It can be useful to identify the components that make up a table before getting into the nitty gritty of generating tables. Why? This will give us a language to speak about table composition and, in doing so, we’ll discover the merits of the different components, how they work together, and finally understand how to make effective table visualizations when the time comes. Tabular presentation can serve several purposes. We may want to present raw data values in unvarnished form, so that others can perform analyses from that data. We could also prepare a set of summarized results and tabulate that; this provides the reader with the results of an analysis. We could even produce a table with varied information on a particular topic and present it in a visually appealing way. With creativity and an eye to aesthetics, the reader will linger for a bit and explore the presented data, perhaps leaving with heightened understanding of the topic at hand and a feeling of edification.
The gt package can make all of this possible. The general principle is that we combine table components together and refine the presentation bit by bit. We’ve prepared a basic diagram here, showing how the main components of a table (and their subcomponents) fit together:
Here is a listing of the table components (from top to bottom):
the Table Header (optional; with a title and possibly a subtitle)
the Stub and the Stub Head (optional; contains row labels, optionally within row groups having row group labels and possibly summary labels when a summary is present)
the Column Labels (contains column labels, optionally under spanner column labels)
the Table Body (contains columns and rows of cells)
the Table Footer (optional; possibly with footnotes and source notes)
Once we have the input data for the table to present, we need to decide which of these components should be used. This chapter will show you how to introduce the input data to gt and how to add the various components together. Generally, the functions that produce or modify these table components will begin with tab_*() and the components will only be displayed when there is content for them (e.g., there won’t be a table footer unless you use a function that adds content to a footer). Now let’s get to making gt tables by starting with the very important, entry-point function: gt().
2.1 Making a gt table: start with gt()
When one provides table data to the gt() function, it generates a gt table object. This function is the initial step in a typical gt workflow. Once you possess the gt table object, you have the ability to perform styling transformations before rendering it as a display table of different formats.
Let’s use the exibble dataset for the next few examples, we’ll learn how to make simple gt tables with the gt() function. The most basic thing to do is to just use gt() with the dataset as the input.
exibble|>gt()
num
char
fctr
date
time
datetime
currency
row
group
1.111e-01
apricot
one
2015-01-15
13:35
2018-01-01 02:22
49.950
row_1
grp_a
2.222e+00
banana
two
2015-02-15
14:40
2018-02-02 14:33
17.950
row_2
grp_a
3.333e+01
coconut
three
2015-03-15
15:45
2018-03-03 03:44
1.390
row_3
grp_a
4.444e+02
durian
four
2015-04-15
16:50
2018-04-04 15:55
65100.000
row_4
grp_a
5.550e+03
NA
five
2015-05-15
17:55
2018-05-05 04:00
1325.810
row_5
grp_b
NA
fig
six
2015-06-15
NA
2018-06-06 16:11
13.255
row_6
grp_b
7.770e+05
grapefruit
seven
NA
19:10
2018-07-07 05:22
NA
row_7
grp_b
8.880e+06
honeydew
eight
2015-08-15
20:20
NA
0.440
row_8
grp_b
From this, we get a very simple table with column labels and all of the body cells below. This is the simplest form of a gt table: it doesn’t restructure the data at all and closely resembles what you’d see when printing a tibble or data frame in the R console. The key difference is that you now have a presentation-ready table that can be rendered in HTML, PDF, or other formats. While this basic output is functional, the gt() function offers several arguments that let you add structure and context to your table right from the start.
This dataset has the row and group columns. The former contains unique values that are ideal for labeling rows, and this often happens in what is called the ‘stub’ (a reserved area that serves to label rows). With the gt() function, we can immediately place the contents of the row column into the stub column. To do this, we use the rowname_col argument with the name of the column to use in quotes.
exibble|>gt(rowname_col ="row")
num
char
fctr
date
time
datetime
currency
group
row_1
1.111e-01
apricot
one
2015-01-15
13:35
2018-01-01 02:22
49.950
grp_a
row_2
2.222e+00
banana
two
2015-02-15
14:40
2018-02-02 14:33
17.950
grp_a
row_3
3.333e+01
coconut
three
2015-03-15
15:45
2018-03-03 03:44
1.390
grp_a
row_4
4.444e+02
durian
four
2015-04-15
16:50
2018-04-04 15:55
65100.000
grp_a
row_5
5.550e+03
NA
five
2015-05-15
17:55
2018-05-05 04:00
1325.810
grp_b
row_6
NA
fig
six
2015-06-15
NA
2018-06-06 16:11
13.255
grp_b
row_7
7.770e+05
grapefruit
seven
NA
19:10
2018-07-07 05:22
NA
grp_b
row_8
8.880e+06
honeydew
eight
2015-08-15
20:20
NA
0.440
grp_b
This sets up a table with a stub, the row labels are placed within the stub column, and a vertical dividing line has been placed on the right-hand side.
The group column can be used to divide the rows into discrete groups. Within that column, we see repetitions of the values "grp_a" and "grp_b". These serve both as ID values and the initial label for the groups. With the groupname_col argument in gt(), we can set up the row groups immediately upon creation of the table.
If you’d rather perform the set up of row groups later (i.e., not in the gt() call), this is possible with use of the tab_row_group() function (and row_group_order() can help with the arrangement of row groups).
One more thing to consider with row groups is their layout. By default, row group labels reside in separate rows the appear above the group. However, we can use the row_group_as_column = TRUE option to put the row group labels within a secondary column within the table stub.
This could be done later if need be, and using tab_options(row_group.as_column = TRUE) would be the way to do it outside of the gt() call.
2.1.1 Multi-column stubs for hierarchical row labels
When your data has natural hierarchical structure, you can create a multi-column stub by passing a vector of column names to rowname_col. This feature is particularly useful for financial reports with account hierarchies, clinical trial tables with multiple levels of categorization, or any situation where rows have parent-child relationships.
Let’s create a dataset with a two-level hierarchy (region and category) and display it with a multi-column stub:
The multi-column stub creates a clean visual hierarchy. Notice that repeating values in the first stub column (the region) are automatically consolidated, making it clear which categories belong to each region. The tab_stubhead() function also accepts a vector of labels, one for each level of the hierarchy.
This feature works seamlessly with formatting and styling functions. The stub columns are treated as a unit, allowing you to apply styles to the entire stub area while still being able to reference individual stub columns when needed.
2.1.2 Additional gt() options
Some datasets have rownames built in (mtcars famously has the car model names as the rownames). To use those rownames as row labels in the stub, the rownames_to_stub = TRUE option will prove to be useful.
By default, values in the body of a gt table (and their column labels) are automatically aligned. The alignment is governed by the types of values in a column. If you’d like to disable this form of auto-alignment, the auto_align = FALSE option can be taken.
What you’ll get from that is center-alignment of all table body values and all column labels. Note that row labels in the the stub are still left-aligned (though it’s hard to see that in the previous example); auto_align has no effect on alignment within the table stub. It’s generally not recommended to use auto_align = FALSE since the the automatic alignment choices are quite reasonable for most tables.
However which way you generate the initial gt table object, you can use it with a huge variety of functions in the package to further customize the presentation. Formatting body cells is commonly done with the family of formatting functions (e.g., fmt_number(), fmt_date(), etc.). The package supports formatting with internationalization (‘i18n’ features) and so locale-aware functions come with a locale argument. To avoid having to use that argument repeatedly, the gt() function has its own locale argument. Setting a locale in that will make it available globally. Here’s an example of how that works in practice when setting locale = "fr" in gt() and using formatting functions:
In this example, the fmt_number() and fmt_date() functions understand that the locale for this table is "fr" (French), so the appropriate formatting for that locale is apparent in the num, currency, and date columns. However in the fmt_datetime() call, we explicitly use the "en" (English) locale. This overrides the "fr" default set for this table and the end result is dates formatted with the English locale in the datetime column.
The process_md argument controls whether the contents of rowname_col and groupname_col should be interpreted as Markdown. By default (FALSE), the text appears literally. When set to TRUE, gt will render Markdown syntax in your row labels and row group labels. This is useful when your stub or grouping data contains formatted text.
Without process_md = TRUE, you would see the literal Markdown syntax (e.g., **Premium** Widget instead of Premium Widget) in the stub and row group labels. Note that process_md specifically affects the rowname_col and groupname_col content; for Markdown in other parts of the table (like column labels or headers), use the md() helper function. If you need Markdown text in the table body cells to be rendered, use the fmt_markdown() formatting function, which is covered in Chapter 4.
2.2 The table header and footer
It’s possible to add a table header to the gt table, which includes a title and even a subtitle. The table header is an optional component of the table that appears above the column labels. You have the flexibility to use Markdown formatting for the header’s title and subtitle, allowing for greater customization. Additionally, if you intend to use HTML output for the table, you may utilize HTML in either the title or subtitle.
2.2.1tab_header()
The tab_header() function adds a header section to your gt table, which can include a title and subtitle. These elements appear above the column labels and help provide context for the data presented. The header is particularly useful for giving tables descriptive names, explaining their purpose, or noting important details about the data source or time period.
Use the gtcars dataset to create a gt table. Add a header part to the table with the tab_header() function. We’ll add a title and the optional subtitle as well. With the md() helper function, we can make sure the Markdown formatting is interpreted and transformed.
gtcars|>dplyr::select(mfr, model, msrp)|>dplyr::slice(1:5)|>gt()|>tab_header( title =md("Data listing from **gtcars**"), subtitle =md("`gtcars` is an R dataset"))
Data listing from gtcars
gtcars is an R dataset
mfr
model
msrp
Ford
GT
447000
Ferrari
458 Speciale
291744
Ferrari
458 Spider
263553
Ferrari
458 Italia
233509
Ferrari
488 GTB
245400
If the table is intended solely as an HTML table, you could introduce your own HTML elements into the header. You can even use the htmltools package to help arrange and generate the HTML. Here’s an example of that, where two <div> elements are placed in a htmltools::tagList().
gtcars|>dplyr::select(mfr, model, msrp)|>dplyr::slice(1:5)|>gt()|>tab_header( title =htmltools::tagList(htmltools::tags$div( style =htmltools::css( `text-align` ="center"),htmltools::HTML(web_image("https://www.r-project.org/logo/Rlogo.png"))),htmltools::tags$div("Data listing from ",htmltools::tags$strong("gtcars"))))
Data listing from
gtcars
mfr
model
msrp
Ford
GT
447000
Ferrari
458 Speciale
291744
Ferrari
458 Spider
263553
Ferrari
458 Italia
233509
Ferrari
488 GTB
245400
If using HTML but doing something far simpler, we can use the html() helper function to declare that the text provided is HTML.
gtcars|>dplyr::select(mfr, model, msrp)|>dplyr::slice(1:5)|>gt()|>tab_header( title =html("Data listing from <strong>gtcars</strong>"), subtitle =html("From <span style='color:red;'>gtcars</span>"))
Data listing from gtcars
From gtcars
mfr
model
msrp
Ford
GT
447000
Ferrari
458 Speciale
291744
Ferrari
458 Spider
263553
Ferrari
458 Italia
233509
Ferrari
488 GTB
245400
The html() helper function tells gt to interpret the provided text as raw HTML, allowing for direct use of HTML tags like <strong> for bold text and <span> with inline styles for colored text. This approach gives you precise control over formatting when creating HTML output.
2.2.2opt_align_header()
By default, table headers are center-aligned, which works well for most tables. However, sometimes you may want to align the header text to the left or right to match your table’s overall design or to create a specific visual effect.
The table header that is appended to a gt table typically has center alignment for both the title and subtitle elements. However, it is possible to adjust the horizontal alignment of the title and subtitle by utilizing the align argument. This function provides a quick and easy means to set the alignment to the left or right. It also serves as a convenient shortcut for <gt_tbl> |> tab_options(heading.align = <align>).
exibble|>gt(rowname_col ="row", groupname_col ="group")|>tab_header( title ="The title of the table", subtitle ="The table's subtitle")|>opt_align_table_header(align ="left")
The title of the table
The table's subtitle
num
char
fctr
date
time
datetime
currency
grp_a
row_1
1.111e-01
apricot
one
2015-01-15
13:35
2018-01-01 02:22
49.950
row_2
2.222e+00
banana
two
2015-02-15
14:40
2018-02-02 14:33
17.950
row_3
3.333e+01
coconut
three
2015-03-15
15:45
2018-03-03 03:44
1.390
row_4
4.444e+02
durian
four
2015-04-15
16:50
2018-04-04 15:55
65100.000
grp_b
row_5
5.550e+03
NA
five
2015-05-15
17:55
2018-05-05 04:00
1325.810
row_6
NA
fig
six
2015-06-15
NA
2018-06-06 16:11
13.255
row_7
7.770e+05
grapefruit
seven
NA
19:10
2018-07-07 05:22
NA
row_8
8.880e+06
honeydew
eight
2015-08-15
20:20
NA
0.440
The title and subtitle now align to the left edge of the table, creating a more document-like appearance. Left-aligned headers work well when the table is part of a larger report or when you want to establish a clear reading direction from left to right.
2.3 Adding source notes to the footer of the table
The footer section of a table sits beneath the table body and provides space for supplementary information. Source notes are one type of footer content. They typically cite data origins, provide general disclaimers, or offer context that applies to the entire table. Unlike footnotes (which link to specific cells via marks), source notes stand alone as general commentary.
2.3.1tab_source_note()
It’s possible to add a source note to the footer part of the gt table with tab_source_note(). Several of these can be added to the footer and, to do that, we can simply use multiple calls of tab_source_note() (they will be inserted in the order provided). We can use Markdown formatting for the note, or, if the table is intended for HTML output, we can include HTML formatting.
Here is the function’s signature:
tab_source_note(data,source_note)
With three columns from the gtcars dataset, let’s create a gt table. We can use the tab_source_note() function to add a source note to the table footer. Here we are citing the data source but this function can be used for any text you’d prefer to display in the footer section.
The md() and html() helper functions work with source notes, allowing you to add styled text, links, or other formatting. Here’s an example using Markdown to create a clickable link:
gtcars|>dplyr::select(mfr, model, msrp)|>dplyr::slice(1:5)|>gt()|>tab_source_note( source_note =md("Data from [edmunds.com](https://www.edmunds.com), 2015."))|>tab_source_note( source_note =md("*Prices shown are MSRP in USD.*"))
This example demonstrates two source notes added in sequence, both using Markdown formatting. The first contains a hyperlink, and the second uses italics. For HTML-specific formatting, you could use html() instead:
Source notes provide a clean way to document your data’s provenance and add necessary context without cluttering the table itself. They appear in a dedicated footer area, visually separated from the data, making it clear that this information applies to the table as a whole rather than to specific cells or values.
2.4 Grouping together column labels with spanners
Column spanners are horizontal labels that stretch across multiple columns, grouping them under a common heading. They create visual hierarchy in the boxhead (the part of the table containing column labels) and help readers understand relationships between columns. For example, columns showing different years of population data might be grouped under a "Population" spanner, while density columns might share a "Density" spanner.
The part of the table that contains, at a minimum, column labels and, optionally, spanner labels is sometimes called the table boxhead. A spanner will occupy space over any number of contiguous column labels. With the tab_spanner() function, you can insert a spanner in the boxhead part of a gt table. This function allows for mapping to be defined by column names, existing spanner ID values, or a mixture of both.
2.4.1tab_spanner()
With the tab_spanner() function, you can insert a spanner above column labels or existing spanners in the boxhead part of a gt table.
The spanners are placed in the order of calling tab_spanner() so if a later call uses the same columns in its definition (or even a subset) as the first invocation, the second spanner will be overlaid atop the first. Options exist for forcibly inserting a spanner underneath other (with level as space permits) and with replace, which allows for full or partial spanner replacement.
Let’s create a gt table using a small portion of the gtcars dataset. Over several columns (hp, hp_rpm, trq, trq_rpm, mpg_c, mpg_h) we’ll use tab_spanner() to add a spanner with the label "performance". This effectively groups together several columns related to car performance under a unifying label.
Notice that in the above table code, we used the starts_with() selection helper in both the dplyrselect() statement and in the gttab_spanner() statement. Such use of tidyselect selection helpers is incredibly helpful for shortening the amount of code supplied in the columns argument across many gt functions.
With the default gather = TRUE option, columns selected for a particular spanner will be moved so that there is no separation between them. This can be seen with the example below that uses a subset of the towny dataset. The starting column order is name, latitude, longitude, population_2016, density_2016, population_2021, and density_2021. The first two uses of tab_spanner() deal with making separate spanners for the two population and two density columns. After their use, the columns are moved to this new ordering: name, latitude, longitude, population_2016, population_2021, density_2016, and density_2021. The third and final call of tab_spanner() doesn’t further affect the ordering of columns.
While columns are moved, it is only the minimal amount of moving required (pulling in columns from the right) to ensure that columns are gathered under the appropriate spanners. With the last call, there are two more things to note: (1) label values can use the md() (or html()) helper functions to help create styled text, and (2) an id value may be supplied for reference later (e.g., for styling with tab_style() or applying footnotes with tab_footnote()).
It’s possible to stack multiple spanners atop each other with consecutive calls of tab_spanner(). It’s a bit like playing Tetris: putting a spanner down anywhere there is another spanner (i.e., there are one or more shared columns) means that second spanner will reside a level above prior. Let’s look at a few examples at how this works, and we’ll also explore a few lesser-known placement tricks. Let’s use a cut down version of exibble for this, set up a few level-one spanners, and then place a level two spanner over two other spanners.
exibble_narrow<-exibble|>dplyr::slice_head(n =3)exibble_narrow|>gt()|>tab_spanner( label ="Row Information", columns =c(row, group))|>tab_spanner( label ="Numeric Values", columns =where(is.numeric), id ="num_spanner")|>tab_spanner( label ="Text Values", columns =c(char, fctr), id ="text_spanner")|>tab_spanner( label ="Numbers and Text", spanners =c("num_spanner", "text_spanner"))
Numbers and Text
Numeric Values
Text Values
date
time
datetime
Row Information
num
currency
char
fctr
row
group
0.1111
49.95
apricot
one
2015-01-15
13:35
2018-01-01 02:22
row_1
grp_a
2.2220
17.95
banana
two
2015-02-15
14:40
2018-02-02 14:33
row_2
grp_a
33.3300
1.39
coconut
three
2015-03-15
15:45
2018-03-03 03:44
row_3
grp_a
In the above example, we used the spanners argument to define where the "Numbers and Text"-labeled spanner should reside. For that, we supplied the "num_spanner" and "text_spanner" ID values for the two spanners associated with the num, currency, char, and fctr columns. Alternatively, we could have given those column names to the columns argument and achieved the same result. You could actually use a combination of spanners and columns to define where the spanner should be placed. Here is an example of just that:
And, again, we could have solely supplied all of the column names to columns instead of using this hybrid approach, but it is interesting to express the definition of spanners with this flexible combination. What if you wanted to extend the above example and place a spanner above the date, time, and datetime columns? If you tried that in the manner as exemplified above, the spanner will be placed in the third level of spanners:
exibble_narrow_gt|>tab_spanner( label ="Date and Time Columns", columns =contains(c("date", "time")), id ="date_time_spanner")
Date and Time Columns
Text, Dates, Times, Datetimes
Numeric Values
Text Values
date
time
datetime
row
group
num
currency
char
fctr
0.1111
49.95
apricot
one
2015-01-15
13:35
2018-01-01 02:22
row_1
grp_a
2.2220
17.95
banana
two
2015-02-15
14:40
2018-02-02 14:33
row_2
grp_a
33.3300
1.39
coconut
three
2015-03-15
15:45
2018-03-03 03:44
row_3
grp_a
Remember that the approach taken by tab_spanner() is to keep stacking atop existing spanners. But, there is space next to the "Text Values" spanner on the first level. You can either revise the order of tab_spanner() calls, or, use the level argument to force the spanner into that level (so long as there is space).
exibble_narrow_gt|>tab_spanner( label ="Date and Time Columns", columns =contains(c("date", "time")), level =1, id ="date_time_spanner")
Text, Dates, Times, Datetimes
Numeric Values
Text Values
Date and Time Columns
row
group
num
currency
char
fctr
date
datetime
time
0.1111
49.95
apricot
one
2015-01-15
2018-01-01 02:22
13:35
row_1
grp_a
2.2220
17.95
banana
two
2015-02-15
2018-02-02 14:33
14:40
row_2
grp_a
33.3300
1.39
coconut
three
2015-03-15
2018-03-03 03:44
15:45
row_3
grp_a
That puts the spanner in the intended level. If there aren’t free locations available in the level specified you’ll get an error stating which columns cannot be used for the new spanner (this can be circumvented, if necessary, with the replace = TRUE option). If you choose a level higher than the maximum occupied, then the spanner will be dropped down. Again, these behaviors are indicative of Tetris-like rules though they tend to work well for the application of spanners.
2.4.2tab_spanner_delim()
The cols_spanner_delim() function can take specially-crafted column names and generate one or more spanner column labels (along with relabeling the column labels).
This is done by splitting the column name by a specified delimiter character (this is the delim) and placing the fragments from top to bottom (i.e., higher-level spanners to the column labels). Furthermore, the neighboring text fragments on different spanner levels will be coalesced together to put the span back into spanner. For instance, having the three side-by-side column names rating_1, rating_2, and rating_3 will (in the default case at least) result in a spanner with the label "rating" above columns with the labels "1", "2", and "3".
If we take a hypothetical table that includes the column names province.NL_ZH.pop, province.NL_ZH.gdp, province.NL_NH.pop, and province.NL_NH.gdp, we can see that we have a naming system that has a well-defined structure. We start with the more general to the left ("province") and move to the more specific on the right ("pop"). If the columns are in the table in this exact order, then things are in an ideal state as the eventual spanner column labels will form from this neighboring. When using tab_spanner_delim() here with delim set as “.” we get the following text fragments:
province.NL_ZH.pop -> "province", "NL_ZH", "pop"
province.NL_ZH.gdp -> "province", "NL_ZH", "gdp"
province.NL_NH.pop -> "province", "NL_NH", "pop"
province.NL_NH.gdp -> "province", "NL_NH", "gdp"
This gives us the following arrangement of column labels and spanner labels:
There might be situations where the same delimiter is used throughout but only the last instance requires a splitting. With a pair of column names like north_holland_pop and north_holland_area you would only want "pop" and "area" to be column labels underneath a single spanner ("north_holland"). To achieve this, the split and limit arguments are used and the values for each need to be split = "last" and limit = 1. This will give us the following arrangement:
With a subset of the towny dataset, we can create a gt table and then use the tab_spanner_delim() function to automatically generate column spanner labels. In this case we have some column names in the form population_<year>. The underscore character is the delimiter that separates a common word "population" and a year value. In this default way of splitting, fragments to the right are lowest (really they become new column labels) and moving left we get spanners. Let’s have a look at how tab_spanner_delim() handles these column names:
The spanner created through this use of tab_spanner_delim() is automatically given an ID value by gt. Because it’s hard to know what the ID value is, we can use tab_info() to inspect the table’s indices and ID values.
towny_subset_gt|>tab_info()
Information on ID and Label Values
ID
Idx Lvl
Label
Columns
name
1
name
population_1996
2
1996
population_2001
3
2001
population_2006
4
2006
population_2011
5
2011
population_2016
6
2016
population_2021
7
2021
Rows
<< Index values 1 to 7 >>
Spanners
spanner-population_1996
1
population
From this informational table, we see that the ID for the spanner is "spanner-population_1996". Also, the columns are still accessible by the original column names (tab_spanner_delim() did change their labels though). Let’s use tab_style() to add some styles to the towny_subset_gt table.
We can plan ahead a bit and refashion the column names with dplyr before introducing the table to gt() and tab_spanner_delim(). Here the column labels have underscore delimiters where splitting is not wanted (so a period or space character is used instead). The usage of tab_spanner_delim() gives two levels of spanners. We can further touch up the labels after that with cols_label_with() and text_transform().
With a summarized, filtered, and pivoted version of the pizzaplace dataset, we can create another gt table and then use the tab_spanner_delim() function with the same delimiter/separator that was used in the tidyrpivot_wider() call. We can also process the generated column labels with cols_label_with().
This example demonstrates a sophisticated workflow combining pivoting, delimiter-based spanners, and dynamic label generation. The pivot_wider() creates columns like revenue.2015-01-01 and sold.2015-01-01, which tab_spanner_delim() splits into spanners (revenue, sold) and column labels (the dates). The cols_label_with() function then appends the day of the week to each date label, producing labels like "2015-01-01 (Thu)". The data_color() call adds a subtle green gradient to revenue cells, making it easy to spot higher-performing days at a glance.
2.5 The stub and row groups
The stub is a special column (or set of columns) on the left side of the table that holds row labels. When present, the stub serves as an identifier for each row, similar to how column labels identify columns. Row groups take this organization further by dividing rows into named sections, each with its own header row. Together, the stub and row groups create vertical structure in a table, making it easier to navigate and understand large datasets.
The stub is created when you designate a column for row labels using rowname_col in gt(). Once a stub exists, you can add a stubhead label (a header for the stub column itself) and organize rows into groups. Row groups appear as labeled sections that visually separate different categories of data.
2.5.1tab_row_group()
Create a row group with a collection of rows. This requires specification of the rows to be included, either by supplying row labels, row indices, or through use of a select helper function like starts_with().
Here is the function’s signature:
tab_row_group(data,label,rows, id =label)
To set a default row group label for any rows not formally placed in a row group, we can use a separate call to tab_options(row_group.default_label = <label>). If this is not done and there are rows that haven’t been placed into a row group (where one or more row groups already exist), those rows will be automatically placed into a row group without a label. To restore labels for row groups not explicitly assigned a group, tab_options(row_group.default_label = "") can be used.
Using a subset of the gtcars dataset, let’s create a simple gt table with row labels (from the model column) inside of a stub. This eight-row table begins with no row groups at all but with a single use of the tab_row_group() function, we can specify a row group that will contain any rows where the car model begins with a number.
This actually makes two row groups since there are row labels that don’t begin with a number. That second row group is a catch-all NA group, and it doesn’t display a label at all. Rather, it is set off from the other group with a double line. This may be a preferable way to display the arrangement of one distinct group and an ‘others’ or default group. If that’s the case but you’d like the order reversed, the row_group_order() function can be used for that.
Two more options include: (1) setting a default label for the ‘others’ group (done through tab_options()], and (2) creating row groups until there are no more unaccounted for rows. Let’s try the first option in the next example:
The above use of the row_group.default_label in tab_options() gets the job done and provides a default label. One drawback is that the default/NA group doesn’t have an ID, so it can’t as easily be styled with tab_style(); however, row groups have indices and the index for the "others" group here is 1.
Another way to handle rows with NA values in the grouping column is through the omit_na_group argument in gt(). By default (FALSE), rows with NA in the groupname_col are assigned to a group labeled "NA". Setting omit_na_group = TRUE causes those rows to appear as ungrouped rows in the table body instead. This is useful when you want certain rows to stand apart from any row group, perhaps as header rows or separators.
Let’s see how this works. First, we’ll create a dataset where some rows have NA for the group column:
data_with_na_group<-dplyr::tibble( item =c("Category A Items", "Widget", "Gadget", "Category B Items", "Sprocket", "Cog"), group =c(NA, "A", "A", NA, "B", "B"), value =c(NA, 100, 150, NA, 200, 175))data_with_na_group|>gt(rowname_col ="item", groupname_col ="group")
value
NA
Category A Items
NA
Category B Items
NA
A
Widget
100
Gadget
150
B
Sprocket
200
Cog
175
With the default behavior, the rows with NA in the group column are placed in an "NA" group. Now let’s use omit_na_group = TRUE to have those rows appear outside of any group:
The rows that had NA for their group now appear as ungrouped rows, visually distinct from the grouped content. This pattern is particularly useful when you want to include descriptive header rows or section dividers that logically shouldn’t belong to any data group.
Now let’s try using tab_row_group() with our gtcars-based table such that all rows are formally assigned to different row groups. We’ll define two row groups with the (Markdown-infused) labels "**Powerful Cars**" and "**Super Powerful Cars**". The distinction between the groups is whether hp is lesser or greater than 600 (and this is governed by the expressions provided to the rows argument).
Setting the id values for each of the row groups makes things easier since you will have clean, markup-free ID values to reference in later calls (as was done with the tab_style() invocations in the example above). The use of the md() helper function makes it so that any Markdown provided for the label of a row group is faithfully rendered.
2.5.2row_group_order()
By default, row groups appear in the order they were created with tab_row_group(). The row_group_order() function lets you rearrange them into any sequence you prefer.
Here is the function’s signature:
row_group_order(data,groups)
The groups argument takes a vector of row group ID values in the desired order. If a group was created without an explicit id, its label serves as the ID. The special value NA represents the default/unnamed group (rows not explicitly assigned to any group).
Here, Italian manufacturers appear first, followed by German, with any remaining rows in the unnamed group at the bottom. The cols_hide() call removes the mfr column since that information is now conveyed by the row group labels.
2.5.3tab_stubhead()
Add a label to the stubhead of a gt table. The stubhead is the lone element that is positioned left of the column labels, and above the stub. If a stub does not exist, then there is no stubhead (so no change will be made when using this function in that case). We have the flexibility to use Markdown formatting for the stubhead label. Furthermore, if the table is intended for HTML output, we can use HTML for the stubhead label.
Here is the signature for tab_stubhead():
tab_stubhead(data,label)
Using a small subset of the gtcars dataset, we can create a gt table with row labels. Since we have row labels in the stub (via use of rowname_col = "model" in the gt() function call) we have a stubhead, so, let’s add a stubhead label ("car") with the tab_stubhead() function to describe what’s in the stub.
The stubhead label "car" now appears above the stub column, clarifying that the row labels represent car models. Without a stubhead, readers might need to infer this from context. For tables with many rows or complex stub content, a clear stubhead label improves navigability.
2.5.4tab_stub_indent()
Indentation of row labels is an effective way for establishing structure in a table stub. The tab_stub_indent() function allows for fine control over row label indentation in the stub. We can use an explicit definition of an indentation level, or, employ an indentation directive using keywords.
Here is the function’s signature:
tab_stub_indent(data,rows, indent ="increase")
Let’s use a summarized version of the pizzaplace dataset to create a gt table with row groups and row labels. With the summary_rows() function, we’ll generate summary rows at the top of each row group. With tab_stub_indent() we can add indentation to the row labels in the stub.
pizzaplace|>dplyr::group_by(type, size)|>dplyr::summarize( sold =dplyr::n(), income =sum(price), .groups ="drop")|>gt(rowname_col ="size", groupname_col ="type")|>tab_header(title ="Pizzas Sold in 2015")|>fmt_integer(columns =sold)|>fmt_currency(columns =income)|>summary_rows( fns =list(label ="All Sizes", fn ="sum"), side ="top", fmt =list(~fmt_integer(., columns =sold),~fmt_currency(., columns =income)))|>tab_options( summary_row.background.color ="gray95", row_group.background.color ="#FFEFDB", row_group.as_column =TRUE)|>tab_stub_indent( rows =everything(), indent =2)
Pizzas Sold in 2015
sold
income
chicken
All Sizes
11,050
$195,919.50
L
4,932
$102,339.00
M
3,894
$65,224.50
S
2,224
$28,356.00
classic
All Sizes
14,888
$220,053.10
L
4,057
$74,518.50
M
4,112
$60,581.75
S
6,139
$69,870.25
XL
552
$14,076.00
XXL
28
$1,006.60
supreme
All Sizes
11,987
$208,197.00
L
4,564
$94,258.50
M
4,046
$66,475.00
S
3,377
$47,463.50
veggie
All Sizes
11,649
$193,690.45
L
5,403
$104,202.70
M
3,583
$57,101.00
S
2,663
$32,386.75
The indent argument accepts either a numeric value (0 through 5) or the keywords "increase" or "decrease". When using numeric values, 0 means no indentation and 5 is the maximum. The keywords adjust indentation relative to the current level, which is useful when building tables programmatically.
Progressive indentation creates a visual hierarchy within each group, useful for showing parent-child relationships or levels of detail.
2.6 Column labels
Column labels appear at the top of each column and identify the data within. While gt uses column names from your data as default labels, you’ll often want to provide cleaner, more descriptive labels for presentation. Several functions help manage column labels: cols_label() for setting labels directly, cols_label_with() for applying transformations, and cols_move() family functions for reordering.
2.6.1cols_label()
The cols_label() function assigns display labels to columns. These labels appear in the table while the underlying column names remain unchanged (useful for referencing columns in subsequent gt function calls).
The md() helper renders Markdown syntax, making "Population" bold and adding italics to the unit in "Density". The html() helper allows raw HTML, which we use here to create a proper superscript for the squared unit in "Area". Mixing these approaches gives you flexibility: Markdown for simple formatting and HTML when you need precise control over the output.
2.6.1.1 Incorporating units with gt’s units notation
Measurement units frequently appear in column labels, and it’s often clearer to include them in the label itself rather than using other methods to convey unit information. While the cols_units() function provides one approach, gt also supports a built-in units notation system that allows you to define units directly within column labels.
To use this notation, surround the portion of text representing the units with {{ and }}. This tells gt to interpret that text as a units definition and render it appropriately.
The units notation uses a succinct, ASCII-friendly syntax for writing measurement units. While it may feel somewhat familiar, it’s specifically designed for this purpose. Each component (unit names, parentheses, symbols) is treated as a separate entity, and you can flexibly add subscripts and exponents. Here are the key rules and examples:
Basic units and division:
"m/s" and "m / s" both render as “m/s” with proper formatting
spaces around operators are optional and ignored
"m /s" gives the same result, since "/<unit>" is equivalent to "<unit>^-1"
Exponents:
"m s^-1" displays with the “-1” as a proper exponent
"t_i^2.5" shows a t with an “i” subscript and a “2.5” exponent
exponents are specified with the ^ character
Subscripts:
"E_h" renders as E with an “h” subscript
use the _ character for subscripts
"m[_0^2]" uses brackets with overstriking to set both subscript and superscript vertically aligned
Chemical formulas:
"g/L %C6H12O6%" encloses a chemical formula in % characters
numbers in formulas are automatically subscripted (e.g., C₆H₁₂O₆)
useful for biochemistry and chemistry tables
Automatic symbol conversions:
the letter “u” in "ug", "um", "uL", and "umol" converts to the Greek mu symbol (µ)
"degC" and "degF" render with a proper degree symbol (°C, °F)
these shortcuts make typing common units easier
Greek letters:
enclose Greek letter names in colons (e.g., :beta:, :sigma:)
lowercase: :alpha:, :beta:, :gamma:, :delta:, etc.
uppercase: :Alpha:, :Beta:, :Gamma:, :Delta:, etc.
works for the full Greek alphabet
Special symbols:
shorthand names enclosed in colons convert to proper symbols
examples: :angstrom:, :ohm:, :micro:, :degree:
provides access to scientific and mathematical symbols
Text formatting:
surround text with * for italics: "*m*/s" renders m/s
surround text with ** for bold: "**kg**" renders kg
can be applied to unit names, subscripts, or exponents partially or fully
useful for emphasizing specific components
We can use units notation to cleanly express measurement units in column labels. By enclosing units in double braces ({{ and }}), gt automatically formats them with proper typography:
This example demonstrates the units notation in action. The {km^2} syntax automatically renders with proper superscript formatting for the squared kilometer unit, while {people/km^2} renders as a clean fraction with the exponent properly formatted. The resulting table displays professional-looking column headers with mathematically correct unit notation.
Here’s a more complex example showing various features of units notation:
This notation system makes it straightforward to include properly formatted units without needing to manually construct HTML or Unicode characters.
2.6.2cols_label_with()
When you need to transform many column labels programmatically, cols_label_with() applies a function to generate labels:
cols_label_with(data, columns =everything(),fn)
Rather than manually specifying labels for each column with cols_label(), this function applies a transformation function to column names to automatically generate readable labels. This is especially valuable when working with datasets that have systematic naming conventions.
Let’s see this in action with a subset of the towny dataset, which contains columns with underscored names like population_2021, density_2021, and land_area_km2. We’ll use cols_label_with() to automatically convert these technical column names into proper display labels:
In this example, the transformation function does two things: first, gsub("_", " ", .x) replaces all underscores with spaces, converting "population_2021" to "population 2021". Then tools::toTitleCase() applies title case formatting, resulting in clean labels like "Population 2021", "Density 2021", and "Land Area Km2". The function is applied to all columns by default (you can limit it with the columns argument if needed).
This approach is particularly useful when column names follow a consistent pattern that can be transformed into readable labels. It saves considerable time compared to manually labeling each column, especially in tables with many columns that follow naming conventions.
2.6.3 Hiding columns with cols_hide()
Sometimes you need columns for calculations or row grouping but don’t want them displayed. The cols_hide() function removes columns from the visual output while keeping them accessible for other gt operations:
The mfr column is hidden but still serves as the grouping variable. Hidden columns can be referenced in tab_style(), fmt_*() functions, and other operations.
2.7 Inspecting table structure with tab_info()
As tables grow complex with multiple spanners, row groups, and customizations, it becomes helpful to inspect their structure. The tab_info() function generates a summary table showing column names, indices, and IDs for all table elements:
In this chapter, we’ve covered the essential structural components that form the foundation of every gt table. These building blocks (from the basic gt() function to headers, footers, spanners, stubs, and row groups) provide the scaffolding upon which all table presentation rests.
Understanding these components is crucial because they establish the logical organization of your data before any formatting or styling is applied. The header gives context, the stub and row groups create vertical structure, column labels and spanners organize the horizontal dimension, and the footer provides additional information. Each component serves a specific purpose in making data more accessible and interpretable to your readers.
We’ve explored many functions such as gt(), tab_header(), tab_source_note(), tab_spanner(), tab_row_group(), cols_label(), and others. They’ll appear repeatedly throughout your gt workflow. They form the vocabulary you’ll use to describe and build table structure, whether you’re creating simple data displays or complex analytical presentations.
As you progress through subsequent chapters, you’ll see how these structural foundations support more advanced capabilities. Chapter 3 and Chapter 4 cover formatting functions for numeric and non-numeric data, building upon the column organization you establish here. Chapter 5 introduces substitution and text transformation, completing the three-stage rendering pipeline. Chapter 6 and Chapter 7 address column modifications and summary rows, extending the structural concepts introduced in this chapter. The styling techniques in Chapter 8 leverage the component structure to apply visual enhancements precisely where needed, while Chapter 9 shows how to add footnotes that reference structural elements. Chapter 10 demonstrates nanoplots for embedding visualizations within cells. The advanced topics in Chapter 11 and Chapter 12 cover table groups and output formats, and Chapter 13 and Chapter 14 explore how to extend gt through external packages and your own extensions. All of these capabilities depend on the solid structural foundation established by the basic components covered here.
Master these fundamentals now, and the more sophisticated table-building techniques ahead will feel like natural extensions of what you already know. The time invested in understanding table structure will pay dividends in every gt table you create going forward.