Tables often need more than just raw data. They frequently require summary statistics that help readers understand the data at a glance: totals, averages, minimums, maximums, and other aggregations. In gt, we have powerful tools for adding these summaries both vertically (as summary rows) and horizontally (as summary columns).
The summary functions in gt address a common frustration in table-making: the disconnect between calculating summary statistics and presenting them elegantly. In many workflows, you might calculate means and totals in your data manipulation code, then manually position them in your table, style them differently from the data rows, and hope nothing breaks when the data changes. The gt approach is different. Summary rows and columns are computed from the table data itself, positioned automatically, and can be styled and formatted independently of the body data. When your underlying data changes, the summaries update automatically.
This chapter covers three primary functions: summary_rows() for adding summaries within row groups, grand_summary_rows() for adding summaries across the entire table, and summary_columns() for adding computed columns that aggregate across rows. We’ll also cover extract_summary(), which lets you pull out summary data for use elsewhere. Together, these functions provide a complete toolkit for building tables that don’t just present data but help readers understand it.
7.1 The role of summary rows in tables
Before diving into the mechanics of creating summary rows, it’s worth considering when and why you might use them. Summary rows serve several distinct purposes.
First, they provide context. A column of sales figures means more when you can see the total at the bottom. A list of response times becomes interpretable when you know the average. Summary rows transform raw data into information by providing reference points.
Second, they enable comparison. When data is organized into groups, summary rows for each group allow readers to compare not just individual values but aggregate performance. How does this quarter compare to last quarter? How does the East region compare to the West? Summary rows answer these questions at a glance.
Third, they support verification. In financial and accounting contexts, totals serve as check figures. Readers (and auditors) expect to see them, and their presence signals that the table has been carefully prepared.
Finally, they reduce cognitive load. Rather than requiring readers to mentally calculate averages or identify maximums, summary rows do the work for them. This is especially valuable when tables will be scanned quickly rather than studied carefully.
The gt package supports two types of summary rows: group summaries (created with summary_rows()) that appear within or alongside each row group, and grand summaries (created with grand_summary_rows()) that summarize the entire table. Both types can include multiple summary statistics, can be positioned at the top or bottom of their respective sections, and can be formatted independently of the body data.
7.2 Adding group-level summaries with summary_rows()
The summary_rows() function adds summary rows to row groups in your table. This requires that your table actually has row groups, which you typically create by specifying a groupname_col in gt() or by using tab_row_group(). The summary rows appear within each group, providing statistics calculated from just that group’s data.
Let’s start with a practical example. We’ll use the sp500 dataset, which contains daily stock market data, and create a table with weekly row groups:
This table shows daily stock prices grouped by week, with minimum, maximum, and mean values calculated for each week. The summary rows appear at the bottom of each group by default, clearly labeled with the function names.
7.2.1 Specifying aggregation functions
The fns argument is where you tell summary_rows() what statistics to calculate. There are several ways to specify aggregation functions, ranging from simple to highly customized.
The simplest approach is to provide function names as quoted strings:
For convenience, gt automatically adds na.rm = TRUE to common aggregation functions when you specify them as strings. This applies to "min", "max", "mean", "median", "sd", and "sum". This saves you from writing out the full expression every time.
When you need more control, you can use formula syntax. The right-hand side (RHS) of the formula contains the aggregation expression, where . represents the column data:
The named list elements become both the ID and the label for each summary row. In this example, we calculate a custom range statistic (the difference between maximum and minimum) alongside the average.
For maximum flexibility, you can provide a double-sided formula where the left-hand side (LHS) specifies the ID and label:
This approach lets you use different values for the row’s ID (used for targeting in other functions like tab_style()) and its label (displayed in the table). The label can include Markdown formatting via md() or HTML via html().
7.2.2 Targeting specific columns
By default, summary_rows() calculates aggregations for all columns. This is rarely what you want, especially when your table includes non-numeric columns or columns where aggregation doesn’t make sense.
The columns argument lets you specify which columns should receive summary calculations:
This adds a sum row only to grp_a, leaving grp_b without summaries. The groups argument accepts group IDs as strings, vectors of strings, or tidyselect helpers like starts_with() or matches().
7.2.4 Positioning summary rows
By default, summary rows appear at the bottom of each group. The side argument lets you place them at the top instead:
Placing totals at the top is common in financial reporting, where readers want to see the bottom line first. The summary row appears before the detail rows, inviting readers to drill down into the components.
7.2.5 Formatting summary values
The fmt argument accepts formatting expressions that control how summary values appear. You can use any gt formatting function:
When you provide a list of formatting expressions, they’re applied in order to the summary rows. In this example, the sum row uses fmt_number() and the mean row uses fmt_currency().
You can also target specific columns within the formatting expression:
This technique is useful when groups have different precision requirements. Perhaps one group contains whole-number counts while another contains measurements that warrant decimal places. The two-sided formula syntax keeps these formatting rules close to the summary definition, making the code easier to maintain.
7.2.6 Handling missing values
When aggregation produces NA values (for example, when all values in a column are NA), the missing_text argument controls what appears in the cell:
dplyr::tibble( group =c("A", "A", "B", "B"), value =c(1, 2, NA, NA))|>gt(groupname_col ="group")|>summary_rows( columns =value, fns =list("mean"), missing_text ="No data")
value
A
1
2
mean
1.5
B
NA
NA
mean
No data
The default missing_text is "---", which provides a visual placeholder that clearly indicates the absence of a calculable value (an em dash).
7.3 Adding table-level summaries with grand_summary_rows()
While summary_rows() operates on individual row groups, grand_summary_rows() calculates aggregations across all data in the table, regardless of grouping. Grand summary rows typically appear at the very bottom (or top) of the table, providing overall totals or averages.
This table now has both group-level summaries (min and max for each week) and grand summaries (overall min, max, and average for the entire period). The grand summary rows span all groups, summarizing the complete dataset.
7.3.1 Arguments and usage
The grand_summary_rows() function shares most of its arguments with summary_rows():
columns: Which columns to aggregate (defaults to all)
fns: Aggregation expressions (same syntax as summary_rows())
fmt: Formatting expressions
side: "bottom" (default) or "top"
missing_text: Text for NA values (default: "---")
The key difference is that grand_summary_rows() has no groups argument, since it always operates on all data.
countrypops|>filter(country_code_2%in%c("BE", "NL", "LU"))|>filter(year%%10==0)|>select(country_name, year, population)|>tidyr::pivot_wider(names_from =year, values_from =population)|>gt(rowname_col ="country_name")|>tab_header(title ="Populations of the Benelux Countries")|>tab_spanner(columns =everything(), label ="Year")|>fmt_integer()|>grand_summary_rows( fns =list(label ="TOTALS", id ="totals", fn ="sum"), fmt =~fmt_integer(.), side ="top")|>tab_style( locations =cells_grand_summary(), style =cell_fill(color ="lightblue"))
Populations of the Benelux Countries
Year
1960
1970
1980
1990
2000
2010
2020
TOTALS
20,954,090
23,033,246
24,373,192
25,300,739
26,613,063
28,017,933
29,610,523
Belgium
9,153,489
9,655,549
9,859,242
9,967,379
10,251,250
10,895,586
11,538,604
Luxembourg
313,970
339,171
364,150
381,850
436,300
506,953
630,419
Netherlands
11,486,631
13,038,526
14,149,800
14,951,510
15,925,513
16,615,394
17,441,500
In this example, the grand summary row appears at the top of the table (a common convention for totals) and is styled with a light blue background to distinguish it from the data rows.
7.3.2 Referencing other columns in aggregations
A powerful but often overlooked feature of summary rows is the ability to reference other columns within an aggregation formula. This is particularly useful when you need to compute a summary that depends on values from multiple columns, such as a weighted average or a ratio that must be calculated from totals rather than averaged directly.
Let’s consider a sales dataset where each row shows units sold and total profit for a product. If you want the grand summary to show the overall profit per unit, you cannot simply average the per-unit profits from each row (that would give incorrect results if the rows have different unit volumes). Instead, you need to sum all profits and divide by the sum of all units. The fns argument allows you to write aggregation expressions that reference any column in the table data, not just the column being summarized.
This example demonstrates several techniques worth noting. First, we call grand_summary_rows() three times, but all calls use the same id = "totals". This produces a single summary row in the table, not three separate rows. By splitting the aggregation across multiple calls, we can apply different logic to different columns while maintaining a unified row label.
Second, each call uses its own formatting: fmt_integer() for the units column, and fmt_currency() for the profit and profit-per-unit columns. This flexibility is essential when a single summary row must display values with fundamentally different formats.
Third, the profit_per_unit summary uses a custom formula that references the profit and units columns directly. The expression sum(profit) / sum(units) calculates the correct weighted profit per unit across all products (totaling "$37,750" profit divided by 1,120 units gives "$33.71" per unit), rather than averaging the individual ratios (which would incorrectly give "$33.04").
This technique works in both summary_rows() and grand_summary_rows(). Within a group summary, the column references resolve to just that group’s data. Within a grand summary, they resolve to the entire table’s data. This makes it straightforward to compute derived statistics like weighted averages, ratios of totals, or any calculation that requires multiple columns.
7.3.3 Combining group and grand summaries
Many real-world tables require multiple levels of aggregation. A financial report might show subtotals for each department and a grand total for the entire organization. A sales analysis might present regional summaries alongside company-wide figures. A scientific study might report group means as well as overall statistics. In these situations, you need both summary_rows() and grand_summary_rows() working together.
The key to making multi-level summaries work is visual differentiation. When readers encounter two or three types of summary rows in a single table, they need clear cues to distinguish subtotals from grand totals from data rows. This typically means using different background colors, text weights, or indentation levels for each type. Without such differentiation, the hierarchy becomes confusing and the summaries lose their communicative value.
The following example demonstrates this pattern with population data for two groups of countries:
countrypops|>filter(country_code_2%in%c("BR", "RU", "IN", "CN", "FR", "DE", "IT", "GB"))|>filter(year%%10==0)|>select(country_name, year, population)|>tidyr::pivot_wider(names_from =year, values_from =population)|>gt(rowname_col ="country_name")|>tab_row_group( label =md("*BRIC*"), rows =c("Brazil", "Russia", "India", "China"), id ="bric")|>tab_row_group( label =md("*Big Four*"), rows =c("France", "Germany", "Italy", "United Kingdom"), id ="big4")|>row_group_order(groups =c("bric", "big4"))|>tab_stub_indent(rows =everything())|>tab_header(title ="Populations of the BRIC and Big Four Countries")|>tab_spanner(columns =everything(), label ="Year")|>fmt_number(n_sigfig =3, suffixing =TRUE)|>summary_rows( fns =list(label =md("**Subtotal**"), id ="subtotal", fn ="sum"), fmt =~fmt_number(., n_sigfig =3, suffixing =TRUE), side ="bottom")|>grand_summary_rows( fns =list(label =md("**GRAND TOTAL**"), id ="grand_total", fn ="sum"), fmt =~fmt_number(., n_sigfig =3, suffixing =TRUE))|>tab_style( locations =cells_summary(), style =cell_fill(color ="lightgray"))|>tab_style( locations =cells_grand_summary(), style =list(cell_fill(color ="steelblue"),cell_text(color ="white")))
Populations of the BRIC and Big Four Countries
Year
1960
1970
1980
1990
2000
2010
2020
BRIC
Brazil
72.4M
95.4M
121M
149M
174M
194M
209M
China
667M
818M
981M
1.14B
1.26B
1.34B
1.41B
India
436M
546M
687M
865M
1.06B
1.24B
1.40B
Russia
120M
130M
139M
148M
147M
143M
145M
Subtotal
1.30B
1.59B
1.93B
2.30B
2.64B
2.92B
3.17B
Big Four
Germany
72.8M
78.2M
78.3M
79.4M
82.2M
81.8M
83.2M
France
47.4M
52.0M
55.3M
58.3M
60.9M
65.0M
67.6M
United Kingdom
52.4M
55.7M
56.3M
57.2M
58.9M
62.8M
66.7M
Italy
50.2M
53.8M
56.4M
56.7M
56.9M
59.8M
59.4M
Subtotal
223M
240M
246M
252M
259M
269M
277M
GRAND TOTAL
1.52B
1.83B
2.18B
2.55B
2.90B
3.19B
3.44B
This table shows populations for two groups of countries (BRIC and Big Four), with subtotals for each group and a grand total for all countries combined. The styling distinguishes the different types of summary rows, making the table’s structure immediately clear.
7.4 Adding summary columns with summary_columns()
While summary rows aggregate data vertically (down columns), summary columns aggregate data horizontally (across rows). The summary_columns() function computes row-wise aggregations and adds them as new columns to your table. This is useful when you want to show totals, averages, or other statistics that combine values from multiple columns within each row.
This adds a total column that sums the num and currency values for each row. The new column appears on the right side of the table by default and can be formatted, styled, and manipulated like any other column.
7.4.1 Specifying aggregation functions
The fns argument works similarly to the summary row functions, but the aggregation happens across columns rather than down rows. This horizontal aggregation is particularly useful for time-series data where columns represent periods (months, quarters, years) and you want to show totals or averages for each row. It’s also valuable when columns represent categories or components that naturally sum to a meaningful whole.
Unlike summary rows, which aggregate many observations into fewer summary values, summary columns preserve the row structure while adding derived values. Each row in your original data still appears in the output; you’re simply enriching it with computed statistics. This makes summary columns ideal for dashboards and reports where readers want to see both the details and the row-level summaries side by side.
You can add multiple summary columns at once by providing a list of functions. Each function generates a separate column, and you can specify names and labels for each.
7.4.2 Custom aggregation expressions
Simple aggregations like sums and means cover many use cases, but sometimes you need calculations that involve multiple columns in more complex ways. Perhaps you want to compute a percentage change between two columns, calculate a ratio, or apply a formula that references specific column positions. The bracket notation in summary_columns() makes this possible.
When you write .[1], .[2], and so on in your formula, you’re referring to the columns in the order they appear in your columns argument. This positional reference system lets you build arbitrarily complex expressions. The trade-off is readability: bracket notation is concise but can be cryptic if you’re not careful. For complex calculations, consider adding a comment in your code explaining what each position represents.
In this expression, .[1] refers to the first selected column (pop_1960), .[2] to the second (pop_2024), and .[3] to the third (years). This allows complex calculations that reference multiple columns.
7.4.3 Formatting summary columns
Summary columns often require different formatting than the source columns they’re derived from. A column of quarterly sales figures might use integer formatting, but the average of those quarters might warrant one decimal place. A column showing year-over-year change might need percentage formatting even when the source columns are raw numbers. The fmt argument lets you apply formatting to the new column immediately, without a separate formatting step.
This inline formatting is particularly convenient because it keeps the summary column’s definition self-contained. Everything about the column (what it calculates, what it’s called, and how it’s displayed) is specified in a single function call. This makes the code easier to read and maintain, especially when you’re creating multiple summary columns with different formatting requirements.
The summary column displays the Q1 average for each row, formatted as currency to match the source columns. By handling the formatting within summary_columns(), the new column integrates seamlessly with the rest of the table. Readers see consistent dollar formatting across all columns, whether they contain original data or computed summaries.
7.5 Styling summary rows
Summary rows can be styled like any other table element using tab_style(). The location helper functions cells_summary() and cells_grand_summary() target summary rows specifically.
7.5.1 Styling all summary rows
Summary rows serve a different purpose than data rows, and that difference should be visible. Readers scanning a table need to immediately distinguish totals and averages from individual observations. Without visual differentiation, summary rows blend into the data, and readers may misinterpret aggregated values as additional data points.
The most common styling choices for summary rows include background fills (light gray is a classic choice), bold text, borders above or below the row, and italic styling for certain types of summaries like averages. The goal is distinction without distraction: the styling should set summary rows apart without overwhelming the table’s visual hierarchy.
To apply uniform styling to all summary rows across all groups, use cells_summary() without any filtering arguments:
The gray background and bold text make the summary rows immediately identifiable. This blanket styling approach works well when all summary rows serve the same purpose and deserve equal visual weight.
7.5.2 Targeting specific summary rows
You can target specific summary rows by their ID using the rows argument in cells_summary():
By assigning IDs to summary rows in the fns argument, you gain precise control over their styling. Here, totals receive a blue background to signal their importance as definitive figures, while averages use italic text to suggest they’re derived statistics rather than absolute values. This kind of semantic styling helps readers interpret the numbers correctly.
7.5.3 Targeting specific groups and columns
The cells_summary() function accepts groups and columns arguments for precise targeting:
This level of precision is useful when you want to highlight specific values (perhaps a particularly important group, a column that exceeded targets, or a combination that warrants the reader’s attention). The cells_summary() helper accepts any combination of groups, columns, and rows arguments, giving you fine-grained control over exactly which cells receive styling.
7.5.4 Styling grand summary rows
Grand summary rows use the cells_grand_summary() location helper:
The dark background with white text creates strong visual emphasis appropriate for a grand total (the single most important number in many tables). Like cells_summary(), the cells_grand_summary() helper accepts columns and rows arguments for more targeted styling when needed.
7.6 Adding footnotes to summary rows
Footnotes can be attached to summary rows using tab_footnote() with the appropriate location helpers:
The footnote appears on every “Average” row across all groups, clarifying the calculation method. This is particularly important for summary statistics where multiple calculation methods exist (arithmetic vs. geometric mean, sample vs. population standard deviation) or where the handling of missing values affects interpretation. Footnotes on summary rows serve the same documentary purpose as footnotes elsewhere in the table, providing essential context without cluttering the display.
7.7 Extracting summary data with extract_summary()
Sometimes you need to access the computed summary values programmatically, rather than just displaying them in a table. The extract_summary() function retrieves summary data from a gt table as a list of data frames.
$summary_df_data_list
$summary_df_data_list$grp_a
# A tibble: 2 × 7
group_id row_id rowname group num currency `::rowname::`
<chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 grp_a sum sum NA 480. 65169. sum
2 grp_a mean mean NA 120. 16292. mean
$summary_df_data_list$grp_b
# A tibble: 2 × 7
group_id row_id rowname group num currency `::rowname::`
<chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 grp_b sum sum NA 9662550 1340. sum
2 grp_b mean mean NA 3220850 447. mean
$summary_df_data_list$`::GRAND_SUMMARY`
# A tibble: 1 × 7
group_id row_id rowname group num currency `::rowname::`
<chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 ::GRAND_SUMMARY sum sum NA 9663030. 66509. sum
The returned list contains one data frame per row group (including a "::grand_summary" element for grand summaries if present). Each data frame includes the row names, group information, and the formatted summary values.
This is useful for:
Reusing summary calculations in other analyses
Creating custom visualizations of summary data
Validating that summaries are calculated correctly
Building composite tables from multiple sources
7.8 Practical examples
The techniques covered in this chapter come together most clearly in real-world scenarios. The following examples demonstrate how summary rows and columns can be combined to create tables that serve specific professional purposes. Each example illustrates different aspects of the summary system and shows how formatting and styling choices reinforce the table’s communicative goals.
7.8.1 Financial statement with subtotals
Financial statements are perhaps the most natural application for summary rows. Accountants and financial analysts expect to see subtotals for logical groupings (revenue, expenses, assets, liabilities) along with grand totals that tie everything together. The following example creates a simplified income statement that demonstrates these conventions.
Notice how accounting notation (parentheses for negative values) is applied consistently to both the data rows and the summary rows. The subtotals use italic styling to distinguish them from line items, while the grand total (Net Income) receives bold text and a background fill to mark it as the bottom line. This visual hierarchy guides readers from detail to summary.
7.8.2 Sales report with multiple statistics
Business reports often need to show both detailed breakdowns and summary statistics. This example combines summary_columns() for row-wise totals and averages with grand_summary_rows() for column-wise company totals. The result is a table that answers multiple questions at once: How did each region perform? What were the quarterly trends? What’s the company-wide picture?
The use of column spanners ("Quarters" and "Summary") helps readers understand the table’s structure at a glance. The summary columns on the right provide immediate context for each region’s performance, while the grand summary row at the bottom gives the company-wide view. A manager scanning this table can quickly identify that East is the top-performing region and that sales have been trending upward across quarters.
7.8.3 Scientific data with statistical summaries
Scientific publications typically require summary statistics like means, standard deviations, and sample sizes for experimental data. This example shows how summary_rows() can present multiple statistics for each treatment group, formatted appropriately for each measure.
Each treatment group receives the same three summary statistics, formatted according to their nature: means and standard deviations show two decimal places, while sample sizes appear as integers. The light yellow background on summary rows distinguishes them from the raw data, making it easy for readers to find the statistics they need. This format is common in scientific papers and lab reports, where reviewers expect to see both the underlying data and its statistical summary.
These three examples illustrate the versatility of gt’s summary system. Whether you’re preparing financial reports, business analyses, or scientific publications, the same core functions adapt to meet domain-specific conventions and reader expectations.
7.9 Summary
The summary functions in gt provide a comprehensive system for adding aggregated data to your tables:
summary_rows() adds summary rows within each row group, useful for subtotals and group-level statistics
grand_summary_rows() adds summary rows that aggregate across the entire table, useful for overall totals and grand statistics
summary_columns() adds computed columns that aggregate across rows, useful for row totals and row-wise calculations
extract_summary() retrieves summary data for use outside the table
All three summary functions share a consistent interface with the fns argument for specifying aggregations, the fmt argument for formatting results, and the side argument for positioning. Summary elements can be styled and annotated using the standard gt functions with the cells_summary() and cells_grand_summary() location helpers.
By separating the calculation of summaries from their presentation, gt ensures that your tables remain reproducible and maintainable. When data changes, summaries update automatically. When you need different formatting, you can adjust it without recalculating. This approach embodies the gt philosophy: express your intent clearly, and let the package handle the details.