7 Summary rows and columns

Tables often need more than just raw data. They frequently require summary statistics that help readers understand the data at a glance: totals, averages, minimums, maximums, and other aggregations. In gt, we have powerful tools for adding these summaries both vertically (as summary rows) and horizontally (as summary columns).

The summary functions in gt address a common frustration in table-making: the disconnect between calculating summary statistics and presenting them elegantly. In many workflows, you might calculate means and totals in your data manipulation code, then manually position them in your table, style them differently from the data rows, and hope nothing breaks when the data changes. The gt approach is different. Summary rows and columns are computed from the table data itself, positioned automatically, and can be styled and formatted independently of the body data. When your underlying data changes, the summaries update automatically.

This chapter covers three primary functions: summary_rows() for adding summaries within row groups, grand_summary_rows() for adding summaries across the entire table, and summary_columns() for adding computed columns that aggregate across rows. We’ll also cover extract_summary(), which lets you pull out summary data for use elsewhere. Together, these functions provide a complete toolkit for building tables that don’t just present data but help readers understand it.

7.1 The role of summary rows in tables

Before diving into the mechanics of creating summary rows, it’s worth considering when and why you might use them. Summary rows serve several distinct purposes.

First, they provide context. A column of sales figures means more when you can see the total at the bottom. A list of response times becomes interpretable when you know the average. Summary rows transform raw data into information by providing reference points.

Second, they enable comparison. When data is organized into groups, summary rows for each group allow readers to compare not just individual values but aggregate performance. How does this quarter compare to last quarter? How does the East region compare to the West? Summary rows answer these questions at a glance.

Third, they support verification. In financial and accounting contexts, totals serve as check figures. Readers (and auditors) expect to see them, and their presence signals that the table has been carefully prepared.

Finally, they reduce cognitive load. Rather than requiring readers to mentally calculate averages or identify maximums, summary rows do the work for them. This is especially valuable when tables will be scanned quickly rather than studied carefully.

The gt package supports two types of summary rows: group summaries (created with summary_rows()) that appear within or alongside each row group, and grand summaries (created with grand_summary_rows()) that summarize the entire table. Both types can include multiple summary statistics, can be positioned at the top or bottom of their respective sections, and can be formatted independently of the body data.

7.2 Adding group-level summaries with `summary_rows()`

The summary_rows() function adds summary rows to row groups in your table. This requires that your table actually has row groups, which you typically create by specifying a groupname_col in gt() or by using tab_row_group(). The summary rows appear within each group, providing statistics calculated from just that group’s data.

Let’s start with a practical example. We’ll use the sp500 dataset, which contains daily stock market data, and create a table with weekly row groups:

sp500 |>
  filter(date >= "2015-01-05" & date <= "2015-01-16") |>
  arrange(date) |>
  mutate(week = paste0("W", strftime(date, format = "%V"))) |>
  select(-adj_close, -volume) |>
  gt(
    rowname_col = "date",
    groupname_col = "week"
  ) |>
  summary_rows(
    fns = list("min", "max", "mean"),
    fmt = ~ fmt_number(., decimals = 2, use_seps = FALSE)
  )

	open	high	low	close
W02
2015-01-05	2054.44	2054.44	2017.34	2020.58
2015-01-06	2022.15	2030.25	1992.44	2002.61
2015-01-07	2005.55	2029.61	2005.55	2025.90
2015-01-08	2030.61	2064.08	2030.61	2062.14
2015-01-09	2063.45	2064.43	2038.33	2044.81
min	2005.55	2029.61	1992.44	2002.61
max	2063.45	2064.43	2038.33	2062.14
mean	2035.24	2048.56	2016.85	2031.21
W03
2015-01-12	2046.13	2049.30	2022.58	2028.26
2015-01-13	2031.58	2056.93	2008.25	2023.03
2015-01-14	2018.40	2018.40	1988.44	2011.27
2015-01-15	2013.75	2021.35	1991.47	1992.67
2015-01-16	1992.25	2020.46	1988.12	2019.42
min	1992.25	2018.40	1988.12	1992.67
max	2046.13	2056.93	2022.58	2028.26
mean	2020.42	2033.29	1999.77	2014.93

This table shows daily stock prices grouped by week, with minimum, maximum, and mean values calculated for each week. The summary rows appear at the bottom of each group by default, clearly labeled with the function names.

7.2.1 Specifying aggregation functions

The fns argument is where you tell summary_rows() what statistics to calculate. There are several ways to specify aggregation functions, ranging from simple to highly customized.

The simplest approach is to provide function names as quoted strings:

sp500 |>
  filter(date >= "2015-01-05" & date <= "2015-01-09") |>
  select(date, open, high, low, close) |>
  gt(rowname_col = "date") |>
  tab_row_group(label = "First Week", rows = everything()) |>
  summary_rows(
    fns = list("min", "max", "sum"),
    fmt = ~ fmt_number(., decimals = 2)
  )

	open	high	low	close
First Week
2015-01-09	2063.45	2064.43	2038.33	2044.81
2015-01-08	2030.61	2064.08	2030.61	2062.14
2015-01-07	2005.55	2029.61	2005.55	2025.90
2015-01-06	2022.15	2030.25	1992.44	2002.61
2015-01-05	2054.44	2054.44	2017.34	2020.58
min	2,005.55	2,029.61	1,992.44	2,002.61
max	2,063.45	2,064.43	2,038.33	2,062.14
sum	10,176.20	10,242.81	10,084.27	10,156.04

For convenience, gt automatically adds na.rm = TRUE to common aggregation functions when you specify them as strings. This applies to "min", "max", "mean", "median", "sd", and "sum". This saves you from writing out the full expression every time.

When you need more control, you can use formula syntax. The right-hand side (RHS) of the formula contains the aggregation expression, where . represents the column data:

sp500 |>
  filter(date >= "2015-01-05" & date <= "2015-01-09") |>
  select(date, open, close) |>
  gt(rowname_col = "date") |>
  tab_row_group(label = "First Week", rows = everything()) |>
  summary_rows(
    fns = list(
      range = ~ max(.) - min(.),
      avg = ~ mean(., na.rm = TRUE)
    ),
    fmt = ~ fmt_number(., decimals = 2)
  )

	open	close
First Week
2015-01-09	2063.45	2044.81
2015-01-08	2030.61	2062.14
2015-01-07	2005.55	2025.90
2015-01-06	2022.15	2002.61
2015-01-05	2054.44	2020.58
range	57.90	59.53
avg	2,035.24	2,031.21

The named list elements become both the ID and the label for each summary row. In this example, we calculate a custom range statistic (the difference between maximum and minimum) alongside the average.

For maximum flexibility, you can provide a double-sided formula where the left-hand side (LHS) specifies the ID and label:

sp500 |>
  filter(date >= "2015-01-05" & date <= "2015-01-09") |>
  select(date, open, close) |>
  gt(rowname_col = "date") |>
  tab_row_group(label = "First Week", rows = everything()) |>
  summary_rows(
    fns = list(
      list(id = "daily_range", label = md("**Daily Range**")) ~ max(.) - min(.),
      list(id = "average", label = md("*Average*")) ~ mean(., na.rm = TRUE)
    ),
    fmt = ~ fmt_number(., decimals = 2)
  )

	open	close
First Week
2015-01-09	2063.45	2044.81
2015-01-08	2030.61	2062.14
2015-01-07	2005.55	2025.90
2015-01-06	2022.15	2002.61
2015-01-05	2054.44	2020.58
Daily Range	57.90	59.53
Average	2,035.24	2,031.21

This approach lets you use different values for the row’s ID (used for targeting in other functions like tab_style()) and its label (displayed in the table). The label can include Markdown formatting via md() or HTML via html().

7.2.2 Targeting specific columns

By default, summary_rows() calculates aggregations for all columns. This is rarely what you want, especially when your table includes non-numeric columns or columns where aggregation doesn’t make sense.

The columns argument lets you specify which columns should receive summary calculations:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum", "mean"),
    fmt = ~ fmt_number(., decimals = 1)
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	480.1	65,169.3
mean	120.0	16,292.3
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
sum	9,662,550.0	1,339.5
mean	3,220,850.0	446.5

You can use any tidyselect expression in columns:

exibble |>
  select(group, num, currency, char) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = where(is.numeric),
    fns = list("sum"),
    fmt = ~ fmt_number(., decimals = 1)
  )

	num	currency	char
grp_a
	1.111e-01	49.950	apricot
	2.222e+00	17.950	banana
	3.333e+01	1.390	coconut
	4.444e+02	65100.000	durian
sum	480.1	65,169.3	—
grp_b
	5.550e+03	1325.810	NA
	NA	13.255	fig
	7.770e+05	NA	grapefruit
	8.880e+06	0.440	honeydew
sum	9,662,550.0	1,339.5	—

Using where(is.numeric) ensures that only numeric columns receive summary calculations, even if the table structure changes later.

7.2.3 Targeting specific groups

Sometimes you only want summary rows for certain groups, not all of them. The groups argument controls which row groups receive summaries:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    groups = "grp_a",
    columns = c(num, currency),
    fns = list("sum"),
    fmt = ~ fmt_number(., decimals = 1)
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	480.1	65,169.3
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440

This adds a sum row only to grp_a, leaving grp_b without summaries. The groups argument accepts group IDs as strings, vectors of strings, or tidyselect helpers like starts_with() or matches().

7.2.4 Positioning summary rows

By default, summary rows appear at the bottom of each group. The side argument lets you place them at the top instead:

countrypops |>
  filter(country_code_2 %in% c("BR", "RU", "IN", "CN")) |>
  filter(year %in% c(2000, 2010, 2020)) |>
  select(country_name, year, population) |>
  tidyr::pivot_wider(names_from = year, values_from = population) |>
  gt(rowname_col = "country_name") |>
  tab_row_group(label = "BRIC Nations", rows = everything()) |>
  summary_rows(
    fns = list(label = md("**TOTAL**"), id = "total", fn = "sum"),
    fmt = ~ fmt_integer(.),
    side = "top"
  )

	2000	2010	2020
BRIC Nations
TOTAL	2,641,182,884	2,917,737,961	3,167,623,685
Brazil	174018282	193701929	208660842
China	1262645000	1337705000	1411100000
India	1057922733	1243481564	1402617695
Russia	146596869	142849468	145245148

Placing totals at the top is common in financial reporting, where readers want to see the bottom line first. The summary row appears before the detail rows, inviting readers to drill down into the components.

7.2.5 Formatting summary values

The fmt argument accepts formatting expressions that control how summary values appear. You can use any gt formatting function:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum", "mean"),
    fmt = list(
      ~ fmt_number(., decimals = 0, use_seps = TRUE),
      ~ fmt_currency(., currency = "USD")
    )
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	$480.06	$65,169.29
mean	$120.02	$16,292.32
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
sum	$9,662,550.00	$1,339.50
mean	$3,220,850.00	$446.50

When you provide a list of formatting expressions, they’re applied in order to the summary rows. In this example, the sum row uses fmt_number() and the mean row uses fmt_currency().

You can also target specific columns within the formatting expression:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum"),
    fmt = ~ fmt_number(., columns = num, decimals = 1) |>
           fmt_currency(columns = currency, currency = "EUR")
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	480.1	€65,169.29
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
sum	9,662,550.0	€1,339.50

For group-specific formatting, use a two-sided formula where the LHS identifies the group:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum"),
    fmt = list(
      "grp_a" ~ fmt_number(., decimals = 0),
      "grp_b" ~ fmt_number(., decimals = 2)
    )
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	480	65,169
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
sum	9,662,550.00	1,339.50

This technique is useful when groups have different precision requirements. Perhaps one group contains whole-number counts while another contains measurements that warrant decimal places. The two-sided formula syntax keeps these formatting rules close to the summary definition, making the code easier to maintain.

7.2.6 Handling missing values

When aggregation produces NA values (for example, when all values in a column are NA), the missing_text argument controls what appears in the cell:

dplyr::tibble(
  group = c("A", "A", "B", "B"),
  value = c(1, 2, NA, NA)
) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = value,
    fns = list("mean"),
    missing_text = "No data"
  )

	value
A
	1
	2
mean	1.5
B
	NA
	NA
mean	No data

The default missing_text is "---", which provides a visual placeholder that clearly indicates the absence of a calculable value (an em dash).

7.3 Adding table-level summaries with `grand_summary_rows()`

While summary_rows() operates on individual row groups, grand_summary_rows() calculates aggregations across all data in the table, regardless of grouping. Grand summary rows typically appear at the very bottom (or top) of the table, providing overall totals or averages.

sp500 |>
  filter(date >= "2015-01-05" & date <= "2015-01-16") |>
  arrange(date) |>
  mutate(week = paste0("W", strftime(date, format = "%V"))) |>
  select(-adj_close, -volume) |>
  gt(
    rowname_col = "date",
    groupname_col = "week"
  ) |>
  summary_rows(
    fns = list("min", "max"),
    fmt = ~ fmt_number(., decimals = 2, use_seps = FALSE)
  ) |>
  grand_summary_rows(
    columns = c(open, high, low, close),
    fns = list(
      list(id = "grand_min", label = "Overall Min") ~ min(.),
      list(id = "grand_max", label = "Overall Max") ~ max(.),
      list(id = "grand_avg", label = "Overall Avg") ~ mean(.)
    ),
    fmt = ~ fmt_number(., decimals = 2, use_seps = FALSE)
  )

	open	high	low	close
W02
2015-01-05	2054.44	2054.44	2017.34	2020.58
2015-01-06	2022.15	2030.25	1992.44	2002.61
2015-01-07	2005.55	2029.61	2005.55	2025.90
2015-01-08	2030.61	2064.08	2030.61	2062.14
2015-01-09	2063.45	2064.43	2038.33	2044.81
min	2005.55	2029.61	1992.44	2002.61
max	2063.45	2064.43	2038.33	2062.14
W03
2015-01-12	2046.13	2049.30	2022.58	2028.26
2015-01-13	2031.58	2056.93	2008.25	2023.03
2015-01-14	2018.40	2018.40	1988.44	2011.27
2015-01-15	2013.75	2021.35	1991.47	1992.67
2015-01-16	1992.25	2020.46	1988.12	2019.42
min	1992.25	2018.40	1988.12	1992.67
max	2046.13	2056.93	2022.58	2028.26
Overall Min	1992.25	2018.40	1988.12	1992.67
Overall Max	2063.45	2064.43	2038.33	2062.14
Overall Avg	2027.83	2040.92	2008.31	2023.07

This table now has both group-level summaries (min and max for each week) and grand summaries (overall min, max, and average for the entire period). The grand summary rows span all groups, summarizing the complete dataset.

7.3.1 Arguments and usage

The grand_summary_rows() function shares most of its arguments with summary_rows():

columns: Which columns to aggregate (defaults to all)
fns: Aggregation expressions (same syntax as summary_rows())
fmt: Formatting expressions
side: "bottom" (default) or "top"
missing_text: Text for NA values (default: "---")

The key difference is that grand_summary_rows() has no groups argument, since it always operates on all data.

countrypops |>
  filter(country_code_2 %in% c("BE", "NL", "LU")) |>
  filter(year %% 10 == 0) |>
  select(country_name, year, population) |>
  tidyr::pivot_wider(names_from = year, values_from = population) |>
  gt(rowname_col = "country_name") |>
  tab_header(title = "Populations of the Benelux Countries") |>
  tab_spanner(columns = everything(), label = "Year") |>
  fmt_integer() |>
  grand_summary_rows(
    fns = list(label = "TOTALS", id = "totals", fn = "sum"),
    fmt = ~ fmt_integer(.),
    side = "top"
  ) |>
  tab_style(
    locations = cells_grand_summary(),
    style = cell_fill(color = "lightblue")
  )

	Year
Populations of the Benelux Countries
	1960	1970	1980	1990	2000	2010	2020
TOTALS	20,954,090	23,033,246	24,373,192	25,300,739	26,613,063	28,017,933	29,610,523
Belgium	9,153,489	9,655,549	9,859,242	9,967,379	10,251,250	10,895,586	11,538,604
Luxembourg	313,970	339,171	364,150	381,850	436,300	506,953	630,419
Netherlands	11,486,631	13,038,526	14,149,800	14,951,510	15,925,513	16,615,394	17,441,500

In this example, the grand summary row appears at the top of the table (a common convention for totals) and is styled with a light blue background to distinguish it from the data rows.

7.3.2 Referencing other columns in aggregations

A powerful but often overlooked feature of summary rows is the ability to reference other columns within an aggregation formula. This is particularly useful when you need to compute a summary that depends on values from multiple columns, such as a weighted average or a ratio that must be calculated from totals rather than averaged directly.

Let’s consider a sales dataset where each row shows units sold and total profit for a product. If you want the grand summary to show the overall profit per unit, you cannot simply average the per-unit profits from each row (that would give incorrect results if the rows have different unit volumes). Instead, you need to sum all profits and divide by the sum of all units. The fns argument allows you to write aggregation expressions that reference any column in the table data, not just the column being summarized.

dplyr::tibble(
  product = c("Widget", "Gadget", "Sprocket", "Gizmo", "Doodad"),
  units = c(150, 280, 95, 420, 175),
  profit = c(4500, 11200, 3325, 12600, 6125)
) |>
  dplyr::mutate(profit_per_unit = profit / units) |>
  gt(rowname_col = "product") |>
  fmt_integer(columns = units) |>
  fmt_currency(columns = c(profit, profit_per_unit)) |>
  cols_label(
    units = "Units Sold",
    profit = "Total Profit",
    profit_per_unit = "Profit/Unit"
  ) |>
  grand_summary_rows(
    columns = units,
    fns = list(label = md("**TOTAL**"), id = "totals", fn = "sum"),
    fmt = ~ fmt_integer(.)
  ) |>
  grand_summary_rows(
    columns = c(profit, profit_per_unit),
    fns = list(label = md("**TOTAL**"), id = "totals", fn = "sum"),
    fmt = ~ fmt_currency(.)
  ) |>
  grand_summary_rows(
    columns = profit_per_unit,
    fns = list(label = md("**TOTAL**"), id = "totals") ~ sum(profit) / sum(units),
    fmt = ~ fmt_currency(.)
  ) |>
  tab_style(
    locations = cells_grand_summary(),
    style = cell_fill(color = "lightgray")
  )

	Units Sold	Total Profit	Profit/Unit
Widget	150	$4,500.00	$30.00
Gadget	280	$11,200.00	$40.00
Sprocket	95	$3,325.00	$35.00
Gizmo	420	$12,600.00	$30.00
Doodad	175	$6,125.00	$35.00
TOTAL	1,120	$37,750.00	$33.71

This example demonstrates several techniques worth noting. First, we call grand_summary_rows() three times, but all calls use the same id = "totals". This produces a single summary row in the table, not three separate rows. By splitting the aggregation across multiple calls, we can apply different logic to different columns while maintaining a unified row label.

Second, each call uses its own formatting: fmt_integer() for the units column, and fmt_currency() for the profit and profit-per-unit columns. This flexibility is essential when a single summary row must display values with fundamentally different formats.

Third, the profit_per_unit summary uses a custom formula that references the profit and units columns directly. The expression sum(profit) / sum(units) calculates the correct weighted profit per unit across all products (totaling "$37,750" profit divided by 1,120 units gives "$33.71" per unit), rather than averaging the individual ratios (which would incorrectly give "$33.04").

This technique works in both summary_rows() and grand_summary_rows(). Within a group summary, the column references resolve to just that group’s data. Within a grand summary, they resolve to the entire table’s data. This makes it straightforward to compute derived statistics like weighted averages, ratios of totals, or any calculation that requires multiple columns.

7.3.3 Combining group and grand summaries

Many real-world tables require multiple levels of aggregation. A financial report might show subtotals for each department and a grand total for the entire organization. A sales analysis might present regional summaries alongside company-wide figures. A scientific study might report group means as well as overall statistics. In these situations, you need both summary_rows() and grand_summary_rows() working together.

The key to making multi-level summaries work is visual differentiation. When readers encounter two or three types of summary rows in a single table, they need clear cues to distinguish subtotals from grand totals from data rows. This typically means using different background colors, text weights, or indentation levels for each type. Without such differentiation, the hierarchy becomes confusing and the summaries lose their communicative value.

The following example demonstrates this pattern with population data for two groups of countries:

countrypops |>
  filter(
    country_code_2 %in% c("BR", "RU", "IN", "CN", "FR", "DE", "IT", "GB")
  ) |>
  filter(year %% 10 == 0) |>
  select(country_name, year, population) |>
  tidyr::pivot_wider(names_from = year, values_from = population) |>
  gt(rowname_col = "country_name") |>
  tab_row_group(
    label = md("*BRIC*"),
    rows = c("Brazil", "Russia", "India", "China"),
    id = "bric"
  ) |>
  tab_row_group(
    label = md("*Big Four*"),
    rows = c("France", "Germany", "Italy", "United Kingdom"),
    id = "big4"
  ) |>
  row_group_order(groups = c("bric", "big4")) |>
  tab_stub_indent(rows = everything()) |>
  tab_header(title = "Populations of the BRIC and Big Four Countries") |>
  tab_spanner(columns = everything(), label = "Year") |>
  fmt_number(n_sigfig = 3, suffixing = TRUE) |>
  summary_rows(
    fns = list(label = md("**Subtotal**"), id = "subtotal", fn = "sum"),
    fmt = ~ fmt_number(., n_sigfig = 3, suffixing = TRUE),
    side = "bottom"
  ) |>
  grand_summary_rows(
    fns = list(label = md("**GRAND TOTAL**"), id = "grand_total", fn = "sum"),
    fmt = ~ fmt_number(., n_sigfig = 3, suffixing = TRUE)
  ) |>
  tab_style(
    locations = cells_summary(),
    style = cell_fill(color = "lightgray")
  ) |>
  tab_style(
    locations = cells_grand_summary(),
    style = list(
      cell_fill(color = "steelblue"),
      cell_text(color = "white")
    )
  )

	Year
Populations of the BRIC and Big Four Countries
	1960	1970	1980	1990	2000	2010	2020
BRIC
Brazil	72.4M	95.4M	121M	149M	174M	194M	209M
China	667M	818M	981M	1.14B	1.26B	1.34B	1.41B
India	436M	546M	687M	865M	1.06B	1.24B	1.40B
Russia	120M	130M	139M	148M	147M	143M	145M
Subtotal	1.30B	1.59B	1.93B	2.30B	2.64B	2.92B	3.17B
Big Four
Germany	72.8M	78.2M	78.3M	79.4M	82.2M	81.8M	83.2M
France	47.4M	52.0M	55.3M	58.3M	60.9M	65.0M	67.6M
United Kingdom	52.4M	55.7M	56.3M	57.2M	58.9M	62.8M	66.7M
Italy	50.2M	53.8M	56.4M	56.7M	56.9M	59.8M	59.4M
Subtotal	223M	240M	246M	252M	259M	269M	277M
GRAND TOTAL	1.52B	1.83B	2.18B	2.55B	2.90B	3.19B	3.44B

This table shows populations for two groups of countries (BRIC and Big Four), with subtotals for each group and a grand total for all countries combined. The styling distinguishes the different types of summary rows, making the table’s structure immediately clear.

7.4 Adding summary columns with `summary_columns()`

While summary rows aggregate data vertically (down columns), summary columns aggregate data horizontally (across rows). The summary_columns() function computes row-wise aggregations and adds them as new columns to your table. This is useful when you want to show totals, averages, or other statistics that combine values from multiple columns within each row.

exibble |>
  select(row, num, currency) |>
  gt(rowname_col = "row") |>
  summary_columns(
    columns = c(num, currency),
    fns = ~ sum(., na.rm = TRUE),
    new_col_names = "total",
    new_col_labels = md("**Total**")
  )

	num	currency	Total
row_1	1.111e-01	49.950	5.00611e+01
row_2	2.222e+00	17.950	2.01720e+01
row_3	3.333e+01	1.390	3.47200e+01
row_4	4.444e+02	65100.000	6.55444e+04
row_5	5.550e+03	1325.810	6.87581e+03
row_6	NA	13.255	1.32550e+01
row_7	7.770e+05	NA	7.77000e+05
row_8	8.880e+06	0.440	8.88000e+06

This adds a total column that sums the num and currency values for each row. The new column appears on the right side of the table by default and can be formatted, styled, and manipulated like any other column.

7.4.1 Specifying aggregation functions

The fns argument works similarly to the summary row functions, but the aggregation happens across columns rather than down rows. This horizontal aggregation is particularly useful for time-series data where columns represent periods (months, quarters, years) and you want to show totals or averages for each row. It’s also valuable when columns represent categories or components that naturally sum to a meaningful whole.

Unlike summary rows, which aggregate many observations into fewer summary values, summary columns preserve the row structure while adding derived values. Each row in your original data still appears in the output; you’re simply enriching it with computed statistics. This makes summary columns ideal for dashboards and reports where readers want to see both the details and the row-level summaries side by side.

dplyr::tibble(
  product = c("Widget", "Gadget", "Sprocket"),
  q1_sales = c(150, 200, 175),
  q2_sales = c(180, 190, 160),
  q3_sales = c(200, 210, 190),
  q4_sales = c(220, 205, 210)
) |>
  gt(rowname_col = "product") |>
  summary_columns(
    columns = ends_with("_sales"),
    fns = list(
      ~ sum(.),
      ~ mean(.)
    ),
    new_col_names = c("annual_total", "quarterly_avg"),
    new_col_labels = c("Annual Total", "Quarterly Avg")
  ) |>
  fmt_number(columns = quarterly_avg, decimals = 1)

	q1_sales	q2_sales	q3_sales	q4_sales	Annual Total	Quarterly Avg
Widget	150	180	200	220	750	187.5
Gadget	200	190	210	205	805	201.2
Sprocket	175	160	190	210	735	183.8

You can add multiple summary columns at once by providing a list of functions. Each function generates a separate column, and you can specify names and labels for each.

7.4.2 Custom aggregation expressions

Simple aggregations like sums and means cover many use cases, but sometimes you need calculations that involve multiple columns in more complex ways. Perhaps you want to compute a percentage change between two columns, calculate a ratio, or apply a formula that references specific column positions. The bracket notation in summary_columns() makes this possible.

When you write .[1], .[2], and so on in your formula, you’re referring to the columns in the order they appear in your columns argument. This positional reference system lets you build arbitrarily complex expressions. The trade-off is readability: bracket notation is concise but can be cryptic if you’re not careful. For complex calculations, consider adding a comment in your code explaining what each position represents.

countrypops |>
  arrange(country_name, year) |>
  group_by(country_name, country_code_2) |>
  summarize(
    pop_1960 = first(population),
    pop_2024 = last(population),
    years = n(),
    .groups = "drop"
  ) |>
  filter(!is.na(pop_1960), !is.na(pop_2024)) |>
  slice_max(pop_2024 - pop_1960, n = 5) |>
  select(country_name, pop_1960, pop_2024, years) |>
  gt(rowname_col = "country_name") |>
  summary_columns(
    columns = c(pop_1960, pop_2024, years),
    fns = ~ (.[2] - .[1]) / (.[3] - 1),
    new_col_names = "avg_annual_change",
    new_col_labels = md("**Avg. Annual Change**")
  ) |>
  fmt_integer() |>
  cols_label(
    pop_1960 = "1960 Pop.",
    pop_2024 = "2024 Pop.",
    years = "Years"
  )

	1960 Pop.	2024 Pop.	Years	Avg. Annual Change
India	435,990,338	1,450,935,791	65	15,858,523
China	667,070,000	1,408,975,000	65	11,592,266
Pakistan	45,709,310	251,269,164	65	3,211,873
Indonesia	88,296,070	283,487,931	65	3,049,873
Nigeria	45,053,782	232,679,478	65	2,931,652

In this expression, .[1] refers to the first selected column (pop_1960), .[2] to the second (pop_2024), and .[3] to the third (years). This allows complex calculations that reference multiple columns.

7.4.3 Formatting summary columns

Summary columns often require different formatting than the source columns they’re derived from. A column of quarterly sales figures might use integer formatting, but the average of those quarters might warrant one decimal place. A column showing year-over-year change might need percentage formatting even when the source columns are raw numbers. The fmt argument lets you apply formatting to the new column immediately, without a separate formatting step.

This inline formatting is particularly convenient because it keeps the summary column’s definition self-contained. Everything about the column (what it calculates, what it’s called, and how it’s displayed) is specified in a single function call. This makes the code easier to read and maintain, especially when you’re creating multiple summary columns with different formatting requirements.

dplyr::tibble(
  item = c("Revenue", "Expenses", "Profit"),
  jan = c(10000, 8000, 2000),
  feb = c(12000, 8500, 3500),
  mar = c(11000, 8200, 2800)
) |>
  gt(rowname_col = "item") |>
  summary_columns(
    columns = c(jan, feb, mar),
    fns = ~ mean(.),
    new_col_names = "avg",
    new_col_labels = "Q1 Average",
    fmt = ~ fmt_currency(., currency = "USD", decimals = 0)
  ) |>
  fmt_currency(columns = c(jan, feb, mar), currency = "USD", decimals = 0)

	jan	feb	mar	Q1 Average
Revenue	$10,000	$12,000	$11,000	$11,000
Expenses	$8,000	$8,500	$8,200	$8,233
Profit	$2,000	$3,500	$2,800	$2,767

The summary column displays the Q1 average for each row, formatted as currency to match the source columns. By handling the formatting within summary_columns(), the new column integrates seamlessly with the rest of the table. Readers see consistent dollar formatting across all columns, whether they contain original data or computed summaries.

7.5 Styling summary rows

Summary rows can be styled like any other table element using tab_style(). The location helper functions cells_summary() and cells_grand_summary() target summary rows specifically.

7.5.1 Styling all summary rows

Summary rows serve a different purpose than data rows, and that difference should be visible. Readers scanning a table need to immediately distinguish totals and averages from individual observations. Without visual differentiation, summary rows blend into the data, and readers may misinterpret aggregated values as additional data points.

The most common styling choices for summary rows include background fills (light gray is a classic choice), bold text, borders above or below the row, and italic styling for certain types of summaries like averages. The goal is distinction without distraction: the styling should set summary rows apart without overwhelming the table’s visual hierarchy.

To apply uniform styling to all summary rows across all groups, use cells_summary() without any filtering arguments:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum", "mean"),
    fmt = ~ fmt_number(., decimals = 1)
  ) |>
  tab_style(
    locations = cells_summary(),
    style = list(
      cell_fill(color = "lightgray"),
      cell_text(weight = "bold")
    )
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	480.1	65,169.3
mean	120.0	16,292.3
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
sum	9,662,550.0	1,339.5
mean	3,220,850.0	446.5

The gray background and bold text make the summary rows immediately identifiable. This blanket styling approach works well when all summary rows serve the same purpose and deserve equal visual weight.

7.5.2 Targeting specific summary rows

You can target specific summary rows by their ID using the rows argument in cells_summary():

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list(
      list(id = "total", label = "Total") ~ sum(., na.rm = TRUE),
      list(id = "average", label = "Average") ~ mean(., na.rm = TRUE)
    ),
    fmt = ~ fmt_number(., decimals = 1)
  ) |>
  tab_style(
    locations = cells_summary(rows = "total"),
    style = cell_fill(color = "lightblue")
  ) |>
  tab_style(
    locations = cells_summary(rows = "average"),
    style = cell_text(style = "italic")
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
Total	480.1	65,169.3
Average	120.0	16,292.3
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
Total	9,662,550.0	1,339.5
Average	3,220,850.0	446.5

By assigning IDs to summary rows in the fns argument, you gain precise control over their styling. Here, totals receive a blue background to signal their importance as definitive figures, while averages use italic text to suggest they’re derived statistics rather than absolute values. This kind of semantic styling helps readers interpret the numbers correctly.

7.5.3 Targeting specific groups and columns

The cells_summary() function accepts groups and columns arguments for precise targeting:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum"),
    fmt = ~ fmt_number(., decimals = 1)
  ) |>
  tab_style(
    locations = cells_summary(
      groups = "grp_a",
      columns = currency
    ),
    style = cell_fill(color = "lightgreen")
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
sum	480.1	65,169.3
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
sum	9,662,550.0	1,339.5

This level of precision is useful when you want to highlight specific values (perhaps a particularly important group, a column that exceeded targets, or a combination that warrants the reader’s attention). The cells_summary() helper accepts any combination of groups, columns, and rows arguments, giving you fine-grained control over exactly which cells receive styling.

7.5.4 Styling grand summary rows

Grand summary rows use the cells_grand_summary() location helper:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  grand_summary_rows(
    columns = c(num, currency),
    fns = list(
      list(id = "grand_total", label = md("**Grand Total**")) ~ sum(., na.rm = TRUE)
    ),
    fmt = ~ fmt_number(., decimals = 1)
  ) |>
  tab_style(
    locations = cells_grand_summary(),
    style = list(
      cell_fill(color = "navy"),
      cell_text(color = "white", weight = "bold")
    )
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
Grand Total	9,663,030.1	66,508.8

The dark background with white text creates strong visual emphasis appropriate for a grand total (the single most important number in many tables). Like cells_summary(), the cells_grand_summary() helper accepts columns and rows arguments for more targeted styling when needed.

7.6 Adding footnotes to summary rows

Footnotes can be attached to summary rows using tab_footnote() with the appropriate location helpers:

exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list(
      list(id = "avg", label = "Average") ~ mean(., na.rm = TRUE)
    ),
    fmt = ~ fmt_number(., decimals = 2)
  ) |>
  tab_footnote(
    footnote = "Arithmetic mean excluding missing values",
    locations = cells_summary(
      groups = everything(),
      columns = num,
      rows = "avg"
    )
  )

	num	currency
grp_a
	1.111e-01	49.950
	2.222e+00	17.950
	3.333e+01	1.390
	4.444e+02	65100.000
Average	120.02¹	16,292.32
grp_b
	5.550e+03	1325.810
	NA	13.255
	7.770e+05	NA
	8.880e+06	0.440
Average	3,220,850.00¹	446.50
¹ Arithmetic mean excluding missing values

The footnote appears on every “Average” row across all groups, clarifying the calculation method. This is particularly important for summary statistics where multiple calculation methods exist (arithmetic vs. geometric mean, sample vs. population standard deviation) or where the handling of missing values affects interpretation. Footnotes on summary rows serve the same documentary purpose as footnotes elsewhere in the table, providing essential context without cluttering the display.

7.7 Extracting summary data with `extract_summary()`

Sometimes you need to access the computed summary values programmatically, rather than just displaying them in a table. The extract_summary() function retrieves summary data from a gt table as a list of data frames.

gt_table <- exibble |>
  select(group, num, currency) |>
  gt(groupname_col = "group") |>
  summary_rows(
    columns = c(num, currency),
    fns = list("sum", "mean"),
    fmt = ~ fmt_number(., decimals = 2)
  ) |>
  grand_summary_rows(
    columns = c(num, currency),
    fns = list("sum"),
    fmt = ~ fmt_number(., decimals = 2)
  )

extract_summary(gt_table)

$summary_df_data_list
$summary_df_data_list$grp_a
# A tibble: 2 × 7
  group_id row_id rowname group   num currency `::rowname::`
  <chr>    <chr>  <chr>   <dbl> <dbl>    <dbl> <chr>        
1 grp_a    sum    sum        NA  480.   65169. sum          
2 grp_a    mean   mean       NA  120.   16292. mean         

$summary_df_data_list$grp_b
# A tibble: 2 × 7
  group_id row_id rowname group     num currency `::rowname::`
  <chr>    <chr>  <chr>   <dbl>   <dbl>    <dbl> <chr>        
1 grp_b    sum    sum        NA 9662550    1340. sum          
2 grp_b    mean   mean       NA 3220850     447. mean         

$summary_df_data_list$`::GRAND_SUMMARY`
# A tibble: 1 × 7
  group_id        row_id rowname group      num currency `::rowname::`
  <chr>           <chr>  <chr>   <dbl>    <dbl>    <dbl> <chr>        
1 ::GRAND_SUMMARY sum    sum        NA 9663030.   66509. sum

The returned list contains one data frame per row group (including a "::grand_summary" element for grand summaries if present). Each data frame includes the row names, group information, and the formatted summary values.

This is useful for:

Reusing summary calculations in other analyses
Creating custom visualizations of summary data
Validating that summaries are calculated correctly
Building composite tables from multiple sources

7.8 Practical examples

The techniques covered in this chapter come together most clearly in real-world scenarios. The following examples demonstrate how summary rows and columns can be combined to create tables that serve specific professional purposes. Each example illustrates different aspects of the summary system and shows how formatting and styling choices reinforce the table’s communicative goals.

7.8.1 Financial statement with subtotals

Financial statements are perhaps the most natural application for summary rows. Accountants and financial analysts expect to see subtotals for logical groupings (revenue, expenses, assets, liabilities) along with grand totals that tie everything together. The following example creates a simplified income statement that demonstrates these conventions.

dplyr::tibble(
  category = c(
    "Revenue", "Revenue", "Revenue",
    "Expenses", "Expenses", "Expenses", "Expenses"
  ),
  item = c(
    "Product Sales", "Service Revenue", "Licensing",
    "Cost of Goods", "Salaries", "Marketing", "Operations"
  ),
  amount = c(500000, 150000, 75000, -180000, -220000, -45000, -80000)
) |>
  gt(
    rowname_col = "item",
    groupname_col = "category"
  ) |>
  fmt_currency(columns = amount, currency = "USD", accounting = TRUE) |>
  summary_rows(
    fns = list(
      list(id = "subtotal", label = md("*Subtotal*")) ~ sum(.)
    ),
    fmt = ~ fmt_currency(., currency = "USD", accounting = TRUE)
  ) |>
  grand_summary_rows(
    fns = list(
      list(id = "net_income", label = md("**Net Income**")) ~ sum(.)
    ),
    fmt = ~ fmt_currency(., currency = "USD", accounting = TRUE)
  ) |>
  tab_style(
    locations = cells_summary(),
    style = cell_text(style = "italic")
  ) |>
  tab_style(
    locations = cells_grand_summary(),
    style = list(
      cell_fill(color = "lightgray"),
      cell_text(weight = "bold")
    )
  ) |>
  tab_header(
    title = "Income Statement",
    subtitle = "Fiscal Year 2024"
  )

	amount
Income Statement
Fiscal Year 2024
Revenue
Product Sales	$500,000.00
Service Revenue	$150,000.00
Licensing	$75,000.00
Subtotal	$725,000.00
Expenses
Cost of Goods	($180,000.00)
Salaries	($220,000.00)
Marketing	($45,000.00)
Operations	($80,000.00)
Subtotal	($525,000.00)
Net Income	$200,000.00

Notice how accounting notation (parentheses for negative values) is applied consistently to both the data rows and the summary rows. The subtotals use italic styling to distinguish them from line items, while the grand total (Net Income) receives bold text and a background fill to mark it as the bottom line. This visual hierarchy guides readers from detail to summary.

7.8.2 Sales report with multiple statistics

Business reports often need to show both detailed breakdowns and summary statistics. This example combines summary_columns() for row-wise totals and averages with grand_summary_rows() for column-wise company totals. The result is a table that answers multiple questions at once: How did each region perform? What were the quarterly trends? What’s the company-wide picture?

dplyr::tibble(
  region = rep(c("North", "South", "East", "West"), each = 3),
  quarter = rep(c("Q1", "Q2", "Q3"), 4),
  sales = c(
    120, 135, 142,
    98, 105, 118,
    156, 148, 162,
    88, 95, 102
  )
) |>
  tidyr::pivot_wider(names_from = quarter, values_from = sales) |>
  gt(rowname_col = "region") |>
  summary_columns(
    columns = c(Q1, Q2, Q3),
    fns = list(~ sum(.), ~ mean(.)),
    new_col_names = c("total", "avg"),
    new_col_labels = c("Total", "Avg"),
    fmt = ~ fmt_number(., decimals = 0)
  ) |>
  grand_summary_rows(
    columns = c(Q1, Q2, Q3, total, avg),
    fns = list(
      list(id = "company_total", label = "Company Total") ~ sum(.)
    ),
    fmt = ~ fmt_number(., decimals = 0)
  ) |>
  tab_spanner(label = "Quarters", columns = c(Q1, Q2, Q3)) |>
  tab_spanner(label = "Summary", columns = c(total, avg)) |>
  tab_header(title = "Regional Sales Report")

	Quarters			Summary
Regional Sales Report
	Q1	Q2	Q3	Total	Avg
North	120	135	142	397	132
South	98	105	118	321	107
East	156	148	162	466	155
West	88	95	102	285	95
Company Total	462	483	524	1,469	490

The use of column spanners ("Quarters" and "Summary") helps readers understand the table’s structure at a glance. The summary columns on the right provide immediate context for each region’s performance, while the grand summary row at the bottom gives the company-wide view. A manager scanning this table can quickly identify that East is the top-performing region and that sales have been trending upward across quarters.

7.8.3 Scientific data with statistical summaries

Scientific publications typically require summary statistics like means, standard deviations, and sample sizes for experimental data. This example shows how summary_rows() can present multiple statistics for each treatment group, formatted appropriately for each measure.

set.seed(23)

dplyr::tibble(
  treatment = rep(c("Control", "Treatment A", "Treatment B"), each = 4),
  replicate = rep(1:4, 3),
  response = c(
    rnorm(4, mean = 10, sd = 2),
    rnorm(4, mean = 15, sd = 2),
    rnorm(4, mean = 13, sd = 2)
  )
) |>
  gt(
    rowname_col = "replicate",
    groupname_col = "treatment"
  ) |>
  fmt_number(columns = response, decimals = 2) |>
  summary_rows(
    columns = response,
    fns = list(
      list(id = "mean", label = "Mean") ~ mean(.),
      list(id = "sd", label = "SD") ~ sd(.),
      list(id = "n", label = "n") ~ length(.)
    ),
    fmt = list(
      ~ fmt_number(., decimals = 2),
      ~ fmt_number(., decimals = 2),
      ~ fmt_integer(.)
    )
  ) |>
  tab_style(
    locations = cells_summary(),
    style = cell_fill(color = "lightyellow")
  ) |>
  tab_header(
    title = "Experimental Results",
    subtitle = "Response variable by treatment group"
  )

	response
Experimental Results
Response variable by treatment group
Control
1	10.39
2	9.13
3	11.83
4	13.59
Mean	11
SD	2
n	4
Treatment A
1	16.99
2	17.21
3	14.44
4	17.04
Mean	16
SD	1
n	4
Treatment B
1	13.09
2	16.15
3	13.44
4	10.91
Mean	13
SD	2
n	4

Each treatment group receives the same three summary statistics, formatted according to their nature: means and standard deviations show two decimal places, while sample sizes appear as integers. The light yellow background on summary rows distinguishes them from the raw data, making it easy for readers to find the statistics they need. This format is common in scientific papers and lab reports, where reviewers expect to see both the underlying data and its statistical summary.

These three examples illustrate the versatility of gt’s summary system. Whether you’re preparing financial reports, business analyses, or scientific publications, the same core functions adapt to meet domain-specific conventions and reader expectations.

7.9 Summary

The summary functions in gt provide a comprehensive system for adding aggregated data to your tables:

summary_rows() adds summary rows within each row group, useful for subtotals and group-level statistics
grand_summary_rows() adds summary rows that aggregate across the entire table, useful for overall totals and grand statistics
summary_columns() adds computed columns that aggregate across rows, useful for row totals and row-wise calculations
extract_summary() retrieves summary data for use outside the table

All three summary functions share a consistent interface with the fns argument for specifying aggregations, the fmt argument for formatting results, and the side argument for positioning. Summary elements can be styled and annotated using the standard gt functions with the cells_summary() and cells_grand_summary() location helpers.

By separating the calculation of summaries from their presentation, gt ensures that your tables remain reproducible and maintainable. When data changes, summaries update automatically. When you need different formatting, you can adjust it without recalculating. This approach embodies the gt philosophy: express your intent clearly, and let the package handle the details.