6 Modifying columns

While formatting functions operate on the values within cells, sometimes we need to modify the structure and arrangement of the columns themselves. The gt package provides a comprehensive set of cols_*() functions that allow us to relabel columns, set their widths, control alignment, reorder their positions, hide and reveal them, and even merge data from multiple columns into a single column. These operations are essential for transforming raw data into polished, presentation-ready tables.

In this chapter, we’ll explore each of these column modification functions in detail. We’ll see how cols_label() and cols_label_with() allow us to create human-readable column headers, how cols_width() and cols_align() give us control over the visual presentation, how the cols_move*() family lets us rearrange columns, how cols_hide() and cols_unhide() manage column visibility, and finally how the cols_merge*() functions allow us to combine data from multiple columns into single, information-rich cells.

6.1 Labels, widths, alignments

The column labels in a gt table serve as the reader’s guide to understanding the data. By default, gt uses the column names from your data frame, which are often abbreviated, in snake_case, or otherwise not ideal for presentation purposes. Similarly, the default column widths and alignments may not always be optimal for your particular data. This section covers the functions that give you fine-grained control over these presentational aspects.

6.1.1 `cols_label()`

The cols_label() function is your primary tool for assigning human-readable labels to columns. It accepts a set of column-label pairs, where you provide the column name (or use tidyselect) and the label you’d like to assign.

Here is the function’s signature:

cols_label(
 data,
 ...,
 .list = list2(...),
 .fn = NULL,
 .process_units = TRUE
)

Let’s create a table using a subset of the gtcars dataset and give the columns more descriptive labels:

gtcars |>
 dplyr::select(mfr, model, year, hp, mpg_c, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_label(
   mfr = "Manufacturer",
   model = "Model",
   year = "Year",
   hp = "Horsepower",
   mpg_c = "MPG (City)",
   msrp = "Price (USD)"
 )

Manufacturer	Model	Year	Horsepower	MPG (City)	Price (USD)
Ford	GT	2017	647	11	447000
Ferrari	458 Speciale	2015	597	13	291744
Ferrari	458 Spider	2015	562	13	263553
Ferrari	458 Italia	2014	562	13	233509
Ferrari	488 GTB	2016	661	15	245400

The resulting table is immediately more readable. Where mfr might leave readers guessing, “Manufacturer” communicates clearly. The transformation from mpg_c to “MPG (City)” not only spells out the abbreviation but also clarifies that this is the city fuel economy figure (as opposed to highway). These small touches make a substantial difference in how quickly readers can interpret the data.

You can also use the md() and html() helper functions to add formatting to your labels. This is particularly useful for adding units, subscripts, or other stylistic touches:

sp500 |>
 dplyr::slice(1:5) |>
 dplyr::select(date, open, high, low, close, volume) |>
 gt() |>
 cols_label(
   date = "Date",
   open = md("**Open**"),
   high = md("*High*"),
   low = md("*Low*"),
   close = md("**Close**"),
   volume = html("Volume<br><small>(shares)</small>")
 )

Date	Open	High	Low	Close	Volume (shares)
2015-12-31	2060.59	2062.54	2043.62	2043.94	2655330000
2015-12-30	2077.34	2077.34	2061.97	2063.36	2367430000
2015-12-29	2060.54	2081.56	2060.54	2078.36	2542000000
2015-12-28	2057.77	2057.77	2044.20	2056.50	2492510000
2015-12-24	2063.52	2067.36	2058.73	2060.99	1411860000

In this table, we’ve used Markdown formatting to make the “Open” and “Close” labels bold (arguably the most important prices for traders), while “High” and “Low” appear in italics. The volume label demonstrates the power of using raw HTML: the <br> creates a line break and <small> renders the unit clarification in a smaller font size. This kind of rich formatting in column labels can guide the reader’s attention and provide additional context without cluttering the data cells themselves.

For columns with measurement units, gt supports a convenient notation using curly braces. When .process_units = TRUE (the default), text within {...} is automatically formatted as units:

exibble |>
 dplyr::select(num, currency) |>
 dplyr::slice(1:4) |>
 gt() |>
 cols_label(
   num = "Value {{m/s^2}}",
   currency = "Amount {{USD}}"
 )

Value m/s²	Amount USD
0.1111	49.95
2.2220	17.95
33.3300	1.39
444.4000	65100.00

The units notation provides a clean, consistent way to display measurement units. Notice how m/s^2 is rendered with proper superscript formatting for the exponent (this happens automatically when using the double-curly-brace syntax). The result is scientifically accurate notation without requiring you to write HTML or Markdown.

6.1.2 `cols_label_with()`

When you need to apply a transformation function to multiple column labels at once, cols_label_with() is the more efficient approach. Rather than specifying each label individually, you provide a function that will be applied to the selected column names.

Here is the function’s signature:

cols_label_with(
 data,
 columns = everything(),
 fn
)

A common use case is converting snake_case column names to Title Case:

towny |>
 dplyr::select(name, population_2021, density_2021, land_area_km2) |>
 dplyr::slice_head(n = 5) |>
 gt() |>
 cols_label_with(
   columns = everything(),
   fn = ~ gsub("_", " ", .x) |> tools::toTitleCase()
 )

Name	Population 2021	Density 2021	Land Area Km2
Addington Highlands	2534	1.96	1293.99
Adelaide Metcalfe	3011	9.09	331.11
Adjala-Tosorontio	10989	29.58	371.53
Admaston/Bromley	2995	5.76	519.59
Ajax	126666	1900.75	66.64

The anonymous function (specified using the ~ syntax) performs two operations: first, gsub("_", " ", .x) replaces all underscores with spaces, and then tools::toTitleCase() capitalizes the first letter of each word. The result transforms population_2021 into “Population 2021” and land_area_km2 into “Land Area Km2”. This approach is far more efficient than manually specifying labels for each column, especially when dealing with datasets that have many columns following consistent naming conventions.

You can also be selective about which columns to transform. Here, we use starts_with() to only modify certain columns:

gtcars |>
 dplyr::select(mfr, model, hp, hp_rpm, trq, trq_rpm) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_label(
   mfr = "Manufacturer",
   model = "Model"
 ) |>
 cols_label_with(
   columns = starts_with(c("hp", "trq")),
   fn = ~ toupper(gsub("_", " ", .x))
 )

Manufacturer	Model	HP	HP RPM	TRQ	TRQ RPM
Ford	GT	647	6250	550	5900
Ferrari	458 Speciale	597	9000	398	6000
Ferrari	458 Spider	562	9000	398	6000
Ferrari	458 Italia	562	9000	398	6000
Ferrari	488 GTB	661	8000	561	3000

Here we combine two approaches: explicit labeling with cols_label() for the mfr and model columns (which need specific human-friendly names), and programmatic labeling with cols_label_with() for the performance-related columns. The starts_with() selector targets only columns beginning with “hp” or “trq”, converting them to uppercase with spaces. This results in labels like “HP”, “HP RPM”, “TRQ”, and “TRQ RPM”, which are appropriate for technical specifications where abbreviations are standard.

The cols_label_with() function is particularly powerful when combined with custom functions. For example, you could create a function that looks up labels from a data dictionary or applies domain-specific formatting rules.

6.1.3 `cols_width()`

The cols_width() function allows you to explicitly set the widths of one or more columns. This is especially useful when you want consistent column sizing across tables, when dealing with long text that might cause awkward wrapping, or when certain columns deserve more visual prominence.

Here is the function’s signature:

cols_width(
 data,
 ...,
 .list = list2(...)
)

The width values can be specified using the px() helper for pixel values or pct() for percentages. Let’s see a basic example:

exibble |>
 dplyr::select(char, fctr, date, time) |>
 dplyr::slice(1:4) |>
 gt() |>
 cols_width(
   char ~ px(150),
   fctr ~ px(100),
   date ~ px(120),
   time ~ px(100)
 )

char	fctr	date	time
apricot	one	2015-01-15	13:35
banana	two	2015-02-15	14:40
coconut	three	2015-03-15	15:45
durian	four	2015-04-15	16:50

By setting explicit pixel widths, we ensure that the columns maintain consistent sizing regardless of the content they contain. The char column receives the most space at 150 pixels since character data often varies in length, while fctr and time are narrower at 100 pixels. This explicit control prevents the table from looking unbalanced when some cells contain much more content than others.

Using percentage-based widths ensures that columns scale proportionally with the table’s overall width:

gtcars |>
 dplyr::select(mfr, model, year, hp, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_label(
   mfr = "Manufacturer",
   model = "Model",
   year = "Year",
   hp = "HP",
   msrp = "Price"
 ) |>
 cols_width(
   mfr ~ pct(20),
   model ~ pct(35),
   year ~ pct(10),
   hp ~ pct(10),
   msrp ~ pct(25)
 )

Manufacturer	Model	Year	HP	Price
Ford	GT	2017	647	447000
Ferrari	458 Speciale	2015	597	291744
Ferrari	458 Spider	2015	562	263553
Ferrari	458 Italia	2014	562	233509
Ferrari	488 GTB	2016	661	245400

With percentage-based widths totaling 100%, the columns will proportionally fill the available table width. The model column receives the largest share (35%) since car model names tend to be longer, while year and hp get just 10% each since they contain short numeric values. This approach is particularly valuable for responsive layouts where the table width may vary depending on the viewing context.

You can use tidyselect helpers to apply widths to multiple columns at once:

towny |>
 dplyr::select(name, starts_with("population"), starts_with("density")) |>
 dplyr::slice_head(n = 5) |>
 gt() |>
 cols_width(
   name ~ px(200),
   starts_with("population") ~ px(120),
   starts_with("density") ~ px(100)
 )

name	population_1996	population_2001	population_2006	population_2011	population_2016	population_2021	density_1996	density_2001	density_2006	density_2011	density_2016	density_2021
Addington Highlands	2429	2402	2512	2517	2318	2534	1.88	1.86	1.94	1.95	1.79	1.96
Adelaide Metcalfe	3128	3149	3135	3028	2990	3011	9.45	9.51	9.47	9.14	9.03	9.09
Adjala-Tosorontio	9359	10082	10695	10603	10975	10989	25.19	27.14	28.79	28.54	29.54	29.58
Admaston/Bromley	2837	2824	2716	2844	2935	2995	5.46	5.44	5.23	5.47	5.65	5.76
Ajax	64430	73753	90167	109600	119677	126666	966.84	1106.74	1353.05	1644.66	1795.87	1900.75

The starts_with() helper elegantly handles all population columns and all density columns with single rules, assigning 120 pixels to each population column and 100 pixels to each density column. This is considerably more maintainable than specifying widths for each column individually, and it automatically accommodates any new columns that might be added to the dataset later (provided they follow the same naming convention).

6.1.4 `cols_align()`

By default, gt automatically aligns columns based on their data types: numeric columns are right-aligned and text columns are left-aligned. The cols_align() function lets you override this behavior when needed.

Here is the function’s signature:

cols_align(
 data,
 align = c("auto", "left", "center", "right"),
 columns = everything()
)

Let’s see how to center-align specific columns:

exibble |>
 dplyr::select(char, num, currency, date) |>
 dplyr::slice(1:4) |>
 gt() |>
 cols_align(
   align = "center",
   columns = c(char, date)
 )

char	num	currency	date
apricot	0.1111	49.95	2015-01-15
banana	2.2220	17.95	2015-02-15
coconut	33.3300	1.39	2015-03-15
durian	444.4000	65100.00	2015-04-15

In this table, the char and date columns are now center-aligned, while num and currency retain their default right-alignment. Notice how the center alignment affects only the body cells; the column labels remain in their default positions. Centering text columns can create a more balanced visual appearance, though it may reduce readability for longer text values (so use this technique judiciously).

You might want to center all columns for a more symmetrical appearance, particularly in summary tables:

gtcars |>
 dplyr::count(mfr, name = "n_models") |>
 dplyr::arrange(desc(n_models)) |>
 dplyr::slice_head(n = 8) |>
 gt() |>
 cols_label(
   mfr = "Manufacturer",
   n_models = "Number of Models"
 ) |>
 cols_align(
   align = "center",
   columns = everything()
 )

Manufacturer	Number of Models
Ferrari	9
Audi	5
BMW	5
Aston Martin	4
Porsche	4
Lamborghini	3
Maserati	3
Mercedes-Benz	2

For this summary table showing model counts by manufacturer, center alignment creates a clean, symmetrical presentation. Since the table has only two columns and relatively short values in each, center alignment works well. The everything() selector applies the alignment to all columns in one call, keeping the code concise.

For numeric values that benefit from decimal alignment, gt offers cols_align_decimal(). This function aligns values along their decimal points, making it easier to compare values across rows:

dplyr::tibble(
 item = c("Widget A", "Widget B", "Widget C", "Total"),
 price = c(1.5, 24.99, 199.00, 225.49),
 quantity = c(100, 50, 5, 155)
) |>
 gt() |>
 cols_align_decimal(columns = price)

item	price	quantity
Widget A	1.50	100
Widget B	24.99	50
Widget C	199.00	5
Total	225.49	155

The price column now has its values aligned on the decimal point, making it easy to compare prices at a glance. Without decimal alignment, the varying number of digits before the decimal would make visual comparison more difficult. This is especially valuable in financial tables, scientific data, or any context where precise numeric comparison matters.

6.2 Moving columns

The order in which columns appear can significantly impact how readers interpret and navigate a table. While you could reorder columns in your source data before passing it to gt(), the cols_move*() functions provide a convenient way to rearrange columns as part of your table-building pipeline. This is especially useful when the order that makes sense for analysis differs from the order that works best for presentation.

6.2.1 `cols_move_to_start()`

The cols_move_to_start() function moves one or more columns to the beginning of the column series (i.e., to the left side of the table).

Here is the function’s signature:

cols_move_to_start(
 data,
 columns
)

Let’s say we have a dataset where we want to emphasize certain columns by placing them first:

gtcars |>
 dplyr::select(mfr, model, year, bdy_style, hp, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_move_to_start(columns = msrp)

msrp	mfr	model	year	bdy_style	hp
447000	Ford	GT	2017	coupe	647
291744	Ferrari	458 Speciale	2015	coupe	597
263553	Ferrari	458 Spider	2015	convertible	562
233509	Ferrari	458 Italia	2014	coupe	562
245400	Ferrari	488 GTB	2016	coupe	661

The msrp column (which originally appeared last in our selection) now leads the table. If price is the primary point of interest for our readers, placing it first ensures they see that information immediately. The remaining columns maintain their relative order, simply shifted one position to the right.

You can move multiple columns at once. They will appear in the order you specify:

exibble |>
 dplyr::select(-row, -group) |>
 dplyr::slice(1:4) |>
 gt() |>
 cols_move_to_start(columns = c(date, time, datetime))

date	time	datetime	num	char	fctr	currency
2015-01-15	13:35	2018-01-01 02:22	0.1111	apricot	one	49.95
2015-02-15	14:40	2018-02-02 14:33	2.2220	banana	two	17.95
2015-03-15	15:45	2018-03-03 03:44	33.3300	coconut	three	1.39
2015-04-15	16:50	2018-04-04 15:55	444.4000	durian	four	65100.00

All three temporal columns now appear at the beginning of the table, in the order we specified: date, then time, then datetime. This reorganization groups related information together, which is often more logical for the reader than the arbitrary order that might result from the original data structure.

6.2.2 `cols_move_to_end()`

Conversely, cols_move_to_end() moves columns to the end of the column series (i.e., to the right side of the table). This is often useful for placing less important columns or summary columns at the end.

Here is the function’s signature:

cols_move_to_end(
 data,
 columns
)

In the following example, we move the identifying columns to the end so that the numeric data appears first:

sp500 |>
 dplyr::select(date, open, high, low, close, volume) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_move_to_end(columns = date)

open	high	low	close	volume	date
2060.59	2062.54	2043.62	2043.94	2655330000	2015-12-31
2077.34	2077.34	2061.97	2063.36	2367430000	2015-12-30
2060.54	2081.56	2060.54	2078.36	2542000000	2015-12-29
2057.77	2057.77	2044.20	2056.50	2492510000	2015-12-28
2063.52	2067.36	2058.73	2060.99	1411860000	2015-12-24

By moving date to the end, the numeric trading data (open, high, low, close, volume) takes center stage. This arrangement might be preferred when the focus is on price movements rather than the specific dates (perhaps in a context where readers already know the time period being examined).

Multiple columns can be moved together:

gtcars |>
 dplyr::select(mfr, model, year, hp, trq, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_move_to_end(columns = c(mfr, model))

year	hp	trq	msrp	mfr	model
2017	647	550	447000	Ford	GT
2015	597	398	291744	Ferrari	458 Speciale
2015	562	398	263553	Ferrari	458 Spider
2014	562	398	233509	Ferrari	458 Italia
2016	661	561	245400	Ferrari	488 GTB

Now the numeric specifications (year, hp, trq, msrp) appear first, followed by the identifying information (mfr, model). This layout prioritizes the data over the labels, a presentation choice that might suit analytical contexts where readers are comparing specifications across vehicles.

6.2.3 `cols_move()`

For more precise control over column placement, cols_move() allows you to position columns after a specified reference column.

Here is the function’s signature:

cols_move(
 data,
 columns,
 after
)

This function is particularly useful when you want to group related columns together:

gtcars |>
 dplyr::select(mfr, model, year, hp, hp_rpm, trq, trq_rpm, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_move(
   columns = msrp,
   after = model
 )

mfr	model	msrp	year	hp	hp_rpm	trq	trq_rpm
Ford	GT	447000	2017	647	6250	550	5900
Ferrari	458 Speciale	291744	2015	597	9000	398	6000
Ferrari	458 Spider	263553	2015	562	9000	398	6000
Ferrari	458 Italia	233509	2014	562	9000	398	6000
Ferrari	488 GTB	245400	2016	661	8000	561	3000

The msrp column is now positioned immediately after model, giving readers the price information right alongside the vehicle identification. The after argument specifies the reference column; the moved column(s) will appear directly to its right. This flexibility is valuable when you need to interleave columns rather than simply push them to the start or end.

Here’s another example where we reorder columns to group performance metrics together:

gtcars |>
 dplyr::select(mfr, model, year, mpg_c, mpg_h, hp, trq) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_move(
   columns = c(hp, trq),
   after = year
 )

mfr	model	year	hp	trq	mpg_c	mpg_h
Ford	GT	2017	647	550	11	18
Ferrari	458 Speciale	2015	597	398	13	17
Ferrari	458 Spider	2015	562	398	13	17
Ferrari	458 Italia	2014	562	398	13	17
Ferrari	488 GTB	2016	661	561	15	22

Both hp and trq have been moved to follow year, grouping the power-related metrics together. The fuel economy columns (mpg_c, mpg_h) now appear at the end. By thoughtfully arranging columns, you help readers understand the logical structure of your data. In this case, suggesting a grouping of identity information, power metrics, and efficiency metrics.

Using tidyselect helpers with cols_move() can be quite powerful for organizing tables with many columns:

towny |>
 dplyr::select(name, latitude, longitude, starts_with("population"), starts_with("density")) |>
 dplyr::slice_head(n = 5) |>
 gt() |>
 cols_move(
   columns = starts_with("density"),
   after = name
 )

name	density_1996	density_2001	density_2006	density_2011	density_2016	density_2021	latitude	longitude	population_1996	population_2001	population_2006	population_2011	population_2016	population_2021
Addington Highlands	1.88	1.86	1.94	1.95	1.79	1.96	45.00000	-77.25000	2429	2402	2512	2517	2318	2534
Adelaide Metcalfe	9.45	9.51	9.47	9.14	9.03	9.09	42.95000	-81.70000	3128	3149	3135	3028	2990	3011
Adjala-Tosorontio	25.19	27.14	28.79	28.54	29.54	29.58	44.13333	-79.93333	9359	10082	10695	10603	10975	10989
Admaston/Bromley	5.46	5.44	5.23	5.47	5.65	5.76	45.52917	-76.89694	2837	2824	2716	2844	2935	2995
Ajax	966.84	1106.74	1353.05	1644.66	1795.87	1900.75	43.85833	-79.03639	64430	73753	90167	109600	119677	126666

The starts_with("density") selector captures all density-related columns and positions them immediately after name. The original column order placed population columns before density columns, but now density comes first. When working with wide datasets that have many similarly-named columns, tidyselect helpers make reorganization straightforward and your code more resilient to future changes in the underlying data.

6.3 Hiding columns

There are situations where you want to retain columns in your data for reference or for use in calculations, but you don’t want them to appear in the rendered table. The cols_hide() function makes columns invisible while keeping them available for other gt operations. This is particularly useful when you need a column’s data for conditional formatting, merging operations, or calculations, but displaying that column would clutter the table.

6.3.1 `cols_hide()`

The cols_hide() function takes one or more columns and hides them from the table display. The columns remain part of the underlying gt object and can still be referenced in other function calls.

Here is the function’s signature:

cols_hide(
 data,
 columns
)

Let’s create a table where we use a column for sorting but don’t want to display it:

gtcars |>
 dplyr::select(mfr, model, year, hp, msrp) |>
 dplyr::arrange(desc(hp)) |>
 dplyr::slice(1:8) |>
 gt() |>
 cols_hide(columns = year)

mfr	model	hp	msrp
Ferrari	LaFerrari	949	1416362
Ferrari	F12Berlinetta	731	319995
Lamborghini	Aventador	700	397500
Ferrari	GTC4Lusso	680	298000
Ferrari	488 GTB	661	245400
Ferrari	FF	652	295000
Chevrolet	Corvette	650	88345
Ford	GT	647	447000

The table displays the top 8 cars by horsepower, but the year column is no longer visible. We retain the data in our pipeline (and could use it for other operations like conditional formatting), but it doesn’t appear in the rendered output. This is cleaner than removing the column entirely from the source data, especially when the column might be needed for other purposes.

A common use case is hiding columns that have been merged into other columns. When you use cols_merge() (discussed later), the source columns remain in the table unless explicitly hidden:

exibble |>
 dplyr::select(char, fctr, num, currency) |>
 dplyr::slice(1:4) |>
 gt() |>
 cols_hide(columns = c(fctr, currency))

char	num
apricot	0.1111
banana	2.2220
coconut	33.3300
durian	444.4000

Only char and num remain visible; fctr and currency are hidden. The data is still present in the gt object, but it simply isn’t rendered. This technique is particularly useful when you want to keep auxiliary data available for reference while presenting a simplified view to the reader.

You can use tidyselect helpers to hide multiple related columns:

gtcars |>
 dplyr::select(mfr, model, year, hp, hp_rpm, trq, trq_rpm, mpg_c, mpg_h) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_hide(columns = ends_with("_rpm"))

mfr	model	year	hp	trq	mpg_c	mpg_h
Ford	GT	2017	647	550	11	18
Ferrari	458 Speciale	2015	597	398	13	17
Ferrari	458 Spider	2015	562	398	13	17
Ferrari	458 Italia	2014	562	398	13	17
Ferrari	488 GTB	2016	661	561	15	22

The ends_with("_rpm") selector hides both hp_rpm and trq_rpm in a single statement. The table shows horsepower and torque values without the RPM figures (perhaps the RPM data is too technical for a general audience, or it’s being reserved for a more detailed companion table). Either way, the data remains accessible within the gt object should you need it.

6.3.2 `cols_unhide()`

The cols_unhide() function reverses the hiding operation, making previously hidden columns visible again. This is useful in programmatic contexts where you might conditionally show or hide columns, or when you’ve inherited a gt table object and want to reveal certain columns.

Here is the function’s signature:

cols_unhide(
 data,
 columns
)

Let’s demonstrate a workflow where we hide and then selectively unhide columns:

gtcars |>
 dplyr::select(mfr, model, year, hp, trq, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_hide(columns = c(year, hp, trq)) |>
 cols_unhide(columns = hp)

mfr	model	hp	msrp
Ford	GT	647	447000
Ferrari	458 Speciale	597	291744
Ferrari	458 Spider	562	263553
Ferrari	458 Italia	562	233509
Ferrari	488 GTB	661	245400

We first hide three columns (year, hp, trq), then selectively reveal hp. The result is a table showing mfr, model, hp, and msrp (with year and trq remaining hidden). This pattern demonstrates the composability of gt operations: you can hide broadly and then unhide specifically, which can be useful when building tables programmatically or when adapting templates for different contexts.

The cols_unhide() function is particularly valuable when working with functions that return gt tables. You can receive a table, inspect which columns are hidden, and then unhide specific ones as needed. This follows the general gt philosophy that table construction operations should be reversible and composable.

6.4 Merging column data together

One of the more powerful column modification capabilities in gt is the ability to merge data from multiple columns into a single column. This is achieved through the cols_merge*() family of functions. Merging is useful when you want to present related information together in a compact format (for example, combining a value with its uncertainty, showing a range from two values, or displaying a count alongside its percentage).

It’s important to understand that when columns are merged, the data from the source columns is combined into a target column. The source columns are not automatically hidden; you typically want to use cols_hide() afterward if you don’t want the source columns to remain visible. However, some of the specialized merge functions handle this automatically.

6.4.1 `cols_merge()`

The cols_merge() function provides a flexible way to combine values from multiple columns using a pattern string. The pattern uses column indices (e.g., {1}, {2}) to reference the columns being merged.

Here is the function’s signature:

cols_merge(
 data,
 columns,
 rows = everything(),
 hide_columns = columns[-1],
 pattern = NULL,
 sep = " "
)

Let’s combine manufacturer and model into a single column:

gtcars |>
 dplyr::select(mfr, model, year, hp, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_merge(
   columns = c(mfr, model),
   pattern = "{1} {2}"
 ) |>
 cols_label(mfr = "Vehicle")

Vehicle	year	hp	msrp
Ford GT	2017	647	447000
Ferrari 458 Speciale	2015	597	291744
Ferrari 458 Spider	2015	562	263553
Ferrari 458 Italia	2014	562	233509
Ferrari 488 GTB	2016	661	245400

The manufacturer and model names are now combined into a single “Vehicle” column, with values like “Ford GT” and “Ferrari 458 Speciale”. The {1} and {2} placeholders in the pattern refer to the first and second columns specified, respectively. Notice that the model column has been automatically hidden (the default behavior), leaving us with a cleaner four-column table.

The pattern string is quite flexible. You can include any literal text between the placeholders:

gtcars |>
 dplyr::select(mfr, model, year, hp, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_merge(
   columns = c(model, year),
   pattern = "{1} ({2})"
 ) |>
 cols_label(model = "Model (Year)")

mfr	Model (Year)	hp	msrp
Ford	GT (2017)	647	447000
Ferrari	458 Speciale (2015)	597	291744
Ferrari	458 Spider (2015)	562	263553
Ferrari	458 Italia (2014)	562	233509
Ferrari	488 GTB (2016)	661	245400

Now each cell in the model column contains the model name followed by the year in parentheses, such as "GT (2017)" or "458 Speciale (2015)". The parentheses are literal characters in the pattern, demonstrating how you can embed any formatting you need. This is a common convention for displaying names with supplementary information.

You can merge more than two columns:

gtcars |>
 dplyr::select(mfr, model, year, hp, msrp) |>
 dplyr::slice(1:5) |>
 gt() |>
 cols_merge(
   columns = c(mfr, model, year),
   pattern = "{1} {2}, {3}"
 ) |>
 cols_label(mfr = "Vehicle Information")

Vehicle Information	hp	msrp
Ford GT, 2017	647	447000
Ferrari 458 Speciale, 2015	597	291744
Ferrari 458 Spider, 2015	562	263553
Ferrari 458 Italia, 2014	562	233509
Ferrari 488 GTB, 2016	661	245400

All three identifying pieces of information (manufacturer, model, and year) are now combined into a single column, producing entries like “Ford GT, 2017”. This aggressive consolidation reduces the table from five columns to three, creating a more compact presentation when the identifying information is secondary to the numeric data.

Note that by default, cols_merge() hides all columns except the first one specified in columns. If you want to retain visibility of certain columns, you can adjust the hide_columns argument.

6.4.2 `cols_merge_uncert()`

When presenting scientific or statistical data, it’s common to show a value along with its uncertainty (e.g., error bounds). The cols_merge_uncert() function is specifically designed for this purpose, creating nicely formatted value ± uncertainty presentations.

Here is the function’s signature:

cols_merge_uncert(
 data,
 col_val,
 col_uncert,
 rows = everything(),
 sep = " +/- ",
 autohide = TRUE
)

Let’s create a dataset with values and their uncertainties:

measurements <- dplyr::tibble(
 sample = c("A", "B", "C", "D"),
 value = c(12.4, 8.7, 15.2, 10.1),
 uncertainty = c(0.3, 0.2, 0.5, 0.4)
)

measurements |>
 gt() |>
 cols_merge_uncert(
   col_val = value,
   col_uncert = uncertainty
 ) |>
 cols_label(value = "Measurement")

sample	Measurement
A	12.4 ± 0.3
B	8.7 ± 0.2
C	15.2 ± 0.5
D	10.1 ± 0.4

Each measurement now displays with its associated uncertainty using the conventional ± notation: “12.4 ± 0.3”, “8.7 ± 0.2”, and so on. This format is standard in scientific literature and immediately communicates both the measured value and its precision. The uncertainty column has been automatically hidden thanks to the autohide = TRUE default.

The function uses ± as the default separator, which is standard notation for uncertainties. By default, autohide = TRUE means the uncertainty column is automatically hidden after merging.

You can also use asymmetric uncertainties by providing two uncertainty columns in col_uncert:

asymmetric_data <- dplyr::tibble(
 experiment = c("Trial 1", "Trial 2", "Trial 3"),
 result = c(50.2, 48.9, 51.5),
 lower_bound = c(2.1, 1.8, 2.5),
 upper_bound = c(3.2, 2.4, 3.1)
)

asymmetric_data |>
 gt() |>
 cols_merge_uncert(
   col_val = result,
   col_uncert = c(lower_bound, upper_bound)
 ) |>
 cols_label(result = "Result")

experiment	Result
Trial 1	50.2+3.2 −2.1
Trial 2	48.9+2.4 −1.8
Trial 3	51.5+3.1 −2.5

When uncertainties differ above and below the measured value, providing two uncertainty columns produces asymmetric notation like “50.2 (+3.2/−2.1)”. This format accurately represents situations where error bounds are not symmetric (common in fields like particle physics, astrophysics, and certain statistical analyses where confidence intervals are naturally asymmetric).

6.4.3 `cols_merge_range()`

The cols_merge_range() function combines two columns that represent the bounds of a range (e.g., minimum and maximum values) into a single column with appropriate range notation.

Here is the function’s signature:

cols_merge_range(
 data,
 col_begin,
 col_end,
 rows = everything(),
 sep = NULL,
 autohide = TRUE,
 locale = NULL
)

This function is exceedingly useful for displaying ranges like price ranges, temperature ranges, or date ranges:

price_ranges <- dplyr::tibble(
  product = c("Basic", "Standard", "Premium", "Enterprise"),
  min_price = c(9.99, 29.99, 79.99, 199.99),
  max_price = c(19.99, 49.99, 149.99, 499.99)
)

price_ranges |>
  gt() |>
  cols_merge_range(
    col_begin = min_price,
    col_end = max_price
  ) |>
  cols_label(min_price = "Price Range") |>
  fmt_currency(columns = min_price, decimals = 2)

product	Price Range
Basic	$9.99–19.99
Standard	$29.99–49.99
Premium	$79.99–149.99
Enterprise	$199.99–499.99

Each product tier now shows its price range in a single column with values like “$9.99–$19.99”. The en-dash (–) separator is the typographically correct character for ranges, and cols_merge_range() uses it by default. By applying fmt_currency() to the target column, both the minimum and maximum values in each cell receive currency formatting.

Note that when you apply formatting functions like fmt_currency(), you should apply them to the target column (the first column in the merge operation) for the formatting to take effect on the merged result.

Here’s another example using the gtcars dataset to show horsepower ranges by manufacturer:

gtcars |>
 dplyr::group_by(mfr) |>
 dplyr::summarize(
   min_hp = min(hp),
   max_hp = max(hp),
   n_models = n(),
   .groups = "drop"
 ) |>
 dplyr::slice_head(n = 8) |>
 gt() |>
 cols_merge_range(
   col_begin = min_hp,
   col_end = max_hp
 ) |>
 cols_label(
   mfr = "Manufacturer",
   min_hp = "Horsepower Range",
   n_models = "Models"
 )

Manufacturer	Horsepower Range	Models
Acura	573–573	1
Aston Martin	430–608	4
Audi	430–560	5
BMW	315–560	5
Bentley	500–500	1
Chevrolet	650–650	1
Dodge	645–645	1
Ferrari	553–949	9

This summary table shows the range of horsepower values for each manufacturer’s lineup, presented as entries like “245–647” for Audi or “562–691” for Ferrari. At a glance, readers can see which manufacturers offer broader or narrower performance ranges. The compact range notation conveys the same information as two separate columns but in a more digestible format.

6.4.4 `cols_merge_n_pct()`

A very common table pattern is showing counts alongside their percentages. The cols_merge_n_pct() function automates this, creating entries like “150 (25.0%)” from separate count and percentage columns.

Here is the function’s signature:

cols_merge_n_pct(
 data,
 col_n,
 col_pct,
 rows = everything(),
 autohide = TRUE
)

Let’s create a frequency table showing counts and percentages:

gtcars |>
 dplyr::count(bdy_style, name = "n") |>
 dplyr::mutate(pct = n / sum(n) * 100) |>
 gt() |>
 cols_merge_n_pct(
   col_n = n,
   col_pct = pct
 ) |>
 cols_label(
   bdy_style = "Body Style",
   n = "Count (Percentage)"
 )

Body Style	Count (Percentage)
convertible	5 (10.638298)
coupe	32 (68.085106)
hatchback	2 (4.255319)
sedan	8 (17.021277)

The table now shows entries like “coupe: 18 (34.62%)” and “sedan: 12 (23.08%)”, communicating both the raw count and its proportion of the total. This dual presentation helps readers understand both the absolute frequency and relative importance of each category. The function automatically formats the percentage and wraps it in parentheses, following the conventional notation used in statistical and research publications.

This function is particularly useful for survey results, categorical summaries, and any situation where you want to show both absolute and relative frequencies:

towny |>
 dplyr::mutate(
   size_category = dplyr::case_when(
     population_2021 >= 100000 ~ "Large",
     population_2021 >= 20000 ~ "Medium",
     TRUE ~ "Small"
   )
 ) |>
 dplyr::count(size_category, name = "count") |>
 dplyr::mutate(percentage = count / sum(count) * 100) |>
 dplyr::arrange(desc(count)) |>
 gt() |>
 cols_merge_n_pct(
   col_n = count,
   col_pct = percentage
 ) |>
 cols_label(
   size_category = "Municipality Size",
   count = "Count (Pct)"
 )

Municipality Size	Count (Pct)
Small	325 (78.502415)
Medium	61 (14.734300)
Large	28 (6.763285)

This categorical summary of Ontario municipalities shows how many fall into each size category along with the percentage. We can immediately see that “Small” municipalities dominate the dataset while “Large” municipalities are relatively rare. The merged format prevents the table from becoming unwieldy with separate count and percentage columns, while still conveying the complete picture.

The cols_merge_n_pct() function handles the formatting of the percentage values and applies appropriate parentheses automatically, creating a clean and professional appearance for your frequency tables.

6.5 Adding columns and rows

While the previous sections focused on modifying existing columns, gt also provides functions to add entirely new columns and rows directly within the table-building pipeline. This allows you to perform calculations and add data without returning to dplyr or modifying your source data frame.

6.5.1 `cols_add()`

The cols_add() function works similarly to dplyr::mutate(), allowing you to create new columns based on expressions involving existing columns. This is particularly useful for calculated fields, unit conversions, or derived values.

Here is the function’s signature:

cols_add(
  .data,
  ...,
  .before = NULL,
  .after = NULL
)

Let’s create a table that calculates population density from existing columns:

towny |>
  dplyr::select(name, population_2021, land_area_km2) |>
  dplyr::filter(population_2021 > 100000) |>
  dplyr::slice_max(population_2021, n = 8) |>
  gt() |>
  cols_add(
    density = population_2021 / land_area_km2
  ) |>
  fmt_integer(columns = population_2021) |>
  fmt_number(columns = c(land_area_km2, density), decimals = 1) |>
  cols_label(
    name = "Municipality",
    population_2021 = "Population",
    land_area_km2 = "Area (km²)",
    density = "Density (per km²)"
  )

Municipality	Population	Area (km²)	Density (per km²)
Toronto	2,794,356	631.1	4,427.8
Ottawa	1,017,449	2,788.2	364.9
Mississauga	717,961	292.7	2,452.6
Brampton	656,480	265.9	2,469.0
Hamilton	569,353	1,118.3	509.1
London	422,324	420.5	1,004.3
Markham	338,503	210.9	1,604.8
Vaughan	323,103	272.4	1,186.0

The density column is calculated directly in the gt pipeline, dividing population by land area. This keeps the calculation close to where it’s displayed, making the table code more self-contained and easier to understand.

You can add multiple columns in a single call, and use the .before or .after arguments to position them:

gtcars |>
  dplyr::select(mfr, model, hp, mpg_c, mpg_h) |>
  dplyr::slice(1:6) |>
  gt() |>
  cols_add(
    mpg_avg = (mpg_c + mpg_h) / 2,
    hp_per_mpg = hp / mpg_avg,
    .after = mpg_h
  ) |>
  fmt_number(columns = c(mpg_avg, hp_per_mpg), decimals = 1) |>
  cols_label(
    mfr = "Make",
    model = "Model",
    hp = "HP",
    mpg_c = "City MPG",
    mpg_h = "Hwy MPG",
    mpg_avg = "Avg MPG",
    hp_per_mpg = "HP/MPG"
  )

Make	Model	HP	City MPG	Hwy MPG	Avg MPG	HP/MPG
Ford	GT	647	11	18	14.5	44.6
Ferrari	458 Speciale	597	13	17	15.0	39.8
Ferrari	458 Spider	562	13	17	15.0	37.5
Ferrari	458 Italia	562	13	17	15.0	37.5
Ferrari	488 GTB	661	15	22	18.5	35.7
Ferrari	California	553	16	23	19.5	28.4

Here we calculate an average MPG from city and highway figures, then compute a horsepower-to-efficiency ratio. Both new columns are placed after the highway MPG column using .after = mpg_h.

The cols_add() function is also useful for adding constant values or flags:

exibble |>
  dplyr::select(char, num, currency) |>
  dplyr::slice(1:4) |>
  gt() |>
  cols_add(
    status = c("Active", "Pending", "Active", "Closed"),
    reviewed = TRUE,
    .before = char
  ) |>
  cols_label(
    status = "Status",
    reviewed = "Reviewed",
    char = "Item",
    num = "Value",
    currency = "Amount"
  ) |>
  fmt_tf(columns = reviewed, tf_style = "check-mark")

Status	Reviewed	Item	Value	Amount
Active	✔	apricot	0.1111	49.95
Pending	✔	banana	2.2220	17.95
Active	✔	coconut	33.3300	1.39
Closed	✔	durian	444.4000	65100.00

You can even start with an empty table and build it up entirely through cols_add():

dplyr::tibble() |>
  gt() |>
  cols_add(
    product = c("Widget", "Gadget", "Sprocket"),
    price = c(29.99, 49.99, 19.99),
    quantity = c(100, 50, 200)
  ) |>
  cols_add(
    total = price * quantity
  ) |>
  fmt_currency(columns = c(price, total)) |>
  fmt_integer(columns = quantity)

product	price	quantity	total
Widget	$29.99	100	$2,999.00
Gadget	$49.99	50	$2,499.50
Sprocket	$19.99	200	$3,998.00

This approach can be useful when building tables programmatically or when the table structure itself is part of the computation.

6.5.2 `rows_add()`

Just as cols_add() adds columns, rows_add() adds new rows to your table. This is useful for inserting summary rows, spacer rows, or additional data points directly in the gt pipeline.

Here is the function’s signature:

rows_add(
  .data,
  ...,
  .list = list2(...),
  .before = NULL,
  .after = NULL,
  .n_empty = NULL
)

Let’s add a row to a simple table:

dplyr::tibble(
  item = c("Apples", "Oranges", "Bananas"),
  quantity = c(50, 30, 45),
  price = c(1.20, 0.95, 0.65)
) |>
  gt() |>
  rows_add(
    item = "Grapes",
    quantity = 25,
    price = 2.50
  ) |>
  fmt_currency(columns = price) |>
  fmt_integer(columns = quantity)

item	quantity	price
Apples	50	$1.20
Oranges	30	$0.95
Bananas	45	$0.65
Grapes	25	$2.50

The new row is added at the bottom by default. You can use .before or .after to position it elsewhere:

dplyr::tibble(
  rank = 1:3,
  name = c("Alice", "Bob", "Carol"),
  score = c(95, 87, 82)
) |>
  gt() |>
  rows_add(
    rank = 2,
    name = "New Entry",
    score = 90,
    .after = 1
  ) |>
  fmt_integer()

rank	name	score
1	Alice	95
2	New Entry	90
2	Bob	87
3	Carol	82

The new row is inserted after row 1, pushing the original rows 2 and 3 down. Note that you’ll need to handle any renumbering or reordering logic yourself; rows_add() simply inserts the data.

You can add multiple rows at once by providing vectors:

dplyr::tibble(
  category = c("Electronics"),
  sales = c(15000)
) |>
  gt() |>
  rows_add(
    category = c("Clothing", "Food", "Books"),
    sales = c(8500, 12000, 3200)
  ) |>
  fmt_currency(columns = sales, currency = "USD", decimals = 0) |>
  cols_label(
    category = "Category",
    sales = "Q4 Sales"
  )

Category	Q4 Sales
Electronics	$15,000
Clothing	$8,500
Food	$12,000
Books	$3,200

The .n_empty argument lets you add blank rows, which can be useful for visual spacing:

dplyr::tibble(
  section = c("Introduction", "Methods"),
  page = c(1, 15)
) |>
  gt() |>
  rows_add(
    section = "Results",
    page = 32,
    .after = 2
  ) |>
  rows_add(
    .n_empty = 1,
    .after = 2
  ) |>
  sub_missing(missing_text = "") |>
  cols_label(
    section = "Section",
    page = "Page"
  )

Section	Page
Introduction	1
Methods	15

Results	32

Like cols_add(), you can build a table from scratch using rows_add():

dplyr::tibble(
  time = lubridate::POSIXct(),
  event = character(0)
) |>
  gt() |>
  rows_add(
    time = lubridate::ymd_hms("2024-01-15 09:30:00"),
    event = "Meeting started"
  ) |>
  rows_add(
    time = lubridate::ymd_hms("2024-01-15 10:45:00"),
    event = "Break"
  ) |>
  rows_add(
    time = lubridate::ymd_hms("2024-01-15 11:00:00"),
    event = "Meeting resumed"
  ) |>
  fmt_datetime(columns = time, date_style = "yMd", time_style = "Hm")

time	event
1/15/2024 09:30	Meeting started
1/15/2024 10:45	Break
1/15/2024 11:00	Meeting resumed

This pattern of starting with an empty structure and incrementally adding rows is useful for building tables programmatically, such as logging events or accumulating results in an interactive application.

Both cols_add() and rows_add() integrate seamlessly with other gt functions. You can format, style, label, and otherwise manipulate the added columns and rows just like any other part of the table. This makes them powerful tools for keeping your table logic contained within the gt pipeline rather than scattered across data manipulation steps.

6.6 Summary

This chapter has covered the comprehensive set of functions that modify table structure at the column and row level. These functions transform raw data into polished, presentation-ready table elements.

The key capabilities we’ve explored:

column labels: cols_label() assigns human-readable labels, with support for Markdown, HTML, and units notation. cols_label_with() applies transformations programmatically across multiple columns.
column dimensions: cols_width() controls column widths using pixels, percentages, or content-based sizing. cols_align() and cols_align_decimal() ensure proper alignment for readability.
column arrangement: the cols_move*() family repositions columns, while cols_hide() and cols_unhide() control visibility without removing data.
column merging: cols_merge() combines multiple columns using flexible patterns. Specialized variants like cols_merge_uncert(), cols_merge_range(), and cols_merge_n_pct() handle common merge scenarios with appropriate formatting.
adding columns and rows: cols_add() creates new columns from expressions, enabling calculated fields and derived values directly in the gt pipeline. rows_add() inserts new rows, useful for additional data points or spacer rows.

These structural modifications work alongside the formatting and styling functions you’ve already learned. A typical workflow might involve formatting values, then adjusting column labels, setting appropriate widths, reordering columns for logical flow, and merging related information.

The next chapter explores summary rows and columns, which extend table structure vertically and horizontally with computed aggregations. You’ll learn to add totals, averages, and other statistics that help readers understand your data at a glance.

6.1 Labels, widths, alignments

6.1.1 cols_label()

6.1.2 cols_label_with()

6.1.3 cols_width()

6.1.4 cols_align()

6.2 Moving columns

6.2.1 cols_move_to_start()

6.2.2 cols_move_to_end()

6.2.3 cols_move()

6.3 Hiding columns

6.3.1 cols_hide()

6.3.2 cols_unhide()

6.4 Merging column data together

6.4.1 cols_merge()

6.4.2 cols_merge_uncert()

6.4.3 cols_merge_range()

6.4.4 cols_merge_n_pct()

6.5 Adding columns and rows

6.5.1 cols_add()

6.5.2 rows_add()

6.6 Summary

6.1.1 `cols_label()`

6.1.2 `cols_label_with()`

6.1.3 `cols_width()`

6.1.4 `cols_align()`

6.2.1 `cols_move_to_start()`

6.2.2 `cols_move_to_end()`

6.2.3 `cols_move()`

6.3.1 `cols_hide()`

6.3.2 `cols_unhide()`

6.4.1 `cols_merge()`

6.4.2 `cols_merge_uncert()`

6.4.3 `cols_merge_range()`

6.4.4 `cols_merge_n_pct()`

6.5.1 `cols_add()`

6.5.2 `rows_add()`