This file illustrates how to use modelsummary to produce PDF, HTML, and RTF (Microsoft Word-compatible) files from a single Rmarkdown document.

A first table

To begin we estimate three linear regression models using the mtcars data. Then, we load the modelsummary package and call the modelsummary function:

library(modelsummary)

models <- list()
models[[1]] <- lm(mpg ~ cyl, mtcars)
models[[2]] <- lm(mpg ~ cyl + drat, mtcars)
models[[3]] <- lm(hp ~ cyl + drat, mtcars)

modelsummary(models,
             title='Determinants of Car Features',
             statistic='conf.int')

Without any modification, this code will produce a nice table, whether the document is compiled to PDF, HTML, or RTF.^[The Rmarkdown output in the header must be set to either "pdf_document", "html_document", or "rtf_document".] In fact, tables produced using any of the modelsummary built-in options and arguments should compile correctly in all three formats.

One major benefit of modelsummary is that regression tables can be customized using the powerful gt and kableExtra libraries. To achieve this, we post-process the output of modelsummary() using functions from the gt or kableExtra packages. Achieving this in an Rmarkdown file requires some slight, but very easy, tweaks to our code. The next two sections explain what those tweaks are.

Output formats: gt vs. kableExtra

modelsummary supports several table-making packages: gt, kableExtra, flextable, huxtable, DT. These packages are fantastic, but they each have (minor) disadvantages. For instance,

In the rest of this example file, we will demonstrate how to use gt and kableExtra functions to customize tables. However, we will not create tables using gt in PDF documents, and we will not create tables using kableExtra in RTF documents. To achieve this, we detect the Rmarkdown output format and create two boolean variables:

is_rtf <- knitr::opts_knit$get("rmarkdown.pandoc.to") == 'rtf'
is_latex <- knitr::opts_knit$get("rmarkdown.pandoc.to") == 'latex'

In chunks with gt code, we set eval=!is_latex. In code chunks with kableExtra code, we set eval=!is_rtf.

Customizing tables with gt

We can use functions from the gt package to post-process and customize a modelsummary table:

library(gt)

modelsummary(models, 
             output = 'gt',
             title = 'Table customized using `gt` functions.',
             stars = TRUE,
             notes = c('First custom note to contain text.',
                       'Second custom note with different content.')) %>%
   # spanning labels
   tab_spanner(label = 'Miles / Gallon', columns = 2:3) %>%
   tab_spanner(label = 'Horsepower', columns = 4) %>%
   # color
   tab_style(style = cell_text(color = "blue", weight = "bold"),
             locations = cells_body(columns = 1)) %>%
   tab_style(style = cell_fill(color = "pink"),
             locations = cells_body(rows = 3))
library(gt)

adelie <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402702-20b1d280-b52a-11ea-9950-f3a03133fd45.png', height = 100)
gentoo <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402718-27404a00-b52a-11ea-9ad3-dd7562f6438d.png', height = 100)
chinstrap <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402708-23acc300-b52a-11ea-9a77-de360a0d1f7d.png', height = 100)

cap <- 'Flipper lengths (mm) of the famous penguins of Palmer Station, Antarctica.'
f <- (`Species` = species) ~ (` ` = flipper_length_mm) * (`Distribution` = Histogram) + flipper_length_mm * sex * ((`Avg.` = Mean)*Arguments(fmt='%.0f') + (`Std. Dev.` = SD)*Arguments(fmt='%.1f'))
datasummary(f,
            data = penguins,
            output = 'gt',
            title = cap,
            notes = 'Artwork by @allison_horst',
            sparse_header = TRUE) %>%
    text_transform(locations = cells_body(columns = 1, rows = 1), fn = adelie) %>%
    text_transform(locations = cells_body(columns = 1, rows = 2), fn = chinstrap) %>%
    text_transform(locations = cells_body(columns = 1, rows = 3), fn = gentoo) %>%
    tab_style(style = list(cell_text(color = "#FF6700", size = 'x-large')), locations = cells_body(rows = 1)) %>%
    tab_style(style = list(cell_text(color = "#CD51D1", size = 'x-large')), locations = cells_body(rows = 2)) %>%
    tab_style(style = list(cell_text(color = "#007377", size = 'x-large')), locations = cells_body(rows = 3)) %>%
    tab_options(table_body.hlines.width = 0, table.border.top.width = 0, table.border.bottom.width = 0) %>%
    cols_align('center', columns = 3:6)

Customizing tables with kableExtra

We can use functions from the kableExtra package to post-process and customize a modelsummary table. To do this, we load the kableExtra package, and we set output='kableExtra' in the modelsummary() call:

library(kableExtra)

modelsummary(models,
             output = 'kableExtra',
             title = 'Table customized using `kableExtra` functions.',
             stars = TRUE,
             notes = c('First custom note to contain text.',
                       'Second custom note with different content.')) %>%
    # spanning labels
    add_header_above(c(" " = 1, "Miles / Gallon" = 2, "Horsepower" = 1)) %>%
    # color
    row_spec(3, background = 'pink') %>%
    column_spec(1, color = 'blue')

Variable labels

Some packages like haven can assign attributes to the columns of a dataset for use as labels. Most of the functions in modelsummary can display these labels automatically. For example:

library(haven)
dat <- mtcars
dat$am <- haven::labelled(dat$am, label = "Transmission")
dat$mpg <- haven::labelled(dat$mpg, label = "Miles per Gallon")
mod <- lm(hp ~ mpg + am, dat = dat)

modelsummary(mod, coef_rename = TRUE)

datasummary_skim(dat[, c("mpg", "am", "drat")])

Warning: The labelling system in haven does not appear to fully support factor variables. If you want to use factors, it is often best to convert the variable in question to a labelled character vector instead.



vincentarelbundock/gtsummary documentation built on Nov. 6, 2024, 11:07 p.m.