datasummary_correlation: Generate a correlation table for all numeric variables in...
In vincentarelbundock/gtsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

View source: R/datasummary_correlation.R

datasummary_correlation

R Documentation

Generate a correlation table for all numeric variables in your dataset.

Description

The names of the variables displayed in the correlation table are the names of the columns in the data. You can rename those columns (with or without spaces) to produce a table of human-readable variables. See the Details and Examples sections below, and the vignettes on the modelsummary website:

https://modelsummary.com/
https://modelsummary.com/articles/datasummary.html

Usage

datasummary_correlation(
  data,
  output = getOption("modelsummary_output", default = "default"),
  method = getOption("modelsummary_method", default = "pearson"),
  fmt = 2,
  align = getOption("modelsummary_align", default = NULL),
  add_rows = getOption("modelsummary_add_rows", default = NULL),
  add_columns = getOption("modelsummary_add_columns", default = NULL),
  title = getOption("modelsummary_title", default = NULL),
  notes = getOption("modelsummary_notes", default = NULL),
  escape = getOption("modelsummary_escape", default = TRUE),
  stars = getOption("modelsummary_stars", default = FALSE),
  ...
)

Arguments

`data`	A data.frame (or tibble)
`output`	filename or object type (character string) Supported filename extensions: .docx, .html, .tex, .md, .txt, .csv, .xlsx, .png, .jpg Supported object types: "default", "html", "markdown", "latex", "latex_tabular", "typst", "data.frame", "tinytable", "gt", "kableExtra", "huxtable", "flextable", "DT", "jupyter". The "modelsummary_list" value produces a lightweight object which can be saved and fed back to the `modelsummary` function. The "default" output format can be set to "tinytable", "kableExtra", "gt", "flextable", "huxtable", "DT", or "markdown" If the user does not choose a default value, the packages listed above are tried in sequence. Session-specific configuration: `options("modelsummary_factory_default" = "gt")` Persistent configuration: `config_modelsummary(output = "markdown")` Warning: Users should not supply a file name to the `output` argument if they intend to customize the table with external packages. See the 'Details' section. LaTeX compilation requires the `booktabs` and `siunitx` packages, but `siunitx` can be disabled or replaced with global options. See the 'Details' section.
`method`	character or function character: "pearson", "kendall", "spearman", or "pearspear" (Pearson correlations above and Spearman correlations below the diagonal) function: takes a data.frame with numeric columns and returns a square matrix or data.frame with unique row.names and colnames corresponding to variable names. Note that the `datasummary_correlation_format` can often be useful for formatting the output of custom correlation functions.
`fmt`	how to format numeric values: integer, user-supplied function, or `modelsummary` function. Integer: Number of decimal digits User-supplied functions: Any function which accepts a numeric vector and returns a character vector of the same length. `modelsummary` functions: `fmt = fmt_significant(2)`: Two significant digits (at the term-level) `fmt = fmt_sprintf("%.3f")`: See `?sprintf` `fmt = fmt_identity()`: unformatted raw values
`align`	A string with a number of characters equal to the number of columns in the table (e.g., `align = "lcc"`). Valid characters: l, c, r, d. "l": left-aligned column "c": centered column "r": right-aligned column "d": dot-aligned column. For LaTeX/PDF output, this option requires at least version 3.0.25 of the siunitx LaTeX package. See the LaTeX preamble help section below for commands to insert in your LaTeX preamble.
`add_rows`	a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. You can define a "position" attribute of integers to set the row positions. See Examples section below.
`add_columns`	a data.frame (or tibble) with the same number of rows as your main table.
`title`	string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as `⁠\\label{tab:mytable}⁠` directly to the title string, while also specifying `escape=FALSE`.
`notes`	list or vector of notes to append to the bottom of the table.
`escape`	boolean TRUE escapes or substitutes LaTeX/HTML characters which could prevent the file from compiling/displaying. `TRUE` escapes all cells, captions, and notes. Users can have more fine-grained control by setting `escape=FALSE` and using an external command such as: `modelsummary(model, "latex") \|> tinytable::format_tt(tab, j=1:5, escape=TRUE)`
`stars`	to indicate statistical significance FALSE (default): no significance stars. TRUE: `c("+" = .1, "" = .05, "" = .01, "*" = 0.001)` Named numeric vector for custom stars such as `c('' = .1, '+' = .05)` Note: a legend will not be inserted at the bottom of the table when the `estimate` or `statistic` arguments use "glue strings" with `{stars}`.
`...`	other parameters are passed through to the table-making packages.

Version 2.0.0, kableExtra, and tinytable

Since version 2.0.0, modelsummary uses tinytable as its default table-drawing backend. Learn more at: https://vincentarelbundock.github.io/tinytable/",

Revert to kableExtra for one session:

options(modelsummary_factory_default = 'kableExtra') options(modelsummary_factory_latex = 'kableExtra') options(modelsummary_factory_html = 'kableExtra')

Global Options

The behavior of modelsummary can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:

options(modelsummary_output = "modelsummary_list")
options(modelsummary_statistic = '({conf.low}, {conf.high})')
options(modelsummary_stars = TRUE)

Options not specific to given arguments are listed below.

Model labels: default column names

These global option changes the style of the default column headers:

options(modelsummary_model_labels = "roman")

The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"

Table-making packages

modelsummary supports 6 table-making packages: tinytable, kableExtra, gt, flextable, huxtable, and DT. Some of these packages have overlapping functionalities. To change the default backend used for a specific file format, you can use ' the options function:

options(modelsummary_factory_html = 'kableExtra') options(modelsummary_factory_word = 'huxtable') options(modelsummary_factory_png = 'gt') options(modelsummary_factory_latex = 'gt') options(modelsummary_factory_latex_tabular = 'kableExtra')

Table themes

Change the look of tables in an automated and replicable way, using the modelsummary theming functionality. See the vignette: https://modelsummary.com/articles/appearance.html

modelsummary_theme_gt
modelsummary_theme_kableExtra
modelsummary_theme_huxtable
modelsummary_theme_flextable
modelsummary_theme_dataframe

Model extraction functions

modelsummary can use two sets of packages to extract information from statistical models: the easystats family (performance and parameters) and broom. By default, it uses easystats first and then falls back on broom in case of failure. You can change the order of priorities or include goodness-of-fit extracted by both packages by setting:

options(modelsummary_get = "easystats")

options(modelsummary_get = "broom")

options(modelsummary_get = "all")

Formatting numeric entries

By default, LaTeX tables enclose all numeric entries in the ⁠\num{}⁠ command from the siunitx package. To prevent this behavior, or to enclose numbers in dollar signs (for LaTeX math mode), users can call:

options(modelsummary_format_numeric_latex = "plain")

options(modelsummary_format_numeric_latex = "mathmode")

A similar option can be used to display numerical entries using MathJax in HTML tables:

options(modelsummary_format_numeric_html = "mathjax")

LaTeX preamble

When creating LaTeX via the tinytable backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary() calls are cached.

\usepackage{tabularray}
\usepackage{float}
\usepackage{graphicx}
\usepackage[normalem]{ulem}
\UseTblrLibrary{booktabs}
\UseTblrLibrary{siunitx}
\newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}}
\newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}}
\NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}

Examples

library(modelsummary)

# clean variable names (base R)
dat <- mtcars[, c("mpg", "hp")]
colnames(dat) <- c("Miles / Gallon", "Horse Power")
datasummary_correlation(dat)

# clean variable names (tidyverse)
library(tidyverse)
dat <- mtcars %>%
  select(`Miles / Gallon` = mpg,
         `Horse Power` = hp)
datasummary_correlation(dat)

# `correlation` package objects
if (requireNamespace("correlation", quietly = TRUE)) {
  co <- correlation::correlation(mtcars[, 1:4])
  datasummary_correlation(co)

  # add stars to easycorrelation objects
  datasummary_correlation(co, stars = TRUE)
}

# alternative methods
datasummary_correlation(dat, method = "pearspear")

# custom function
cor_fun <- function(x) cor(x, method = "kendall")
datasummary_correlation(dat, method = cor_fun)

# rename columns alphabetically and include a footnote for reference
note <- sprintf("(%s) %s", letters[1:ncol(dat)], colnames(dat))
note <- paste(note, collapse = "; ")

colnames(dat) <- sprintf("(%s)", letters[1:ncol(dat)])

datasummary_correlation(dat, notes = note)

# `datasummary_correlation_format`: custom function with formatting
dat <- mtcars[, c("mpg", "hp", "disp")]

cor_fun <- function(x) {
  out <- cor(x, method = "kendall")
  datasummary_correlation_format(
    out,
    fmt = 2,
    upper_triangle = "x",
    diagonal = ".")
}

datasummary_correlation(dat, method = cor_fun)

# use kableExtra and psych to color significant cells
library(psych)
library(kableExtra)

dat <- mtcars[, c("vs", "hp", "gear")]

cor_fun <- function(dat) {
  # compute correlations and format them
  correlations <- data.frame(cor(dat))
  correlations <- datasummary_correlation_format(correlations, fmt = 2)

  # calculate pvalues using the `psych` package
  pvalues <- psych::corr.test(dat)$p

  # use `kableExtra::cell_spec` to color significant cells
  for (i in 1:nrow(correlations)) {
    for (j in 1:ncol(correlations)) {
      if (pvalues[i, j] < 0.05 && i != j) {
        correlations[i, j] <- cell_spec(correlations[i, j], background = "pink")
      }
    }
  }
  return(correlations)
}

# The `escape=FALSE` is important here!
datasummary_correlation(dat, method = cor_fun, escape = FALSE)

References

Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v103.i01")}.'

vincentarelbundock/gtsummary documentation built on Feb. 15, 2025, 11:22 p.m.

vincentarelbundock/gtsummary index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

vincentarelbundock/gtsummary
Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

datasummary_correlation: Generate a correlation table for all numeric variables in...
In vincentarelbundock/gtsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

Generate a correlation table for all numeric variables in your dataset.

Description

Usage

Arguments

Version 2.0.0, kableExtra, and tinytable

Global Options

Model labels: default column names

Table-making packages

Table themes

Model extraction functions

Formatting numeric entries

LaTeX preamble

Examples

References

Related to datasummary_correlation in vincentarelbundock/gtsummary...

R Package Documentation

Browse R Packages

We want your feedback!

vincentarelbundock/gtsummary Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

datasummary_correlation: Generate a correlation table for all numeric variables in... In vincentarelbundock/gtsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

Generate a correlation table for all numeric variables in your dataset.

Description

Usage

Arguments

Version 2.0.0, kableExtra, and tinytable

Global Options

Model labels: default column names

Table-making packages

Table themes

Model extraction functions

Formatting numeric entries

LaTeX preamble

Examples

References

Related to datasummary_correlation in vincentarelbundock/gtsummary...

R Package Documentation

Browse R Packages

We want your feedback!

vincentarelbundock/gtsummary
Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

datasummary_correlation: Generate a correlation table for all numeric variables in...
In vincentarelbundock/gtsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready