manhattan_plot: Manhattan Plotting

View source: R/manhattan.R

manhattan_plotR Documentation

Manhattan Plotting

Description

A generic function for manhattan plot.

Usage

manhattan_plot(x, ...)

manhattan_plot.default(x, ...)

## S3 method for class 'data.frame'
manhattan_plot(
  x,
  chromosome = NULL,
  outfn = NULL,
  signif = c(5e-08, 1e-05),
  pval.colname = "pval",
  chr.colname = "chr",
  pos.colname = "pos",
  label.colname = NULL,
  highlight.colname = NULL,
  chr.order = NULL,
  signif.col = NULL,
  chr.col = NULL,
  highlight.col = NULL,
  rescale = TRUE,
  rescale.ratio.threshold = 5,
  signif.rel.pos = 0.2,
  chr.gap.scaling = 1,
  color.by.highlight = FALSE,
  preserve.position = FALSE,
  thin = NULL,
  thin.n = 1000,
  thin.bins = 200,
  pval.log.transform = TRUE,
  plot.title = ggplot2::waiver(),
  plot.subtitle = ggplot2::waiver(),
  plot.width = 10,
  plot.height = 5,
  plot.scale = 1,
  point.size = 0.75,
  label.font.size = 2,
  max.overlaps = 20,
  x.label = "Chromosome",
  y.label = expression(-log[10](p)),
  ...
)

## S3 method for class 'MPdata'
manhattan_plot(
  x,
  chromosome = NULL,
  outfn = NULL,
  signif = NULL,
  signif.col = NULL,
  rescale = TRUE,
  rescale.ratio.threshold = 5,
  signif.rel.pos = 0.2,
  chr.gap.scaling = NULL,
  color.by.highlight = FALSE,
  label.colname = NULL,
  x.label = "Chromosome",
  y.label = expression(-log[10](p)),
  point.size = 0.75,
  label.font.size = 2,
  max.overlaps = 20,
  plot.title = ggplot2::waiver(),
  plot.subtitle = ggplot2::waiver(),
  plot.width = 10,
  plot.height = 5,
  plot.scale = 1,
  ...
)

## S4 method for signature 'GRanges'
manhattan_plot(
  x,
  chromosome = NULL,
  outfn = NULL,
  signif = c(5e-08, 1e-05),
  pval.colname = "pval",
  label.colname = NULL,
  highlight.colname = NULL,
  chr.order = NULL,
  signif.col = NULL,
  chr.col = NULL,
  highlight.col = NULL,
  rescale = TRUE,
  rescale.ratio.threshold = 5,
  signif.rel.pos = 0.2,
  chr.gap.scaling = 1,
  color.by.highlight = FALSE,
  preserve.position = FALSE,
  thin = NULL,
  thin.n = 1000,
  thin.bins = 200,
  pval.log.transform = TRUE,
  plot.title = ggplot2::waiver(),
  plot.subtitle = ggplot2::waiver(),
  plot.width = 10,
  plot.height = 5,
  plot.scale = 1,
  point.size = 0.75,
  label.font.size = 2,
  max.overlaps = 20,
  x.label = "Chromosome",
  y.label = expression(-log[10](p)),
  ...
)

Arguments

x

a data.frame, an extension of data.frame object (e.g. tibble), or an MPdata object.

...

additional arguments to be passed onto geom_label_repel

chromosome

a character. This is supplied if a manhattan plot of a single chromosome is desired. If NULL, then all the chromosomes in the data will be plotted.

outfn

a character. File name to save the Manhattan Plot. If outfn is supplied (i.e. !is.null(outfn)), then the plot is not drawn in the graphics window.

signif

a numeric vector. Significant p-value thresholds to be drawn for manhattan plot. At least one value should be provided. Default value is c(5e-08, 1e-5). If signif is not NULL and x is an MPdata object, signif argument overrides the value inside MPdata.

pval.colname

a character. Column name of x containing p.value.

chr.colname

a character. Column name of x containing chromosome.

pos.colname

a character. Column name of x containing position.

label.colname

a character. Name of the column in MPdata$data to be used for labeling.

highlight.colname

a character. If you desire to color certain points (e.g. significant variants) rather than color by chromosome, you can specify the category in this column, and provide the color mapping in highlight.col. Ignored if NULL.

chr.order

a character vector. Order of chromosomes presented in manhattan plot.

signif.col

a character vector of equal length as signif. It contains colors for the lines drawn at signif. If NULL, the smallest value is colored black while others are grey. If x is an MPdata object, behaves similarly to signif.

chr.col

a character vector of equal length as chr.order. It contains colors for the chromosomes. Name of the vector should match chr.order. If NULL, default colors are applied using RColorBrewer.

highlight.col

a character vector. It contains color mapping for the values from highlight.colname.

rescale

a logical. If TRUE, the plot will rescale itself depending on the data. More on this in details.

rescale.ratio.threshold

a numeric. Threshold of that triggers the rescale.

signif.rel.pos

a numeric between 0.1 and 0.9. If the plot is rescaled, where should the significance threshold be positioned?

chr.gap.scaling

a numeric. scaling factor for gap between chromosome if you desire to change it if x is an MPdata object, then the gap will scale relative to the gap in the object.

color.by.highlight

a logical. Should the points be colored based on a highlight column?

preserve.position

a logical. If TRUE, the width of each chromosome reflect the number of variants and the position of each variant is correctly scaled? If FALSE, the width of each chromosome is equal and the variants are equally spaced.

thin

a logical. If TRUE, thinPoints will be applied. Defaults to TRUE if chromosome is NULL. Defaults to FALSE if chromosome is supplied.

thin.n

an integer. Number of max points per horizontal partitions of the plot. Defaults to 1000.

thin.bins

an integer. Number of bins to partition the data. Defaults to 200.

pval.log.transform

a logical. If TRUE, the p-value will be transformed to -log10(p-value).

plot.title

a character. Plot title

plot.subtitle

a character. Plot subtitle

plot.width

a numeric. Plot width in inches. Corresponds to width argument in ggsave function.

plot.height

a numeric. Plot height in inches. Corresponds to height argument in ggsave function.

plot.scale

a numeric. Plot scale. Corresponds to scale argument in ggsave function.

point.size

a numeric. Size of the points.

label.font.size

a numeric. Size of the labels.

max.overlaps

an integer. Exclude text labels that overlaps too many things.

x.label

a character. x-axis label

y.label

a character. y-axis label

Details

This generic function accepts a result of a GWAS in the form of data.frame or a MPdata object produced by manhattan_data_preprocess. The function will throw an error if another type of object is passed.

Having rescale = TRUE is useful when there are points with very high -log10(p.value). In this case, the function attempts to split the plot into two different scales, with the split happening near the strictest significance threshold. More precisely, the plot is rescaled when

-log10(pvalue) / (strictest significance threshold) \ge rescale.ratio.threshold

If you wish to add annotation to the points, provide the name of the column to label.colname. The labels are added with ggrepel.

Be careful though: if the annotation column contains a large number of variants, then the plotting could take a long time, and the labels will clutter up the plot. For those points with no annotation, you have the choice to set them as NA or "".

Value

gg object if is.null(outfn), NULL if !is.null(outf)

Examples


gwasdat <- data.frame(
  "chromosome" = rep(1:5, each = 30),
  "position" = c(replicate(5, sample(1:300, 30))),
  "pvalue" = rbeta(150, 1, 1)^5
)

manhattan_plot(
  gwasdat, pval.colname = "pvalue", chr.colname = "chromosome", pos.colname = "position",
  chr.order = as.character(1:5)
)

mpdata <- manhattan_data_preprocess(
  gwasdat, pval.colname = "pvalue", chr.colname = "chromosome", pos.colname = "position",
  chr.order = as.character(1:5)
)

manhattan_plot(mpdata)


leejs-abv/ggmanh documentation built on Sept. 19, 2024, 10:13 p.m.