pairsGBSR: Draw a scatter plot of a pair of specified statistics

View source: R/PlotFunctions.R

pairsGBSRR Documentation

Draw a scatter plot of a pair of specified statistics

Description

Draw a scatter plot of a pair of specified statistics

Usage

pairsGBSR(
  x,
  stats1 = "dp",
  stats2 = "missing",
  target = "marker",
  size = 0.5,
  alpha = 0.8,
  color = c(Marker = "darkblue", Sample = "darkblue"),
  fill = c(Marker = "skyblue", Sample = "skyblue"),
  smooth = FALSE
)

Arguments

x

A GbsrGenotypeData object.

stats1

A string to specify statistics to be drawn.

stats2

A string to specify statistics to be drawn.

target

Either or both of "marker" and "sample", e.g. target = "marker" to draw a histogram only for SNPs.

size

A numeric value to specify the dot size of a scatter plot.

alpha

A numeric value [0-1] to specify the transparency of dots in a scatter plot.

color

A named vector "Marker" and "Sample" to specify border color of bins in the histograms.

fill

A named vector "Marker" and "Sample" to specify fill color of bins in the histograms.⁠stats = "geno⁠ only requires "Ref", "Het" and "Alt", while others uses the value named "Marker".

smooth

A logical value to indicate whether draw a smooth line for data points. See also ggplot2::stat_smooth().

Details

You can draw a scatter plot of per-marker and/or per-sample summary statistics specified at stats1 and stats2. The "stats1" and "stats2" arguments can take the following values:

missing

Proportion of missing genotype calls.

het

Proportion of heterozygote calls.

raf

Reference allele frequency.

dp

Total read counts.

ad_ref

Reference allele read counts.

ad_alt

Alternative allele read counts.

rrf

Reference allele read frequency.

mean_ref

Mean of reference allele read counts.

sd_ref

Standard deviation of reference allele read counts.

median_ref

Quantile of reference allele read counts.

mean_alt

Mean of alternative allele read counts.

sd_alt

Standard deviation of alternative allele read counts.

median_alt

Quantile of alternative allele read counts.

mq

Mapping quality.

fs

Phred-scaled p-value (strand bias)

qd

Variant Quality by Depth

sor

Symmetric Odds Ratio (strand bias)

mqranksum

Alt vs. Ref read mapping qualities

readposranksum

Alt vs. Ref read position bias

baseqranksum

Alt Vs. Ref base qualities

To draw scatter plots for "missing", "het", "raf", you need to run countGenotype() first to obtain statistics. Similary, "dp", "ad_ref", "ad_alt", "rrf" requires values obtained via countRead(). "mq", "fs", "qd", "sor", "mqranksum", "readposranksum", and "baseqranksum" only work with target = "marker", if your data contains those values supplied via SNP calling tools like GATK.

Value

A ggplot object.

Examples

# Load data in the GDS file and instantiate a [GbsrGenotypeData] object.
gds_fn <- system.file("extdata", "sample.gds", package = "GBScleanR")
gds <- loadGDS(gds_fn)

# Summarize genotype count information to be used in `pairsGBSR()`
gds <- countGenotype(gds)

# Draw scatter plots of missing rate vs heterozygosity.
pairsGBSR(gds, stats1 = "missing", stats2 = "het")

# Close the connection to the GDS file
closeGDS(gds)



tomoyukif/GBScleanR documentation built on Oct. 31, 2024, 2:43 a.m.