pgxSegprocess: Extract, analyse and visualize "pgxseg" files

View source: R/pgxSegprocess.R

pgxSegprocessR Documentation

Extract, analyse and visualize "pgxseg" files

Description

This function extracts segment variants, CNV frequency, and metadata from local "pgxseg" files and supports survival data visualization.

Usage

pgxSegprocess(
  file,
  group_id = "group_id",
  show_KM_plot = FALSE,
  return_metadata = FALSE,
  return_seg = FALSE,
  return_frequency = FALSE,
  assembly = "hg38",
  cnv_column_idx = 6,
  bin_size = 1e+06,
  overlap = 1000,
  soft_expansion = 0.1,
  ...
)

Arguments

file

A string specifying the path and name of the "pgxseg" file where the data is to be read.

group_id

A string specifying which id is used for grouping in KM plot or CNV frequency calculation. Default is "group_id".

show_KM_plot

A logical value determining whether to return the Kaplan-Meier plot based on metadata. Default is FALSE.

return_metadata

A logical value determining whether to return metadata. Default is FALSE.

return_seg

A logical value determining whether to return segment data. Default is FALSE.

return_frequency

A logical value determining whether to return CNV frequency data. The frequency calculation is based on segments in segment data and specified group id in metadata. Default is FALSE.

assembly

A string specifying the genome assembly version to apply to CNV frequency calculation and plotting. Allowed options are "hg19" and "hg38". Default is "hg38".

cnv_column_idx

Index of the column specifying the CNV state used for calculating CNV frequency. The index must be at least 6, with the default set to 6. The CNV states should either contain "DUP" for duplications and "DEL" for deletions, or level-specific CNV states represented using Experimental Factor Ontology (EFO) codes.

bin_size

Size of genomic bins used in CNV frequency calculation to split the genome, in base pairs (bp). Default is 1,000,000.

overlap

Numeric value defining the amount of overlap between bins and segments considered as bin-specific CNV, in base pairs (bp). Default is 1,000.

soft_expansion

Fraction of bin_size to determine merge criteria. During the generation of genomic bins, division starts at the centromere and expands towards the telomeres on both sides. If the size of the last bin is smaller than soft_expansion * bin_size, it will be merged with the previous bin. Default is 0.1.

...

Other parameters relevant to KM plot. These include pval, pval.coord, pval.method, conf.int, linetype, and palette (see ggsurvplot from survminer)

Value

Segments data, CNV frequency object, meta data or KM plots from local "pgxseg" files

Examples

file_path <- system.file("extdata", "example.pgxseg",package = 'pgxRpi')
info <- pgxSegprocess(file=file_path,show_KM_plot = TRUE, return_seg = TRUE, return_metadata = TRUE)

progenetix/pgxRpi documentation built on Nov. 4, 2024, 11:31 p.m.