Linnorm.PCA: Linnorm-PCA Clustering pipeline for subpopulation Analysis

Description Usage Arguments Details Value Examples

View source: R/Linnorm.PCA.R

Description

This function first performs Linnorm transformation on the dataset. Then, it will perform Principal component analysis on the dataset and use k-means clustering to identify subpopulations of cells.

Usage

1
2
3
4
Linnorm.PCA(datamatrix, RowSamples = FALSE, input = "Raw", MZP = 0,
  DataImputation = TRUE, num_PC = 3, num_center = c(1:20), Group = NULL,
  Coloring = "kmeans", pca.scale = FALSE, kmeans.iter = 2000,
  plot.title = "PCA K-means clustering", ...)

Arguments

datamatrix

The matrix or data frame that contains your dataset. Each row is a feature (or Gene) and each column is a sample (or replicate). Raw Counts, CPM, RPKM, FPKM or TPM are supported. Undefined values such as NA are not supported. It is not compatible with log transformed datasets.

RowSamples

Logical. In the datamatrix, if each row is a sample and each row is a feature, set this to TRUE so that you don't need to transpose it. Linnorm works slightly faster with this argument set to TRUE, but it should be negligable for smaller datasets. Defaults to FALSE.

input

Character. "Raw" or "Linnorm". In case you have already transformed your dataset with Linnorm, set input into "Linnorm" so that you can put the Linnorm transformed dataset into the "datamatrix" argument. Defaults to "Raw".

MZP

Double >=0, <= 1. Minimum non-Zero Portion Threshold for this function. Genes not satisfying this threshold will be removed from the analysis. For exmaple, if set to 0.3, genes without at least 30 percent of the samples being non-zero will be removed. Defaults to 0.

DataImputation

Logical. Perform data imputation on the dataset after transformation. Defaults to TRUE.

num_PC

Integer >= 2. Number of principal componenets to be used in K-means clustering. Defaults to 3.

num_center

Numeric vector. Number of clusters to be tested for k-means clustering. fpc, vegan, mclust and apcluster packages are used to determine the number of clusters needed. If only one number is supplied, it will be used and this test will be skipped. Defaults to c(1:20).

Group

Character vector with length equals to sample size. Each character in this vector corresponds to each of the columns (samples) in the datamatrix. In the plot, the shape of the points that represent each sample will be indicated by their group assignment. Defaults to NULL.

Coloring

Character. "kmeans" or "Group". If Group is not NULL, coloring in the PCA plot will reflect each sample's group. Otherwise, coloring will reflect k means clustering results. Defaults to "Group".

pca.scale

Logical. In the prcomp(for Principal component analysis) function, set the "scale." parameter. It signals the function to scale unit variances in the variables before the analysis takes place. Defaults to FALSE.

kmeans.iter

Numeric. Number of iterations in k-means clustering. Defaults to 2000.

plot.title

Character. Set the title of the plot. Defaults to "PCA K-means clustering".

...

arguments that will be passed into Linnorm's transformation function.

Details

This function performs PCA clustering using Linnorm transformation.

Value

It returns a list with the following objects:

Examples

1
2
3
4
#Obtain example matrix:
data(Islam2011)
#Example:
PCA.results <- Linnorm.PCA(Islam2011)

Linnorm documentation built on Nov. 8, 2020, 6:48 p.m.