neighSmooth: Euclidean neighbor smoothing

View source: R/neighSmooth.R

neighSmoothR Documentation

Euclidean neighbor smoothing

Description

This function constructs a variable that for each event shows the average value for its euclidean k-nearest neighbors. It builds on the same idea as has been put forward in the Sconify package: -Burns TJ (2019). Sconify: A toolkit for performing KNN-based statistics for flow and mass cytometry data. R package version 1.4.0 and -Hart GT, Tran TM, Theorell J, Schlums H, Arora G, Rajagopalan S, et al. Adaptive NK cells in people exposed to Plasmodium falciparum correlate with protection from malaria. J Exp Med. 2019 Jun 3;216(6):1280–90. First, the k nearest neighbors are defined for cell x. Then, the average value for the k nearest neighbors is returned as the result for cell x.

Usage

neighSmooth(
  focusData,
  euclidSpaceData,
  neighRows = seq_len(nrow(as.matrix(focusData))),
  ctrlRows,
  kNeighK = max(100, round(nrow(as.matrix(euclidSpaceData))/10000)),
  kMeansK = max(1, round(nrow(as.matrix(euclidSpaceData))/1000)),
  method = "mean"
)

Arguments

focusData

The data that should be smoothed. Should be a matrix with the variables to be smoothed as columns.

euclidSpaceData

The data cloud in which the nearest neighbors for the events should be identified. Can be a vector, matrix or dataframe.

neighRows

The rows in the dataset that correspond to the neighbors of the focusData points. This can be all the focusData points, or a subset, depending on the setup.

ctrlRows

Optionally, a set of control rows that are used to remove background signal from the neighRows data before sending the data back.

kNeighK

The number of nearest neighbors.

kMeansK

The number of clusters in the initial step of the algorithm. A higher number leads to shorter runtime, but potentially lower accuracy.

method

The method to use for the smoothing. Three values possible: mean (default), median and mode.

Value

An object of the same dimensions as focusData that has been smoothed.

Examples

data(testData)
data(testDataSNE)
euclidSpaceData <-
    testData[, c(
        "SYK", "CD16", "CD57", "EAT.2",
        "CD8", "NKG2C", "CD2", "CD56"
    )]
## Not run: 
smoothGroupVector <- neighSmooth(
    focusData = as.numeric(testData$label),
    euclidSpaceData
)

## End(Not run)

Theorell/DepecheR documentation built on July 27, 2023, 8:13 p.m.