wasserstein.test: Two-sample test to check for differences between two...
In goncalves-lab/waddR: Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

wasserstein.test

R Documentation

Two-sample test to check for differences between two distributions using the 2-Wasserstein distance

Description

Two-sample test to check for differences between two distributions using the 2-Wasserstein distance, either using the semi-parametric permutation testing procedure with a generalized Pareto distribution (GPD) approximation to estimate small p-values accurately or the test based on asymptotic theory

Usage

wasserstein.test(x, y, method = c("SP", "ASY"), permnum = 10000)

Arguments

`x`	sample (vector) representing the distribution of condition `A`
`y`	sample (vector) representing the distribution of condition `B`
`method`	testing procedure to be employed: "SP" for the semi-parametric permutation testing procedure with GPD approximation, "ASY" for the test based on asymptotic theory; if no method is specified, "SP" will be used by default.
`permnum`	number of permutations used in the permutation testing procedure (if `method="SP"` is performed); default is 10000

Details

Details concerning the two testing procedures (i.e. the semi-parametric permutation testing procedure with GPD approximation and the test based on asymptotic theory) can be found in Schefzik et al. (2020).

Note that the asymptotic theory-based test (method="ASY") should only be employed when the samples x and y can be assumed to come from continuous distributions. In contrast, the semi-parametric test (method="SP") can be used for samples coming from continuous or discrete distributions.

Value

A vector, see Schefzik et al. (2020) for details:

d.wass: 2-Wasserstein distance between the two samples computed by quantile approximation
d.wass^2: squared 2-Wasserstein distance between the two samples computed by quantile approximation
d.comp^2: squared 2-Wasserstein distance between the two samples computed by decomposition approximation
d.comp: 2-Wasserstein distance between the two samples computed by decomposition approximation
location: location term in the decomposition of the squared 2-Wasserstein distance between the two samples
size: size term in the decomposition of the squared 2-Wasserstein distance between the two samples
shape: shape term in the decomposition of the squared 2-Wasserstein distance between the two samples
rho: correlation coefficient in the quantile-quantile plot
pval: The p-value of the semi-parametric or the asymptotic theory-based test, depending on the specified method
p.ad.gpd: in case the GPD fitting is performed: p-value of the Anderson-Darling test to check whether the GPD actually fits the data well (otherwise NA). This output is only returned when performing the semi-parametric test (method="SP")!
N.exc: in case the GPD fitting is performed: number of exceedances (starting with 250 and iteratively decreased by 10 if necessary) that are required to obtain a good GPD fit, i.e. p-value of Anderson-Darling test \geq 0.05 (otherwise NA). This output is only returned when performing the semi-parametric test (method="SP")!
perc.loc: fraction (in %) of the location part with respect to the overall squared 2-Wasserstein distance obtained by the decomposition approximation
perc.size: fraction (in %) of the size part with respect to the overall squared 2-Wasserstein distance obtained by the decomposition approximation
perc.shape: fraction (in %) of the shape part with respect to the overall squared 2-Wasserstein distance obtained by the decomposition approximation
decomp.error: relative error between the squared 2-Wasserstein distance computed by the quantile approximation and the squared 2-Wasserstein distance computed by the decomposition approximation

References

Schefzik, R., Flesch, J., and Goncalves, A. (2020). waddR: Using the 2-Wasserstein distance to identify differences between distributions in two-sample testing, with application to single-cell RNA-sequencing data.

Examples

set.seed(24)
x<-rnorm(100)
y1<-rnorm(150)
y2<-rexp(150,3)
y3<-rpois(150,2)

#for reproducibility, set a seed for the semi-parametric, permutation-based test
set.seed(32)
wasserstein.test(x,y1,method="SP",permnum=10000)
wasserstein.test(x,y1,method="ASY")

set.seed(33)
wasserstein.test(x,y2,method="SP",permnum=10000)
wasserstein.test(x,y2,method="ASY")

set.seed(34)
#only consider SP method, as Poisson distribution is discrete
wasserstein.test(x,y3,method="SP",permnum=10000)

goncalves-lab/waddR documentation built on June 29, 2023, 12:18 a.m.

goncalves-lab/waddR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

goncalves-lab/waddR
Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

wasserstein.test: Two-sample test to check for differences between two...
In goncalves-lab/waddR: Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

Two-sample test to check for differences between two distributions using the 2-Wasserstein distance

Description

Usage

Arguments

Details

Value

References

Examples

Related to wasserstein.test in goncalves-lab/waddR...

R Package Documentation

Browse R Packages

We want your feedback!

goncalves-lab/waddR Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

wasserstein.test: Two-sample test to check for differences between two... In goncalves-lab/waddR: Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

Two-sample test to check for differences between two distributions using the 2-Wasserstein distance

Description

Usage

Arguments

Details

Value

References

Examples

Related to wasserstein.test in goncalves-lab/waddR...

R Package Documentation

Browse R Packages

We want your feedback!

goncalves-lab/waddR
Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

wasserstein.test: Two-sample test to check for differences between two...
In goncalves-lab/waddR: Statistical tests for detecting differential distributions based on the 2-Wasserstein distance