View source: R/mvplnDataGenerator.R
mvplnDataGenerator | R Documentation |
This function simulates data from a mixture of MVPLN model. Each dataset will have 'n' random matrices or units, each matrix with dimension r x p, where 'r' is the number of occasions and 'p' is the number of responses/variables.
mvplnDataGenerator(
nOccasions,
nResponses,
nUnits,
mixingProportions,
matrixMean,
phi,
omega
)
nOccasions |
A positive integer indicating the number of occassions. A matrix Y_j has size r x p, and the dataset will have 'j' such matrices with j = 1,...,n. Here, Y_j matrix is said to contain k ∈ 1,...,p responses/variables over i ∈ 1,...,r occasions. |
nResponses |
A positive integer indicating the number of responses/variables. A matrix Y_j has size r x p, and the dataset will have 'j' such matrices with j = 1,...,n. Here, Y_j matrix is said to contain k ∈ 1,...,p responses/variables over i ∈ 1,...,r occasions. |
nUnits |
A positive integer indicating the number of units. A matrix Y_j has size r x p, and the dataset will have 'j' such matrices with j = 1,...,n. |
mixingProportions |
A numeric vector that length equal to the number of total components, indicating the proportion of each component. Vector content should sum to 1. |
matrixMean |
A matrix of size r x p for each component/cluster, giving the matrix of means (M). All matrices should be combined via rbind. See example. |
phi |
A matrix of size r x r, which is the covariance matrix containing the variances and covariances between 'r' occasions, for each component/cluster. All matrices should be combined via rbind. See example. |
omega |
A matrix of size p x p, which is the covariance matrix containing the variance and covariances of 'p' responses/variables, for each component/cluster. All matrices should be combined via rbind. See example. |
Returns an S3 object of class mvplnDataGenerator with results.
dataset - Simulated dataset with 'n' matrices, each matrix with dimension r x p, where 'r' is the number of occasions and 'p' is the number of responses/variables.
truemembership - A numeric vector indicating the membership of each observation.
units - A positive integer indicating the number of units used for simulating the data.
occassions - A positive integer indicating the number of occassions used for simulating the data.
variables - A positive integer indicating the number of responses/variables used for simulating the data.
mixingProportions - A numeric vector indicating the mixing proportion of each component.
means - Matrix of mean used for simulating the data.
phi - Covariance matrix containing the variances and covariances between 'r' occasions used for simulating the data.
psi - Covariance matrix containing the variance and covariances of 'p' responses/variables used for simulating the data.
Anjali Silva, anjali@alumni.uoguelph.ca, Sanjeena Dang, sanjeenadang@cunet.carleton.ca.
Silva, A. et al. (2018). Finite Mixtures of Matrix Variate Poisson-Log Normal Distributions for Three-Way Count Data. arXiv preprint arXiv:1807.08380.
Aitchison, J. and C. H. Ho (1989). The multivariate Poisson-log normal distribution. Biometrika 76. Link.
Silva, A. et al. (2019). A multivariate Poisson-log normal mixture model for clustering transcriptome sequencing data. BMC Bioinformatics 20. Link.
# Example 1
# Generating simulated matrix variate count data
set.seed(1234)
trueG <- 2 # number of total G
truer <- 2 # number of total occasions
truep <- 3 # number of total responses
trueN <- 100 # number of total units
truePiG <- c(0.79, 0.21) # mixing proportions
# Mu is a r x p matrix
trueM1 <- matrix(rep(6, (truer * truep)),
ncol = truep,
nrow = truer, byrow = TRUE)
trueM2 <- matrix(rep(1, (truer * truep)),
ncol = truep,
nrow = truer,
byrow = TRUE)
trueMall <- rbind(trueM1, trueM2)
# Phi is a r x r matrix
# Loading needed packages for generating data
# if (!require(clusterGeneration)) install.packages("clusterGeneration")
# library("clusterGeneration")
# Covariance matrix containing variances and covariances between r occasions
# truePhi1 <- clusterGeneration::genPositiveDefMat("unifcorrmat",
# dim = truer,
# rangeVar = c(1, 1.7))$Sigma
truePhi1 <- matrix(c(1.075551, -0.488301,
-0.488301, 1.362777), nrow = 2)
truePhi1[1, 1] <- 1 # For identifiability issues
# truePhi2 <- clusterGeneration::genPositiveDefMat("unifcorrmat",
# dim = truer,
# rangeVar = c(0.7, 0.7))$Sigma
truePhi2 <- matrix(c(0.7000000, 0.6585887,
0.6585887, 0.7000000), nrow = 2)
truePhi2[1, 1] <- 1 # For identifiability issues
truePhiall <- rbind(truePhi1, truePhi2)
# Omega is a p x p matrix
# Covariance matrix containing variances and covariances between p responses
# trueOmega1 <- clusterGeneration::genPositiveDefMat("unifcorrmat", dim = truep,
# rangeVar = c(1, 1.7))$Sigma
trueOmega1 <- matrix(c(1.0526554, 1.0841910, -0.7976842,
1.0841910, 1.1518811, -0.8068102,
-0.7976842, -0.8068102, 1.4090578),
nrow = 3)
# trueOmega2 <- clusterGeneration::genPositiveDefMat("unifcorrmat", dim = truep,
# rangeVar = c(0.7, 0.7))$Sigma
trueOmega2 <- matrix(c(0.7000000, 0.5513744, 0.4441598,
0.5513744, 0.7000000, 0.4726577,
0.4441598, 0.4726577, 0.7000000),
nrow = 3)
trueOmegaAll <- rbind(trueOmega1, trueOmega2)
# Generated simulated data
sampleData <- mixMVPLN::mvplnDataGenerator(nOccasions = truer,
nResponses = truep,
nUnits = trueN,
mixingProportions = truePiG,
matrixMean = trueMall,
phi = truePhiall,
omega = trueOmegaAll)
# Example 2
trueG <- 1 # number of total G
truer <- 2 # number of total occasions
truep <- 3 # number of total responses
trueN <- 1000 # number of total units
truePiG <- 1L # mixing proportion for G = 1
# Mu is a r x p matrix
trueM1 <- matrix(c(6, 5.5, 6, 6, 5.5, 6),
ncol = truep,
nrow = truer,
byrow = TRUE)
trueMall <- rbind(trueM1)
# Phi is a r x r matrix
set.seed(1)
# truePhi1 <- clusterGeneration::genPositiveDefMat(
# "unifcorrmat",
# dim = truer,
# rangeVar = c(0.7, 1.7))$Sigma
truePhi1 <- matrix(c(1.3092747, 0.3219674,
0.3219674, 1.3233794), nrow = 2)
truePhi1[1, 1] <- 1 # for identifiability issues
truePhiall <- rbind(truePhi1)
# Omega is a p x p matrix
set.seed(1)
# trueOmega1 <- genPositiveDefMat(
# "unifcorrmat",
# dim = truep,
# rangeVar = c(1, 1.7))$Sigma
trueOmega1 <- matrix(c(1.1625581, 0.9157741, 0.8203499,
0.9157741, 1.2216287, 0.7108193,
0.8203499, 0.7108193, 1.2118854), nrow = 3)
trueOmegaAll <- rbind(trueOmega1)
# Generated simulated data
set.seed(1)
sampleData2 <- mixMVPLN::mvplnDataGenerator(
nOccasions = truer,
nResponses = truep,
nUnits = trueN,
mixingProportions = truePiG,
matrixMean = trueMall,
phi = truePhiall,
omega = trueOmegaAll)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.