Description Usage Arguments Extends Fields Methods Author(s) References Examples
This class is created for fast and memory efficient manipulations with large datasets presented in matrix form. It is used to load, store, and manipulate large datasets, e.g. genotype and gene expression matrices. When a dataset is loaded, it is sliced in blocks of 1,000 rows (default size). This allows imputing, standardizing, and performing other operations with the data with minimal memory overhead.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # x[[i]] indexing allows easy access to individual slices.
# It is equivalent to x$GetSlice(i) and x$SetSlice(i,value)
## S4 method for signature 'SlicedData'
x[[i]]
## S4 replacement method for signature 'SlicedData'
x[[i]] <- value
# The following commands work as if x was a simple matrix object
## S4 method for signature 'SlicedData'
nrow(x)
## S4 method for signature 'SlicedData'
ncol(x)
## S4 method for signature 'SlicedData'
dim(x)
## S4 method for signature 'SlicedData'
rownames(x)
## S4 method for signature 'SlicedData'
colnames(x)
## S4 replacement method for signature 'SlicedData'
rownames(x) <- value
## S4 replacement method for signature 'SlicedData'
colnames(x) <- value
# SlicedData object can be easily transformed into a matrix
# preserving row and column names
## S4 method for signature 'SlicedData'
as.matrix(x)
# length(x) can be used in place of x$nSlices()
# to get the number of slices in the object
## S4 method for signature 'SlicedData'
length(x)
|
x |
|
i |
Number of a slice. |
value |
New content for the slice / new row or column names. |
SlicedData
is a reference classes (envRefClass
).
Its methods can change the values of the fields of the class.
dataEnv
:environment
. Stores the slices of the data matrix. The slices should be accessed via getSlice()
and setSlice()
methods.
nSlices1
:numeric
. Number of slices. For internal use. The value should be access via nSlices()
method.
rowNameSlices
:list
. Slices of row names.
columnNames
:character
. Column names.
fileDelimiter
:character
. Delimiter separating values in the input file.
fileSkipColumns
:numeric
. Number of columns with row labels in the input file.
fileSkipRows
:numeric
. Number of rows with column labels in the input file.
fileSliceSize
:numeric
. Maximum number of rows in a slice.
fileOmitCharacters
:character
. Missing value (NaN) representation in the input file.
initialize(mat)
:Create the object from a matrix.
nSlices()
:Returns the number of slices.
nCols()
:Returns the number of columns in the matrix.
nRows()
:Returns the number of rows in the matrix.
Clear()
:Clears the object. Removes the data slices and row and column names.
Clone()
:Makes a copy of the object. Changes to the copy do not affect the source object.
CreateFromMatrix(mat)
: Creates SlicedData
object from a matrix
.
LoadFile(filename, skipRows = NULL, skipColumns = NULL,
sliceSize = NULL, omitCharacters = NULL, delimiter = NULL, rowNamesColumn = 1)
: Loads data matrix from a file. filename
should be a character string. The remaining parameters specify the file format and have the same meaning as file*
fields. Additional rowNamesColumn
parameter specifies which of the columns of row labels to use as row names.
SaveFile(filename)
: Saves the data to a file. filename
should be a character string.
getSlice(sl)
: Retrieves sl
-th slice of the matrix.
setSlice(sl, value)
: Set sl
-th slice of the matrix.
ColumnSubsample(subset)
: Reorders/subsets the columns according to subset
.
Acts as M = M[ ,subset]
for a matrix M
.
RowReorder(ordr)
: Reorders rows according to ordr
.
Acts as M = M[ordr, ]
for a matrix M
.
RowMatrixMultiply(multiplier)
: Multiply each row by the multiplier
.
Acts as M = M %*% multiplier
for a matrix M
.
CombineInOneSlice()
: Combines all slices into one. The whole matrix can then be obtained via $getSlice(1)
.
IsCombined()
: Returns TRUE
if the number of slices is 1 or 0.
ResliceCombined(sliceSize = -1)
: Cuts the data into slices of sliceSize
rows. If sliceSize
is not defined, the value of fileSliceSize
field is used.
GetAllRowNames()
:Returns all row names in one vector.
RowStandardizeCentered()
:Set the mean of each row to zero and the sum of squares to one.
SetNanRowMean()
:Impute rows with row mean. Rows full of NaN values are imputed with zeros.
RowRemoveZeroEps()
:Removes rows of zeros and those that are nearly zero.
FindRow(rowname)
: Finds row by name. Returns a pair of slice number an row number within the slice. If no row is found, the function returns NULL
.
rowMeans(x, na.rm = FALSE, dims = 1L)
:Returns a vector of row means. Works as rowMeans but requires dims
to be equal to 1L
.
rowSums(x, na.rm = FALSE, dims = 1L)
:Returns a vector of row sums. Works as rowSums but requires dims
to be equal to 1L
.
colMeans(x, na.rm = FALSE, dims = 1L)
:Returns a vector of column means. Works as colMeans but requires dims
to be equal to 1L
.
colSums(x, na.rm = FALSE, dims = 1L)
:Returns a vector of column sums. Works as colSums but requires dims
to be equal to 1L
.
Andrey Shabalin ashabalin@vcu.edu
The package website: http://www.bios.unc.edu/research/genomic_software/Matrix_eQTL/
1 | # Create a SlicedData variable
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.