Description Usage Arguments Details Value Negative, non-positive, and saturated values Missing values Weighted normalization Within-slide normalization Author(s) References See Also Examples
Weighted affine normalization between channels and arrays.
This method will both remove curvature in the M vs A plots that are due to an affine transformation of the data. In other words, if there are (small or large) biases in the different (red or green) channels, biases that can be equal too, you will get curvature in the M vs A plots and this type of curvature will be removed by this normalization method.
Moreover, if you normalize all slides at once (recommended), this method will also bring the signals on the same scale such that the log-ratios for different slides are comparable. Thus, do not normalize the scale of the log-ratios between slides afterward.
It is recommended to normalize as many slides as possible in one run. The result is that if creating log-ratios between any channels and any slides, they will contain as little curvature as possible.
Furthermore, since the relative scale between any two channels on any two slides will be one if one normalizes all slides (and channels) at once it is possible to add or multiply with the same constant to all channels/arrays without introducing curvature. Thus, it is easy to rescale the data afterwards as demonstrated in the example.
1 2 | ## S3 method for class 'RGData'
normalizeAffine(this, slides=NULL, groupBy=NULL, ...)
|
slides |
|
groupBy |
|
... |
Additional arguments accepted by
|
A line is fitted robustly throught the (y_R,y_G) observations using an iterated re-weighted principal component analysis (IWPCA), which minimized the residuals that are orthogonal to the fitted line. Each observation is down-weighted by the inverse of the absolute residuals, i.e. in L_1.
Returns a list
containing estimated biases, slopes, and distances
to the optimal line for each slide.
Affine normalization applies equally well to negative values. Thus,
contrary to normalization methods applied to log-ratios, such as curve-fit
normalization methods, affine normalization, will not set these to NA
.
Data points that are saturated in one or more channels are not used to estimate the normalization function, but they are normalized.
The estimation of the affine normalization function will be made based
on only complete non-saturated observations, i.e. observations that
contains no NA
values nor saturated values as defined by satSignal
.
Each data point can be assigned a weight in [0,1] specifying how much
it should affect the fitting of the affine normalization function.
Note that here a data point is here considered to be the vector
of values for all channels and all arrays included in the normalization.
For instance, for two-channel data with three arrays, each data point is
a vector (R1,G1,R2,G2,R3,G3) of length 6 where R and G represent the
red and the green channels.
Regardless of weights, all data points are normalized based on the fitted normalization function.
Data-point weights are obtained from probe weights, if given.
Weights can be set using *setProbeWeights()
.
If weights are specified, they will be used.
This normalization method normalized multiple channels/arrays at once.
To normalize array by array, like curve-fit normalization, use a loop,
e.g. for (kk in 1:nbrOfSlides(rg)) normalizeAffine(rg, slide=kk)
.
Henrik Bengtsson (http://www.braju.com/R/)
[1] Henrik Bengtsson and Ola Hössjer, Methodological Study of Affine Transformations of Gene Expression Data, Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method, BMC Bioinformatics, 2006, 7:100.
Internally, the light-weight function
normalizeAffine.matrix
is used.
For more information see RGData
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 | # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# First some utilities functions
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
plotPairwiseScans <- function(rg, channel, ...) {
pair <- NULL;
for (ii in 1:(nbrOfSlides(rg)-1)) {
for (jj in (ii+1):nbrOfSlides(rg)) {
pair <- cbind(pair, c(ii,jj));
}
}
opar <- par(mar=c(5,5,3,1));
xlab <- substitute(y[channel]^{(v)}, list=list(channel=channel));
ylab <- substitute(y[channel]^{(w)}, list=list(channel=channel));
plot(NA, xlab=xlab, ylab=ylab, ...);
box(lwd=2, col=switch(channel, "R"="red", "G"="green"));
colors <- 1:ncol(pair)+3;
for (kk in 1:ncol(pair)) {
s <- pair[1,kk];
t <- pair[2,kk];
yv <- rg[[channel]][,s];
yw <- rg[[channel]][,t];
ok <- (yv > 0 & yw > 0 & is.finite(yv) & is.finite(yw));
mc <- log((yv/yw)[ok], base=2);
ac <- log((yv*yw)[ok], base=2)/2;
points(ac,mc, pch=".", col=colors[kk]);
}
# Add a legend
names <- getSlideName(rg);
pairNames <- apply(pair, MARGIN=2, FUN=function(x) paste(names[x], collapse=","));
pairNames <- paste("(", pairNames, ")", sep="");
usr <- par("usr")
legend(x=usr[2],y=usr[3], legend=pairNames, fill=colors, xjust=1, yjust=0, cex=0.7);
par(opar);
}
plotPairDensities <- function(rg, xlim=c(-2,18), ylim=c(0,0.8), ...) {
colors <- seq(rg)+3;
opar <- par(mar=c(5,5,3,1));
xlab <- expression(log[2](y[c]));
plot(NA, xlim=xlim, ylim=ylim, xlab=xlab, ylab="density", ...);
for (ch in c("R", "G")) {
col <- switch(ch, "R"="red", "G"="green");
for (slide in seq(rg)) {
x <- rg[[ch]][,slide];
line <- density(log(x[is.finite(x) & x > 0], base=2));
lines(line, col=col, lwd=4);
lines(line, col=colors[slide], lwd=1);
}
}
# Add a legend
names <- getSlideName(rg);
if (!is.null(names)) {
usr <- par("usr")
legend(x=usr[2],y=usr[4], legend=names, fill=colors, xjust=1, yjust=1, cex=0.7);
}
par(opar);
}
# Draw the (R,G) grid of the (fitted) affine model
drawRGGrid <- function(maxSignal=16, by=1, drawCurve=TRUE, highlightLog1=FALSE, afit=NULL, aR=NULL, aG=NULL, b=NULL) {
x <- seq(-2*maxSignal,maxSignal, by=by)
# The grid ticks on the non-logarithmic scale and shifted -1.
X <- 2^x
# The input (R,G) grid.
R <- matrix(X, nrow=length(X), ncol=length(X))
G <- t(R)
if (!is.null(afit)) {
if (is.null(aR))
aR <- afit$a[1];
if (is.null(aG))
aG <- afit$a[2];
if (is.null(b))
b <- max(afit$b[-1]);
}
if (!is.null(aR) & !is.null(aG) & !is.null(b)) {
R <- aR + R
G <- aG + b*G
}
r <- log(R, base=2)
g <- log(G, base=2)
m <- r-g
a <- (r+g)/2
drawGrid(a,m, col="gray");
if (highlightLog1) {
z <- which(x == 0);
col <- c("lightblue", "gray");
lwd <- c(2,1);
for (kk in 1:2) {
lines(a[z,],m[z,], col=col[kk], lwd=lwd[kk]);
lines(a[,z],m[,z], col=col[kk], lwd=lwd[kk]);
}
}
if (drawCurve) {
lines(diag(a), diag(m), col="blue", lty=4, lwd=2);
}
} # drawRGGrid()
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Main example code
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# One array was scanned four times at four different PMT settings.
rg0 <- RGData$read("PMT-RGData.dat", path=system.file("data-ex", package="aroma.light"));
setLayout(rg0, Layout(4,4,17,17)); # Not really necessary!
setSlideName(rg0, c("500V","600V","700V","800V"));
# Reorder the slides in scan order
keepSlides(rg0, c("800V","500V","600V","700V"));
# Since scan "500V" seems to be an outlier, exclude it!
removeSlides(rg0, slide="500V")
rg <- clone(rg0);
# Multiscan calibration.
fit <- calibrateMultiscan(rg);
Device$set(3, height="108%");
subplots(9, nrow=3);
# Plot such that there is a right angle between the log(R) and log(G) axes.
Alim <- c(-2,18)
Mlim <- c(-1,1)*abs(diff(Alim))
ma <- as.MAData(rg);
plot(ma, xlim=Alim, ylim=Mlim, main="multiscan calibrated");
fit <- normalizeAffine(rg)
drawRGGrid(maxSignal=Alim[2], afit=fit[[1]])
ma <- as.MAData(rg);
plot(ma, xlim=Alim, ylim=Mlim, main="calibrated + normalized");
drawRGGrid(maxSignal=Alim[2])
plotPairDensities(rg, ylim=c(0,0.4), main="calibrated + normalized");
rg <- clone(rg0)
keepSlides(rg, slide="600V")
ma <- as.MAData(rg);
pmt <- paste("PMT", getSlideNames(ma));
plot(ma, xlim=Alim, ylim=Mlim, main=pmt);
cat("Signals (", pmt, ") before normalization:\n", sep="");
print(summary(rg))
fit <- normalizeAffine(rg)
cat("Signals (", pmt, ") after normalization:\n", sep="");
print(summary(rg))
cat("based on the estimates:\n");
str(fit)
drawRGGrid(maxSignal=Alim[2], afit=fit[[1]])
ma <- as.MAData(rg);
plot(ma, xlim=Alim, ylim=Mlim, main=paste(pmt, "normalized"));
drawRGGrid(maxSignal=Alim[2])
plotPairDensities(rg, ylim=c(0,0.4), main=paste(pmt, "normalized"));
rg <- clone(rg0)
ma <- as.MAData(rg);
plot(NA, xlim=Alim, ylim=Mlim, main="All scans");
for (kk in seq(ma))
points(ma, slide=kk, col=kk+1, pch=".");
cat("Signals before normalization:\n", sep="");
print(summary(rg))
fit <- normalizeAffine(rg)
cat("Signals after normalization:\n", sep="");
print(summary(rg))
cat("based on the estimates:\n");
str(fit)
drawRGGrid(maxSignal=Alim[2], afit=fit[[1]])
# To give evidence that all signals now contains the same bias
# and that the relative scale between all scans and all channels
# is one we rescale all signals by the same factor. The only affect
# this will have on the M vs A plot is that the data is shifted
# along the A dimension. Note that there is no rescaling the
# log-ratios!
#scale <- 1/2^5; # Shift such that A <- A - 5;
#for (ch in c("R", "G"))
# rg[[ch]] <- scale*rg[[ch]];
ma <- as.MAData(rg);
plot(NA, xlim=Alim, ylim=Mlim, main="All scans normalized");
for (kk in seq(ma))
points(ma, slide=kk, col=kk+1, pch=".");
drawRGGrid(maxSignal=Alim[2])
plotPairDensities(rg, ylim=c(0,0.4), main="All scans normalized");
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.