DE analysis (by NOISeq)

design.table <- fread(type.list$Design, header = TRUE)
noi.count <- fread(type.list$RSEM)[Type!="protein_coding",]
rna.id <- noi.count[,1]
rna.type <- noi.count[,2]
noi.count <- as.data.frame(noi.count[, round(.SD), .SDcols = -(1:2), with = TRUE])
rownames(noi.count) <- rna.id[[1]]
rna.type <- rna.type[[1]]
names(rna.type) <- rna.id[[1]]
noi.data <- NOISeq::readData(data = noi.count, factors = as.data.frame(design.table[,2]), biotype = rna.type)
# This 'pok gai' function use print as message,
# cannot be suppressed by invisible() or suppressMessages(),
# so the tricky sink blocks were used
{ sink("/dev/null"); seqbio.result <- NOISeq::noiseqbio(noi.data, norm = 'tmm', factor = 'condition'); sink(); }

Column {.tabset}

Expression plot

{ sink("/dev/null"); NOISeq::DE.plot(seqbio.result, graphic = "expr", q = 0.9); sink(); }

MD plot

{ sink("/dev/null"); NOISeq::DE.plot(seqbio.result, graphic = "MD", q = 0.8); sink(); }

Column

Description

Title: Differential expression analysis of novel and known lncRNA from LncPipe(NOIseq)

NOISeq takes both the log fold change and absolute difference of reads count into consideration when comparing two conditions. All pairs of replicates are calculated to determine empirical distribution functions. Differentially expressed genes is identified by a small P-value in both the log fold change and the absolute difference in reads count.

What we have done: Design.matrix file and reads count matrix were fed into edgeR, an R packages for identification differential expressed gene from RNA-seq or Chip-seq experiment. Before exact test statitistics, raw reads count matrix were filtered according to the user-defined parameter min.expressed.sample, which can be explained as follows: raw reads count was first normalized in to cpm (Counts Per Million reads) matrix , gene with cpm >1 in more than min.expressed.sample numbers were retained into futher analysis. in current version, we only implemented the standard comparison that only two condition were supportted in each analysis. Experimental without repelicates were not supported . Source code of differential expression analysis can be freely checked by https://github.com/bioinformatist/LncPipeReporter/edit/master/inst/rmd/DE.Rmd. Complementary experimental design can be performed seperately besed kallisto.count.txt generated by lncPipe, this file can be imported into another software IDEA, which focus on comprehensive differential expression analysis from expression matrix.

What dose plots Mean: In this section, Expression plot, MD plot were show at right panel.

Those plots showing different aspects of differential expression results. Expression plot is to compare the expression values in each condition for all features. Differentially expressed features can be highlighted. Manhattan plot is to compare the expression values in each condition across all the chromosome positions. Differentially expressed features can also be highlighted. MD plot shows the values for (M,D) statistics. Differentially expressed features can also be highlighted. Distribution plot displays the percentage of differentially expressed features per chromosome and biotype (if this information is provided by the user). For more information, please go the plot function instruction page at site http://www.imsbio.co.jp/RGM/R_rdfile?f=NOISeq/man/DE.plot.Rd&d=R_BC

Reference

Tarazona S, Furio-Tari P, Turra D, Pietro AD, Nueda MJ, Ferrer A and Conesa A (2015). “Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package.” Nucleic Acids Research, 43(21), pp. e140.

Differential expressed lncRNAs table

{ sink("/dev/null"); degs <- NOISeq::degenes(seqbio.result); sink(); }
fwrite(degs, 'DE.csv', row.names = TRUE)
DT::datatable(head(degs, n = 80L)) %>% DT::formatRound(c('normal_mean', 'tumor_mean', 'theta', "log2FC"), digits = 2)


bioinformatist/LncPipe-Reporter documentation built on May 6, 2019, 2:30 p.m.