design.table <- fread(type.list$Design, header = TRUE) noi.count <- fread(type.list$RSEM)[Type!="protein_coding",] rna.id <- noi.count[,1] rna.type <- noi.count[,2] noi.count <- as.data.frame(noi.count[, round(.SD), .SDcols = -(1:2), with = TRUE]) rownames(noi.count) <- rna.id[[1]] rna.type <- rna.type[[1]] names(rna.type) <- rna.id[[1]] noi.data <- NOISeq::readData(data = noi.count, factors = as.data.frame(design.table[,2]), biotype = rna.type) # This 'pok gai' function use print as message, # cannot be suppressed by invisible() or suppressMessages(), # so the tricky sink blocks were used { sink("/dev/null"); seqbio.result <- NOISeq::noiseqbio(noi.data, norm = 'tmm', factor = 'condition'); sink(); }
{ sink("/dev/null"); NOISeq::DE.plot(seqbio.result, graphic = "expr", q = 0.9); sink(); }
{ sink("/dev/null"); NOISeq::DE.plot(seqbio.result, graphic = "MD", q = 0.8); sink(); }
Title: Differential expression analysis of novel and known lncRNA from LncPipe(NOIseq)
NOISeq takes both the log fold change and absolute difference of reads count into consideration when comparing two conditions. All pairs of replicates are calculated to determine empirical distribution functions. Differentially expressed genes is identified by a small P-value in both the log fold change and the absolute difference in reads count.
What we have done: Design.matrix
file and reads count matrix were fed into edgeR, an R packages for identification differential expressed gene from RNA-seq or Chip-seq experiment. Before exact test statitistics, raw reads count matrix were filtered according to the user-defined parameter min.expressed.sample
, which can be explained as follows: raw reads count was first normalized in to cpm (Counts Per Million reads) matrix , gene with cpm >1 in more than min.expressed.sample
numbers were retained into futher analysis. in current version, we only implemented the standard comparison that only two condition were supportted in each analysis. Experimental without repelicates were not supported . Source code of differential expression analysis can be freely checked by https://github.com/bioinformatist/LncPipeReporter/edit/master/inst/rmd/DE.Rmd. Complementary experimental design can be performed seperately besed kallisto.count.txt
generated by lncPipe, this file can be imported into another software IDEA, which focus on comprehensive differential expression analysis from expression matrix.
What dose plots Mean: In this section, Expression plot
, MD plot
were show at right panel.
Those plots showing different aspects of differential expression results. Expression plot is to compare the expression values in each condition for all features. Differentially expressed features can be highlighted. Manhattan plot is to compare the expression values in each condition across all the chromosome positions. Differentially expressed features can also be highlighted. MD plot shows the values for (M,D) statistics. Differentially expressed features can also be highlighted. Distribution plot displays the percentage of differentially expressed features per chromosome and biotype (if this information is provided by the user). For more information, please go the plot function instruction page at site http://www.imsbio.co.jp/RGM/R_rdfile?f=NOISeq/man/DE.plot.Rd&d=R_BC
Reference
Tarazona S, Furio-Tari P, Turra D, Pietro AD, Nueda MJ, Ferrer A and Conesa A (2015). “Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package.” Nucleic Acids Research, 43(21), pp. e140.
{ sink("/dev/null"); degs <- NOISeq::degenes(seqbio.result); sink(); } fwrite(degs, 'DE.csv', row.names = TRUE) DT::datatable(head(degs, n = 80L)) %>% DT::formatRound(c('normal_mean', 'tumor_mean', 'theta', "log2FC"), digits = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.