miRNA-seq / Differential expression analysis using edgeR

Description

This tool will perform an analysis for differentially expressed sequences using the R implementation of the edgeR algorithm.

Parameters


Details


Given an input table of counts data for at least two samples, the edgeR package performs scaling, normalization and statistical analysis to identify differentially expressed genomic features between two experimental conditions. Notice that the statistical analysis assumes that there are at least two independent biological replicate samples for each experiment condition.

In it's current implementation, the tool only supports single-factor experiment designs. The experiment conditions to be compared should be defined in the phenodata.tsv file and the appropriate column be selected using the 'Column describing groups' parameter.

Scaling, to account for variations in library size, are done either by dividing with the average of the total counts for each sample or, if the user have filled in the 'library_size' column of the phenodata.tsv file, by dividing with the average of those values.

In order to reduce the impact of so called RNA composition bias, which can arise for example when only a small number of genes are very highly expressed in one experiment condition but not in the other, an offset value can be estimated and built into the generalized linear model. The user can choose to turn normalization off using the 'Apply normalization' parameter.

There are two different methods for estimating the dispersion. The 'common' dispersion method assumes there are a small number of samples but many reads for estimating a common dispersion value, whereas the 'tagwise' method might be more suited for small library sizes.

Statistical testing is performed using a generalized linear model and the p-values are adjusted for multiple testing using the classical approaches.

Output

The analysis output consists of the following:


References

This tool uses the edgeR package for statistical analysis. Please read the following article for more detailed information:

MD Robinson, DJ McCarthy, and GK Smyth. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26 (1):139Ð40, Jan 2010.

.