This tool calculates general statistics of the reads in the given FASTQ or FASTA file.
This tool utilizes the PRINSEQ package. The statistics are calculated using the PRINSEQ option: -stats_all.
The input data can be in FASTQ or FASTA format
The command output is a table that contains following values
stats_dinuc aatt | Dinucleotide odds ratio for AA/TT. |
stats_dinuc acgt | Dinucleotide odds ratio for AC/GT. |
stats_dinuc agct | Dinucleotide odds ratio for AG/CT. |
stats_dinuc at | Dinucleotide odds ratio for AT. |
stats_dinuc catg | Dinucleotide odds ratio for CA/TG. |
stats_dinuc ccgg | Dinucleotide odds ratio for CC/GG. |
stats_dinuc cg | Dinucleotide odds ratio for CG. |
stats_dinuc gatc | Dinucleotide odds ratio for GA/TC. |
stats_dinuc gc | Dinucleotide odds ratio for GC. |
stats_dinuc ta | Dinucleotide odds ratio for TA. |
stats_dupl 3 | The number of 3' duplicates. |
stats_dupl 3maxd | |
stats_dupl 5 | The number of 5' duplicates. |
stats_dupl 5maxd | |
stats_dupl exact | The number of exact duplicates. |
stats_dupl exactmaxd | |
stats_dupl exactrevcomp | Number of exact duplicates with reverse complements. |
stats_dupl exactrevcompmaxd | |
stats_dupl revcomp | Number of 5'/3' duplicates with reverse complements. |
stats_dupl revcompmaxd | |
stats_dupl total | Total number of duplicates. |
stats_info bases | Total number of bases in the input file. |
stats_info reads | Number of reads in the input file. |
stats_len max 101 | Length of the longest read. |
stats_len mean | Mean length of the reads. |
stats_len median | Median of the read lengths. |
stats_len min | Length of the shortest read. |
stats_len mode | Mode of the read lengths. |
stats_len modeval | Number of mode length sequences. |
stats_len range | Range of the sequence lengths. |
stats_len stddev | Standard deviation of the read lengths. |
stats_ns maxn | Maximum number of Ns in one read. |
stats_ns maxp | The maximum percentage of Ns per read. |
stats_ns seqswithn | Number of reads with ambiguous base N. |
stats_tag midnum | The number of predefined MIDs. |
stats_tag prob3 | The probability of a tag sequence at the 3'-end (in percentage). |
stats_tag prob5 | The probability of a tag sequence at the 5'-end (in percentage). |