Fast_Virome_Explorer - Estimate Virome Composition¶
Asses viral composition of sample based on read_counts of particular taxonomic units.
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/fast_virome_explorer.snake
- Rule name: fast_virome_explorer__estimate_virome_composition
Input(s):
- reads_f: fastq file with sequences from forward strand
- reads_r: fastq file with sequences from reverse strand
- index: kallisto index created from reference database
- ref_lens: lenghts of particular reference genomes from database
Output(s):
- composition: TSV table containing information about number of reads assigned to taxonomic units (most common species)
- abundance: TSV table containing NCBI ID of all found taxonomic units with assigned read counts and transkripts per milion
Custom - Fill Na Values With Virusnames¶
Python script, replaces blank space in input TSV file with virus names from that row and create new changed TSV file.
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/fast_virome_explorer.snake
- Rule name: custom__fill_na_values_with_virusnames
Input(s):
- composition: TSV table containing information about number of reads assigned to taxonomic units (most common species), generated as output of previous rule
Output(s):
- checked_composition: new TSV table but that NA values replace with virus names from first column
Custom - Convert To Tpm Metric¶
Python script (have to be set in config => count_type: tpm), create new TSV table with metric turned into tpm (transcripts per milion).
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/fast_virome_explorer.snake
- Rule name: custom__convert_to_tpm_metric
Input(s):
- checked_composition: checked TSV table in previous rule, containing information about number of reads assigned to taxonomic units (most common species)
- abundance: TSV table containing NCBI ID of all found taxonomic units with assigned read counts and transkripts per milion, output from rule fast_virome_explorer__estimate_virome_composition
Output(s):
- checked_tpm_composition: new TSV table but that count metric is changed from read count to tpm
Custom - Convert To Krona¶
Create from input file new krona file.
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/fast_virome_explorer.snake
- Rule name: custom__convert_to_krona
Input(s):
- composition: containing information about number of reads assigned to taxonomic units (most common species), output file from one of the last two previous rules (according to selected count metric)
Output(s):
- krona: new krona file
Metaxa2 - Classify Reads¶
Find closest homologue sequence for each sequenced fragment
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/metaxa2.snake
- Rule name: metaxa2__classify_reads
Input(s):
- r1: Left side of sequenced fragments in gzipped fastq format
- r2: Right side of sequenced fragments in gzipped fastq format
- blast: Blast index of reference sequences (generated by Metaxa2 database builder)
- cutoffs: Auxiliary files from reference sequences (generated by Metaxa2 database builder)
- hmm: Auxiliary file from reference sequences (generated by Metaxa2 database builder)
Output(s):
- taxonomy: Summary taxonomies of classified sequenced fragments
Metaxa2 - Create Reference Index¶
Transform genomic sequences into Metaxa2 index for faster classification
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/metaxa2.snake
- Rule name: metaxa2__create_reference_index
Input(s):
- fasta: Genomic reference sequences in Fasta format
- tax: Taxonomies for each reference sequence
Output(s):
- blast: Blast index of reference sequences
- cutoffs: Auxiliary files from reference sequences
- hmm: Auxiliary file from reference sequences
Metaxa2 - Summarize Classification¶
Summarize taxonomies per individual taxonomic levels - e.g. for species, order …
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/metaxa2.snake
- Rule name: metaxa2__summarize_classification
Input(s):
- taxonomy: Classified fragments - output of metaxa2 tool
- nomatch_template: Auxiliary file for margin case without any classified fragment
- nomatch_tax_template: Auxiliary file for margin case without any classified fragment
Output(s):
- summary: Summarized taxonomy per species level (others should be generated accordingly)
Metaxa2 - Prepare For Krona¶
Convert metaxa2 classification files into standardised format suitable for generation of Krona reports
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/metaxa2.snake
- Rule name: metaxa2__prepare_for_krona
Input(s):
- classification: Summarized classification from Metaxa2 classifier
Output(s):
- krona: Tabular format suitable for Krona report generation
Rdp - Classify Reads¶
Find closest homologue sequence for each sequenced fragment
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/rdp.snake
- Rule name: rdp__classify_reads
Input(s):
- reads: Joined sequenced fragments in fasta format
Output(s):
- readtax: Individual taxonomy for each analysed fragment
- taxonomy: Summary taxonomies of classified sequenced fragments
Rdp - Prepare For Krona¶
Convert RDP classification files into standardised format suitable for generation of Krona reports
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/classification/read_based/rdp.snake
- Rule name: rdp__prepare_for_krona
Input(s):
- classification: Summarized classification from RDP classifier
Output(s):
- krona: Tabular format suitable for Krona report generation