Megahit - Assemble Reads Into Contigs¶
Assemble preprocessed reads into the larger genomic sequences, contigs. Also generate contig overlap graphs and initial scaffolds. Use ‘megahit’ program.
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/assembly/assembler/megahit.snake
- Rule name: megahit__assemble_reads_into_contigs
Input(s):
- r1: gzipped fastq file with left reads
- r2: gzipped fastq file with right reads
Output(s):
- contigs: contigs in .fa file
- intermediate_contigs: Assembled contigs with k-mer size 99
Param(s):
- outdir: output directory (do not change)
- contigs: contig file generated by megahit (do not change)
Megahit - Generate Contig Graph¶
Create graph with contigs to visually assess complexity of the assembly
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/assembly/assembler/megahit.snake
- Rule name: megahit__generate_contig_graph
Input(s):
- intermediate_contigs: Assembled contigs with k-mer size 99
Output(s):
- contigs: contigs in .fa file
Param(s):
- outdir: output directory (do not change)
- contigs: contig file generated by megahit (do not change)
Spades - Assemble Reads Into Contigs¶
Assemble preprocessed reads into the larger genomic sequences, contigs. Also generate contig overlap graphs and initial scaffolds. Use ‘spades’ program.
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/assembly/assembler/spades.snake
- Rule name: spades__assemble_reads_into_contigs
Input(s):
- r1: gzipped fastq file with left reads
- r2: gzipped fastq file with right reads
Output(s):
- fastg: assembled graph in .fastg format
- gfa: assembly graph in .gfa format
- contigs: contigs in .fa file
- scaffolds: scaffolds in .fa file
Param(s):
- outdir: output directory (do not change)
- contigs: contig file generated by spades (do not change)
- scaffolds: scaffolds file generated by spades (do not change)
- mode: mode of operation of spades (do not change), extracted from config file
- careful: whether to use –careful parameter for spades (do not change), extracted from config file
Unicycler - Assemble Reads Into Contigs¶
Assemble preprocessed reads into the larger genomic sequences, contigs. Also generate contig overlap graphs and initial scaffolds. Use ‘unicycler’ program.
Location
- Filepath: <SnakeLines_dir>/rules/paired_end/assembly/assembler/unicycler.snake
- Rule name: unicycler__assemble_reads_into_contigs
Input(s):
- r1: gzipped fastq file with left reads
- r2: gzipped fastq file with right reads
Output(s):
- gfa: assembly graph in .gfa format as needed by downstream analysis
- contigs: assembled contigs in .fa file
Param(s):
- outdir: output directory (do not change)
- contigs: contig file generated by spades (do not change)
- gfa: assembly graph in .gfa format as generated by unicycler