Analyse methylation profiles

Identify genomic regions with and without methylation. The pipeline expects reads with the bisulfide conversion.

Purpose

  • Epigenetic marker for several diseases (e.g. oncology)
  • Compare between samples with different phenotype (e.g. tissues)

Required inputs

  • Sequenced reads in gzipped fastq format.
    • each sample is represented by two gzipped fastq files
    • standard output files of paired-end sequencing
  • Reference genome in fasta format
|-- reads/original
        |-- <sample_1>_R1.fastq.gz
        |-- <sample_1>_R2.fastq.gz
        |-- <sample_2>_R1.fastq.gz
        |-- <sample_2>_R2.fastq.gz
|-- reference/<reference>
        |-- <reference>.fa

Generated outputs

  • Summary report of methylation profiles in sequenced samples

Example

How to run example:

cd /usr/local/snakelines/example/mhv

snakemake \
   --snakefile ../../snakelines.snake \
   --configfile config_methylseq.yaml

Example configuration:

Planned improvements

  • Aggregate quality statistics of preprocess and mapping with the MultiQC
  • Include coverage tracks (Bismark can produce them as well)