Output

Output Structure

epigeneticbutton/
├── config/                     # Location for the main config file and recommended location for sample files and target files
├── data/                       # Location for test material and examples (e.g. zm_structural_RNAs.fa.gz)
├── Help/                       # Location for help files (e.g. Help_structural_RNAs_database_with_Rfam)
├── profiles/
│       ├── sge/                # Config file to run snakemake on a cluster managed by SGE
│       └── slurm/              # Config file to run snakemake on a cluster managed by SLURM
├── workflow/
│       ├── envs/               # Conda environment file for depencies
│       ├── rules/              # Snakemake files with data type analysis rules
│       ├── scripts/            # R scripts for plots
│       └── snakefile           # main snakefile
├── genomes/                    # Genome directories created upon run
│       └── {ref_genome}/       # Reference genome directories with sequence, annotation and indexes
└── results/                    # Results directories created upon run
        ├── combined/           # Combined analysis results
        │       ├── bedfiles/   # Peak calling results
        │       ├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
        │       ├── logs/       # Log files
        │       ├── matrix/     # Data matrices
        │       ├── plots/      # Visualization plots
        │       └── reports/    # Analysis reports
        └── <env>/      # Data type specific directories
                ├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
                ├── fastq/      # Processed FASTQ files
                ├── logs/       # Log files
                ├── mapped/     # Mapped reads (bam)
                ├── plots/      # Data type specific plots
                ├── reports/    # QC reports
                ├── tracks/     # Track files (bigwigs)
                └── */          # data-specific directories (e.g. 'peaks' for ChIP, 'peaks' and 'motifs' for TF, 'DEG' for RNA, 'DMRs' and 'methylcall' for mC, 'clusters' for sRNA)

Histone ChIP-seq

Output tree

ChIP/
├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
├── fastq/      # Processed FASTQ files
├── logs/       # Log files
├── mapped/     # Mapped reads (bam)
├── peaks/      # Peak files (MACS2 output) for each replicate, pseudo-replicate and merged biological replicates and selected peaks (shared by merged and both pseudo-replicates).
├── plots/      # Fingerprints (IP vs Input for each IP sample), IDR if at least two biological replicates
├── reports/    # QC reports (output from Cutadapt) and summary of mapping statistics and peak statistics (output from Bowtie2 and samtools)
└── tracks/     # Track files (bigwigs); log2FC of IP/Input for each rep and merged if at least 2 biological replicates

Mapping metrics

Data for each sample:

results/ChIP/reports/summary_ChIP_<paired>_mapping_stats_ChIP__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_mapping_stats_<analysis_name>_ChIP.txt

Plot:

results/combined/plots/mapping_stats_<analysis_name>_ChIP.pdf

Example:

mapping_stats_epicc_ChIP — Histogram of mapping metrics

(the actual output is in pdf format)

Peak metrics

Data for each sample:

results/ChIP/reports/summary_ChIP_peak_stats_ChIP__<line>__<tissue>__<sample_type>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_peak_stats_<analysis_name>_ChIP.txt

Plot:

results/combined/plots/peak_stats_<analysis_name>_ChIP.pdf

(see TF ChIP-seq for an example)

Fingerprints

Performed with Deeptools.

Plot for each biological replicate:

results/ChIP/plots/Fingerprint__final__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.png

(see TF ChIP-seq for an example)

IDR

Performed with IDR.

Plot for pairs of biological replicate:

results/ChIP/plots/idr_<paired>__<data_type>__<line>__<tissue>__<sample_type>__<replicate1>_vs_<replicate2>__<ref_genome>.<narrow|broad>Peak.png

(see TF ChIP-seq for an example)

Upset Plot

Perfomed with ComplexUpset.

Table of combined peaks for all histone ChIP-seq samples in the analysis:

results/combined/bedfiles/combined_peaks__ChIP__<analysis_name>__<ref_genome>.bed

Table of combined peaks for all histone ChIP-seq samples in the analysis annotated based on the closest gene:
```
results/combined/bedfiles/annotated__combined_peaks__ChIP__<analysis_name>__<ref_genome>.bed
```

Upset plot:

results/combined/plots/Upset_combined_peaks__ChIP__<analysis_name>__<ref_genome>.pdf

(see Combined Output for an example)

TF ChIP-seq

Output tree

TF/
├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
├── fastq/      # Processed FASTQ files
├── logs/       # Log files
├── mapped/     # Mapped reads (bam)
├── motifs/     # Motifs analysis with the MEME suite, one folder per selected and idr peaks (and per replicates if so chosen in the config file)
├── peaks/      # Peak files (MACS2 output) for each replicate, pseudo-replicate and merged biological replicates and selected peaks (shared by merged and both pseudo-replicates).
├── plots/      # Fingerprints (IP vs Input for each IP sample), IDR if at least two biological replicates
├── reports/    # QC reports (output from Cutadapt) and summary of mapping statistics and peak statistics (output from Bowtie2 and samtools)
└── tracks/     # Track files (bigwigs); log2FC of IP/Input for each rep and merged if at least 2 biological replicates

Mapping metrics

Data for each sample:

results/TF/reports/summary_TF_<paired>_mapping_stats_<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_mapping_stats_<analysis_name>_TF.txt

Plot:

results/combined/plots/mapping_stats_<analysis_name>_TF.pdf

(see histone ChIP-seq for an example)

Peak metrics

Data for each sample:

results/TF/reports/summary_TF_peak_stats_<dat_type>__<line>__<tissue>__<sample_type>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_peak_stats_<analysis_name>_TF.txt

Plot:

results/combined/plots/peak_stats_<analysis_name>_TF.pdf

Example:

peak_stats_epicc_TF — Histogram of peak metrics

(the actual output is in pdf format)

Fingerprints

Performed with Deeptools.

Plot for each biological replicate:

results/ChIP/plots/Fingerprint__final__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.png

Example:

Fingerprint__final__TF_SUVH1__Col0__suvh1.1__IP__Rep1__ColCEN — Fingerprint plot comparing an IP to its Input

Irreproducible Discovery Rate

Performed with IDR.

Plot for pairs of biological replicates:

results/ChIP/plots/idr_<paired>__<data_type>__<line>__<tissue>__<sample_type>__<replicate1>_vs_<replicate2>__<ref_genome>.<narrow|broad>Peak.png

Example:

idr_se__TF_SUVH1__Col0__suvh1.1__IP__Rep1_vs_Rep2__ColCEN_peaks.narrowPeak — Plot of Irreproducible Discovery Rate for two biological replicates

Motifs

Performed with the MEME suite.

Full output from selected peaks (and idr peaks if available) for each sample:

results/TF/motifs/selected_peaks__<data_type>__<line>__<tissue>__<sample_type>__<ref_genome>/meme/

which includes:

results/TF/motifs/selected_peaks__<data_type>__<line>__<tissue>__<sample_type>__<ref_genome>/meme/meme_out/meme.html

Example:

(the actual output is in html format, and others)

Upset Plot

Perfomed with ComplexUpset.

Table of combined peaks for all TF ChIP-seq samples in the analysis:

results/combined/bedfiles/combined_peaks__TF__<analysis_name>__<ref_genome>.bed

Table of combined peaks for all TF ChIP-seq samples in the analysis annotated based on the closest gene:
```
results/combined/bedfiles/annotated__combined_peaks__TF__<analysis_name>__<ref_genome>.bed
```

Upset plot:

results/combined/plots/Upset_combined_peaks__TF__<analysis_name>__<ref_genome>.pdf

(see Combined Output for an example)

RNA-seq

Output tree

RNA/
├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
├── DEG/ # Differential Expression Analysis results. Contains count tables, list of differential expression genes for all pairwise comparisons, gene expression tables and RData object for plotting gene expression (see `usage - plotting differential expression`)
├── fastq/      # Processed FASTQ files
├── GO/ # Gene Ontology Analysis results (optional). Contains GO terms enriched in sets of DEGs uniquely UP- or DOWN-regulated in each sample, and in additional GO analysis (see `usage - GO analysis`)
├── logs/       # Log files
├── mapped/     # Mapped reads (bam) (and STAR output files)
├── plots/      # Expression and GO analysis (optional)
├── reports/    # QC reports (output from Cutadapt) and summary of mapping statistics and peak statistics (output from STAR and samtools)
└── tracks/     # Track files (bigwigs); plus and minus strand (still in positive values) CPM for each replicate and merged all replicates per sample

Mapping metrics

Data for each sample:

results/RNA/reports/summary_RNA_<paired>_mapping_stats_<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_mapping_stats_<analysis_name>_RNA.txt

Plot:

results/combined/plots/mapping_stats_<analysis_name>_RNA.pdf

(see histone ChIP-seq for an example)

Differential Expression analysis

Counts from STAR; analysis performed with EdgeR.

Count data for each RNAseq sample:

results/RNA/DEG/counts__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.tab

Summary tables for all RNAseq samples used for the analysis:

results/RNA/DEG/counts__<analysis_name>__<ref_genome>.txt # Count data output by STAR
results/RNA/DEG/samples__<analysis_name>__<ref_genome>.txt # Table of samples information for edgeR analysis
results/RNA/DEG/genes_rpkm__<analysis_name>__<ref_genome>.txt # Table of gene expression values for all genes in all samples in Reads per Kilobase Million (RPKM)

Output tables of differentially expressed genes (DEG) for each pairwise comparison:

results/RNA/DEG/FC_<analysis_name>__<ref_genome>__<line_sample1>__<tissue_sample1>_vs_<line_sample2>__<tissue_sample2>.txt # all genes in logFC sample1/sample2 and their differential statistics
results/RNA/DEG/DEG_<analysis_name>__<ref_genome>__<line_sample1>__<tissue_sample1>_vs_<line_sample2>__<tissue_sample2>.txt # only DEGs

Output summary tables of DEGs for all pairwise comparisons:

results/RNA/DEG/summary_DEG_stats__<analysis_name>__<ref_genome>.txt # number of differential expressed genes in all pairwise comparisons and uniquely regulated in each sample
results/RNA/DEG/unique_DEGs__<analysis_name>__<ref_genome>.txt # list of genes uniquely regulated in each sample

Rdata object for plotting expression levels:

results/RNA/DEG/ReadyToPlot__<analysis_name>__<ref_genome>.RData

Global output from the differential analysis:

results/combined/plots/BCV_RNAseq_<analysis_name>_<ref_genome>.pdf # Biological Coefficient of Variation of all genes
results/combined/plots/MDS_RNAseq_<analysis_name>_<ref_genome>_d12.pdf # Multidimensional scaling of all the samples on the first two dimensions, with dots instead of labels
results/combined/plots/MDS_RNAseq_<analysis_name>_<ref_genome>_d12_labs.pdf # Multidimensional scaling of all the samples on the first two dimensions, with labels instead of dots
results/combined/plots/MDS_RNAseq_<analysis_name>_<ref_genome>_d23.pdf # Multidimensional scaling of all the samples on the first two dimensions, with dots instead of labels
results/combined/plots/MDS_RNAseq_<analysis_name>_<ref_genome>_d23_labs.pdf # Multidimensional scaling of all the samples on the first two dimensions, with labels instead of dots

Examples:

BCV_RNAseq_epicc_ColCEN — BCV plot for all genes

MDS2 — MDS plot for all RNA-seq samples

Heatmap of all DEGs across all samples:

results/combined/plots/Heatmap_RNAseq_cpm__<analysis_name>__<ref_genome>.pdf # all gene expression normalized by count per million
results/combined/plots/Heatmap_RNAseq_zscore__<analysis_name>__<ref_genome>.pdf # each gene normalized by Z-score

Example:

Heatmap_RNAseq_cpm__epicc__ColCEN — Heatmap of expression (CPM) in all samples for all the differentially expressed genes in this analysis

(the actual output is in pdf format)

Plots of expression level in all samples for the top 100 DEGs (if present):

results/combined/plots/plot_expression__<analysis_name>__<ref_genome>__unique_DEGs.pdf

Example:

RNAseq_expression — Histogram of gene expression (RPKM) in the different samples (dots = biological replicates, bar = mean) for one differentially expressed gene

Gene Ontology analysis

Performed with rrvgo and TopGO.

List of Gene Ontology (GO) terms and corresponding Gene IDs (GIDs) enriched in the DEGs uniquely UP- and DOWN-regulated in each sample:

results/RNA/GO/topGO_DOWN_in_<line>__<tissue>_BP_GOs.txt # Biological Process (BP) GO terms enriched in genes only DOWN-regulated in this sample
results/RNA/GO/topGO_DOWN_in_<line>__<tissue>_BP_GIDs.txt # genes in the Biological Process (BP) GO terms enriched in genes only DOWN-regulated in this sample
results/RNA/GO/topGO_DOWN_in_<line>__<tissue>_MF_GOs.txt # Molecular Function (MF) GO terms enriched in genes only DOWN-regulated in this sample
results/RNA/GO/topGO_DOWN_in_<line>__<tissue>_MF_GIDs.txt # genes in the Molecular Function (MF) GO terms enriched in genes only DOWN-regulated in this sample
results/RNA/GO/topGO_UP_in_<line>__<tissue>_BP_GOs.txt # Biological Process (BP) GO terms enriched in genes only UP-regulated in this sample
results/RNA/GO/topGO_UP_in_<line>__<tissue>_BP_GIDs.txt # genes in the Biological Process (BP) GO terms enriched in genes only UP-regulated in this sample
results/RNA/GO/topGO_UP_in_<line>__<tissue>_MF_GOs.txt # Molecular Function (MF) GO terms enriched in genes only UP-regulated in this sample
results/RNA/GO/topGO_UP_in_<line>__<tissue>_MF_GIDs.txt # genes in the Molecular Function (MF) GO terms enriched in genes only DOWN-regulated in this sample

Corresponding plots:

results/RNA/plots/topGO_DOWN_in_<line>__<tissue>_BP_treemap.pdf # Treemap of simplified BP terms in DOWN-regulated genes in this sample
results/RNA/plots/topGO_DOWN_in_<line>__<tissue>_MF_treemap.pdf # Treemap of simplified MF terms in DOWN-regulated genes in this sample
results/RNA/plots/topGO_UP_in_<line>__<tissue>_BP_treemap.pdf # Treemap of simplified BP terms in UP-regulated genes in this sample
results/RNA/plots/topGO_UP_in_<line>__<tissue>_MF_treemap.pdf # Treemap of simplified MF terms in UP-regulated genes in this sample

If not enough terms are enriched, these plots might not be created.

Example:

topGO_DOWN_in_Col0__suvh13_BP_treemap — Treemap of the enriched Gene Ontology term of the Biological Process class in unique Down-regulated genes of this sample

(the actual output is in pdf format)

small RNA-seq

Output tree

sRNA/
├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
├── clusters/ # Clusters and differential analysis when all samples are analyzed together, on de novo identified clusters, on all genes and on all TEs (optional)
├── fastq/      # Processed FASTQ files
├── logs/       # Log files
├── mapped/     # Subfolders of ShortStack output for each replicate
├── reports/    # QC reports (output from Cutadapt) and summary of size statistics
└── tracks/     # Track files (bigwigs); plus and minus strand (still in positive values) CPM for each replicate and merged all replicates per sample for each size chosen (default, 21, 22, 23 and 24nt)

Mapping statistics

Data for each sample:

results/sRNA/reports/sizes_stats__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_sizes_stats_<analysis_name>_sRNA.txt

Plot:

results/combined/plots/srna_sizes_stats_<analysis_name>_sRNA.pdf # all sizes found in the raw data
results/combined/plots/srna_sizes_stats_zoom_<analysis_name>_sRNA.pdf # zoom to chosen sizes (default 21 to 24nt)

Example:

srna_sizes_stats_epicc_sRNA — Histogram of size distribution in small RNA samples

Cluster and Differential Expression analysis

Counts from ShortStack; analysis performed with EdgeR.

ShortStack analysis on each replicate:

results/mapped/<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>/ # output folder from ShortStack with all cluster results and alignement files
results/mapped/<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>/clusters.bed # simplified bed-file of clusters for downstream analyses

For all samples in the analysis, two runs will be performed by default (three with the optional TE analysis), which will create the same output

The full ShortStack output will be located in these folders:

results/clusters/<analysis_name>__<ref_genome>__on_new_clusters/ # Identifying small RNA clusters, normalizing accross all samples
results/clusters/<analysis_name>__<ref_genome>__on_all_genes/ # Limiting mapping to all genes, normalizing accross all samples
results/clusters/<analysis_name>__<ref_genome>__on_all_TEs/ # Limiting mapping to all TEs, normalizing accross all samples (optional)

Each folder (e.g. on new clusters) will also contain the files required for edgeR analysis and output from the differential analysis, following the same pattern than for DEG analysis of RNAseq:

results/clusters/<analysis_name>__<ref_genome>__on_new_clusters/counts_for_edgeR.txt # Count data for edgeR analysis
results/clusters/<analysis_name>__<ref_genome>__on_new_clusters/samples_for_edgeR.txt # Table of samples information for edgeR analysis
results/clusters/<analysis_name>__<ref_genome>__on_new_clusters/FC_<line_sample1>__<tissue_sample1>_vs_<line_sample2>__<tissue_sample2>.txt # log Fold Change between each pairs of samples at all clusters
results/clusters/<analysis_name>__<ref_genome>__on_new_clusters/DEG_<line_sample1>__<tissue_sample1>_vs_<line_sample2>__<tissue_sample2>.txt # only differentially regulated clusters between each pair of samples
results/clusters/<analysis_name>__<ref_genome>__on_new_clusters/unique_DEGs.txt # list of clusters uniquely regulated in each sample

For each differential analysis (e.g. on new clusters), similar output than for RNAseq DEGs will be generated, following the naming pattern sRNA_<analysis_name>_<ref_genome>__on_new_clusters pattern. It includes:

results/sRNA/reports/summary_DEG_stats__<analysis_name>__<refgenome>__on_new_clusters.txt # number of differential regulated clusters in all pairwise comparisons and uniquely regulated in each sample
results/combined/plots/BCV_RNAseq_<analysis_name>_<ref_genome>.pdf # Biological Coefficient of Variation of all genes
results/combined/plots/MDS_RNAseq_<analysis_name>_<ref_genome>_<d12|d12_labs|d23|d23_labs>.pdf # Multidimensional scaling, all four versions
results/combined/plots/Heatmap_sRNA_<cpm|zscore>__<analysis_name>__<ref_genome>__on_new_clusters.pdf # expression values accross all differentially regulated clusters by count per million and zscore

(See RNAseq for examples)

Upset Plot

Perfomed with ComplexUpset.

Table of combined clusters identified in at least one of the small RNA replicates in the analysis, split by chosen sizes:
```
results/combined/bedfiles/combined_clusters__sRNA__<analysis_name>__<ref_genome>.bed
```

Table of combined clusters annotated based on the closest gene:

results/combined/bedfiles/annotated__combined_clusters__sRNA__<analysis_name>__<ref_genome>.bed

Upset plot:

results/combined/plots/Upset_combined_clusters__sRNA__<analysis_name>__<ref_genome>.pdf

Example:

Upset_combined_clusters__sRNA__epicc__ColCEN — Upset plot of small RNA clusters

DNA methylation

Output tree

mC/
├── chkpts/     # Empty checkpoint files used for pipeline logic. Deleting them will trigger rerunning the corresponding analysis
├── DMRs/ # Differential Methylated Regions results. Contains list of DMRs for each context in pairwise comparisons and summary tables
├── fastq/      # Processed FASTQ files
├── logs/       # Log files
├── mapped/     # Mapped reads (bam)
├── methylcall/ # CX report (Bismark output) of methylation calls per cytosines
├── reports/    # QC reports (output from Cutadapt) and summary of mapping statistics and methylation statistics (output from Bismark)
└── tracks/     # Track files (bigwigs); strand-specific and merged methylation values (from 0 to 100%) for each replicate and all replicates of each sample merged

Mapping metrics

Data for each sample:

results/mC/reports/summary_mC_<paired>_mapping_stats_<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.txt

Summary table:

results/combined/reports/summary_mapping_stats_<analysis_name>_mC.txt

Plot:

results/combined/plots/mapping_stats_<analysis_name>_mC.pdf

(see histone ChIP-seq for an example)

Methylation Calls

Performed with Bismark

Methylation data for each sample:

results/mC/reports/final_report_<paired>__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.html # html summary output from Bismark
results/mC/reports/<paired>__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.deduplicated.cytosine_context_summary.txt # methylation level per sequence context output from Bismark
results/mC/reports/<paired>__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.deduplicated.M-bias.txt # M-bias data output from Bismark
results/mC/reports/<paired>__<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.deduplicated_splitting_report.txt # Methylation extraction statistics output from Bismark
results/mC/methylcall/<data_type>__<line>__<tissue>__<sample_type>__<replicate>__<ref_genome>.deduplicated.CX_report.txt.gz # Table with methylation values and coverage for each cytosine of the genome

Differential Methylated Regions analysis

Performed with DMRcaller

List of differentially methylated regions (DMRs) in each sequence context and summary tables for each pairwise comparison of samples:

results/mC/DMRs/<data_type_sample1>__<line_sample1>__<tissue_sample1>__<sample_type_sample1>__<ref_genome_sample1>__vs__<data_type_sample2>__<line_sample2>__<tissue_sample2>__<sample_type_sample2>__<ref_genome_sample2>__CG_DMRs.txt # Bed file of DMRs in the CG context between sample1 and sample2 including methylation values and statistics (if there are any CG DMRs)
results/mC/DMRs/<data_type_sample1>__<line_sample1>__<tissue_sample1>__<sample_type_sample1>__<ref_genome_sample1>__vs__<data_type_sample2>__<line_sample2>__<tissue_sample2>__<sample_type_sample2>__<ref_genome_sample2>__CHG_DMRs.txt # Bed file of DMRs in the CHG context between sample1 and sample2 including methylation values and statistics (if mC context is all and there are any CHG DMRs)
results/mC/DMRs/<data_type_sample1>__<line_sample1>__<tissue_sample1>__<sample_type_sample1>__<ref_genome_sample1>__vs__<data_type_sample2>__<line_sample2>__<tissue_sample2>__<sample_type_sample2>__<ref_genome_sample2>__CHH_DMRs.txt # Bed file of DMRs in the CHH context between sample1 and sample2 including methylation values and statistics (if mC context is all and there are any CHH DMRs)

results/mC/DMRs/summary__<data_type_sample1>__<line_sample1>__<tissue_sample1>__<sample_type_sample1>__<ref_genome_sample1>__vs__<data_type_sample2>__<line_sample2>__<tissue_sample2>__<sample_type_sample2>__<ref_genome_sample2>__DMRs.txt # summary table with the number of Hyper- and Hypo-methylated regions in each sequence context between sample1 and sample2

Combined Output

Upset Plot

Perfomed with ComplexUpset.

Table of combined peaks for all TF and histone ChIP-seq samples in the analysis:

results/combined/bedfiles/combined_peaks__all_chip__<analysis_name>__<ref_genome>.bed

Table of combined peaks for all TF and histone ChIP-seq samples in the analysis annotated based on the closest gene:
```
results/combined/bedfiles/annotated__combined_peaks__all_chip__<analysis_name>__<ref_genome>.bed
```

Upset plot:

results/combined/plots/Upset_combined_peaks__all_chip__<analysis_name>__<ref_genome>.pdf

Example:

Upset_combined_peaks__all_chip__epicc__ColCEN — Upset plot of peaks in all histone and TF ChIP-seq samples

Heatmaps and metaplots

Performed with Deeptools.

Heatmaps and metaplots will be performed on all genes in the reference genome, and on all TEs in the TE file provided if optional TE analysis is selected.
Three sets of heatmaps and metaplots will be generated with a corresponding prefix, aligning all regions on their Transcription Start Sites (tss), Transcription End sites (tes) and scaling all regions to the same lenght (regions) (used for examples below).
DNA methylation samples are treated separately for these outputs due to different interpolation method requirement for vizualization (‘nearest’ instead of ‘bilinear’ for the other data types).

Deeptool matrices and sorted region files:

results/combined/matrix/final_matrix_regions__most__<analysis_name>__<ref_genome>__all_genes.gz # matrix scaled by regions for all samples except mC on all genes
results/combined/matrix/final_matrix_regions__mC__<analysis_name>__<ref_genome>__all_genes.gz # matrix scaled by regions for mC samples on all genes
results/combined/matrix/sorted_final_matrix_regions__mC__<analysis_name>__<ref_genome>__all_genes.gz # matrix scaled by regions for mC samples on all genes sorted according to other samples (only present if `heatmap_sort_mc_after_others` is set to `true`)
results/combined/matrix/Heatmap__regions__most__<analysis_name>__<ref_genome>__all_genes_sorted_regions.bed # bed-file of regions of all genes in sorted order of all samples except mC

results/combined/matrix/final_matrix_tss__most__<analysis_name>__<ref_genome>__all_genes.gz # matrix of regions aligned on TSS for all samples except mC on all genes
results/combined/matrix/final_matrix_tss__mC__<analysis_name>__<ref_genome>__all_genes.gz # matrix of regions aligned on TSS for mC samples on all genes
results/combined/matrix/sorted_final_matrix_tss__mC__<analysis_name>__<ref_genome>__all_genes.gz # matrix of regions aligned on TSS for mC samples on all genes sorted according to other samples (only present if `heatmap_sort_mc_after_others` is set to `true`)
results/combined/matrix/Heatmap__tss__most__<analysis_name>__<ref_genome>__all_genes_sorted_regions.bed # bed-file of regions aligned on TSS of all genes in sorted order of all samples except mC

results/combined/matrix/final_matrix_tes__most__<analysis_name>__<ref_genome>__all_genes.gz # matrix of regions aligned on TES for all samples except mC on all genes
results/combined/matrix/final_matrix_tes__mC__<analysis_name>__<ref_genome>__all_genes.gz # matrix of regions aligned on TES for mC samples on all genes
results/combined/matrix/sorted_final_matrix_tes__mC__<analysis_name>__<ref_genome>__all_genes.gz # matrix of regions aligned on TES for mC samples on all genes sorted according to other samples (only present if `heatmap_sort_mc_after_others` is set to `true`)
results/combined/matrix/Heatmap__tes__most__<analysis_name>__<ref_genome>__all_genes_sorted_regions.bed # bed-file of regions aligned on TES of all genes in sorted order of all samples except mC

#optional: if TE analysis is set to true, the same files will be generated for all regions in the provided TE file

Heatmaps:

results/combined/plots/Heatmap__regions__most__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps scaled by regions of all samples except mC, sorted by mean (default, can be changed in configuration)
results/combined/plots/Heatmap_sorted__regions__mC__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps scaled by regions of mC samples, in the same sort order than all other samples above (if `heatmap_sort_mc_after_others` is set to `true`)
results/combined/plots/Heatmap__regions__mC__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps scaled by regions of mC samples, sorted by mean (default, can be changed in configuration if `heatmap_sort_mc_after_others` is set to `false`)

results/combined/plots/Heatmap__tss__most__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps of regions aligned on TSS of all samples except mC, sorted by mean (default, can be changed in configuration)
results/combined/plots/Heatmap_sorted__tss__mC__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps of regions aligned on TSS of mC samples, in the same sort order than all other samples above (if `heatmap_sort_mc_after_others` is set to `true`)
results/combined/plots/Heatmap__tss__mC__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps of regions aligned on TSS of mC samples, sorted by mean (default, can be changed in configuration if `heatmap_sort_mc_after_others` is set to `false`)

results/combined/plots/Heatmap__tes__most__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps of regions aligned on TSS of all samples except mC, sorted by mean of all these samples (default, can be changed in configuration)
results/combined/plots/Heatmap_sorted__tes__mC__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps of regions aligned on TSS of mC samples, in the same sort order than all other samples above (if `heatmap_sort_mc_after_others` is set to `true`)
results/combined/plots/Heatmap__tes__mC__<analysis_name>__<ref_genome>__all_genes.pdf # heatmaps of regions aligned on TSS of mC samples, sorted by mean of all these mC samples (if `heatmap_sort_mc_after_others` is set to `false`)

#optional: if TE analysis is set to true, the same files will be generated for all regions in the provided TE file

Examples:

Heatmap__regions__most__epicc__ColCEN__all_genes — Heatmap of scaled regions for all samples but mC at all genes

Heatmap_sorted__regions__mC__epicc__ColCEN__all_genes — Heatmap of scaled regions for mC samples at all genes in the same sorted order than above

Metaplots:

results/combined/plots/Profile__regions__most__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) scaled by regions of all samples except mC on all genes (each sample on its own plot).
results/combined/plots/Profile_pergroup__regions__most__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) scaled by regions of all samples except mC on all genes (all samples on the same plot, can get messy).
results/combined/plots/Profile__regions__mC__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) scaled by regions of mC samples on all genes (each sample on its own plot).
results/combined/plots/Profile_pergroup__regions__mC__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) scaled by regions of mC samples on all genes (all samples on the same plot, can get messy).

results/combined/plots/Profile__tss__most__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TSS of all samples except mC on all genes (each sample on its own plot).
results/combined/plots/Profile_pergroup__tss__most__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TSS of all samples except mC on all genes (all samples on the same plot, can get messy).
results/combined/plots/Profile__tss__mC__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TSS of mC samples on all genes (each sample on its own plot).
results/combined/plots/Profile_pergroup__tss__mC__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TSS of mC samples on all genes (all samples on the same plot, can get messy).

results/combined/plots/Profile__tes__most__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TES of all samples except mC on all genes (each sample on its own plot).
results/combined/plots/Profile_pergroup__tes__most__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TES of all samples except mC on all genes (all samples on the same plot, can get messy).
results/combined/plots/Profile__tes__mC__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TES of mC samples on all genes (each sample on its own plot).
results/combined/plots/Profile_pergroup__tes__mC__<analysis_name>__<ref_genome>__all_genes.pdf # Metaplots of mean using lines (defaults, can be changed in configuration) aligned on TES of mC samples on all genes (all samples on the same plot, can get messy).

#optional: if TE analysis is set to true, the same files will be generated for all regions in the provided TE file

Examples:

Profile__tss__mC__epicc__ColCEN__all_genes — Metaplot of all genes aligned on their TSS for mC samples

Profile_pergroup__tss__mC__epicc__ColCEN__all_genes — Metaplot of all genes aligned on their TSS for mC samples on the same plot

(the actual outputs are in pdf format)

Browser screenshots

Performed with Gviz.

Browser shot over whole chromosomes:

results/combined/plots/Browser_full_chromosomes__all__epicc__ColCEN.pdf # Browser of all the samples on full-length chromsomes

Example:

Browser_full_chromosomes__all__epicc__ColCEN — Browser shot of full-length chromosome1 (binsize 50kb)

Browser shot over target regions:

results/combined/plots/Browser_interesting_genes__all__epicc__ColCEN.pdf # Browser of all the samples at the target loci `data/target_loci.bed`

Example:

Browser_DDM1 — Browser shot around a target loci, here the region surrounding the DDM1 gene (binsize 1bp)

(the actual output is in pdf format)