Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks

Abstract

Chronic lymphocytic leukemia (CLL) is characterized by substantial clinical heterogeneity, despite relatively few genetic alterations. To provide a basis for studying epigenome deregulation in CLL, we established genome-wide chromatin accessibility maps for 88 CLL samples from 55 patients using the ATAC-seq assay, which we complemented by ChIPmentation and RNA-seq data for a representative subset of samples. Furthermore, we devised a bioinformatic method for linking these chromatin profiles to clinical annotations. Our analysis identified sample-specific variation on top of a shared core of CLL regulatory regions. IGHV mutation status – which distinguishes the two major subtypes of CLL – was accurately predicted by the chromatin profiles, and gene regulatory networks inferred for IGHV-mutated vs. IGHV-unmutated samples identified characteristic regulatory differences between these two disease subtypes. In summary, we found widespread heterogeneity in the CLL chromatin landscape, established a community resource for studying epigenome deregulation in leukemia, and demonstrated the feasibility of chromatin accessibility mapping in cancer cohorts and clinical research.

Data Visualization: Interactive browser tracks for the chromatin accessibility landscape of CLL

All data are available for interactive browsing using a UCSC Genome Browser track hub

This track hub includes several types of tracks:

  1. Signal intensity for individual ATAC-seq, ChIPmentation and RNA-seq samples
  2. Cohort-level map of chromatin-accessible regions in CLL
  3. Combined ATAC-seq signal for IGHV mutated and unmutated samples
  4. Combined ChIPmentation signal for three CLL subtypes (mCLL, iCLL, uCLL)

Data Analysis: The CLL chromatin accessibility map and gene regulatory networks for download

Analysis Description and download
Cohort-level map of chromatin-accessible regions in CLL Accessibility values (quantile normalized, log2) and cohort level statistics (217 MB)
Hypervariable regions within sample groups with different IGHV mutation status BED file with regions more variable in mCLL (19.3 Kb)
BED file with regions more variable in uCLL (37.5 Kb)
Differentially expressed genes between CLL disease subtypes uCLL vs mCLL (25.5 Kb)
uCLL vs iCLL (43.5 Kb)
iCLL vs mCLL (10.8 Kb)
Chromatin accessibility regions associated with IGHV mutation status BED file with associated regions (37.9 Kb)
Accessibility values (quantile normalized, log2) and cohort level statistics of IGHV regions (3.2 MB)
Gene regulatory networks Transcription factor footprint-based inference of TF-DNA interactions and network inference. Networks are directed, weighted graphs.
CLL cohort-level network (infered from all samples) (6.9 MB)
CD19+ DNase-seq network (from publicly available data) (10.3 MB)
mCLL cohort-level network (infered from IGHV mutated samples) (7.9 MB)
uCLL cohort-level network (infered from IGHV unmutated samples) (5.4 MB)

Data Access: All raw and processed data are available from the corresponding repositories

Processed chromatin accessibility data are openly available through the Gene Expression Omnibus (GEO) repository, whereas the raw sequence reads are available under controlled access through the European Genome-phenome Archive (EGA) under the following Data Access Agreement in order to protect patient privacy.

Chromatin accessibility (ATAC-seq) Histone Marks (ChIPmentation) Gene Expression (RNA-seq)
Raw sequencing data ATAC-seq raw sequence reads
(Raw data available through European Genome-phenome Archive (EGA))
ChIPmentation raw sequence reads
(Raw data available through European Genome-phenome Archive (EGA))
RNA-seq raw sequence reads
(Raw data available through European Genome-phenome Archive (EGA))
Processed data ATAC-seq peaks
(BigWig and BED files available at the Gene Expression Omnibus (GEO) repository)
ChIPmentation histone peaks
(BigWig and BED files available at the Gene Expression Omnibus (GEO) repository)
RNA-seq gene expression values
(CSV file available at the Gene Expression Omnibus (GEO) repository)

Source Code: The analysis scripts are available from a git source code repository

Access the repository on Github or download the git repository.

Download the complete set of analysis outputs.

Citation

If you use this resource in your research, please cite:

André F. Rendeiro*, Christian Schmidl*, Jonathan C. Strefford*, Renata Walewska, Zadie Davis, Matthias Farlik, David Oscier, Christoph Bock
Chromatin accessibility maps of chronic lymphocytic leukemia identify subtype-specific epigenome signatures and transcription regulatory networks.
Nat. Commun. 7:11938 doi: 10.1038/ncomms11938 (2016).

*Shared first authors