Supplementary web site for

Kevin Y. Yip, Chao Cheng, Nitin Bhardwaj, James B. Brown, Jing Leng, Anshul Kundaje, Joel Rozowsky, Ewan Birney, Peter J. Bickel, Michael Snyder and Mark Gerstein

Classification of genomic regions based on experimentally-determined binding sites of more than 100 transcription-related factors in the whole human genome

Files of the six types of regions, with neighboring bins merged into regions

All files are in BED format and compressed by gzip.

  GM12878 H1-hESC HeLa-S3 Hep-G2 K562 All cell types
Binding Active Regions (BARs) Region file Region file Region file Region file Region file Region files in tar
Binding Inactive Regions (BIRs) Region file Region file Region file Region file Region file Region files in tar
Promoter-Proximal Regulatory Modules (PRMs) Region file Region file Region file Region file Region file Region files in tar
Gene-Distal Regulatory Modules (DRMs) Region file Region file Region file Region file Region file Region files in tar
High Occupancy of TRF (HOT) regions Region file Region file Region file Region file Region file Region files in tar
Intergenic High Occupancy of TRF (HOT) regions Region file Region file Region file Region file Region file Region files in tar
Low Occupancy of TRF (LOT) regions Region file Region file Region file Region file Region file Region files in tar
Intergenic Low Occupancy of TRF (LOT) regions Region file Region file Region file Region file Region file Region files in tar
All types of regions Region files in tar Region files in tar Region files in tar Region files in tar Region files in tar Region files in tar

Intersection of Gene-Distal Regulatory Modules (DRMs) and predicted enhancers of ChromHMM and Segway Region file Region file Region file Region file Region file Region files in tar
Intersection of Gene-Distal Regulatory Modules (DRMs) and predicted weak enhancers of ChromHMM and Segway Region file Region file Region file Region file Region file Region files in tar

Files related to second round of enhancer validations

List of predicted enhancers (before extending based on experimental requirements) Gzipped BED file

Files related to DRM-target transcript associations

Union of DRM bins from all cell lines Gzipped BED file
Union of DRM bins merged into regions, allowing gaps of 1 bin in size Gzipped BED file
DRM-transcript pairs and associated TFs
Each line contains a DRM-transcript pair with related fields in different columns:
  1. RNA setting (0-3)
  2. Experimental method (0, 1, 2: RNA-seq; 3: CAGE)
  3. Transcript selection (0: Poly A-; 1, 3: Poly A+; 2: Short RNA)
  4. Cell compartment (always whole cell)
  5. Parent gene of the transcript
  6. Chromosome of the transcript
  7. Coordinate of the TSS of the transcript, where first position of each chromosome is conted as 1 (as opposed to 0 in BED files)
  8. Strand of the transcript
  9. The histone modification involved in the correlation
  10. Chromosome of the DRM
  11. First position of the DRM
  12. Last position of the DRM
  13. Correlation between the histone modification signals at the DRM and the expression level of the transcript
  14. P-value of correlation before correcting for multiple hypothesis testing
  15. P-value of correlation after correctig for multiple hypothesis testing
  16. List of cell lines with strong histone modification signals at the DRMs, and TRFs that bind the DRM in these cell lines in ENCODE naming format
  17. List of cell lines with strong histone modification signals at the DRMs, and TRFs that bind the DRM in these cell lines in HGNC naming format
  18. List of cell lines with strong histone modification signals at the DRMs, and the starting or ending location of a CTCF binding peak between the DRM and the transcript, if any, in the cell lines
Gzipped tab-delimited file