ChIP-seq histone modification data allows for accurate prediction of transcription factor bound genes.
| 2010 AMATA Conference |
Robert C McLeay and Timothy L Bailey
Motif scanning of gene promoter regions is commonly used to determine which genes are regulated by a particular transcription factor (TF). Since TF binding site motifs are short and degenerate, large numbers of false positives are predicted. In addition, TF binding is both tissue- and condition-specific; sites bound in one tissue may not be bound in another. These properties impair the accuracy of the many different analyses that use in silico TF binding site predictions.
Chromatin immunoprecipitation combined with high-throughput sequencing(ChIP-seq) has enabled genome-wide mapping of transcription factor binding sites and histone modifications. These datasets allow whole-genome evaluation of computational methods for predicting transcription factor binding and the association of condition- and tissue-specific histone modifications with TF binding. We use ChIP-seq data for 8 transcription factors as a gold-standard for developing and evaluating methods for computational prediction of transcription factor binding.
We integrate histone modification information with traditional motif scanning in a fast and more accurate method to predict TF-DNA binding that is widely applicable in many common analyses as an alternative to current motif scanning methods. Compared to an existing method that uses a phylogenetic motif model integrating evolutionary information to improve performance, our method predicts less than half as many false positives at 90% recovery of TF-bound promoters.
Our results show that the H3K4me3 histone modification, which indicates transcriptionally active genes, is more informative for predicting TF-bound genes than phylogenetic information, and for approximately half of bound promoters, more informative than the TF's DNA-binding affinity.
| < Prev |
|---|
Last Updated ( Tuesday, 18 January 2011 23:55 )


