Methods | Labels | Positive set | Negative set | Classifier | Epigenetic marks data used | CRM candidates |
---|---|---|---|---|---|---|
Our method | TF binding | CRMs overlapping TF binding peaks | Randomly selected non-CRMs or CRM not overlapping STARR-seq peaks | LR | CA, H3K27ac, H3K4me1, H3K4me3 | Predicted CRMs |
Matched Filter | STARR-seq & H3K27ac peaks | 2-kb regions around STARR-seq peaks overlapping H3K27ac or CA peaks | Randomly selected 2-kb bins not overlapping STARR and H3K27ac/CA peaks | SVM, random forest, rigid regression | CA, H3K9ac, H3K27ac, H3K4m1,H3K4m2, H3K4m3 | 2-kb sliding window |
REPTILE | EP300 binding | DMRs in ±1-kb regions around the summits of top EP300 peaks | Randomly selected 2-kb bins not overlapping EP300 peaks | Random forest | mCG, H3K4me1, H3K4me2, H3K4me3 H3K27me3, H3K9ac, H3K27ac | 2-kb sliding windows with 100-bp step size |
RFECS | EP300 binding | ±1-kb regions around the summits of top EP300 peaks | Randomly selected 2-kb bins not overlapping EP300 peaks | Random forest | mCG, H3K4me1, H3K4me2, H3K4me3 H3K27me3, H3K9ac, H3K27ac | 2-kb sliding windows with 100-bp step size |
DELTA | EP300 binding and promoters | Top EP300 peaks and all known promoter | Randomly selected 2-kb bins not overlapping EP300 peaks and promoters | AdaBoost | mCG, H3K4me1, H3K4me2, H3K4me3 H3K27me3, H3K9ac, H3K27ac | 2-kb sliding windows with 100-bp step size |
CSI-ANN | EP300 binding or known CRMs | Known CRMs or top EP300 peaks | Randomly selected 2-kb bins | Neural network | mCG, H3K4me1, H3K4me2, H3K4me3 H3K27me3, H3K9ac, H3K27ac | 2-kb sliding windows with 100-bp step size |