predict_regulator_context_cofactors¶
Identify context-specific cofactors for a regulator of interest.
This command compares how the co-association pattern of a focal regulator changes across different functional region classes, such as active enhancers versus poised enhancers.
Overview¶
predict_regulator_context_cofactors is useful for regulators that may work with
different cofactors in different genomic contexts.
The command:
builds a labeled region dataset from functional BED files
trains a region function classifier, or loads an existing checkpoint
generates regulator embeddings for each region class
compares the focal regulator’s cofactor patterns between region classes
reports candidate context-specific cofactors
For multiple region classes, all class pairs are compared.
Basic Usage¶
Compare two region classes¶
chrombert-tools predict_regulator_context_cofactors \
--function-bed active_enh.bed --function-name active_enh \
--function-bed poised_enh.bed --function-name poised_enh \
--dual-regulator "CTCF;BRD4" \
--genome hg38 \
--resolution 1kb \
--odir output
Compare multiple region classes¶
With three classes, ChromBERT-tools compares all three class pairs.
chrombert-tools predict_regulator_context_cofactors \
--function-bed enhancer.bed --function-name enhancer \
--function-bed promoter.bed --function-name promoter \
--function-bed silencer.bed --function-name silencer \
--dual-regulator "EZH2" \
--genome hg38 \
--resolution 1kb \
--odir output
Reuse an existing checkpoint¶
Use --ft-ckpt to skip classifier fine-tuning.
chrombert-tools predict_regulator_context_cofactors \
--function-bed active_enh.bed --function-name active_enh \
--function-bed poised_enh.bed --function-name poised_enh \
--dual-regulator "CTCF;BRD4" \
--ft-ckpt path/to/region_function_finetuned.ckpt \
--genome hg38 \
--resolution 1kb \
--odir output
Run with Apptainer¶
Use --nv to enable GPU access.
apptainer exec --nv /path/to/chrombert-tools.sif chrombert-tools predict_regulator_context_cofactors \
--function-bed active_enh.bed \
--function-name active_enh \
--function-bed poised_enh.bed \
--function-name poised_enh \
--dual-regulator "CTCF;BRD4" \
--genome hg38 \
--resolution 1kb \
--odir output
Parameters¶
Class definition¶
--function-bed(file path or semicolon-separated paths)BED file(s) defining one functional region class.
Repeat this option to provide multiple classes. At least two classes are needed for context comparison. If only one class is provided, a background class is generated automatically.
--function-name(string, optional)Name of each functional region class.
Class names are used in output directory names and make results easier to read.
--function-mode(and | or, default: and)How to combine multiple BED files within one class.
andkeeps shared regions.orkeeps the union of regions.
Focal regulator¶
--dual-regulator(string, required)Regulators of interest, separated by semicolons. For example:
"EZH2;BRD4"--ignore-regulator(string, optional)Regulators to mask during fine-tuning, separated by semicolons.
Cofactor detection options¶
--threshold(float, default: 0.1)Minimum difference in regulator similarity between two classes.
Lower values keep more candidate cofactors. Higher values produce a shorter and more stringent candidate list.
--quantile(float, default: 0.95)Quantile used to build the regulator similarity graph for subnetwork visualization.
This option affects the PDF subnetwork figure, not the candidate cofactor table.
Training options¶
--ft-ckpt(file path, optional)Existing fine-tuned region function classifier.
If provided, ChromBERT-tools skips classifier fine-tuning.
--mode(fast | full, default: fast)Training mode.
fastuses a subset of regions for quicker training.fulluses all eligible regions.--batch-size(int, default: 4)Batch size used for training and embedding generation.
Reference and output options¶
--genome(hg38 | mm10, default: hg38)Reference genome.
--resolution(200bp | 1kb | 2kb | 4kb, default: 1kb)ChromBERT bin resolution. For
mm10, only1kbis currently supported.--odir(directory, default: ./output)Output directory. It will be created automatically if needed.
--chrombert-cache-dir(directory, default: ~/.cache/chrombert/data)Directory for ChromBERT reference files, model files, and cached data.
Required cache files¶
The command uses the following ChromBERT cache files:
ChromBERT reference region file
ChromBERT HDF5 feature file
metadata file
ChromBERT regulator list
pre-trained ChromBERT checkpoint
mask matrix
Outputs¶
The following files are written under <odir>.
dataset/Labeled region dataset and train, validation, and test splits.
train/Classifier fine-tuning outputs and checkpoints.
This directory is created only when
--ft-ckptis not provided.emb/region<i>/Mean regulator embeddings for region class
i.results/<i>_<j>_<name_i>_vs_<name_j>/Results for each pair of region classes.
Main files include:
dual_regulator_<reg>_candidate_cofactors.csvCandidate context-specific cofactors for the focal regulator.
dual_regulator_<reg>_subnetwork.pdfSubnetwork figure showing the focal regulator and its context-specific partners.
Interpretation¶
Candidate cofactors are regulators whose similarity to the focal regulator changes between two region classes.
A larger similarity difference suggests stronger context specificity.
For example, a cofactor may be strongly associated with the focal regulator in active enhancers but not in poised enhancers.
Tips¶
Use biologically meaningful region classes, such as active versus poised enhancers or gained versus lost peaks.
Use
--function-nameto make pairwise result directories easy to understand.Use
--thresholdto control how many candidate cofactors are reported.Use
--quantileto control the density of the subnetwork figure.Use
--ft-ckptto reuse a trained classifier and test different focal regulators.To see all options, run:
chrombert-tools predict_regulator_context_cofactors -h