Generate regulator representations from ChromBERT

Note: The remaining examples show Bash command-line usage only for extracting regulator embeddings.

embed_region subcommand: Generate regulator embeddings for input regulators across user-specified regions.

For the Python API, see `examples/api/embed_regulator.ipynb <../api/embed_regulator.ipynb>`__.

If you need to use Apptainer container, please refer to the `apptainer_use.ipynb <apptainer_use.ipynb>`__ tutorial for detailed instructions on using apptainer exec with chrombert-tools.

For more details, please refer to the `embed_regulator <https://chrombert-tools.readthedocs.io/en/latest/commands/embed_regulator.html>`__ command documentation.

[1]:
### options parameter
!chrombert-tools embed_regulator -h
Usage: chrombert-tools embed_regulator [OPTIONS]

  Extract regulator embeddings on specified regions. Supports both general and
  cell-specific modes.

Options:
  --region FILE                   Region file.  [required]
  --regulator TEXT                Regulators of interest, e.g. EZH2 or
                                  EZH2;BRD4. Use ';' to separate multiple
                                  regulators.  [required]
  --cell-type-bw FILE             Cell type accessibility BigWig file. Used
                                  for cell-specific mode.
  --cell-type-peak FILE           Cell type accessibility Peak BED file. Used
                                  for cell-specific mode.
  --ft-ckpt FILE                  Fine-tuned checkpoint. If provided, use
                                  cell-specific model and skip fine-tuning.
  --odir DIRECTORY                Output directory.  [default: ./output]
  --oname TEXT                    Output name of the regulator embeddings.
                                  [default: regulator_emb]
  --genome [hg38|mm10]            Genome.  [default: hg38]
  --resolution [1kb|200bp|2kb|4kb]
                                  Resolution.  [default: 1kb]
  --mode [fast|full]              Used when training cell-specific model.
                                  [default: fast]
  --batch-size INTEGER            Batch size.  [default: 4]
  --num-workers INTEGER           Dataloader workers.  [default: 8]
  --chrombert-cache-dir DIRECTORY
                                  ChromBERT cache dir (contains config/
                                  checkpoint/ etc).  [default:
                                  ~/.cache/chrombert/data]
  -h, --help                      Show this message and exit.

Generate regulator embeddings (pre-trained and general)

[ ]:
%%bash
# --region: focus regions
# --regulator: focus regulators
# --odir: output directory
# --genome: genome
# --resolution: resolution
chrombert-tools embed_regulator \
--region '../data/CTCF_ENCFF664UGR_sample100.bed' \
--regulator "EZH2;BRD4;CTCF;FOXA3;myod1;myF5" \
--odir "./output_emb_regulator_1kb" \
--genome "hg38" \
--resolution "1kb"
Region summary - total: 100, overlapping with ChromBERT: 100 (one region may overlap multiple ChromBERT regions, we keep overlaps with ≥50% coverage of either the ChromBERT bin or the input region), non-overlapping: 0
Note: All regulator names were converted to lowercase for matching.
Regulator count summary - requested: 6, matched in ChromBERT: 5, not found: 1, not found regulator: ['foxa3']
ChromBERT regulators: /mnt/Storage/home/chenqianqian/.cache/chrombert/data/config/hg38_6k_regulators_list.txt
Load pretrained ckpt /mnt/Storage/home/chenqianqian/.cache/chrombert/data/checkpoint/hg38_6k_1kb_pretrain.ckpt successfully!
Your supervised_file does not contain the 'label' column. Please verify whether ground truth column ('label') is required. If it is not needed, you may disregard this message.
Your supervised_file does not contain the 'label' column. Please verify whether ground truth column ('label') is required. If it is not needed, you may disregard this message.
Computing regulator embeddings: 100%|██████████| 25/25 [00:05<00:00,  4.91it/s]

Finished!
Focus region summary - total: 100, overlapping with ChromBERT: 100, non-overlapping: 0
Overlapping regions BED file: ./output_emb_regulator_1kb/overlap_region.bed
Non-overlapping regions BED file: ./output_emb_regulator_1kb/no_overlap_region.bed
Mean regulator embeddings saved to: ./output_emb_regulator_1kb/mean_regulator_emb.pkl
Region-aware regulator embeddings saved to: ./output_emb_regulator_1kb/region_aware_regulator_emb.hdf5
Embedding type: general
[ ]:
# regulator_emb_mean.pkl: one 768-dim vector per regulator (averaged across all regions)
import pickle
with open("./output_emb_regulator_1kb/mean_regulator_emb.pkl", "rb") as f:
    mean_regulator_emb_dict = pickle.load(f)

for key, value in mean_regulator_emb_dict.items():
    print(key, value.shape)

ezh2 (768,)
myod1 (768,)
brd4 (768,)
ctcf (768,)
myf5 (768,)
[6]:
# Python dictionary mapping each matched regulator to its Region-aware regulator embeddings
# shape (N_regions, 768)
import h5py
with h5py.File("./output_emb_regulator_1kb/region_aware_regulator_emb.hdf5", "r") as f:
    print(f.keys())
    print(f['emb'].keys())
    for i in f['emb'].keys():
        print(i, f['emb'][i].shape)
<KeysViewHDF5 ['emb', 'region']>
<KeysViewHDF5 ['brd4', 'ctcf', 'ezh2', 'myf5', 'myod1']>
brd4 (100, 768)
ctcf (100, 768)
ezh2 (100, 768)
myf5 (100, 768)
myod1 (100, 768)

Generate cell-type-specific embeddings using a fine-tuned checkpoint

We use a fine-tuned checkpoint as input.

The embed_regulator subcommand uses this checkpoint to generate context-specific embeddings.

[ ]:
# # Download example data
# # Myoblast and fibroblast data: ATAC-seq bigWig and peak files
# import subprocess
# import os
# if not os.path.exists('../data/myoblast_ENCFF647RNC_peak.bed'):
#     cmd = f'wget https://www.encodeproject.org/files/ENCFF647RNC/@@download/ENCFF647RNC.bed.gz -O ../data/myoblast_ENCFF647RNC_peak.bed.gz'
#     subprocess.run(cmd, shell=True)
#     cmd = f"gzip -d ../data/myoblast_ENCFF647RNC_peak.bed.gz"
#     subprocess.run(cmd, shell=True)

# if not os.path.exists('../data/myoblast_ENCFF149ERN_signal.bigwig'):
#     cmd = f'wget https://www.encodeproject.org/files/ENCFF149ERN/@@download/ENCFF149ERN.bigWig -O ../data/myoblast_ENCFF149ERN_signal.bigwig'
#     subprocess.run(cmd, shell=True)


## fine-tuned a cell-type-specific model
# '''
# --odir: output directory
# --acc_signal1: cell-type-specific accessibility signal
# --acc_peak1: cell-type-specific peak
# --genome: genome
# --resolution: resolution
# '''
# !chrombert-tools region_activity_regression \
# --odir "./output_cell_specific_emb_train" \
# --acc_signal1 "../data/myoblast_ENCFF149ERN_signal.bigwig" \
# --acc_peak1 "../data/myoblast_ENCFF647RNC_peak.bed" \
# --genome "hg38" \
# --resolution "1kb"

[ ]:
import glob
ft_ckpt_dir = "./output_cell_specific_emb_train/train/**/*.ckpt"  # Use checkpoints from embed_region.ipynb if available; otherwise, run the code above first

ft_ckpt = glob.glob(ft_ckpt_dir, recursive=True)[0]
ft_ckpt

'./output_cell_specific_emb_train/train/try_00_seed_55/lightning_logs/lightning_logs/version_0/checkpoints/epoch=0-step=37.ckpt'
[ ]:
# --region: focus regions
# --regulator: focus regulators
# --odir: output directory
# --genome: genome
# --resolution: resolution
# --ft-ckpt: path to the fine-tuned checkpoint

!chrombert-tools embed_regulator \
--region '../data/CTCF_ENCFF664UGR_sample100.bed' \
--regulator "EZH2;BRD4;CTCF;FOXA3;myod1;myF5" \
--odir "./output_emb_regulator_1kb_load_ft" \
--genome "hg38" \
--resolution "1kb" \
--ft-ckpt {ft_ckpt}


Region summary - total: 100, overlapping with ChromBERT: 100 (one region may overlap multiple ChromBERT regions, we keep overlaps with ≥50% coverage of either the ChromBERT bin or the input region), non-overlapping: 0
Note: All regulator names were converted to lowercase for matching.
Regulator count summary - requested: 6, matched in ChromBERT: 5, not found: 1, not found regulator: ['foxa3']
ChromBERT regulators: /mnt/Storage/home/chenqianqian/.cache/chrombert/data/config/hg38_6k_regulators_list.txt
Using provided fine-tuned checkpoint: ./output_cell_specific_emb_train/train/try_00_seed_55/lightning_logs/lightning_logs/version_0/checkpoints/epoch=0-step=37.ckpt
Load pretrained ckpt /mnt/Storage/home/chenqianqian/.cache/chrombert/data/checkpoint/hg38_6k_1kb_pretrain.ckpt successfully!
Loading checkpoint from ./output_cell_specific_emb_train/train/try_00_seed_55/lightning_logs/lightning_logs/version_0/checkpoints/epoch=0-step=37.ckpt
Loading from pl module, remove prefix 'model.'
Loading from pl module, replace 'pretrain_model' with 'pretrain_model.chrombert'
Loaded 111/111 parameters
Your supervised_file does not contain the 'label' column. Please verify whether ground truth column ('label') is required. If it is not needed, you may disregard this message.
Your supervised_file does not contain the 'label' column. Please verify whether ground truth column ('label') is required. If it is not needed, you may disregard this message.
Computing regulator embeddings: 100%|███████████| 25/25 [00:05<00:00,  4.72it/s]

Finished!
Focus region summary - total: 100, overlapping with ChromBERT: 100, non-overlapping: 0
Overlapping regions BED file: ./output_emb_regulator_1kb_load_ft/overlap_region.bed
Non-overlapping regions BED file: ./output_emb_regulator_1kb_load_ft/no_overlap_region.bed
Mean regulator embeddings saved to: ./output_emb_regulator_1kb_load_ft/mean_regulator_emb.pkl
Region-aware regulator embeddings saved to: ./output_emb_regulator_1kb_load_ft/region_aware_regulator_emb.hdf5
Embedding type: cell-specific
[10]:
# regulator_emb_mean.pkl: one 768-dim vector per regulator (averaged across all regions)
import pickle
with open("./output_emb_regulator_1kb_load_ft/mean_regulator_emb.pkl", "rb") as f:
    mean_regulator_emb_dict2 = pickle.load(f)

for key, value in mean_regulator_emb_dict2.items():
    print(key, value.shape)

myod1 (768,)
brd4 (768,)
ezh2 (768,)
myf5 (768,)
ctcf (768,)
[16]:
mean_regulator_emb_dict["myod1"]
[16]:
array([-2.00714844e+00, -9.10444336e-01, -6.23415527e-01, -2.63640625e+00,
       -3.17535553e-01, -9.51425781e-01,  5.77041626e-02, -3.99008789e-01,
        5.66394043e-02,  1.28871155e+00,  1.35552734e+00,  1.43785156e+00,
       -1.16653320e+00,  8.17010498e-01,  1.14992187e+00,  1.71201660e+00,
        6.81149902e-01, -7.94629364e-01,  6.32702942e-01,  1.00140625e+00,
        5.96638184e-01, -3.69458008e-01,  1.31123657e-01, -7.75520020e-01,
        2.88800049e-01,  1.74105469e+00,  4.14042969e-01,  1.72488281e+00,
       -8.69281006e-01, -1.28226074e+00, -1.62515625e+00, -1.21843750e+00,
       -3.18720093e-01,  1.53009766e+00, -7.61884766e-01,  7.34122696e-01,
       -1.57476562e+00, -1.67339478e-01, -3.62263184e-01, -1.70914063e+00,
        5.93414307e-01, -6.88667908e-01, -1.42664063e+00,  1.46355469e+00,
        7.41562500e-01,  2.43287201e-01,  3.57158813e-01,  6.16872559e-01,
        9.48164062e-01,  7.90859375e-01, -1.81777344e+00, -3.42831421e-01,
        4.01247559e-01,  1.16589447e+00,  4.48511963e-01,  4.83542480e-01,
        5.19480591e-01,  3.51649170e-01,  2.30150146e-01,  6.07413940e-01,
       -1.48645172e-01,  1.27757813e+00,  3.80262756e-02, -1.12940918e+00,
        8.55761719e-01,  8.34048157e-01,  1.22499390e-01, -5.64844971e-01,
        9.40332031e-01, -1.24039063e+00, -1.31359375e+00,  3.57182217e-01,
        3.22676544e-01,  4.57579346e-01,  2.99638062e-01,  3.56275635e-01,
        1.01912842e-01, -6.78305664e-01, -2.08938477e+00,  1.19999695e-01,
       -5.53002930e-01, -3.18108368e-02,  1.31890625e+00, -4.65848999e-01,
       -8.18537598e-01, -7.77949219e-01, -1.67167664e-01, -4.79488525e-01,
        8.26343994e-01,  3.06250000e+00, -1.76580200e-01,  1.12200546e-01,
        5.25288086e-01, -7.57294922e-01, -8.52895508e-01,  1.39113281e+00,
       -8.86814575e-01, -2.56608887e-01, -1.74742187e+00, -2.53963318e-01,
        1.67380981e-01, -4.81821289e-01,  1.73007812e+00, -7.99831543e-01,
        9.74003906e-01,  1.42649109e+00, -1.59408203e+00,  6.37457275e-01,
       -6.03220978e-01,  1.17590576e+00, -1.68345703e+00, -1.90210937e+00,
        1.14430557e+00, -2.34554688e+00,  1.49607422e+00,  2.32515625e+00,
       -2.80343750e+00, -3.78395081e-01,  2.09871094e+00, -1.02825684e+00,
        2.31868286e-01, -1.67916260e+00,  4.98359375e-01,  1.71970703e+00,
        2.83859375e+00,  2.18250000e+00,  3.15953125e+00, -6.22880859e-01,
       -1.20151245e+00, -2.51757812e+00,  8.02050781e-01,  1.09616089e+00,
       -9.17924805e-01, -5.50449219e-01,  5.32348022e-01, -9.78625488e-01,
        1.86820312e+00,  2.77996094e+00, -7.29244766e-01, -1.18949219e+00,
       -2.41820312e+00,  3.00796875e+00, -1.71554688e+00,  9.73901367e-01,
        2.25383301e-01,  1.43796875e+00,  2.60085938e+00,  6.82389336e-01,
       -5.70057373e-01, -8.04321289e-02,  3.89861679e-01, -5.85273438e-01,
        8.96734619e-02,  6.88131104e-01, -2.24858398e-01, -3.21165771e-01,
        4.09040527e-01,  1.76969727e+00, -2.60419312e-01,  7.62579346e-01,
        3.58031250e+00,  4.02404785e-01, -1.15059204e-01, -5.42120361e-01,
        2.08531250e+00, -4.66711044e-01,  2.94753906e+00, -8.52666016e-01,
       -1.44047165e-02,  1.07292114e+00,  2.15652344e+00,  3.11909180e-01,
       -7.39666748e-01,  2.52505341e-01, -8.76993713e-01,  1.38707031e+00,
        1.06022141e+00, -1.68515625e+00, -1.14213409e-01, -2.82320313e+00,
        9.75546875e-01, -2.05156250e+00, -4.86699219e-01, -1.02558670e+00,
        6.16059875e-02,  1.10526001e+00,  8.17293396e-01, -7.02229614e-01,
       -1.66427734e+00, -3.01396332e-01,  8.89615784e-01, -2.52232666e-01,
        9.11213379e-01,  1.16242187e+00,  1.63697754e+00, -9.05286255e-01,
       -1.28825195e+00,  2.09849609e+00, -2.65554688e+00, -1.63730469e+00,
        8.36598206e-02,  5.06923828e-01, -4.06788940e-01,  1.42166016e+00,
        3.10000000e+00, -1.27703369e+00,  1.02823120e+00, -4.00808525e-01,
        4.24416504e-01, -1.44386154e+00, -6.79638672e-02, -1.37006016e-01,
        9.15174561e-01,  1.12489258e+00,  8.25676270e-01,  5.90401611e-01,
       -1.09051758e+00,  2.22084351e-01,  1.49259766e+00,  7.26042480e-01,
        3.15006332e-01,  7.01137695e-01, -1.11075134e+00,  1.93221375e+00,
        8.29833984e-02, -1.23856445e+00,  2.03783203e+00, -2.07185059e+00,
       -8.09875488e-02, -1.55132812e+00, -2.49175720e-01,  2.99873047e-01,
        1.07947571e+00, -6.68829346e-02, -1.90917969e+00, -1.89555664e+00,
        3.14598389e-01, -1.92484375e+00, -9.66872559e-01, -3.69750000e+00,
        1.94148437e+00,  1.38505859e+00, -2.55273438e+00, -1.61664063e+00,
        1.61192871e+00,  1.94166992e+00,  5.70705566e-01,  1.60687500e+00,
        8.56528320e-01,  1.11196777e+00, -1.45189819e-01, -3.49975586e-02,
        2.20729492e+00,  1.24660156e+00, -1.62078125e+00,  1.04156799e-01,
        2.51078125e+00, -1.31241600e+00, -4.04311523e-01,  1.30928711e+00,
        7.26962891e-01, -2.71437500e+00, -1.53775635e-01,  1.36487305e+00,
       -2.32367188e+00,  7.56818085e-01, -3.68907471e-01,  4.32073975e-01,
       -1.15209473e+00,  6.26113892e-01,  1.44355469e+00,  9.67501068e-02,
        2.65195312e+00,  6.59545898e-01,  9.54002686e-01, -2.82989502e-02,
        4.61075439e-01, -2.22339844e+00, -5.37540703e-01, -1.38912109e+00,
        7.58520508e-01,  2.86179688e+00,  6.15019531e-01, -4.51422119e-01,
       -2.41515625e+00,  6.43550644e-01, -1.64492188e+00, -8.97047119e-01,
       -2.24259766e+00, -1.43366241e-01,  1.44536438e+00,  9.55892334e-01,
        1.03867676e+00, -5.39111328e-02, -1.32259766e+00,  1.46113281e+00,
        3.80209351e-01,  8.66933594e-01,  5.02770386e-01, -1.96220093e-01,
       -1.18707520e+00,  3.61944885e-01, -1.87238464e-01, -1.10729492e+00,
       -2.52944412e-01,  1.12331055e+00,  3.17975731e-01,  2.62104034e-01,
       -1.13178391e+00,  5.16813965e-01, -1.54383148e+00, -1.32402344e-01,
       -2.69643631e-01, -1.06546349e+00,  7.02181396e-01, -1.04381348e+00,
       -1.84277344e+00,  2.27025146e-01, -2.79241943e-02,  1.08966003e+00,
       -3.14109375e+00,  5.37347412e-01,  1.06842285e+00, -1.76900253e-01,
        9.40092773e-01, -6.17617188e-01,  1.28341125e+00,  1.54602051e-02,
        1.17869141e+00,  4.74983215e-01,  3.97453613e-01,  3.88712158e-01,
       -2.48354034e-01,  8.38311768e-01, -3.23989258e-01, -9.24111328e-01,
        5.77162323e-01, -7.40173340e-01,  3.48197266e+00, -1.27662109e+00,
       -8.62502441e-01,  2.12859375e+00,  8.13399048e-01,  9.09035645e-01,
       -1.03413574e+00, -1.45170288e-01, -1.30475342e+00, -3.85631409e-01,
       -1.89346512e+00,  1.26906128e-01, -1.22648438e+00, -1.91066406e+00,
        1.19930664e+00, -7.63925781e-01,  3.07232056e-01,  1.24731445e-01,
        1.64064453e+00,  1.58062500e+00, -2.18607422e+00, -1.04061890e-01,
        1.67074219e+00,  6.25382080e-01, -9.23645020e-02, -5.46776123e-01,
        8.78173828e-01, -1.08832336e+00,  2.71274414e-01,  1.21378906e+00,
        3.14804688e-01,  1.03656891e+00,  1.42488281e+00,  6.88918457e-01,
        2.05190796e+00, -1.68602051e+00,  4.65122375e-01, -1.10646667e+00,
       -3.27265625e+00, -2.94537964e-01,  1.59527710e+00, -1.08423584e+00,
        3.13439178e-01,  1.56929688e+00, -1.91004639e-01, -1.25720703e+00,
        5.90668945e-01, -1.40332031e+00, -1.65127930e+00, -1.81992187e+00,
        2.30042969e+00,  9.40109863e-01, -5.82900238e-02,  8.59295044e-01,
       -2.55390625e+00, -1.65048828e-01,  4.53398437e-01, -1.85109375e+00,
        1.78828613e+00,  2.04976563e+00,  5.32927246e-01,  2.35404358e-01,
       -2.09846558e+00, -1.60828125e+00, -2.58046265e-01, -1.69642639e-01,
       -3.09369049e-01, -2.37062988e-01,  1.11488647e-01,  3.17606201e-01,
       -3.93266602e-01,  7.22810669e-01, -2.48407227e+00, -2.73335938e+00,
       -1.41004578e+00, -1.17780762e+00, -3.30265625e+00, -1.31652344e+00,
        1.56078125e+00, -8.19470215e-01, -1.29839355e+00, -1.70410156e+00,
       -9.53671875e-01,  2.12124023e-01, -7.44091797e-02,  1.55894531e+00,
        1.56722656e+00,  1.03309937e+00, -5.39793396e-01,  2.52023437e+00,
       -1.08765167e+00, -1.44437820e+00, -1.72872070e+00, -2.71109375e+00,
        2.37414062e+00, -1.55527344e+00, -1.69504852e-01, -1.22302704e-01,
        1.59392578e+00, -1.37915955e-01,  5.51521606e-01,  2.38128906e+00,
       -1.04698486e+00,  5.90571289e-01,  7.32524414e-01,  1.62797363e+00,
       -2.79783516e-01,  4.61818085e-01, -5.66184082e-01, -1.53886719e+00,
        6.69418335e-02,  3.93421021e-01,  8.48974609e-01, -1.74447266e+00,
       -1.95621094e+00,  1.62380859e+00,  9.28939209e-01, -2.05853271e-01,
        1.92864258e+00, -5.72733154e-01, -4.15306702e-01,  8.33985901e-02,
       -2.90367188e+00,  3.98443909e-01,  7.53767090e-01, -9.97451172e-01,
        3.71421509e-01, -2.60562500e+00,  3.98156738e-02, -2.37886719e+00,
       -1.05211914e+00,  1.09208496e+00,  4.79055176e-01, -1.61441406e+00,
       -6.21594238e-01, -6.17456055e-02,  1.88505859e+00,  3.96650085e-01,
       -2.34446869e-01,  1.00132568e+00, -7.45126953e-01,  2.94888306e-01,
       -1.73946411e+00, -1.19603516e+00, -1.73986206e-01,  1.37458984e+00,
       -7.07890472e-01, -9.92910156e-01, -4.60467300e-01, -6.44472656e-01,
        5.18164063e-01, -2.11035156e+00,  2.71789062e+00, -1.11872070e+00,
       -9.84179687e-01, -1.46265625e+00, -2.91497803e-02,  1.82758789e+00,
       -9.55273437e-02, -8.74555664e-01, -3.44703125e+00,  2.21149101e-01,
        1.50414551e+00, -2.14632813e+00,  2.53937500e+00,  1.34180664e+00,
        7.03002930e-02, -1.54203125e+00,  2.28167969e+00, -1.79726563e+00,
        1.85562500e+00,  3.61750000e+00, -4.18096924e-02,  2.90125000e+00,
        7.40133057e-01, -2.25750000e+00,  2.00187500e+00,  3.91113892e-01,
        2.09167969e+00, -2.01835938e+00,  6.34714050e-01, -5.72022705e-01,
        2.29160156e+00, -2.23110352e+00, -1.23963703e+00,  2.31101563e+00,
        1.24168701e-01,  3.15006485e-01, -2.38428955e-01, -1.80230331e-01,
        1.09986328e+00,  2.59796143e-02, -1.30803711e+00,  8.52315063e-01,
       -7.45949707e-01,  2.14289063e+00,  2.07546143e+00, -3.75312500e+00,
       -1.15952148e+00,  3.02164062e+00, -8.46391602e-01,  1.51843750e+00,
       -1.65148437e+00, -6.94909668e-02, -8.29309082e-02, -2.33101563e+00,
       -1.11158203e+00, -3.23921875e+00,  1.48829102e+00, -8.73114014e-01,
        2.93592529e-01, -6.24111328e-01,  9.81089478e-01,  2.15967468e+00,
        5.30773010e-01, -4.06218750e+00,  1.65850464e+00, -1.92410156e+00,
       -1.74106445e-01, -1.30167969e+00, -4.26748657e-01,  1.12265747e+00,
       -9.17494507e-01, -2.14519043e-01, -2.99458008e-01, -7.58866882e-02,
        2.77765274e-01, -2.75195312e+00,  1.33478592e+00,  5.96560669e-02,
       -3.19109116e-01,  1.62875000e+00, -2.54686432e-01,  6.65812378e-01,
       -8.24255371e-01,  4.21010284e-01, -4.08640625e+00,  2.22792969e+00,
       -2.34046875e+00, -1.61898437e+00, -5.07250000e+00, -1.21714844e+00,
       -7.55514374e-01, -1.46035156e-01, -6.31752930e-01, -3.96731567e-02,
       -9.95742187e-01, -4.45074463e-01, -2.14156250e+00, -2.86757813e+00,
        6.44411621e-01,  6.27407532e-01, -1.10255859e+00,  7.22151794e-01,
       -3.16686401e-01, -1.20341797e+00,  8.39140625e-01, -1.60287109e+00,
        1.72836914e+00,  1.72307281e+00,  1.23091064e+00,  1.49648438e+00,
        2.75891876e-01, -5.71337090e-01, -1.10046387e-01,  2.60692139e-01,
        1.96308594e+00,  4.85589600e-02,  7.72878418e-01, -2.47648438e+00,
        4.38879395e-02, -1.69274414e+00, -7.99071045e-01,  1.33708237e+00,
       -2.08765625e+00, -2.28496094e+00,  1.74921875e+00,  5.99792480e-02,
       -1.07402100e+00, -4.45556641e-03,  1.13477051e+00,  5.39805527e-01,
       -1.33244141e+00,  4.01300964e-01, -1.42958984e+00,  1.65416016e+00,
       -8.99942017e-02,  1.46550781e+00,  8.05783997e-01, -6.70324707e-01,
       -5.98672485e-02,  2.33123322e-01,  5.78369141e-01, -2.34968750e+00,
        1.46879089e+00, -1.92078613e+00,  6.29534912e-02,  8.70329590e-01,
       -3.20578125e+00,  2.45632813e+00, -1.21735168e+00, -1.02977051e+00,
       -1.39185547e+00, -3.47032471e-01,  1.00394531e+00,  1.67284180e+00,
       -4.90197449e-01,  1.07365295e+00, -1.72683594e+00, -5.89355469e-03,
        1.06814209e+00,  2.84515076e-01, -2.72585938e+00,  9.92958984e-01,
        4.10335693e-01, -5.37959099e-03,  1.03429443e+00, -1.78866211e+00,
       -3.18533325e-01, -6.00592651e-01, -2.03453674e-01, -1.51890057e+00,
       -5.40886230e-01, -3.48468750e+00, -7.87766113e-01,  1.28163757e+00,
        6.71931152e-01,  7.48067627e-01,  4.11619873e-01,  3.88984375e-01,
        8.74173393e-01,  1.03061401e+00,  5.47947388e-01,  1.66234375e+00,
       -1.65905762e+00, -8.26474609e-01, -4.06781311e-01,  7.48229980e-03,
        1.51480469e+00, -2.45607300e-01,  2.23062500e+00,  2.55617187e+00,
        7.02377930e-01, -1.26870193e-01,  1.14571655e+00, -5.04790039e-01,
        1.07267578e+00,  6.64550781e-02, -8.70028687e-01, -3.53742676e-01,
       -2.27591797e+00, -5.40488892e-01,  6.56863403e-01, -8.89880371e-01,
       -1.56632813e+00,  4.66386108e-01,  1.13094238e+00, -9.43209229e-01,
       -2.33849792e-01, -1.79449219e+00,  2.98351562e+00, -7.42726345e-01,
       -3.35262451e-01, -2.15810776e-01,  8.62402344e-01, -1.56720703e+00,
       -1.88660156e+00,  1.03242188e+00, -1.97515259e-01, -1.22083374e+00,
       -1.00701172e+00, -5.38162231e-01,  3.75019531e-01, -1.97047119e-01,
        1.55338867e+00, -3.21426792e-01, -1.12964844e+00, -5.61574707e-01,
       -3.22835938e+00,  1.15681763e-01,  1.01386230e+00,  2.83273437e+00,
       -6.88135986e-01, -8.05063477e-01, -1.21151855e+00, -1.84250565e+00,
        1.67531250e+00,  8.78274231e-01, -8.75699463e-01, -5.67340698e-01,
       -4.25079956e-01,  5.77982788e-01,  2.54203125e+00,  7.64074707e-01,
       -5.38920898e-01,  2.76625000e+00,  5.72746582e-01, -5.69689941e-01,
        1.15227661e-01, -4.83818359e-01,  5.14741821e-01,  3.97366028e-01,
       -2.04390625e+00, -1.47134766e+00, -8.83032227e-02,  1.26983643e-01,
       -1.71083008e+00,  6.48510284e-01,  1.96898438e+00, -6.74235229e-01,
       -2.94992187e+00,  1.96265625e+00,  1.40054688e+00,  1.19304581e-01,
        2.12167969e-01,  8.47376709e-01,  8.55093384e-02,  1.76919922e+00,
       -1.11777344e+00, -1.62474945e+00,  1.61957031e+00, -2.45205688e-01,
       -9.52685547e-01,  3.29078125e+00, -1.27666992e+00,  1.26856445e+00,
       -4.59438782e-01, -7.30628662e-01, -4.09105225e-01,  2.01921875e+00])
[15]:
mean_regulator_emb_dict2["myod1"]
[15]:
array([-2.05085937e+00, -6.61312561e-01, -7.87341309e-01, -2.56292969e+00,
       -2.15845642e-01, -7.26225586e-01,  1.12674332e-01, -2.44242554e-01,
        7.88000488e-02,  1.09402100e+00,  1.38145508e+00,  1.60011719e+00,
       -1.21123047e+00,  7.91054687e-01,  9.68378906e-01,  1.66437500e+00,
        4.65019531e-01, -7.14401245e-01,  5.66979179e-01,  1.17439453e+00,
        3.93154907e-01, -4.25457764e-01,  3.63861084e-02, -6.71879883e-01,
        1.86912842e-01,  2.00511719e+00,  2.07645264e-01,  1.96289062e+00,
       -1.07225830e+00, -1.07841812e+00, -1.36597656e+00, -1.37207031e+00,
       -2.77866821e-01,  1.77527344e+00, -4.60074158e-01,  6.37563477e-01,
       -1.61382812e+00, -3.39229126e-01, -2.87722168e-01, -1.97074219e+00,
        3.02783203e-01, -7.95349121e-01, -1.53894531e+00,  1.48253906e+00,
        5.90205078e-01,  2.16322327e-01,  3.10146484e-01,  1.08128906e+00,
        1.08240234e+00,  8.47650146e-01, -1.65046875e+00, -3.66970520e-01,
        4.82966309e-01,  1.47290527e+00,  1.50930786e-01,  3.73768921e-01,
        6.91992188e-01,  1.66020660e-01,  2.41939697e-01,  8.01503906e-01,
       -2.36784363e-01,  1.14530762e+00, -3.62182617e-01, -1.30558838e+00,
        6.21777344e-01,  1.11135864e+00,  1.43099365e-01, -2.76130829e-01,
        9.65839844e-01, -1.45187500e+00, -1.30001953e+00, -1.53430176e-02,
        6.25672607e-01,  5.79501953e-01,  4.57066345e-02,  2.71621094e-01,
       -5.00421143e-02, -9.60605469e-01, -1.66885742e+00,  4.77532806e-01,
       -7.02851563e-01,  1.00753555e-01,  1.17841797e+00, -2.58822403e-01,
       -7.97883301e-01, -8.85917969e-01, -3.28305664e-01, -5.74985352e-01,
        7.21161652e-01,  2.94046875e+00, -2.08743286e-01, -8.52685547e-02,
        8.03833008e-01, -6.49528198e-01, -7.31225586e-01,  1.25484375e+00,
       -6.74487305e-01, -5.73089905e-01, -2.06304687e+00, -2.99032402e-01,
        4.44533386e-01, -1.38718750e+00,  1.61937500e+00, -1.09908325e+00,
        8.24958496e-01,  1.44073242e+00, -1.28368164e+00,  2.86739120e-01,
       -3.83755493e-01,  1.05764038e+00, -2.04144531e+00, -1.86128906e+00,
        1.01802155e+00, -2.22031250e+00,  1.81101563e+00,  2.12585938e+00,
       -2.66492188e+00, -2.99461670e-01,  2.11394531e+00, -1.07218262e+00,
        2.54502258e-01, -1.56318756e+00,  5.28828125e-01,  1.99453125e+00,
        2.81132813e+00,  2.10335938e+00,  3.32218750e+00, -3.86325989e-01,
       -1.14366776e+00, -2.59671875e+00,  9.89498749e-01,  1.17443604e+00,
       -1.16202148e+00, -1.68182373e-01,  6.23451233e-01, -9.74484863e-01,
        1.66332031e+00,  2.63351562e+00, -5.98324890e-01, -1.23744141e+00,
       -2.15117188e+00,  3.06500000e+00, -1.59847656e+00,  6.77058105e-01,
        7.18682861e-02,  1.37535156e+00,  2.58226562e+00,  4.58153381e-01,
       -1.02019531e+00, -3.15383148e-01,  7.27435760e-01, -7.34436035e-01,
        2.35648193e-01,  5.15531616e-01, -6.30664062e-01, -3.47162476e-01,
        3.87912598e-01,  1.56608154e+00, -2.73369980e-01,  6.77634277e-01,
        3.25500000e+00,  3.06108398e-01, -1.95331421e-01, -2.06550293e-01,
        2.18960937e+00, -2.28028564e-01,  2.45427734e+00, -8.10014648e-01,
        2.13690186e-02,  8.24217911e-01,  2.36039062e+00,  3.13591309e-01,
       -4.90407715e-01,  1.74159317e-01, -6.72309189e-01,  1.59175781e+00,
        8.95465088e-01, -2.05648438e+00, -3.33265991e-01, -3.10640625e+00,
        1.12879883e+00, -1.89136719e+00, -6.47541504e-01, -9.38204346e-01,
       -1.84201050e-01,  1.13972656e+00,  1.06573975e+00, -5.92927246e-01,
       -1.87955078e+00,  2.33776855e-02,  6.68923950e-01, -7.72338867e-02,
        8.71679688e-01,  1.09829041e+00,  1.69481445e+00, -9.75502930e-01,
       -9.96142578e-01,  2.24031250e+00, -2.93656250e+00, -1.82171875e+00,
       -1.02130127e-01,  7.03747864e-01, -3.49942627e-01,  1.52703125e+00,
        3.23796875e+00, -1.48205811e+00,  1.31237793e+00, -2.49542847e-01,
        3.10366211e-01, -1.32963867e+00, -1.77262726e-01,  1.46704102e-01,
        9.35097656e-01,  1.49273437e+00,  6.53493652e-01,  5.09588623e-01,
       -9.75756836e-01,  3.27554321e-01,  1.48318848e+00,  9.10748291e-01,
        4.48646240e-01,  5.97904053e-01, -1.06177979e+00,  1.60405457e+00,
        3.33223495e-01, -1.51093750e+00,  2.37144531e+00, -1.86169434e+00,
        2.69682617e-01, -1.52690430e+00,  4.07789612e-02, -3.21929932e-02,
        1.20485840e+00,  1.26513062e-01, -1.66972656e+00, -1.95239258e+00,
        1.90753174e-01, -2.06554687e+00, -9.30069580e-01, -3.46054687e+00,
        2.03710938e+00,  1.37115234e+00, -2.33308594e+00, -1.44390625e+00,
        1.86792969e+00,  1.90544922e+00,  8.05932617e-01,  1.57294922e+00,
        7.77293091e-01,  1.34851563e+00, -3.25275879e-01, -1.38296814e-01,
        2.06625153e+00,  1.50800781e+00, -1.57250000e+00,  2.38860397e-01,
        2.43960937e+00, -1.15665527e+00, -4.36538086e-01,  1.35863281e+00,
        5.15566406e-01, -2.52195313e+00,  1.86914063e-01,  1.13677429e+00,
       -2.49640625e+00,  1.17843872e+00, -4.06827393e-01,  3.58597260e-01,
       -1.22629028e+00,  3.63383789e-01,  1.43833008e+00,  5.11184692e-03,
        2.82472656e+00,  9.57084961e-01,  7.33183594e-01,  3.24639893e-02,
        7.02733154e-01, -2.26441406e+00, -3.24601288e-01, -1.40468750e+00,
        9.11906738e-01,  2.73156250e+00,  4.03271828e-01, -5.98726196e-01,
       -2.15752686e+00,  6.58957520e-01, -1.60179688e+00, -1.21585205e+00,
       -2.57933594e+00, -4.40823364e-02,  1.22223877e+00,  9.17036133e-01,
        1.16793945e+00, -2.24260483e-01, -1.52406250e+00,  1.59421875e+00,
        7.86750488e-01,  1.06314453e+00,  5.11153145e-01, -5.13925171e-01,
       -9.40408325e-01,  3.92521133e-01, -3.15574951e-01, -9.83759766e-01,
       -3.65918694e-01,  9.39597168e-01,  2.55842514e-01,  3.66777344e-01,
       -6.54832153e-01,  6.74285889e-01, -1.68447266e+00, -1.20932236e-01,
       -3.47430611e-01, -1.35063232e+00,  7.77221680e-01, -1.18054688e+00,
       -1.82523437e+00,  5.87658081e-01,  2.51959991e-01,  1.71699844e+00,
       -3.30031250e+00,  5.91124878e-01,  1.13507812e+00, -4.84065552e-01,
        9.19794922e-01, -8.73740234e-01,  1.34066406e+00,  2.00467529e-01,
        1.27147461e+00,  2.91445923e-01,  1.89435730e-01,  4.07329102e-01,
       -6.44738770e-02,  1.59808594e+00, -3.82492065e-01, -8.59013672e-01,
        8.08148193e-01, -7.34416504e-01,  3.26426758e+00, -1.31271484e+00,
       -7.76286621e-01,  2.37046875e+00,  6.08492737e-01,  9.95504150e-01,
       -1.27484375e+00, -2.71942139e-01, -1.37347656e+00, -2.55531921e-01,
       -2.01771973e+00,  1.09028015e-01, -9.46821289e-01, -2.40593750e+00,
        1.24376953e+00, -7.13316650e-01,  4.09053497e-01,  3.85610046e-01,
        1.85941406e+00,  1.43863281e+00, -2.14732910e+00,  1.09719849e-01,
        1.71527344e+00,  7.95456543e-01, -3.11889648e-02, -5.01947784e-01,
        9.24254150e-01, -7.68483887e-01,  4.84973755e-01,  1.34972656e+00,
        7.50589905e-01,  6.99008789e-01,  1.31558594e+00,  7.56796875e-01,
        1.84374023e+00, -1.60166992e+00,  4.91282349e-01, -8.03011017e-01,
       -2.87156250e+00, -4.51069946e-01,  1.40758301e+00, -9.09594727e-01,
        3.97942505e-01,  1.48281250e+00, -4.84467773e-01, -1.45325195e+00,
        8.37739258e-01, -1.50078125e+00, -1.58973633e+00, -2.05968750e+00,
        2.46183594e+00,  3.23842773e-01,  1.79915314e-01,  6.99672852e-01,
       -2.81109375e+00, -6.89929199e-02,  6.73004761e-01, -2.20617187e+00,
        2.03369629e+00,  2.00960938e+00,  3.26506348e-01,  4.28153076e-01,
       -2.14474609e+00, -1.33585938e+00, -3.90772705e-01, -2.59727783e-01,
       -3.26272793e-01, -1.43914032e-01,  1.00330620e-01,  4.20254517e-01,
       -6.12248535e-01,  6.80528564e-01, -2.60113281e+00, -2.77335938e+00,
       -1.58927734e+00, -1.04891357e+00, -3.57781250e+00, -1.58781250e+00,
        1.19042725e+00, -9.14228516e-01, -1.19572845e+00, -1.92682617e+00,
       -1.19278168e+00, -3.02227783e-02, -5.05890503e-01,  1.42324219e+00,
        1.72875000e+00,  1.14106720e+00, -9.71446228e-02,  2.42445312e+00,
       -1.11151978e+00, -1.79885254e+00, -1.51688141e+00, -2.38929687e+00,
        2.39382813e+00, -1.44837891e+00,  7.61900330e-02,  1.63415527e-01,
        1.62427734e+00, -4.57282715e-01,  5.76583252e-01,  2.47652344e+00,
       -1.23423828e+00,  6.82714844e-02,  8.13623047e-01,  1.03857666e+00,
       -5.55716553e-01,  6.98461914e-01, -6.86641541e-01, -1.55226562e+00,
        3.81727295e-01,  4.92665710e-01,  8.46636353e-01, -1.73730469e+00,
       -1.98187500e+00,  1.87070312e+00,  1.00655518e+00, -2.79589844e-02,
        2.03894531e+00, -8.03793335e-01, -6.59117432e-01,  1.26654663e-01,
       -3.21656250e+00,  2.81565552e-01,  6.47315369e-01, -1.05736328e+00,
        5.16902161e-01, -2.61890625e+00,  1.96346855e-01, -2.25382813e+00,
       -1.12675781e+00,  1.26701172e+00,  5.31846924e-01, -1.81152344e+00,
       -2.63232422e-01, -1.40225220e-01,  1.98787109e+00,  4.93017273e-01,
        1.45902252e-02,  8.34350586e-01, -8.90975342e-01,  3.98046875e-01,
       -1.93466797e+00, -1.24080078e+00, -2.23403397e-01,  1.67918945e+00,
       -6.06226273e-01, -9.04462891e-01, -4.43297119e-01, -7.15157166e-01,
        4.14121094e-01, -2.48898437e+00,  2.74937500e+00, -1.37769287e+00,
       -1.27783203e+00, -1.18955566e+00,  9.16809082e-03,  2.16457031e+00,
        2.10989990e-01, -8.36191406e-01, -3.65906250e+00,  1.67991943e-01,
        1.50858765e+00, -1.84367188e+00,  2.52273438e+00,  1.47150391e+00,
        2.23183060e-01, -1.60636719e+00,  2.48023438e+00, -2.21757813e+00,
        1.83027344e+00,  3.75562500e+00, -2.36709900e-01,  2.87718750e+00,
        4.23222656e-01, -2.30460938e+00,  2.26720703e+00,  6.10048218e-01,
        2.28482422e+00, -1.97027100e+00,  6.58295898e-01, -6.55173340e-01,
        2.18015625e+00, -2.24639648e+00, -1.22239334e+00,  2.60804687e+00,
        4.47711792e-01,  5.20633240e-01, -1.00163574e-01, -4.96318054e-02,
        1.00755615e+00, -1.99332275e-01, -1.34664063e+00,  9.84943848e-01,
       -7.16840820e-01,  2.18738281e+00,  2.14732422e+00, -3.79093750e+00,
       -1.16030273e+00,  2.73613281e+00, -7.41704102e-01,  1.58890625e+00,
       -1.69183594e+00,  5.42187500e-02, -1.46743164e-01, -2.64320312e+00,
       -3.16657104e-01, -3.47203125e+00,  1.60509766e+00, -9.56304092e-01,
       -1.43622665e-01, -6.41766052e-01,  7.49204102e-01,  2.25924316e+00,
        5.50970459e-01, -4.37515625e+00,  1.66417969e+00, -2.11218750e+00,
        7.95092773e-02, -1.33597656e+00, -4.94359741e-01,  1.23941406e+00,
       -8.49823227e-01, -3.59886475e-01, -7.94155884e-01, -4.15328979e-02,
        1.88248596e-01, -2.52718750e+00,  1.23927979e+00, -1.65345650e-01,
       -2.34155273e-01,  1.88484375e+00, -2.29455566e-01,  9.02575684e-01,
       -5.44145508e-01,  2.66872025e-01, -4.07406250e+00,  2.21960938e+00,
       -2.39468750e+00, -1.77634766e+00, -5.36812500e+00, -1.33270508e+00,
       -5.33262939e-01, -1.27152405e-01, -8.78852539e-01, -2.40968018e-01,
       -1.02166992e+00, -2.30050049e-01, -1.88812500e+00, -2.78515625e+00,
        9.21898956e-01,  8.49580841e-01, -1.04361328e+00,  7.45051270e-01,
       -3.41076660e-01, -1.31458984e+00,  6.86579590e-01, -1.43131348e+00,
        1.49782104e+00,  1.78210938e+00,  1.04004150e+00,  1.39100586e+00,
        3.47036972e-01, -6.58861084e-01, -1.76011963e-01,  1.12172546e-01,
        1.77157379e+00, -1.18941040e-01,  9.11992187e-01, -2.56500000e+00,
        1.31149597e-01, -1.47107788e+00, -8.73352966e-01,  1.23367065e+00,
       -1.99187500e+00, -2.39062500e+00,  1.89062500e+00, -3.03448486e-01,
       -7.17751465e-01,  3.30694580e-01,  1.04075195e+00,  6.92363281e-01,
       -1.58154297e+00,  1.89809113e-01, -1.38005859e+00,  1.43126953e+00,
       -3.91256104e-01,  1.95394531e+00,  1.05529785e+00, -6.33846436e-01,
       -2.10534058e-01,  4.06136093e-01,  4.54258423e-01, -2.52031250e+00,
        1.28060913e+00, -1.87144043e+00,  1.87130890e-01,  8.53940430e-01,
       -2.98671875e+00,  2.34712891e+00, -1.15458008e+00, -1.26313965e+00,
       -1.71025391e+00, -2.86844482e-01,  1.28646484e+00,  1.37339844e+00,
       -6.32166443e-01,  8.02758789e-01, -2.04296875e+00,  3.59297791e-01,
        1.09906250e+00,  4.52526855e-01, -2.52382813e+00,  9.27130737e-01,
        4.40196533e-01, -9.40657806e-02,  1.01741455e+00, -1.75638672e+00,
       -2.96520996e-01, -7.85144653e-01, -3.73627930e-01, -1.49948730e+00,
       -3.21943359e-01, -3.68953125e+00, -9.98813477e-01,  1.09686035e+00,
        8.18315735e-01,  8.88117065e-01,  5.90143051e-01,  9.76889038e-02,
        1.05575195e+00,  1.03966106e+00,  3.70018311e-01,  1.69648438e+00,
       -1.31366539e+00, -1.16993652e+00, -1.99609375e-01,  2.16119385e-01,
        1.80812744e+00, -4.76788483e-01,  2.57046875e+00,  2.75093750e+00,
        4.90071640e-01, -2.27823486e-01,  1.44493164e+00, -3.76059570e-01,
        1.27132812e+00,  3.44092560e-01, -4.56129761e-01, -5.52122192e-01,
       -2.61710938e+00, -7.49263458e-01,  6.00527344e-01, -6.32537842e-01,
       -1.90523438e+00,  1.84763336e-01,  1.00579102e+00, -1.15336823e+00,
        1.07827759e-02, -1.86058594e+00,  3.24195313e+00, -4.30988770e-01,
       -1.39783936e-01, -1.92712860e-01,  1.18355469e+00, -1.21367187e+00,
       -1.89398438e+00,  9.36657715e-01, -6.58792877e-02, -1.06725586e+00,
       -9.82617188e-01, -2.01152039e-01,  6.77980957e-01, -4.15167236e-01,
        1.20622742e+00, -3.79840088e-02, -1.33363281e+00, -8.38652344e-01,
       -3.02960938e+00, -4.31264305e-02,  7.54372559e-01,  2.76335937e+00,
       -9.35374756e-01, -1.21389282e+00, -1.33839844e+00, -1.73993164e+00,
        2.02773438e+00,  1.12481934e+00, -1.08450928e+00, -7.14863281e-01,
       -5.32045898e-01,  5.31797485e-01,  2.30437500e+00,  5.29749756e-01,
       -2.84665833e-01,  2.63390625e+00,  8.94520264e-01, -6.54831543e-01,
        1.40474854e-01, -5.95154648e-01,  4.60836182e-01,  6.76105347e-01,
       -2.07484375e+00, -1.47687500e+00, -4.85385590e-01,  2.89507046e-01,
       -1.48268707e+00,  8.25686035e-01,  2.13335937e+00, -6.78273926e-01,
       -2.93984375e+00,  1.72941406e+00,  1.55945313e+00,  1.51052780e-01,
        1.39624634e-01,  4.86284180e-01,  1.51637268e-01,  1.61369141e+00,
       -1.18324219e+00, -1.63455566e+00,  1.37053223e+00,  1.93296719e-01,
       -1.22253540e+00,  3.41437500e+00, -1.20662109e+00,  1.30581055e+00,
       -6.89124756e-01, -5.24299316e-01, -8.14514160e-02,  2.30187500e+00])
[7]:
# Python dictionary mapping each matched regulator to its Region-aware regulator embeddings
# shape (N_regions, 768)
import h5py
with h5py.File("./output_emb_regulator_1kb_load_ft/region_aware_regulator_emb.hdf5", "r") as f:
    print(f.keys())
    print(f['emb'].keys())
    for i in f['emb'].keys():
        print(i, f['emb'][i].shape)
<KeysViewHDF5 ['emb', 'region']>
<KeysViewHDF5 ['brd4', 'ctcf', 'ezh2', 'myf5', 'myod1']>
brd4 (100, 768)
ctcf (100, 768)
ezh2 (100, 768)
myf5 (100, 768)
myod1 (100, 768)

Generate cell-type-specific embeddings

We use cell-type-specific chromatin accessibility peak and signal files as input.

The embed_regulator subcommand uses these data to build a cell-type-specific model and generate cell-type-specific embeddings.

[ ]:
# --region: focus regions
# --regulator: focus regulators
# --odir: output directory
# --genome: genome
# --resolution: resolution
# --cell-type-bw: Cell-type-specific bigwig file
# --cell-type-peak: Cell-type-specific peak file

!chrombert-tools embed_regulator \
--region '../data/CTCF_ENCFF664UGR_sample100.bed' \
--regulator "EZH2;BRD4;CTCF;FOXA3;myod1;myF5" \
--odir "./output_emb_regulator_1kb_load_ft" \
--genome "hg38" \
--resolution "1kb" \
--cell-type-bw ../data/myoblast_ENCFF149ERN_signal.bigwig \
--cell-type-peak ../data/myoblast_ENCFF647RNC_peak.bed