Juicer
Juicer系列是HiC数据分析的常用工具,包括Juicer, Juicebox, Juicertools
最便捷的工具是Juicer系列: Juicer 从raw.fq数据直接生成 Hi-C maps & structural features;Juicebox 提供可视化、修改功能;随后再将修改后的map返回 Juicer 完成 scaffolding。
Juicertools 可提供额外的下游分析 (TAD/Loops/...)
Install
conda activate hic
conda install hcc::juicer ## juicer_tools -h
# conda install conda-forge::gnu-coreutils
conda install bioconda::bwa
git clone https://github.com/theaidenlab/juicer.git
cd juicer/CPU
wget
ln -s juicer_tools.1.9.9_jcuda.0.8.jar juicer_tools.jar
cd ..
ln -s CPU scripts ## hard code in some steps ...
cd ..
export PATH=$PATH:$PWD/juicer/CPU/ ## juicer.sh ----- juicer/CPU/juicer.sh
export PATH=$PATH:$PWD/juicer/misc/ ## generate_site_positions.py ----- juicer/misc/generate_site_positions.py
# ln -s juicer/CPU scripts
Juicebox建议安装桌面版,莫折腾WSL了
测试数据:juicer’s Test data,ENCODE 下载 .hic,WorkShop 的数据
Juicer
juicer.sh \
-z ref/hg38.fa \ # bwa index hg38.fa
-p ref/hg38.chrom.sizes \ # 染色体 大小 samtools faidx hg38.fa; cut -f 1,2 hg38.fa.fai > hg38.chrom.sizes
-y ref/hg38_HindIII.txt \ # 酶切位点 generate_site_positions.py HindIII hg38 hg38.fa
-d fq_folder \ # matching fq_folder/<fastq and split(Trimmed)>/*_R*.fastq*
-D $PWD/juicer \ # /path/to/juicer 安装目录
-t 4 # 线程数
# juicer.sh -z ref/hg38.fa -p ref/hg38.chrom.sizes -y ref/hg38_HindIII.txt -d fq_folder -D $PWD/juicer -t 4
Output:
fq_folder/aligned/
├── merged_nodups.txt # 去重后的交互对 .pairs 格式,可通过 pairtools 处理
├── inter.hic # 标准化的 Hi-C 矩阵(用于下游分析)
└── inter_30.hic # 30kb 分辨率矩阵
Juicebox
可视化,见相关视频
Try: java -jar Juicebox.jar -m hg38.chrom.sizes -p fq_folder/aligned/inter.hic
Juicertools
- Find Loops
juicer_tools hiccups -m 512 -r 5000,10000 -k KR fq_folder/aligned/inter.hic loops_output
-m matrixSize 最小交互距离
-k normalization (NONE/VC/VC_SQRT/KR) 归一化方法
-c chromosome(s)
-r resolution(s) 分辨率
-f fdr
-p peak width
-i window
-t thresholds
-d centroid distances
--ignore-sparsity
specified_loop_list
- Find TADs --- 其实是一些concat domain,更像是sub-TAD?
juicer_tools arrowhead -m 2000 -r 10000 -k KR fq_folder/aligned/inter.hic tads_output
-c chromosome(s)
-m matrix size
-r resolution
-k normalization (NONE/VC/VC_SQRT/KR)
--ignore-sparsity flag
feature_list
control_list
- 从 .hic 文件提取矩阵
juicer_tools dump <hicFile(s)> <chr1>[:x1:x2] <chr2>[:y1:y2] <BP/FRAG/分辨率 binsize> [outfile] - 归一化
juicer_tools dump oe <NONE/VC/VC_SQRT/KR>