# DRAM
https://github.com/shafferm/DRAM
```bash
wget https://raw.githubusercontent.com/shafferm/DRAM/master/environment.yaml
conda env create -f environment.yaml -n DRAM
DRAM-setup.py prepare_databases --output_dir DRAM_data --skip_uniref --threads 30 --verbose
```
Got this error after 100 minutes:
```
1:41:00.068935: DRAM databases and forms downloaded
1:41:00.242283: Files moved to final destination
1:41:00.244326: Setting database paths
Traceback (most recent call last):
File "/data1/mlee/miniconda3/envs/DRAM/bin/DRAM-setup.py", line 146, in <module>
args.func(**args_dict)
File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 526, in prepare_databases
set_database_paths(**output_dbs, use_current_locs=False, update_description_db=True, start_time=start_time)
File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 350, in set_database_paths
db_dict = check_exists_and_add_to_location_dict(description_db_loc, 'description_db', db_dict)
File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 303, in check_exists_and_add_to_location_dict
if check_file_exists(loc):
File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 34, in check_file_exists
raise ValueError("Database location does not exist: %s" % db_loc)
ValueError: Database location does not exist: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/description_db.sqlite
```
Looking through the github issues, found [this thread](https://github.com/shafferm/DRAM/issues/14), and tried this:
```bash
DRAM-setup.py print_config
```
But dbs aren't stored even though they downloaded and are present:
```
KEGG db: None
KOfam db: None
KOfam KO list: None
UniRef db: None
Pfam db: None
Pfam hmm dat: None
dbCAN db: None
dbCAN family activities: None
RefSeq Viral db: None
MEROPS peptidase db: None
VOGDB db: None
VOG annotations: None
Description db: None
Genome summary form: /Users/shafferm/lab/DRAM/data/genome_summary_form.tsv
Module step form: /Users/shafferm/lab/DRAM/data/module_step_form.tsv
ETC module database: /Users/shafferm/lab/DRAM/data/etc_module_database.tsv
Function heatmap form: /Users/shafferm/lab/DRAM/data/function_heatmap_form.tsv
AMG database: /Users/shafferm/lab/DRAM/data/amg_database.tsv
```
```
ls DRAM_data/*
DRAM_data/amg_database.20210406.tsv DRAM_data/peptidases.20210406.mmsdb_h DRAM_data/refseq_viral.20210406.mmsdb
DRAM_data/CAZyDB.07302020.fam-activities.txt DRAM_data/peptidases.20210406.mmsdb_h.dbtype DRAM_data/refseq_viral.20210406.mmsdb.dbtype
DRAM_data/dbCAN-HMMdb-V9.txt DRAM_data/peptidases.20210406.mmsdb_h.index DRAM_data/refseq_viral.20210406.mmsdb_h
DRAM_data/dbCAN-HMMdb-V9.txt.h3f DRAM_data/peptidases.20210406.mmsdb.idx DRAM_data/refseq_viral.20210406.mmsdb_h.dbtype
DRAM_data/dbCAN-HMMdb-V9.txt.h3i DRAM_data/peptidases.20210406.mmsdb.idx.dbtype DRAM_data/refseq_viral.20210406.mmsdb_h.index
DRAM_data/dbCAN-HMMdb-V9.txt.h3m DRAM_data/peptidases.20210406.mmsdb.idx.index DRAM_data/refseq_viral.20210406.mmsdb.idx
DRAM_data/dbCAN-HMMdb-V9.txt.h3p DRAM_data/peptidases.20210406.mmsdb.index DRAM_data/refseq_viral.20210406.mmsdb.idx.dbtype
DRAM_data/etc_mdoule_database.20210406.tsv DRAM_data/peptidases.20210406.mmsdb.lookup DRAM_data/refseq_viral.20210406.mmsdb.idx.index
DRAM_data/function_heatmap_form.20210406.tsv DRAM_data/peptidases.20210406.mmsdb.source DRAM_data/refseq_viral.20210406.mmsdb.index
DRAM_data/genome_summary_form.20210406.tsv DRAM_data/Pfam-A.hmm.dat.gz DRAM_data/refseq_viral.20210406.mmsdb.lookup
DRAM_data/kofam_ko_list.tsv DRAM_data/pfam.mmspro DRAM_data/refseq_viral.20210406.mmsdb.source
DRAM_data/kofam_profiles.hmm DRAM_data/pfam.mmspro.dbtype DRAM_data/vog_annotations_latest.tsv.gz
DRAM_data/kofam_profiles.hmm.h3f DRAM_data/pfam.mmspro_h DRAM_data/vog_latest_hmms.txt
DRAM_data/kofam_profiles.hmm.h3i DRAM_data/pfam.mmspro_h.dbtype DRAM_data/vog_latest_hmms.txt.h3f
DRAM_data/kofam_profiles.hmm.h3m DRAM_data/pfam.mmspro_h.index DRAM_data/vog_latest_hmms.txt.h3i
DRAM_data/kofam_profiles.hmm.h3p DRAM_data/pfam.mmspro.idx DRAM_data/vog_latest_hmms.txt.h3m
DRAM_data/module_step_form.20210406.tsv DRAM_data/pfam.mmspro.idx.dbtype DRAM_data/vog_latest_hmms.txt.h3p
DRAM_data/peptidases.20210406.mmsdb DRAM_data/pfam.mmspro.idx.index
DRAM_data/peptidases.20210406.mmsdb.dbtype DRAM_data/pfam.mmspro.index
DRAM_data/database_files:
kofam_profiles merops_peptidases_nr.faa pfam.mmsmsa pfam.mmsmsa.index viral.1.protein.faa.gz viral.merged.protein.faa.gz vog.hmm.tar.gz
kofam_profiles.tar.gz Pfam-A.full.gz pfam.mmsmsa.dbtype tmp viral.2.protein.faa.gz vogdb_hmms
```
Then looked at this [thread](https://github.com/shafferm/DRAM/issues/26) and tried to manually make the description db:
```bash
DRAM-setup.py set_database_locations \
--kofam_hmm_loc DRAM_data/kofam_profiles.hmm \
--kofam_ko_list_loc DRAM_data/kofam_ko_list.tsv \
--pfam_db_loc DRAM_data/pfam.mmspro \
--pfam_hmm_dat DRAM_data/Pfam-A.hmm.dat.gz \
--dbcan_db_loc DRAM_data/dbCAN-HMMdb-V9.txt \
--dbcan_fam_activities DRAM_data/CAZyDB.07302020.fam-activities.txt \
--vogdb_db_loc DRAM_data/vog_latest_hmms.txt \
--vog_annotations DRAM_data/vog_annotations_latest.tsv.gz \
--viral_db_loc DRAM_data/refseq_viral.20210406.mmsdb \
--peptidase_db_loc DRAM_data/peptidases.20210406.mmsdb \
--genome_summary_form_loc DRAM_data/genome_summary_form.20210406.tsv \
--module_step_form_loc DRAM_data/module_step_form.20210406.tsv \
--etc_module_database_loc DRAM_data/etc_mdoule_database.20210406.tsv \
--function_heatmap_form_loc DRAM_data/function_heatmap_form.20210406.tsv \
--amg_database_loc DRAM_data/amg_database.20210406.tsv \
--update_description_db
```
Note, I needed to remove `--description_db_loc DRAM_data/description_db.sqlite` in order for it to not fail immediately saying it doesn't exist, despite the help menu saying:
```
--description_db_loc DESCRIPTION_DB_LOC
Location to write description sqlite db (default: None)
```
That output went to "None" in currently directory (the default for the `--description_db_loc` argument above):
```bash
ls -lt None
# -rw-r--r-- 1 mike mike 289640448 Apr 6 15:27 None
file None
# None: SQLite 3.x database, last written using SQLite version 3035004
```
So moved to expected location:
```bash
mv None DRAM_data/description_db.sqlite
```
Looking again:
```bash
DRAM-setup.py print_config
```
```
KEGG db: None
KOfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_profiles.hmm
KOfam KO list: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_ko_list.tsv
UniRef db: None
Pfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/pfam.mmspro
Pfam hmm dat: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/Pfam-A.hmm.dat.gz
dbCAN db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/dbCAN-HMMdb-V9.txt
dbCAN family activities: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/CAZyDB.07302020.fam-activities.txt
RefSeq Viral db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/refseq_viral.20210406.mmsdb
MEROPS peptidase db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/peptidases.20210406.mmsdb
VOGDB db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_latest_hmms.txt
VOG annotations: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_annotations_latest.tsv.gz
Description db: None
Genome summary form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/genome_summary_form.20210406.tsv
Module step form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/module_step_form.20210406.tsv
ETC module database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/etc_mdoule_database.20210406.tsv
Function heatmap form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/function_heatmap_form.20210406.tsv
AMG database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/amg_database.20210406.tsv
```
Updating location of description db:
```bash
DRAM-setup.py set_database_locations \
--description_db_loc DRAM_data/description_db.sqlite \
--update_description_db
```
Then looked ok i think:
```bash
DRAM-setup.py print_config
```
```
KEGG db: None
KOfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_profiles.hmm
KOfam KO list: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_ko_list.tsv
UniRef db: None
Pfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/pfam.mmspro
Pfam hmm dat: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/Pfam-A.hmm.dat.gz
dbCAN db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/dbCAN-HMMdb-V9.txt
dbCAN family activities: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/CAZyDB.07302020.fam-activities.txt
RefSeq Viral db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/refseq_viral.20210406.mmsdb
MEROPS peptidase db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/peptidases.20210406.mmsdb
VOGDB db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_latest_hmms.txt
VOG annotations: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_annotations_latest.tsv.gz
Description db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/description_db.sqlite
Genome summary form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/genome_summary_form.20210406.tsv
Module step form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/module_step_form.20210406.tsv
ETC module database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/etc_mdoule_database.20210406.tsv
Function heatmap form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/function_heatmap_form.20210406.tsv
AMG database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/amg_database.20210406.tsv
```
Testing:
```bash
time DRAM.py annotate -i 'test-genomes/*.fasta' -o test-DRAM-output --min_contig_size 1000
# 50 minutes
time DRAM.py distill \
-i test-DRAM-output/annotations.tsv \
-o test-DRAM-output-summaries \
--trna_path test-DRAM-output/trnas.tsv \
--rrna_path test-DRAM-output/rrnas.tsv
```