# DRAM https://github.com/shafferm/DRAM ```bash wget https://raw.githubusercontent.com/shafferm/DRAM/master/environment.yaml conda env create -f environment.yaml -n DRAM DRAM-setup.py prepare_databases --output_dir DRAM_data --skip_uniref --threads 30 --verbose ``` Got this error after 100 minutes: ``` 1:41:00.068935: DRAM databases and forms downloaded 1:41:00.242283: Files moved to final destination 1:41:00.244326: Setting database paths Traceback (most recent call last): File "/data1/mlee/miniconda3/envs/DRAM/bin/DRAM-setup.py", line 146, in <module> args.func(**args_dict) File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 526, in prepare_databases set_database_paths(**output_dbs, use_current_locs=False, update_description_db=True, start_time=start_time) File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 350, in set_database_paths db_dict = check_exists_and_add_to_location_dict(description_db_loc, 'description_db', db_dict) File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 303, in check_exists_and_add_to_location_dict if check_file_exists(loc): File "/data1/mlee/miniconda3/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 34, in check_file_exists raise ValueError("Database location does not exist: %s" % db_loc) ValueError: Database location does not exist: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/description_db.sqlite ``` Looking through the github issues, found [this thread](https://github.com/shafferm/DRAM/issues/14), and tried this: ```bash DRAM-setup.py print_config ``` But dbs aren't stored even though they downloaded and are present: ``` KEGG db: None KOfam db: None KOfam KO list: None UniRef db: None Pfam db: None Pfam hmm dat: None dbCAN db: None dbCAN family activities: None RefSeq Viral db: None MEROPS peptidase db: None VOGDB db: None VOG annotations: None Description db: None Genome summary form: /Users/shafferm/lab/DRAM/data/genome_summary_form.tsv Module step form: /Users/shafferm/lab/DRAM/data/module_step_form.tsv ETC module database: /Users/shafferm/lab/DRAM/data/etc_module_database.tsv Function heatmap form: /Users/shafferm/lab/DRAM/data/function_heatmap_form.tsv AMG database: /Users/shafferm/lab/DRAM/data/amg_database.tsv ``` ``` ls DRAM_data/* DRAM_data/amg_database.20210406.tsv DRAM_data/peptidases.20210406.mmsdb_h DRAM_data/refseq_viral.20210406.mmsdb DRAM_data/CAZyDB.07302020.fam-activities.txt DRAM_data/peptidases.20210406.mmsdb_h.dbtype DRAM_data/refseq_viral.20210406.mmsdb.dbtype DRAM_data/dbCAN-HMMdb-V9.txt DRAM_data/peptidases.20210406.mmsdb_h.index DRAM_data/refseq_viral.20210406.mmsdb_h DRAM_data/dbCAN-HMMdb-V9.txt.h3f DRAM_data/peptidases.20210406.mmsdb.idx DRAM_data/refseq_viral.20210406.mmsdb_h.dbtype DRAM_data/dbCAN-HMMdb-V9.txt.h3i DRAM_data/peptidases.20210406.mmsdb.idx.dbtype DRAM_data/refseq_viral.20210406.mmsdb_h.index DRAM_data/dbCAN-HMMdb-V9.txt.h3m DRAM_data/peptidases.20210406.mmsdb.idx.index DRAM_data/refseq_viral.20210406.mmsdb.idx DRAM_data/dbCAN-HMMdb-V9.txt.h3p DRAM_data/peptidases.20210406.mmsdb.index DRAM_data/refseq_viral.20210406.mmsdb.idx.dbtype DRAM_data/etc_mdoule_database.20210406.tsv DRAM_data/peptidases.20210406.mmsdb.lookup DRAM_data/refseq_viral.20210406.mmsdb.idx.index DRAM_data/function_heatmap_form.20210406.tsv DRAM_data/peptidases.20210406.mmsdb.source DRAM_data/refseq_viral.20210406.mmsdb.index DRAM_data/genome_summary_form.20210406.tsv DRAM_data/Pfam-A.hmm.dat.gz DRAM_data/refseq_viral.20210406.mmsdb.lookup DRAM_data/kofam_ko_list.tsv DRAM_data/pfam.mmspro DRAM_data/refseq_viral.20210406.mmsdb.source DRAM_data/kofam_profiles.hmm DRAM_data/pfam.mmspro.dbtype DRAM_data/vog_annotations_latest.tsv.gz DRAM_data/kofam_profiles.hmm.h3f DRAM_data/pfam.mmspro_h DRAM_data/vog_latest_hmms.txt DRAM_data/kofam_profiles.hmm.h3i DRAM_data/pfam.mmspro_h.dbtype DRAM_data/vog_latest_hmms.txt.h3f DRAM_data/kofam_profiles.hmm.h3m DRAM_data/pfam.mmspro_h.index DRAM_data/vog_latest_hmms.txt.h3i DRAM_data/kofam_profiles.hmm.h3p DRAM_data/pfam.mmspro.idx DRAM_data/vog_latest_hmms.txt.h3m DRAM_data/module_step_form.20210406.tsv DRAM_data/pfam.mmspro.idx.dbtype DRAM_data/vog_latest_hmms.txt.h3p DRAM_data/peptidases.20210406.mmsdb DRAM_data/pfam.mmspro.idx.index DRAM_data/peptidases.20210406.mmsdb.dbtype DRAM_data/pfam.mmspro.index DRAM_data/database_files: kofam_profiles merops_peptidases_nr.faa pfam.mmsmsa pfam.mmsmsa.index viral.1.protein.faa.gz viral.merged.protein.faa.gz vog.hmm.tar.gz kofam_profiles.tar.gz Pfam-A.full.gz pfam.mmsmsa.dbtype tmp viral.2.protein.faa.gz vogdb_hmms ``` Then looked at this [thread](https://github.com/shafferm/DRAM/issues/26) and tried to manually make the description db: ```bash DRAM-setup.py set_database_locations \ --kofam_hmm_loc DRAM_data/kofam_profiles.hmm \ --kofam_ko_list_loc DRAM_data/kofam_ko_list.tsv \ --pfam_db_loc DRAM_data/pfam.mmspro \ --pfam_hmm_dat DRAM_data/Pfam-A.hmm.dat.gz \ --dbcan_db_loc DRAM_data/dbCAN-HMMdb-V9.txt \ --dbcan_fam_activities DRAM_data/CAZyDB.07302020.fam-activities.txt \ --vogdb_db_loc DRAM_data/vog_latest_hmms.txt \ --vog_annotations DRAM_data/vog_annotations_latest.tsv.gz \ --viral_db_loc DRAM_data/refseq_viral.20210406.mmsdb \ --peptidase_db_loc DRAM_data/peptidases.20210406.mmsdb \ --genome_summary_form_loc DRAM_data/genome_summary_form.20210406.tsv \ --module_step_form_loc DRAM_data/module_step_form.20210406.tsv \ --etc_module_database_loc DRAM_data/etc_mdoule_database.20210406.tsv \ --function_heatmap_form_loc DRAM_data/function_heatmap_form.20210406.tsv \ --amg_database_loc DRAM_data/amg_database.20210406.tsv \ --update_description_db ``` Note, I needed to remove `--description_db_loc DRAM_data/description_db.sqlite` in order for it to not fail immediately saying it doesn't exist, despite the help menu saying: ``` --description_db_loc DESCRIPTION_DB_LOC Location to write description sqlite db (default: None) ``` That output went to "None" in currently directory (the default for the `--description_db_loc` argument above): ```bash ls -lt None # -rw-r--r-- 1 mike mike 289640448 Apr 6 15:27 None file None # None: SQLite 3.x database, last written using SQLite version 3035004 ``` So moved to expected location: ```bash mv None DRAM_data/description_db.sqlite ``` Looking again: ```bash DRAM-setup.py print_config ``` ``` KEGG db: None KOfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_profiles.hmm KOfam KO list: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_ko_list.tsv UniRef db: None Pfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/pfam.mmspro Pfam hmm dat: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/Pfam-A.hmm.dat.gz dbCAN db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/dbCAN-HMMdb-V9.txt dbCAN family activities: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/CAZyDB.07302020.fam-activities.txt RefSeq Viral db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/refseq_viral.20210406.mmsdb MEROPS peptidase db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/peptidases.20210406.mmsdb VOGDB db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_latest_hmms.txt VOG annotations: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_annotations_latest.tsv.gz Description db: None Genome summary form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/genome_summary_form.20210406.tsv Module step form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/module_step_form.20210406.tsv ETC module database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/etc_mdoule_database.20210406.tsv Function heatmap form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/function_heatmap_form.20210406.tsv AMG database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/amg_database.20210406.tsv ``` Updating location of description db: ```bash DRAM-setup.py set_database_locations \ --description_db_loc DRAM_data/description_db.sqlite \ --update_description_db ``` Then looked ok i think: ```bash DRAM-setup.py print_config ``` ``` KEGG db: None KOfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_profiles.hmm KOfam KO list: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/kofam_ko_list.tsv UniRef db: None Pfam db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/pfam.mmspro Pfam hmm dat: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/Pfam-A.hmm.dat.gz dbCAN db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/dbCAN-HMMdb-V9.txt dbCAN family activities: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/CAZyDB.07302020.fam-activities.txt RefSeq Viral db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/refseq_viral.20210406.mmsdb MEROPS peptidase db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/peptidases.20210406.mmsdb VOGDB db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_latest_hmms.txt VOG annotations: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/vog_annotations_latest.tsv.gz Description db: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/description_db.sqlite Genome summary form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/genome_summary_form.20210406.tsv Module step form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/module_step_form.20210406.tsv ETC module database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/etc_mdoule_database.20210406.tsv Function heatmap form: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/function_heatmap_form.20210406.tsv AMG database: /data3/Data_Processing/mlee/ref-dbs/DRAM_data/amg_database.20210406.tsv ``` Testing: ```bash time DRAM.py annotate -i 'test-genomes/*.fasta' -o test-DRAM-output --min_contig_size 1000 # 50 minutes time DRAM.py distill \ -i test-DRAM-output/annotations.tsv \ -o test-DRAM-output-summaries \ --trna_path test-DRAM-output/trnas.tsv \ --rrna_path test-DRAM-output/rrnas.tsv ```