# Tuco RNAseq work #### begin by downloading the data ``` lftp -c 'set ssl:verify-certificate no set ftp:ssl-protect-data true set ftp:ssl-force true; open -u i20240711_150PE_NVX2+_25B_Lee_M004370,Aepujaji9oungah -e "mirror -c; quit" ftp://gslanalyzer.qb3.berkeley.edu:990' ``` #### genome vs transcriptome approach Will need to determine if the genome (https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_004027165.1/) or _de novo_ transcriptome approach will be better. The genome quality is probably really low - [x] BUSCO %'s C:74.8%[S:69.5%,D:5.3%],F:9.4%,M:15.8%,n:9226 - [ ] Mapping rates #### Run BUSCO ``` #!/bin/bash #SBATCH --partition=macmanes,shared #SBATCH -J tuco_busco #SBATCH --cpus-per-task=24 #SBATCH --output busco_tuco.log #SBATCH --exclude=node117 module purge module load anaconda/colsa conda activate busco-5.4.7 export AUGUSTUS_CONFIG_PATH=/mnt/lz01/hcgs/shared/databases/busco/augustus_config/ AUGUSTUS_CONFIG_PATH=/mnt/lz01/hcgs/shared/databases/busco/augustus_config/ busco -i $HOME/tuco/genome/GCA_004027165.1_CteSoc_v1_BIUU_genomic.fna \ --cpu 24 \ --out busco_tuco \ --lineage $HOME/mammalia_odb10 \ --download_path $HOME/databases/busco/ \ --offline \ --config /mnt/lz01/hcgs/shared/databases/busco/config.ini \ --mode genome ```