bgenevcius
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # PLASTICIDADE - EUSCHISTUS (DENIS) ## Table of Contents [TOC] ## Basic commands Conectando no servidor ``` ssh denis@143.107.244.181 -p 2205 ``` Transferencia de arquivos de servidor para local ``` #de local para servidor #a sintaxe é simples: scp -P <porta> <caminho e arquivo de origem> <diretorio de destino> sudo scp -P 2205 /home/bruno/GoogleDrive/[ARTIGO]Denis/T26_1.fq.gz denis@143.107.244.181:/home/denis/raw_seqs #de servidor para servidor sudo scp caminho/do/arquivo.txt usuário@servidor-de-destino:/caminho/destino/ #At: /home/denis/results/Trim_Reads sudo scp ./T4_1.trim.fq.gz cunha@lem.ib.usp.br:/home/cunha/denis/raw_seqs ``` Checar md5 ``` md5sum <arquivo> md5sum * #checar todos de uma vez md5sum * > checklist.chk # generates a list of checksums for any file that matches * md5sum -c checklist.chk # runs through the list to check them ``` Análise em screen ``` #rodar análises em screen screen #iniciar um screen e nomea-lo screen -S session_name #sair do screen sem fechar ctrl+a+d #encerrar o screen em definitivo exit #recuperar screen screen -r #se tiver mais de um aberto screen -r <numeros> ``` Verificar se um programa está instalado (which is a BASH program that looks through everything you have installed, and tells you what folder it is installed to. If it can’t find the program you asked for, it returns nothing, i.e. gives you no results.) ``` which <programa> ``` Verificar a versão de um programa ``` <programa> -v ``` Instalar um programa ``` sudo apt-get install <programa> ``` man (short for manual) displays detailed documentation (also referred as man page or man file) for bash commands. It is a powerful resource to explore bash commands, understand their usage and flags. ``` man ls ``` In most commands the flags can be combined together in no particular order to obtain the desired results/output. ``` ls -Fa ls -laF ``` Contar o número de linhas em um arquivo ``` wc -l <arquivo> ``` ## Description of the data | Sample | Sex ratio | Batch | QC result | |:------:|:---------:|:-----:|:---------:| | 4 | 2MF | A | Hold | | 7 | 2MF | A | Hold | | 26 | M2F | A | Hold | | 29 | 2MF | B | Pass | | 30 | 2MF | C | Pass | | 31 | M2F | B | Pass | | 32 | 2MF | C | Pass | | 34 | M2F | C | Pass | | 36 | M2F | C | Fail | | 37 | M2F | C | Pass | | 38 | M2F | C | Fail | | 39 | 2MF | C | Pass | Files: T4_1.fq.gz T4_2.fq.gz T7_1.fq.gz T7_2.fq.gz T26_1.fq.gz T26_2.fq.gz T29_1.fq.gz T29_2.fq.gz T30_1.fq.gz T30_2.fq.gz T31_1.fq.gz T31_2.fq.gz T32_1.fq.gz T32_2.fq.gz T34_1.fq.gz T34_2.fq.gz T36_1.fq.gz T36_2.fq.gz T37_1.fq.gz T37_2.fq.gz T38_1.fq.gz T38_2.fq.gz T39_1.fq.gz T39_2.fq.gz Our samples are paired-end. Paired end reads are produced when the fragment size used in the sequencing process is much longer (typically 250 - 500 bp long) and the ends of the fragment are read in towards the middle. This produces two “paired” reads. One from the left hand end of a fragment and one from the right with a known separation distance between them. (The known separation distance is actually a distribution with a mean and standard deviation as not all original fragments are of the same length.) This extra information contained in the paired end reads can be useful for helping to tie pieces of sequence together during the assembly process. ![](https://i.imgur.com/0o5M682.jpg) ## 0 - Quality control of raw reads FastQC will will process one sample at a time and give you an output report for each sample separately. Don’t need to unzip raw read files because fastqc can cope with zipped files (.gz). MultiQC will combine all the outputs from FastQC analysis and give you one QC report for all processed samples, making them more easily comparable. ``` fastqc <arquivo> fastqc *.fq.gz #analisar todos de uma vez multiqc . #the dot stands for the local directory, and is obligatory. mv *fastqc* /dados/home/denis/results/fastQC #Passando o arquivo html para o local (the command is run from your local machine) scp -P 2205 denis@143.107.244.181:/home/denis/results/fastQC/multiqc_report.html . ``` About FastQC: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/ About MultiQC: https://multiqc.info/ Resultados: ![](https://i.imgur.com/DO13bCQ.png) ![](https://i.imgur.com/zITQDj0.png) ![](https://i.imgur.com/OZOHW1f.png) Este gráfico indica que as amostras provavelmente estão contaminadas: ![](https://i.imgur.com/nlpErws.png) "Warnings in this module usually indicate a problem with the library. Sharp peaks on an otherwise smooth distribution are normally the result of a specific contaminant (adapter dimers for example), which may well be picked up by the overrepresented sequences module. Wider or multiple peaks may represent contamination with a different species." ## 1 - Preprocessing (trimming) We will use a program called Trimmomatic to filter poor quality reads and trim poor quality bases from our samples. About Trimmomatic: https://datacarpentry.org/wrangling-genomics/03-trimming/ http://www.usadellab.org/cms/index.php?page=trimmomatic Manual: http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf We must first specify whether we have paired end (PE) or single end (SE) reads. You can specify threads to indicate the number of processors on your computer that you want Trimmomatic to use. In most cases using multiple threads (processors) can help to run the trimming faster. ``` #Running Trimmomatic version 0.39: for infile in *_1.fq.gz > do > base=$(basename ${infile} _1.fq.gz) > trimmomatic PE -threads 12 -phred33 ${infile} ${base}_2.fq.gz \ > ${base}_1.trim.fq.gz ${base}_1un.trim.fq.gz \ > ${base}_2.trim.fq.gz ${base}_2un.trim.fq.gz \ > SLIDINGWINDOW:4:20 MINLEN:36 ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 > done ``` I tested quality control again (for only paired reads): ``` fastqc *1.trim* fastqc *2.trim* mv *fastqc* ~/results/fastQc_Trim_Reads/ multiqc . ``` Results: ![](https://i.imgur.com/RRcVI6m.png) ![](https://i.imgur.com/Mru5nko.png) ![](https://i.imgur.com/imvbLP0.png) ![](https://i.imgur.com/bKbxE7C.png) As amostras continuam contaminadas: ![](https://i.imgur.com/wLXZXW5.png) ## 2 - Transcriptome assembly The goal of a sequence assembler is to produce long contiguous pieces of sequence (contigs) from reads. The contigs are sometimes then ordered and oriented in relation to one another to form scaffolds. The distances between pairs of a set of paired end reads is useful information for this purpose. Even if a reference genome is available, de novo assembly should be performed, as it can recover transcripts that are transcribed from segments of the genome that are missing from the reference genome assembly. ### 2.1 - Genome-free De novo transcriptome assembly RNA sequencing data is typically used for gene expression analysis via mapping reads to a reference genome. However, for organisms without a high-quality reference genome or gene annotation, de novo transcriptome assembly is a viable alternative We will be using Spades v3.13.0, which contains the pipeline rnaSPades. rnaSPAdes is a tool for de novo transcriptome assembly from RNA-Seq data and is suitable for all kind of organisms. It accepts gzip-compressed files. Manual: https://cab.spbu.ru/files/release3.14.1/rnaspades_manual.html ``` #assembling a transcriptome for each sample nano script_spades rnaspades.py -1 T4_1.trim.fq.gz -2 T4_2.trim.fq.gz -o rnaSpades_T4 rnaspades.py -1 T7_1.trim.fq.gz -2 T7_2.trim.fq.gz -o rnaSpades_T7 rnaspades.py -1 T26_1.trim.fq.gz -2 T26_2.trim.fq.gz -o rnaSpades_T26 rnaspades.py -1 T29_1.trim.fq.gz -2 T29_2.trim.fq.gz -o rnaSpades_T29 rnaspades.py -1 T31_1.trim.fq.gz -2 T31_2.trim.fq.gz -o rnaSpades_T31 rnaspades.py -1 T30_1.trim.fq.gz -2 T30_2.trim.fq.gz -o rnaSpades_T30 rnaspades.py -1 T32_1.trim.fq.gz -2 T32_2.trim.fq.gz -o rnaSpades_T32 rnaspades.py -1 T34_1.trim.fq.gz -2 T34_2.trim.fq.gz -o rnaSpades_T34 rnaspades.py -1 T36_1.trim.fq.gz -2 T36_2.trim.fq.gz -o rnaSpades_T36 rnaspades.py -1 T37_1.trim.fq.gz -2 T37_2.trim.fq.gz -o rnaSpades_T37 rnaspades.py -1 T38_1.trim.fq.gz -2 T38_2.trim.fq.gz -o rnaSpades_T38 rnaspades.py -1 T39_1.trim.fq.gz -2 T39_2.trim.fq.gz -o rnaSpades_T39 bash script_spades ``` ``` #assembling a transcriptome using all samples rnaspades.py --pe1-1 T26_1.trim.fq.gz --pe1-2 T26_2.trim.fq.gz --pe2-1 T29_1.trim.fq.gz --pe2-2 T29_2.trim.fq.gz --pe3-1 T31_1.trim.fq.gz --pe3-2 T31_2.trim.fq.gz --pe4-1 T4_1.trim.fq.gz --pe4-2 T4_2.trim.fq.gz --pe5-1 T7_1.trim.fq.gz --pe5-2 T7_2.trim.fq.gz -o Transcriptome_denovo ``` ```` #assembling a transcriptome including the new samples #An alternative way to specify an input data set for SPAdes is to create a YAML data set file. By using a YAML file you can provide an unlimited number of paired-end libraries. nano libraries.yaml [ { orientation: "fr", type: "paired-end", right reads: [ "/home/denis/results/Trim_Reads/T4_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T7_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T26_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T29_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T30_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T31_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T32_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T34_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T36_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T37_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T38_1.trim.fq.gz", "/home/denis/results/Trim_Reads/T39_1.trim.fq.gz" ], left reads: [ "/home/denis/results/Trim_Reads/T4_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T7_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T26_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T29_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T30_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T31_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T32_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T34_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T36_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T37_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T38_2.trim.fq.gz", "/home/denis/results/Trim_Reads/T39_2.trim.fq.gz" ] } ] rnaspades.py --dataset libraries.yaml -o Transcriptome_denovo2 ```` #### Doing the same in Trinity to compare ```` #At: /home/denis/results/Trim_Reads Trinity --max_memory 50G --seqType fq --left T4_1.trim.fq.gz,T7_1.trim.fq.gz,T26_1.trim.fq.gz,T29_1.trim.fq.gz,T30_1.trim.fq.gz,T31_1.trim.fq.gz,T32_1.trim.fq.gz,T34_1.trim.fq.gz,T36_1.trim.fq.gz,T37_1.trim.fq.gz,T38_1.trim.fq.gz,T39_1.trim.fq.gz --right T4_2.trim.fq.gz,T7_2.trim.fq.gz,T26_2.trim.fq.gz,T29_2.trim.fq.gz,T30_2.trim.fq.gz,T31_2.trim.fq.gz,T32_2.trim.fq.gz,T34_2.trim.fq.gz,T36_2.trim.fq.gz,T37_2.trim.fq.gz,T38_2.trim.fq.gz,T39_2.trim.fq.gz --CPU 15 --min_contig_length 199 --full_cleanup ```` ### 2.2 - Genome-Guided De novo transcriptome Assembly We will be using Trinity v2.1.1. If a genome sequence is available, Trinity offers a method whereby reads are first aligned to the genome, partitioned according to locus, followed by de novo transcriptome assembly at each locus. In this use-case, the genome is only being used as a substrate for grouping overlapping reads into clusters that will then be separately fed into Trinity for de novo transcriptome assembly. This is very much unlike typical genome-guided approaches (eg. cufflinks) where aligned reads are stitched into transcript structures and where transcript sequences are reconstructed based on the reference genome sequence. Here, transcripts are reconstructed based on the actual read sequences. Why do this? You may have a reference genome, but your sample likely comes from an organism with a genome that isn't an exact match to the reference genome. Genome-guided de novo assembly should capture the sequence variations contained in your RNA-Seq sample in the form of the transcripts that are de novo reconstructed. In comparison to genome-free de novo assembly, it can also help in cases where you have paralogs or other genes with shared sequences, since the genome is used to partition the reads according to locus prior to doing any de novo assembly. If you have a highly fragmented draft genome, then you are likely better off performing a genome-free de novo transcriptome assembly. Users must provide read alignments to Trinity as a coordinate-sorted bam file. Use GSNAP, TopHat, STAR or other favorite RNA-Seq read alignment tool to generate the bam file, and be sure it's coordinate sorted by running 'samtools sort' on it. We will be using STAR v2.7.10a. Manual: https://physiology.med.cornell.edu/faculty/skrabanek/lab/angsd/lecture_notes/STARmanual.pdf Aligning reads using STAR is a two step process: #### Generating genome indeces files ```` #creating a directory for the indices mkdir star_index STAR --runThreadN 8 --runMode genomeGenerate --genomeDir /home/denis/Ref_genome/star_index --genomeFastaFiles /home/denis/Ref_genome/GCA_003667255.1_E_heros_v1.0_sep_2018_genomic.fna ```` #### Mapping reads to the genome ```` mkdir STAR #At: /home/denis/results/Trim_Reads #Creating a loop to go through the entire directory and run each aligment in serial: for i in *_1.trim.fq.gz > do > STAR --genomeDir /home/denis/Ref_genome/star_index \ > --runThreadN 10 \ > --readFilesCommand zcat \ > --readFilesIn $i ${i%_1.trim.fq.gz}_2.trim.fq.gz \ > --outFileNamePrefix /home/denis/results/STAR/${i%_1.trim.fq.gz} \ > --outSAMtype BAM SortedByCoordinate \ > --outSAMunmapped Within #output unmapped reads within the main SAM file (i.e. Aligned.out.sam) > done #Mapping all the samples togheter STAR --genomeDir /home/denis/Ref_genome/star_index \ --runThreadN 10 \ --readFilesCommand zcat \ --readFilesIn T26_1.trim.fq.gz,T29_1.trim.fq.gz,T31_1.trim.fq.gz,T4_1.trim.fq.gz,T7_1.trim.fq.gz T26_2.trim.fq.gz,T29_2.trim.fq.gz,T31_2.trim.fq.gz,T4_2.trim.fq.gz,T7_2.trim.fq.gz \ --outFileNamePrefix /home/denis/results/STAR/All \ --outSAMtype BAM SortedByCoordinate \ --outSAMunmapped Within ```` #### Assessing Alignment Quality ```` #At: /home/denis/results/STAR multiqc . #Passando o arquivo html para o local scp -P 2205 denis@143.107.244.181:/home/denis/results/STAR/multiqc_report.html . ```` Results: ![](https://i.imgur.com/9nFeW5R.png) #### Running Trinity About Trinity https://github.com/trinityrnaseq/trinityrnaseq/wiki/Running-Trinity ```` Trinity --genome_guided_bam AllAligned.sortedByCoord.out.bam --max_memory 50G --genome_guided_max_intron 10000 --CPU 15 ```` ## 3 - Assessing transcriptome assembly quality with busco BUSCO (Benchmarking Universal Single-Copy Orthologs) provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs. Manual: https://busco.ezlab.org/busco_userguide.html#conda-package Creating a new environment with BUSCO installed ``` conda create -n teste -c conda-forge -c bioconda busco=5.3.1 # To activate this environment, use conda activate teste # To deactivate an active environment, use conda deactivate ``` Running BUSCO 5.3.1 for the transcriptome of each sample ``` # I am here now: ~/results/Trasncriptome_assembly #vamos usar como input o transcriptoma com maior númeor de contigs (soft filtered) nano script_busco busco -i rnaSpades_T4/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T4 -m transcriptome busco -i rnaSpades_T7/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T7 -m transcriptome busco -i rnaSpades_T26/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T26 -m transcriptome busco -i rnaSpades_T29/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T29 -m transcriptome busco -i rnaSpades_T31/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T31 -m transcriptome busco -i rnaSpades_T30/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T30 -m transcriptome busco -i rnaSpades_T32/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T32 -m transcriptome busco -i rnaSpades_T34/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T34 -m transcriptome busco -i rnaSpades_T36/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T36 -m transcriptome busco -i rnaSpades_T37/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T37 -m transcriptome busco -i rnaSpades_T38/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T38 -m transcriptome busco -i rnaSpades_T39/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_T39 -m transcriptome bash script_busco #tive que ativar o ambiente teste novamente dentro da screen ``` Now we use a python script (generate_plot.py) that builds R scripts for the graphs, this will build an .R file that may be directly run in R. ``` #first create a folder and then copy the BUSCO short summary file from each of the runs you want to plot into this folder. mkdir BUSCO_summaries cp short_summary.specific.hemiptera_odb10.busco_T4.txt ../BUSCO_summaries/ cp short_summary.specific.hemiptera_odb10.busco_T7.txt ../BUSCO_summaries/ cp short_summary.specific.hemiptera_odb10.busco_T26.txt ../BUSCO_summaries/ cp short_summary.specific.hemiptera_odb10.busco_T29.txt ../BUSCO_summaries/ cp short_summary.specific.hemiptera_odb10.busco_T31.txt ../BUSCO_summaries/ #Then simply run the script giving as argument the name of the folder you created containing the summaries you wish to plot. python3 /home/denis/anaconda3/envs/teste/bin/generate_plot.py -wd BUSCO_summaries #Passando os arquivos gerados para o local scp -P 2205 denis@143.107.244.181:/home/denis/results/Busco/BUSCO_summaries/busco_figure.png . scp -P 2205 denis@143.107.244.181:/home/denis/results/Busco/BUSCO_summaries/busco_figure.R . ``` results: ![](https://i.imgur.com/oYzlhWS.png) Running BUSCO 5.3.1 for the denovo and genome-guided transcriptomes, and for E. heros genome ``` #At: ~/results/Trasncriptome_assembly busco -i Transcriptome_denovo/soft_filtered_transcripts.fasta -l hemiptera_odb10 -o busco_denovo -m transcriptome busco -i Transcriptome_gguided/trinty_out_dir/Trinity-GG.fasta -l hemiptera_odb10 -o busco_gguided -m transcriptome #At: ~/Ref_genome busco -i GCA_003667255.1_E_heros_v1.0_sep_2018_genomic.fna -l hemiptera_odb10 -o busco_genome -m genome #At: ~/results/Busco python3 /home/denis/anaconda3/envs/teste/bin/generate_plot.py -wd BUSCO_summaries2 ``` results: ![](https://i.imgur.com/GlgT97G.png) ## 4 - Keeping only the longest isoforms many of contigs assembled are “repeated” versions of the same gene with little variation (i.e. in few nucleotides). Although these repetitions may be biologically significant (i.e. isoforms), probably most of them are assembly artifacts. With this step we remove those, keeping only the longest isoforms. ```` #Spades #Passando o script para o servidor scp -P 2205 ./get_longest_isoforms_rnaspades.sh denis@143.107.244.181:/home/denis/results/Trasncriptome_assembly/Transcriptome_denovo #Rodando o script bash ./get_longest_isoforms_rnaspades.sh hard_filtered_transcripts.fasta ```` ```` #Trinity perl ./get_longest_isoform_seq_per_trinity_gene.pl Trinity.fasta > Trinity.longest ```` ## 5 - Transcriptome Annotation Transcriptome annotation provides insight into the function and biological process of transcripts and the proteins they encode. ### 5.1 - Transcriptome Annotation with Trinotate Trinotate v3.2.2 is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms. About Trinotate: https://github.com/Trinotate/Trinotate.github.io/blob/master/index.asciidoc #### Obtaining the protein database files by running this Trinotate build process. ```` /home/denis/anaconda3/bin/Build_Trinotate_Boilerplate_SQLite_db.pl Trinotate #Prepare the protein database for blast searches by: makeblastdb -in uniprot_sprot.pep -dbtype prot #Uncompress and prepare the Pfam database for use with hmmscan like so: gunzip Pfam-A.hmm.gz hmmpress Pfam-A.hmm ```` #### Extracting the longest open reading frames ```` TransDecoder.LongOrfs -t /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest ```` #### predicting the likely coding region ``` TransDecoder.Predict -t /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest --retain_pfam_hits /home/denis/results/HMMER/TrinotatePFAM.out --retain_blastp_hits /home/denis/results/Blast/Trinity_transcriptome/blastp.outfmt6 ``` The set of candidate coding regions can be found as files '.transdecoder.' where extensions include .pep, .cds, .gff3, and .bed #### Capturing BLAST Homologies ```` #Search Trinity transcripts blastx -query /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest -db /home/denis/results/Annotation/Trinotate/uniprot_sprot.pep -num_threads 15 -max_target_seqs 1 -outfmt 6 > blastx.outfmt6 #I got this warning: "[blastx] Examining 5 or more matches is recommended", but it does not stop the job. # search Transdecoder-predicted proteins blastp -query /home/denis/results/TransDecoder/Trinity_transcriptome/Trinity.longest.transdecoder.pep -db /home/denis/results/Annotation/Trinotate/uniprot_sprot.pep -num_threads 15 -max_target_seqs 1 -outfmt 6 > blastp.outfmt6 ```` #### Running HMMER to identify protein domains ```` hmmscan --cpu 20 --domtblout TrinotatePFAM.out /home/denis/results/Annotation/Trinotate/Pfam-A.hmm /home/denis/results/TransDecoder/Trinity_transcriptome/Trinity.longest.transdecoder.pep > pfam.log ```` #### Running signalP to predict signal peptides ```` signalp -f short -n signalp2.out /home/denis/results/TransDecoder/Trinity_transcriptome/Trinity.longest.transdecoder.pep ```` #### Running tmHMM to predict transmembrane regions ```` tmhmm --short < /home/denis/results/TransDecoder/Trinity_transcriptome/Trinity.longest.transdecoder.pep > tmhmm.out ```` #### Running RNAMMER to identify rRNA transcripts ```` RnammerTranscriptome.pl --transcriptome /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest --path_to_rnammer /home/denis/results/RNAmmer/Installation/rnammer ```` #### Loading Above Results into a Trinotate SQLite Database ```` #1. Load transcripts and coding regions /home/denis/anaconda3/bin/get_Trinity_gene_to_trans_map.pl /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest > gene_trans_map Trinotate Trinotate.sqlite2 init --gene_trans_map Trinity.fasta.gene_trans_map --transcript_fasta /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest --transdecoder_pep /home/denis/results/TransDecoder/Trinity_transcriptome/Trinity.longest.transdecoder.pep ```` ```` #2. Loading BLAST homologies # load protein hits Trinotate Trinotate.sqlite LOAD_swissprot_blastp /home/denis/results/Blast/Trinity_transcriptome/blastp.outfmt6 # load transcript hits Trinotate Trinotate.sqlite LOAD_swissprot_blastx /home/denis/results/Blast/Trinity_transcriptome/blastx.outfmt6 ```` ```` #3. Load Pfam domain entries Trinotate Trinotate.sqlite LOAD_pfam /home/denis/results/HMMER/TrinotatePFAM.out ```` ```` #4. Load transmembrane domains Trinotate Trinotate.sqlite LOAD_tmhmm /home/denis/results/tmHMM/tmhmm.out ```` ```` #5. Load signal peptide predictions Trinotate Trinotate.sqlite LOAD_signalp /home/denis/results/signalP/signalp.out ```` #### Output an Annotation Report ```` Trinotate Trinotate.sqlite report > trinotate_annotation_report2.xls ```` ### 5.2 - Transcriptome Annotation with eggNOG mapper We will be using eggnog-mapper v2.1.7 which is a tool for fast functional annotation of novel sequences (genes or proteins) using precomputed eggNOG-based orthology assignments. To start an annotation job, provide a FASTA file containing your query sequences (-i option), specify a project name which will be used as a prefix for all the output files (-o option), and run emapper.py About eggnog-mapper: https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.7#Conda_bioconda_channel_version I installed version 2.1.7 using a conda environment. ```` conda create -n eggnog -c bioconda -c conda-forge eggnog-mapper conda activate eggnog conda deactivate ```` #### Running emapper.py ```` emapper.py --cpu 15 -i /home/denis/results/TransDecoder/Trinity.longest.transdecoder.cds -o E_heros --itype CDS --translate -m diamond --evalue 0.000001 --pident 50 --query_cover 70 --seed_ortholog_evalue 0.000001 --target_orthologs all --report_orthologs --go_evidence all --md5 --decorate_gff yes --excel python3 /home/pedro/Programs/eggnog-mapper/emapper.py --cpu 6 -i /home/pedro/Calliphoridae_pedro_gi/Cmac_ANNOT/Cmac_unnanotated_transcripts.fasta -o Cbez_annot_eggnog --data_dir /home/pedro/Programs/eggnog-mapper/data/ --itype CDS --translate -m diamond --evalue 0.000001 --pident 50 --query_cover 70 --seed_ortholog_evalue 0.000001 --target_orthologs all --report_orthologs --go_evidence all --md5 --decorate_gff yes ```` #### Doing the same for spades transcriptome ```` TransDecoder.LongOrfs -t /home/denis/results/Trasncriptome_assembly/Spades_denovo/Spades.longest.hf blastp -query /home/denis/results/TransDecoder/Spades_transcriptome/Spades.longest.hf.transdecoder_dir/longest_orfs.pep \ -db /home/denis/results/Annotation/Trinotate/uniprot_sprot.pep -max_target_seqs 1 \ -outfmt 6 -evalue 1e-5 -num_threads 15 > blastp.outfmt6 hmmscan --cpu 15 --domtblout pfam.domtblout /home/denis/results/Annotation/Trinotate/Pfam-A.hmm /home/denis/results/TransDecoder/Spades_transcriptome/Spades.longest.hf.transdecoder_dir/longest_orfs.pep TransDecoder.Predict -t /home/denis/results/Trasncriptome_assembly/Spades_denovo/Spades.longest.hf --retain_pfam_hits /home/denis/results/HMMER/pfam.domtblout --retain_blastp_hits /home/denis/results/Blast/Spades_transcriptome/blastp.outfmt6 emapper.py --cpu 15 -i /home/denis/results/TransDecoder/Spades_transcriptome/Spades.longest.hf.transdecoder.cds -o E_heros --itype CDS --translate -m diamond --evalue 0.000001 --pident 50 --query_cover 70 --seed_ortholog_evalue 0.000001 --target_orthologs all --report_orthologs --go_evidence all --md5 --decorate_gff yes --excel ```` ### 5.2 - Transcriptome Annotation with Diamond We will be using Diamond v0.9.21. Diamond is an almostdrop-in replacement for blastp and blastx, both due to its speed, and due to the fact that it mimics the BLAST command line function calls and output formats. The main drawback of the tool is that it can only operate with amino acid sequences as targets. However, it does accept both nucleotide and protein queries. Therefore, it is a great choice for performing protein versus protein (or translated nucleotide versus protein) searches while annotating de novo assembled transcriptomes. About: https://github.com/bbuchfink/diamond First I downloaded the protein sequences from *Halyomorpha halys* and made a database with them: ```` diamond makedb --in Hhal_2.0_protein.faa -d Hhal ```` Running a search in blastx mode: ```` diamond blastx -d Hhal.dmnd -q /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest -o matches.tsv --threads 20 ```` reporting only the alignments whose score is at most 10% lower than the best alignment score for a query: ```` diamond blastx -d Hhal.dmnd -q /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest -o matches2.tsv --top 10 --threads 20 ```` In both ways, 50911 queries were aligned. Para saber quantas das ~25.000 proteínas do proteoma de H. halys deram match com os inputs: ```` #Isolando a segunda coluna do arquivo matches2.tsv awk '{print $2}' matches2.tsv > contig_codes.txt #removendo os contigs repetidos sort contig_codes.txt | uniq -c > couts.txt #contando o número de linhas wc -l counst.txt ```` Resultado: 22.167 ## 6 - Measuring expression levels We will be using Kallisto v0.46.2. About: https://github.com/pachterlab/kallisto kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment. #### Building an index Pseudoalignment requires processing a transcriptome file to create a “transcriptome index”. ```` kallisto index -i transcripts.idx /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest ```` #### Quantification Now you can quantify abundances of the transcripts using the two read files reads_1.fastq.gz and reads_2.fastq.gz (kallisto can read in either plain-text or gzipped read files). Important note: only supply one sample at a time to kallisto. The multiple FASTQ (pair) option is for users who have samples that span multiple FASTQ files. ```` nano script_kallisto kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T4 -b 100 /home/denis/results/Trim_Reads/T4_1.trim.fq.gz /home/denis/results/Trim_Reads/T4_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T7 -b 100 /home/denis/results/Trim_Reads/T7_1.trim.fq.gz /home/denis/results/Trim_Reads/T7_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T26 -b 100 /home/denis/results/Trim_Reads/T26_1.trim.fq.gz /home/denis/results/Trim_Reads/T26_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T29 -b 100 /home/denis/results/Trim_Reads/T29_1.trim.fq.gz /home/denis/results/Trim_Reads/T29_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T30 -b 100 /home/denis/results/Trim_Reads/T30_1.trim.fq.gz /home/denis/results/Trim_Reads/T30_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T31 -b 100 /home/denis/results/Trim_Reads/T31_1.trim.fq.gz /home/denis/results/Trim_Reads/T31_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T32 -b 100 /home/denis/results/Trim_Reads/T32_1.trim.fq.gz /home/denis/results/Trim_Reads/T32_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T34 -b 100 /home/denis/results/Trim_Reads/T34_1.trim.fq.gz /home/denis/results/Trim_Reads/T34_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T36 -b 100 /home/denis/results/Trim_Reads/T36_1.trim.fq.gz /home/denis/results/Trim_Reads/T36_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T37 -b 100 /home/denis/results/Trim_Reads/T37_1.trim.fq.gz /home/denis/results/Trim_Reads/T37_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T38 -b 100 /home/denis/results/Trim_Reads/T38_1.trim.fq.gz /home/denis/results/Trim_Reads/T38_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/T39 -b 100 /home/denis/results/Trim_Reads/T39_1.trim.fq.gz /home/denis/results/Trim_Reads/T39_2.trim.fq.gz -t 20 bash script_kallisto ```` #### Doing the same for spades transcriptome ```` kallisto index -i transcripts.idx /home/denis/results/Trasncriptome_assembly/Spades_denovo/Spades.longest.hf nano script_kallisto kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T4 -b 100 /home/denis/results/Trim_Reads/T4_1.trim.fq.gz /home/denis/results/Trim_Reads/T4_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T7 -b 100 /home/denis/results/Trim_Reads/T7_1.trim.fq.gz /home/denis/results/Trim_Reads/T7_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T26 -b 100 /home/denis/results/Trim_Reads/T26_1.trim.fq.gz /home/denis/results/Trim_Reads/T26_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T29 -b 100 /home/denis/results/Trim_Reads/T29_1.trim.fq.gz /home/denis/results/Trim_Reads/T29_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T30 -b 100 /home/denis/results/Trim_Reads/T30_1.trim.fq.gz /home/denis/results/Trim_Reads/T30_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T31 -b 100 /home/denis/results/Trim_Reads/T31_1.trim.fq.gz /home/denis/results/Trim_Reads/T31_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T32 -b 100 /home/denis/results/Trim_Reads/T32_1.trim.fq.gz /home/denis/results/Trim_Reads/T32_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T34 -b 100 /home/denis/results/Trim_Reads/T34_1.trim.fq.gz /home/denis/results/Trim_Reads/T34_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T36 -b 100 /home/denis/results/Trim_Reads/T36_1.trim.fq.gz /home/denis/results/Trim_Reads/T36_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T37 -b 100 /home/denis/results/Trim_Reads/T37_1.trim.fq.gz /home/denis/results/Trim_Reads/T37_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T38 -b 100 /home/denis/results/Trim_Reads/T38_1.trim.fq.gz /home/denis/results/Trim_Reads/T38_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Spades.transcriptome/T39 -b 100 /home/denis/results/Trim_Reads/T39_1.trim.fq.gz /home/denis/results/Trim_Reads/T39_2.trim.fq.gz -t 20 bash script_kallisto ```` #### Doing the same only for coding sequences ```` kallisto index -i transcripts.idx /home/denis/results/TransDecoder/Trinity_transcriptome/Trinity.longest.transdecoder.cds nano script_kallisto kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T4 -b 100 /home/denis/results/Trim_Reads/T4_1.trim.fq.gz /home/denis/results/Trim_Reads/T4_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T7 -b 100 /home/denis/results/Trim_Reads/T7_1.trim.fq.gz /home/denis/results/Trim_Reads/T7_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T26 -b 100 /home/denis/results/Trim_Reads/T26_1.trim.fq.gz /home/denis/results/Trim_Reads/T26_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T29 -b 100 /home/denis/results/Trim_Reads/T29_1.trim.fq.gz /home/denis/results/Trim_Reads/T29_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T30 -b 100 /home/denis/results/Trim_Reads/T30_1.trim.fq.gz /home/denis/results/Trim_Reads/T30_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T31 -b 100 /home/denis/results/Trim_Reads/T31_1.trim.fq.gz /home/denis/results/Trim_Reads/T31_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T32 -b 100 /home/denis/results/Trim_Reads/T32_1.trim.fq.gz /home/denis/results/Trim_Reads/T32_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T34 -b 100 /home/denis/results/Trim_Reads/T34_1.trim.fq.gz /home/denis/results/Trim_Reads/T34_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T36 -b 100 /home/denis/results/Trim_Reads/T36_1.trim.fq.gz /home/denis/results/Trim_Reads/T36_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T37 -b 100 /home/denis/results/Trim_Reads/T37_1.trim.fq.gz /home/denis/results/Trim_Reads/T37_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T38 -b 100 /home/denis/results/Trim_Reads/T38_1.trim.fq.gz /home/denis/results/Trim_Reads/T38_2.trim.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/cds/T39 -b 100 /home/denis/results/Trim_Reads/T39_1.trim.fq.gz /home/denis/results/Trim_Reads/T39_2.trim.fq.gz -t 20 bash script_kallisto ```` #### Doing the same only for M samples ```` nano script_kallisto kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Initial_males/T6M -b 100 /home/denis/raw_seqs/machos_e_femeas/T6M_1.fq.gz /home/denis/raw_seqs/machos_e_femeas/T6M_2.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Initial_males/T8M -b 100 /home/denis/raw_seqs/machos_e_femeas/T8M_1.fq.gz /home/denis/raw_seqs/machos_e_femeas/T8M_2.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Initial_males/T9M -b 100 /home/denis/raw_seqs/machos_e_femeas/T9M_1.fq.gz /home/denis/raw_seqs/machos_e_femeas/T9M_2.fq.gz -t 20 kallisto quant -i transcripts.idx -o /home/denis/results/Kallisto/Initial_males/T11M -b 100 /home/denis/raw_seqs/machos_e_femeas/T11M_1.fq.gz /home/denis/raw_seqs/machos_e_femeas/T11M_2.fq.gz -t 20 bash script_kallisto ```` ## 7 - Evaluating differentially expressed genes first lets keep only columns 1 and 4 from the outputs of Kallisto, these are (1) the contig code and (4) estimated read counts ```` #tail -2 list all the file starting from 2nd line #sort put it in alphabetical order using the first column #cut -f 1,4 keeps only columns 1 and 4 nano script_import tail -n +2 /home/denis/results/Kallisto/T4/abundance.tsv | cut -f 1,4 > T4_abundance.txt tail -n +2 /home/denis/results/Kallisto/T7/abundance.tsv | cut -f 1,4 > T7_abundance.txt tail -n +2 /home/denis/results/Kallisto/T26/abundance.tsv | cut -f 1,4 > T26_abundance.txt tail -n +2 /home/denis/results/Kallisto/T29/abundance.tsv | cut -f 1,4 > T29_abundance.txt tail -n +2 /home/denis/results/Kallisto/T30/abundance.tsv | cut -f 1,4 > T30_abundance.txt tail -n +2 /home/denis/results/Kallisto/T31/abundance.tsv | cut -f 1,4 > T31_abundance.txt tail -n +2 /home/denis/results/Kallisto/T32/abundance.tsv | cut -f 1,4 > T32_abundance.txt tail -n +2 /home/denis/results/Kallisto/T34/abundance.tsv | cut -f 1,4 > T34_abundance.txt tail -n +2 /home/denis/results/Kallisto/T36/abundance.tsv | cut -f 1,4 > T36_abundance.txt tail -n +2 /home/denis/results/Kallisto/T37/abundance.tsv | cut -f 1,4 > T37_abundance.txt tail -n +2 /home/denis/results/Kallisto/T38/abundance.tsv | cut -f 1,4 > T38_abundance.txt tail -n +2 /home/denis/results/Kallisto/T39/abundance.tsv | cut -f 1,4 > T39_abundance.txt bash script_import ```` ```` paste T4_abundance.txt T7_abundance.txt T26_abundance.txt T29_abundance.txt T30_abundance.txt T31_abundance.txt T32_abundance.txt T34_abundance.txt T36_abundance.txt T37_abundance.txt T38_abundance.txt T39_abundance.txt > all_samples_expression.txt ```` Lets try using columns 1 and 5 from the outputs of Kallisto, these are (1) the contig code and (5) Transcripts Per Million (TPM) ```` nano script_import2 tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T4/abundance.tsv | cut -f 1,5 > T4_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T7/abundance.tsv | cut -f 1,5 > T7_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T26/abundance.tsv | cut -f 1,5 > T26_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T29/abundance.tsv | cut -f 1,5 > T29_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T30/abundance.tsv | cut -f 1,5 > T30_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T31/abundance.tsv | cut -f 1,5 > T31_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T32/abundance.tsv | cut -f 1,5 > T32_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T34/abundance.tsv | cut -f 1,5 > T34_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T36/abundance.tsv | cut -f 1,5 > T36_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T37/abundance.tsv | cut -f 1,5 > T37_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T38/abundance.tsv | cut -f 1,5 > T38_abundance.txt tail -n +2 /home/denis/results/Kallisto/Trinity.transcriptome/T39/abundance.tsv | cut -f 1,5 > T39_abundance.txt bash script_import2 ```` ```` paste T4_abundance.txt T7_abundance.txt T26_abundance.txt T29_abundance.txt T30_abundance.txt T31_abundance.txt T32_abundance.txt T34_abundance.txt T36_abundance.txt T37_abundance.txt T38_abundance.txt T39_abundance.txt > all_samples_expression2.txt ```` M samples ```` nano script_import tail -n +2 /home/denis/results/Kallisto/Initial_males/T6M/abundance.tsv | cut -f 1,4 > T6M_abundance.txt tail -n +2 /home/denis/results/Kallisto/Initial_males/T8M/abundance.tsv | cut -f 1,4 > T8M_abundance.txt tail -n +2 /home/denis/results/Kallisto/Initial_males/T9M/abundance.tsv | cut -f 1,4 > T9M_abundance.txt tail -n +2 /home/denis/results/Kallisto/Initial_males/T11M/abundance.tsv | cut -f 1,4 > T11M_abundance.txt bash script_import ```` ```` paste T6M_abundance.txt T8M_abundance.txt T9M_abundance.txt T11M_abundance.txt > M_samples_expression.txt ````` ## TransPi https://github.com/PalMuc/TransPi ```` nextflow run TransPi.nf --all --reads '/home/denis/raw_seqs/plasticidade/*_[1,2].fq.gz' \ --k 25,35,55,75,85 --maxReadLen 150 -profile conda nextflow run TransPi.nf --onlyAsm --reads '/home/denis/results/Trim_Reads/*_[1,2].trim.fq.gz' \ --k 25,35,55,75,85 --maxReadLen 150 -profile conda ```` ## rnaQUAST https://github.com/ablab/rnaquast ```` python ./rnaQUAST.py --transcripts /home/denis/results/Trasncriptome_assembly/Trinity_denovo/Trinity.longest /home/denis/results/Trasncriptome_assembly/Spades_denovo/hard_filtered_transcripts_long_isof.fa --busco hemiptera_odb10 --gene_mark ````

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully