tsbroa25
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # *Dickeya* and *Pectobacterium* (plant pathogens) v.s. *Sodalis* and *Blochmannia* (insect symbionts) Project Packet The goal of our project is to understand the genomic factors that differentiate bacterial plant pathogens from insect symbionts. More specifically, we are interested in exploring the genes, and categories of genes that are responsible for the differences between the pathogens *Dickeya* and *Pectobacterium* and the symbionts, *Blochmannia* and *Sodalis*. These bacteria share a common ancestor and similar pattern of evolution which enables clearer identification of factors that may differentiate symbionts from pathogens. The first pathogenetic genus, *Dickeya* is a type of soft rot pathogen that breaks down succulent, meaty plant parts such as potatoes, roots, tubers, stem cuttings, and leaves. [8] The second pathogenetic genus, *Pectobacterium* synthesizes and secretes large amounts of exoenzymes that break down plant cell walls, such as cellulases, proteases, polygalacturonases, and pectate lyases. These enzymes cause disease in a variety of plant species.[9] Both of these pathogens are thought to have evolved from a common symbiotic ancestor.[4] The first symbiont genus we are exploring, *Sodalis*, has a broad range of symbiotic and free-living species, including a common species that serves as a tsetse fly endosymbiont. This diversity in its characteristics and roles should increase the genetic diversity within the genus, and by including the free-living *S. ligni* species we should be able to capture the broad genetic diversity within the genus. The other genus of symbionts being examined is *Blochmannia*, which is commonly present in carpenter ants, where it aids the ant in processing nitrogen. [11] We chose in total 16 species including *Sodalis* species, *Blochmannia* species, *Dickeya* species, and *Pectobacterium* species with 1 outgroup to be able to compare them: ``` Sodalis glossinidius Sodalis praecaptivus Sodalis pierantonius Sodalis ligni (*free living) Blochmannia floridanus Blochmannia pennsylvanicus Blochmannia herculeanus Blochmannia vicinus Pectobacterium carotovorum Pectobacterium atrosepticum Pectobacterium wasabiae Pectobacterium aquaticum Dickeya dadantii Ech703 Dickeya dadantii Ech586 Dickeya zeae Dickeya dianthicola Pseudomonas aeruginosa (Outgroup) ``` The fasta files, and all subsequent test results, for these species can be found by directing to the project folder using the following command `cd /courses/bi278/tsbroa25/project` Once here use the `ls` command to see further folder and file options. Note: The Genome files and rpsBlast results can be found within each species folder in the genomes folder. A paper published by Czajkowski et al. on the Detection, identification, and differentiation of *Pectobacterium* and *Dickeya* species causing potato blackleg and tuber soft rot identified *pel* and *hrp* as key genes involved in virulence in *Pectobacterium* and *Dickeya*. [2] Zientz et al. identified *ureF* as being a key gene involved in symbiotic behavior in bacteria. [11] Given these findings, we want to explore the frequency of observation of these genes in our chosen taxa to determine whether they could be key markers of differentiation between symbiotic and pathogenic bacteria. ## Gathering Reference Genomes use `ssh username@bi278` to access the bi278 server Set the working directory to `cd /courses/bi278/tsbroa25/project` download the data using the following command, making sure the GCF code corresponds to the genome of interest `datasets download genome accession GCF_000091565.1 --include protein, genome --filename GCF_000091565.1.zip` Unzip the file using the following command `unzip GCF_000091565.1.zip` Unzipping the file results in a folder with the GCF_*********.1 naming convention that contains a .fna file and a .faa file Rename all folders using taxa names (i.e. Blochmannia_floridanus) and place into /courses/bi278/tsbroa25/project/genomes Copy over all .faa files to /courses/bi278/tsbroa25/GenomesFaa and rename with taxa names (i.e. Blochmannia_floridanus.faa) Copy over all .fna files to /courses/bi278/tsbroa25/genomic.fna_files and rename with taxa names (i.e. Blochmannia_floridanus_genomic.fna) ## Using Shell Scripts to determine GC% for reference genomes. Given that genomes that are closely related should have similar GC content and Genome size, using shell scripts written in lab 3 to determine GC content will help us weed out any abnormal genomic data we have in our dataset. The shell script we wrote in Lab03 was a GC Content Counter that takes in fasta files and returns GC content. To write the script, we opened an editor in the Unix shell using ``` nano GCContentCounter.sh ``` and then wrote the following: ``` grep -v ">" /courses/bi278/tsbroa25/project/genomic.fna_files/$1 | tr -d -c GCgc | wc -c grep -v ">" /courses/bi278/tsbroa25/project/genomic.fna_files/$1 | tr -d -c ATGCatgc | w$ ``` This command used a folder location for the fastafiles we feed to the script ($1), and the method using ```grep``` to count the number of GC bases found in the file and then count the total number of bases (ACTG). The script returns these 2 values and we divide GC content by total content to get GC percentage. ``` sh FILENAME.sh FASTAFILE ``` runs the shell script on the target FASTAFILE i.e. ``` sh GCContentCounter.sh Blochmannia_floridanus_genomic.fna ``` This returns 2 values, the number of GC bases in the genome and the number of total (ACTG) bases in the genome. For this specific genome, the output looks like: ``` 193204 705557 ``` Running this shell script on all of our genomes, we were able to fill out the following table for genome size and GC content: | Organism | Genome size (bp) | GC% | | -------- | -------- | ----------- | |*Sodalis glossinidius* | 4292502 | .545 |*Sodalis praecaptivus* | 5159425 | .571 |*Sodalis pierantonius* | 4513140 | .561 |*Sodalis ligni* | 6384591 | .550 |*Blochmannia floridanus* | 705557 | .274 |*Blochmannia pennsylvanicus* | 791654 | .296 |*Blochmannia herculeanus* | 790899 | .297 |*Blochmannia vicinus* | 778501 | .289 |*Pectobacterium carotovorum* | 4892225 | .520 |*Pectobacterium atrosepticum* | 5024241 | .511 |*Pectobacterium wasabiae* | 5043228 | .506 |*Pectobacterium aquaticum* | 4464425 | .512 |*Dickeya dadantii Ech703* | 4679450 | .550 |*Dickeya dadantii Ech586* | 4818394 | .536 |*Dickeya zeae* | 4740052 | .534 |*Dickeya dianthicola* | 4909058 | .557 |*Pseudomonas aeruginosa* (Outgroup) | 6264404 | .666 Genome size and GC content seem consistent over species groups so we feel confident in the quality of the genomes we have selected. ### Downloading the Sequence of the Potential Ancestral Genes Having the sequences of our benchmark genes, *pel*, *ureF*, and *hrp*, as well as other potential ancestral genes is critical to obtaining a running analysis to better understand the differentiating factors between pathogens and symbionts. Choose the linked NCBI Databases of one of the benchmark papers (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4320782/) and select Taxonomy. Choose "protein" under "*Dickeya solani*". Search for the potential ancestral genes at the search engines with the gene's name. Download one of the FASTA files as plain text and organize them into the "query_sequences" folder in the "significant_gene_queries" folder. Gene sequences can also be found by going to https://www.ncbi.nlm.nih.gov and searching *gene name* *species name* in the "Protein" database. ![Screenshot 2023-12-02 at 2.25.08 PM](https://hackmd.io/_uploads/BJylBbFHT.png) ![Screenshot 2023-12-02 at 2.25.23 PM](https://hackmd.io/_uploads/HyY4HWtr6.png) ![Screenshot 2023-12-02 at 2.25.32 PM](https://hackmd.io/_uploads/BkhNBWtrp.png) FASTA file results can be copied and pasted into the terminal to create a Txt file i.e.: ``` "WLGNKWEIPLKMLVMCYGYIVVEGLITAGLKLVPFGQCSAQKLLMAAFKWFSIAWRHADLLTDEELGGSFPLQSIASSCHEEQKFRLFRS" >ureF.txt ``` All .txt protein sequences are saved to /courses/bi278/tsbroa25/project/significant_gene_queries/query_sequences ## Make a protein database of reference genomes and run Blastp Now that we have our genes of interest and reference genomes, we need to construct a database to search our genes of interest to see their presence in the genera of interest. make a protein database for each species' genome: ``` makeblastdb -in /courses/bi278/tsbroa25/project/GenomesFaa/Sodalis_praecaptivus.faa -dbtype prot -title S_praecaptivus_prot -out blastdb/S_praecaptivus_prot/S_praecaptivus_prot -parse_seqids ``` This command should put all protein database files for this specific genome in the "blastdb" folder. Run this command for each species genome- we will then combine each protein database into a large database using blastdb_aliastool: Set an environmental variable called $BLASTDB: ``` echo $BLASTDB ``` Tell BLAST where the databases are that we want in our environment: ``` export BLASTDB="$BLASTDB:/courses/bi278/tsbroa25/project/blastdb" ``` Use blastdb_aliastool to combine protein databases: ``` blastdb_aliastool -dbtype prot -title blastdb2 -out big_blast_db -dblist "B_floridanus_prot B_herculeanus_prot B_pennsylvanicus_prot B_vicinus_prot D_dadantii_Ech586_prot D_dadantii_prot D_dianthicola_prot D_zeae_prot P_aquaticum_prot P_atrosepticum_prot P_carotovorum_prot P_wasabiae_prot P_aeruginosa_prot S_glossinidius_prot S_ligni_prot S_pierantonius_prot S_praecaptivus_prot" ``` Run blastp with each of our significant genes that we want to analyze: i.e.: ``` blastp -task blastp-fast -num_threads 2 -db big_blast_db -evalue 1e-6 -query nitrogenaseReductase.txt -out nitrognase_reductase_BLAST_results.txt ``` and save to .txt files do the same with format 6 blastp results: ``` blastp -task blastp-fast -num_threads 2 -db big_blast_db -evalue 1e-6 -query nitrogenaseReductase.txt -outfmt "6 std ppos qcovs stitle sscinames staxid" -out nitrognase_reductase_BLAST_results.txt ``` BLAST format 6 results are located in /courses/bi278/tsbroa25/project/significant_gene_queries/BLAST_format6_results ## BLAST Results: running blastp on our genes of interest against our protein database, we found the following results and alignments amongst our species. For all of the BLAST queries we used an e-value threshold of 1e-6 so each of our alignments is at least that significant: Genes of Study: #### Virulence factors: #### *hrp*: *hrp* genes allow plant pathogenic bacteria to elicit a hypersensitive response (HR) in resistant plants and cause disease in susceptible plants. Many of these *hrp* proteins are highly similar to proteins involved in the type III secretion apparatus and flagellar assembly in animal pathogens. [6] Specific versions of the *hrp* gene involved in virulence in *Dickeya* include *hrpA-hrpE* which accounts for a type III secretion system and *hrpN* and *hrpW* which account for Harpin secreted by the type III secretion system. Harpins are proteins targeted to the extracellular space of plant tissues.[8] *hrpL* in *Pectobacterium* may contribute specifically to the severity of soft rot in plants it affects, and it is also responsible for type III secretion systems in this taxa. [7] Running the genome sequence query for *hrp* genes against our database (containing the genomes of all taxa of study) found *hrp* genes present only in our pathogenic taxa: *Dickeya* and *Pectobacterium* indicating that this gene is only significantly present in the pathogenic bacteria in our project and is likely a good indicator of pathogenicity. | Taxa | Protein Identified | e value | | -------- | -------- | ----------- | |*Dickeya zeae* | hypothetical protein | 3.6E-28 |*Dickeya* | MULTISPECIES: hypothetical protein | 2.2E-27 |*Dickeya dianthicola* | hypothetical protein | 2.5E-26 |*Pectobacteriaceae* | MULTISPECIES: Hrp pili protein Hrp | 2.52E-17 |*Pectobacteriaceae* | MULTISPECIES: Hrp pili protein HrpA | 2.52E-17 #### *pel*: The *pel* locus in pathogenic bacterium contains seven genes encoding functions involved in the synthesis and export of polysaccharides involved in biofilm production.[1] In pathogenic and virulent bacteria, biofilms are part of bacterial survival mechanisms. They essentially build a matrix consisting of proteins, polysaccharides, and eDNA and offer bacteria protection along with the ability to employ survival strategies to avoid defense systems in their host. Bacteria existing in a biofilm matrix stay dormant and hidden from the host's immune system can cause local tissue damage and acute infection. [10] Running this gene against our database, we found closely related serralysin family metalloproteases in solely our pathogenic bacteria. | Taxa | Protein Identified | e value | | -------- | -------- | ----------- | |*Dickeya* |MULTISPECIES: serralysin family metalloprotease | 0 |*Dickeya zeae* |serralysin family metalloprotease | 0 |*Dickeya zeae* |serralysin family metalloprotease | 0 |*Dickeya* |MULTISPECIES: serralysin family metalloprotease | 0 |*Pectobacterium* |MULTISPECIES: serralysin family metalloprotease | 0 |*Pectobacterium wasabiae* | serralysin family metalloprotease | 0 |*Dickeya dianthicola* | serralysin family metalloprotease | 0 |*Pectobacterium aquaticum* | serralysin family metalloprotease | 0 |*Dickeya zeae* | serralysin family metalloprotease | 0 |*Dickeya dianthicola* | serralysin family metalloprotease | 0 |*Dickeya parazeae* | serralysin family metalloprotease | 0 |*Dickeya dianthicola* | serralysin family metalloprotease | 0 |*Dickeya zeae* | serralysin family metalloprotease | 0 |*Dickeya parazeae* | serralysin family metalloprotease | 0 |*Pectobacterium atrosepticum* |serralysin family metalloprotease | 0 |*Dickeya dianthicola* | hypothetical protein | 2.08E-134 #### Symbiont factors: #### *ureF*: *ureF* is an accessory factor that is needed for nickel incorporation in the active site of the enzyme urease. It can act as a regulatory factor for Urease activity given that Urease requires the incorporation of nickel for activity. Urease activity results in the production of CO2 and NH3 which is a potent cell toxin that damages membrane potential and ion transport. As found by Zientz et. al, it is a key factor in symbiosis in certain *Blochmannia* species. [11] | Taxa | Protein Identified | e value | | -------- | -------- | ----------- | |*Candidatus Sodalis pierantonius* | urease accessory UreF family protein | 1.82E-67 |*Candidatus Blochmannia vicinus* | urease accessory UreF family protein | 1.99E-67 |*Candidatus Blochmannia herculeanus* | urease accessory UreF family protein | 1.19E-65 |*Candidatus Blochmannia floridanus* | urease accessory UreF family protein | 2E-64 |*Sodalis praecaptivus* | urease accessory UreF family protein | 4.03E-64 |*Candidatus Blochmannia pennsylvanicus* | urease accessory UreF family protein | 4.32E-62 |*Sodalis ligni* | urease accessory UreF family protein | 3.81E-56 Running the *ureF* gene against our protein database, we found evidence of *ureF* solely in *Sodalis* and *Blochmannia* taxa indicating that it is likely associated with Symbiotic behavior and could be used as an indicator of symbionts. ## Running Orthofinder We ran orthofinder to find homologous groups of genes and compare the gene differences between plant pathogens and insect symbionts. By cross-examining these results with our BLASTp results, we can strengthen and verify our findings. working directory: `/courses/bi278/tsbroa25/project/GenomesFaa` 1. First we collected all predicted protein sequences (PROKKA*.faa equivalent) from the genomes into a directory (GenomesFAA folder). And ran the Orthofinder within that folder ``` orthofinder -f ./ -X ``` *Notes: orthofinder results are located in the GenomesFAA folder with the name "Ortho_Results_Nov9" 2. Look at the overall statistics ``` cat Ortho_Results_Nov09/Comparative_Genomics_Statistics/Statistics_Overall.tsv ``` It shows a summary of the number of genes in orthogroups, the number of unassigned genes, the percentage of genes in orthogroups, etc. Part of the output: ``` Number of species 17 Number of genes 57798 Number of genes in orthogroups 53780 Number of unassigned genes 4018 Percentage of genes in orthogroups 93.0 Percentage of unassigned genes 7.0 Number of orthogroups 6413 Number of species-specific orthogroups 375 Number of genes in species-specific orthogroups 1155 Percentage of genes in species-specific orthogroups 2.0 Mean orthogroup size 8.4 Median orthogroup size 6.0 G50 (assigned genes) 12 G50 (all genes) 12 O50 (assigned genes) 1529 O50 (all genes) 1697 Number of orthogroups with all species present 456 Number of single-copy orthogroups 356 ``` 3. Check the description of genes ``` annotate_orthogroups --orthogroups_tsv Phylogenetic_Hierarchical_Orthogroups/N0.tsv --hog True --fasta_dir ./ --file_endings faa --out N0.simple_annotation.tsv --simple True ``` *Notes: "N0.simple_annotation.tsv" in "GenomesFaa" shows the description of genes 4. Check out the Phylogenetic_Hierarchical_Orthogroups file ``` cat Ortho_Results_Nov09/Phylogenetic_Hierarchical_Orthogroups/N0.tsv # import "N0.tsv" into a google sheet for analysis ``` Link to the Google sheet: https://docs.google.com/spreadsheets/d/1eDsu4LhBe5eoKXhQmZVuRAJ9Q0AFPkUOPjIGyyPnLig/edit?usp=sharing 5. Shared orthogroup By creating a filter and analyzing the results in the Google sheet, we found that there are 116 orthogroups shared between the pathogens and 1 orthogroups shared between the symbiont. Overall, there are 456 orthogroups shared by all the species. 6. Unique gene analysis By creating a filter in the Google sheet, we found a unique gene to insect symbionts (*Sodalis* and *Blochmannia*) is **N0.HOG0002692**, which is described as a "*MarR* family transcriptional regulator". *MarR* transcription factors enable bacterial responses to chemical signals, affecting gene activity, regulating functions like organic compound degradation, and controlling virulence gene expression. [3] This explains why insect symbionts (*Sodalis* and *Blochmannia*) lack virulence genes but contain regulatory functional genes, such as the nitrogen fixation gene which can assist the host with nitrogen processing. ## Running RPSBlast for COG Recognition We used our orthofinder and blast results to guide our COG analysis process. Using the genes we found in the previous two steps, we decided to focus on analyzing their corresponding COG categories to see if the differences between our symbionts and pathogens stretch beyond those specific genes and if the differences are observed within an entire cluster of genes. This should enable us to verify our orthofinder and blast results, as well as obtain a better idea of larger trends differentiating the pathogens and symbionts. Running RPSBlast: Note: RPS Blast was conducted within each species folder found in the following directory `cd /courses/bi278/tsbroa25/project/genomes` `` 1. Pull in the Database: ``` export BLASTDB="/courses/bi278/Course_Materials/blastdb" echo $BLASTDB ``` 2. Run the Blast on the Query Sequence ``` rpsblast -query *.faa -db Cog -evalue 0.01 -max_target_seqs 25 -outfmt "6 qseqid sseqid evalue qcovs stitle" > FILE..rpsblast.table ``` 4. Only keep the Most Sig COGS `awk -F'[\t,]' '!x[$1]++ && $4>=70 {print $1,$5}' OFS="\t" *.rpsblast.table > FILE.cog.table ` 5. Assign Functional Category for the COGS `awk -F "\t" 'NR==FNR {a[$1]=$2;next} {if ($2 in a){print $1, $2, a[$2]} else {print $0}}' OFS="\t" /courses/bi278/Course_Materials/blastdb/cognames2003-2014.tab *.cog.table > temp` 6. Keep the first category `awk -F "\t" '{if ( length($3)>1 ) { $3 = substr($3, 0, 1) } else { $3 = $3 }; print}' OFS="\t" temp > temp2` 7. Retrieve the full description `awk -F "\t" 'NR==FNR {a[$1]=$2;next} {if ($3 in a){print $0, a[$3]} else {print $0}}' OFS="\t" /courses/bi278/Course_Materials/blastdb/fun2003-2014.tab temp2 > FILE.cog.categorized` 8. Remove the Temps `rm temp temp2` ### Excel Analysis of COG Results After receiving the results from the rpsblast, the files can be copied to Excel to be analyzed more easily. When initially inputting the files into Excel, Excel will not recognize it as a table, and you will need to break each row into multiple columns to separate the gene IDs, COG category, and COG summary using the text-to-column function found under the data tab. This should separate all pieces of our table into different columns. After doing this you can count the number of genes categorized into each cog using a count-if function: `=countif(RANGE, CRITERIA)` now that you have the number of genes categorized into each cog for each species, we can divide the number of genes in each cog category by the total number of categorized genes to obtain percent compositions. These percent compositions can subsequently be used to build the figures found in the results section, simply through comparison. Link to Excel Sheet With the Results: https://www.icloud.com/iclouddrive/0f0lXjm5Q_JgjmthEfVIyoSZA#Genomics_Project_COG_Categorizations #### Results from Cog Recognition ### Figure 1. ![Screenshot 2023-12-01 at 5.21.40 PM](https://hackmd.io/_uploads/BJSYnCvST.png) **Figure 1.** Average percent composition of selected COG categories within the pathogenetic and symbiotic species of interest. **A.** The percent composition of categorized genes within secondary metabolites biosynthesis, transport, and catabolism COGS. The virulence *pel* gene corresponds with this classification of genes. Thus the higher percent composition of genes within this classification of the COG seen in *Dickeya* and *Pectobacterium* is expected. **B.** Percent composition of genes within intracellular trafficking, secretion, and vesicular transport COGS. The virulence *hrp* gene corresponds with this classification of COG. However, there were relatively constant percent compositions of genes within these COGs across all our genera **C.** Percent composition of all genes that are categorized within a transcription COG. The transcription COG classification was chosen for analysis as the *MarR* found during orthofinder analysis falls within this classification. **D.** Percent composition of genes within translation, ribosomal structure, and biogenesis COGS. The *ureF* gene found using BlastP as unique to the symbionts falls within this COG classification. By analyzing the COG classifications corresponding to our genes of interest, we were able to see that only within secondary metabolites biosynthesis, transport, and catabolism classification was there a clear difference between the pathogens and symbionts. Thus the differences between the genomes of our pathogenetic genus and symbionts may not be as broad and drastic as previously thought. Interestingly, the *Sodalis* genus tends to have more similar percent compositions to the pathogens than to *Blochmania* the other symbiont. This could be due to *Blochmania*'s small genomic size. A smaller genome may have drastically different percent compositions than large genomes as overall the genome has fewer genes. However, it still must retain certain genes, such as many housekeeping genes, to sustain life. ## Conclusion Specific genes involved in virulence factors in pathogenic bacteria *hrp* and *pel* were found to be present in only *Pectobacterium* and *Dickeya* taxa indicating their involvement in pathogenicity in bacteria in the form of biofilms and harpin secretions stimulating disease in targeted plants. Conversely, the presence of *ureF* genes in only symbiotic taxa (*Sodalis*, *Blochmannia*) indicates that the Urease complex may be involved in symbiotic behavior amongst insect symbionts. The one orthogroup that the insect symbionts share, known as the "*MarR* family transcriptional regulator," can regulate the expression of virulence genes, based on the results of the orthofinder project, thus this result was to be expected. Furthermore, according to the orthofinder data, the pathogens share 116 orthogroups, but the insect symbionts share only one orthogroup. The percent compositions determined through COG analysis further reinforce this finding, with *Sodalis* having drastically different percent compositions than *Blochmania*. This suggests that plant pathogenetic bacteria may be required to share more common genes to properly function. ## References [1]Colvin KM, Irie Y, Tart CS, Urbano R, Whitney JC, Ryder C, Howell PL, Wozniak DJ, Parsek MR. The Pel and Psl polysaccharides provide Pseudomonas aeruginosa structural redundancy within the biofilm matrix. Environ Microbiol. 2012 Aug;14(8):1913-28. doi: 10.1111/j.1462-2920.2011.02657.x. Epub 2011 Dec 19. PMID: 22176658; PMCID: PMC3840794. [2] Czajkowski R, Pérombelon M, Jafra S, Lojkowska E, Potrykus M, van der Wolf J, Sledz W. Detection, identification and differentiation of Pectobacteriumand Dickeya species causing potato blackleg and tuber soft rot: a review. Ann Appl Biol. 2015 Jan;166(1):18-38. doi: 10.1111/aab.12166. Epub 2014 Oct 27. PMID: 25684775; PMCID: PMC4320782. [3] Deochand, D. K., & Grove, A. (2017). MarR family transcription factors: dynamic variations on a common scaffold. Critical reviews in biochemistry and molecular biology, 52(6), 595–613. https://doi.org/10.1080/10409238.2017.1344612 [4] Husník, F., Chrudimský, T. & Hypša, V. Multiple origins of endosymbiosis within the Enterobacteriaceae (γ-Proteobacteria): convergence of complex phylogenetic approaches. BMC Biol 9, 87 (2011). https://doi.org/10.1186/1741-7007-9-87 [5] Jay, Z. J., & Inskeep, W. P. (2015). The distribution, diversity, and importance of 16S rRNA gene introns in the order Thermoproteales. Biology direct, 10, 35. https://doi.org/10.1186/s13062-015-0065-6 [6] Lindgren P. B. (1997). The role of hrp genes during plant-bacterial interactions. Annual review of phytopathology, 35, 129–152. https://doi.org/10.1146/annurev.phyto.35.1.129 [7]Nam, Hyo-Song & Park, Ju-Yeon & Kang, Beom-Ryong & Lee, Sung-Hee & Cha, Jae-Soon & Kim, Young. (2011). Alternative Sigma Factor HrpL of Pectobacterium carotovorum 35 is Important for the Development of Soft-rot Symptoms. Research in Plant Disease. 17. 111-120. 10.5423/RPD.2011.17.2.111. [8]Reverchon, S. and Nasser, W. (2013), Dickeya dadantii pathogenicity. Environmental Microbiology Reports, 5: 622-636. https://doi.org/10.1111/1758-2229.12073 [9] Tláskal, V., Pylro, V. S., Žifčáková, L., & Baldrian, P. (2021). Ecological Divergence Within the Enterobacterial Genus Sodalis: From Insect Symbionts to Inhabitants of Decomposing Deadwood. Frontiers in Microbiology,12, 668644. https://doi.org/10.3389/fmicb.2021.668644 [10]Vestby LK, Grønseth T, Simm R, Nesse LL. Bacterial Biofilm and its Role in the Pathogenesis of Disease. Antibiotics (Basel). 2020 Feb 3;9(2):59. doi: 10.3390/antibiotics9020059. PMID: 32028684; PMCID: PMC7167820. [11]Zientz E, Beyaert I, Gross R, Feldhaar H. Relevance of the endosymbiosis of Blochmannia floridanus and carpenter ants at different stages of the life cycle of the host. Appl Environ Microbiol. 2006 Sep;72(9):6027-33. doi: 10.1128/AEM.00933-06. PMID: 16957225; PMCID: PMC1563639.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully