# The soapberry bug genome project Dave Angelini, Colby College 13 December 2019 ![](https://live.staticflickr.com/65535/47624606181_c18a2e5b12.jpg) ## The bugs The red-shouldered soapberry bug *Jadera haematoloma* (Hemiptera: Heteroptera: Rhopalidae) is a scentless plant bug native to the US Gulf Coast. It feeds on several native plants of the soapberry family (Sapindaceae) and, since the mid-twentieth century, has adapted to live on the introduced Chinese goldenrain tree (*Koelreuteria* ssp.). This host shift, along with the abundance of red-shouldered soapberry bugs in urban environments, has made *J. haematoloma* an excellent model for the study of rapid adaptive evolution ([Tsai 2013](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3689129/); [Panfilio & Angelini 2018](https://www.sciencedirect.com/science/article/pii/S2214574517301153)). Indeed, different researchers have examined rapid evolution in beak length (e.g. [Carrol & Loye 1987](https://academic.oup.com/aesa/article/80/3/373/10793); [Yu & Andrés 2014](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3931560/); [Cenzer 2016](https://academic.oup.com/aesa/article/80/3/373/10793); [2017](https://www.journals.uchicago.edu/doi/10.1086/693456)), the wing/reproductive polyphenism ([Carroll et al. 2003](http://soapberrybug.org/_dbase_upl/Carroll_et_al._Ann_Ent_2003.pdf); [Fawcett et al. 2018](https://www.nature.com/articles/s41467-018-04102-1)), and several other life history traits ([Carroll 1991](http://soapberrybug.org/_dbase_upl/C1991.pdf); [Carroll et al. 1998](http://soapberrybug.org/_dbase_upl/Carroll_et_al._Ev_Eco_1998.pdf)) of this animal. My lab has been studying appendage development and wing polyphenism in the bugs. And we were interested in a draft genome sequence as a resource for developmental genetics, to contextualize genotyping and population studies, and as a point of comparison to the genomes of other insects, especially *Oncopeltus fasciatus* ([Panfilio et al. 2019](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1660-0)). ## Genome project history The soapberry bug karyotype was described by Lelia Porter in 1917. Males have 13 chromosomes, and the species appears to use an X0 sex determination system. So there are six pairs of autosomes, and an X that is slightly larger than the smallest autosome ([Porter 1917](https://www.journals.uchicago.edu/doi/10.2307/1536296)). Based on meiotic behavior, the smallest chromosome has been described as an "m-chromosome". It does not appear to have chiasmata during meiotic prophase and migrates to the poles early in anaphase ([Ueshima 1979](https://www.amazon.com/Hemiptera-II-Heteroptera-Animal-cytogenetics/dp/344326008X)). ![](https://i.imgur.com/BQw7Qpq.jpg =250x) > Camera lucida drawing of a spermatocyte in prophase I. From [Porter (1917)](https://www.journals.uchicago.edu/doi/10.2307/1536296), Figure 5. In 2015, Spencer Johnston at Texas A&M used flow cytometry to estimate the genome size of *Jadera haematoloma* at about 1.95 Gbp. In 2018, an anonymous gift was made to [Colby College](https://www.colby.edu/) to support research in genomics and bioinformatics, and we were given the green light to use these funds for a genome sequencing project. Additional funding was provided by [Maine INBRE](https://inbre.maineidea.net/) and the Colby [Department of Biology](http://www.colby.edu/bio/). We contracted [Dovetail Genomics](https://dovetailgenomics.com/) for sequencing and assembly. Dovetail offers a combination of library preparation methods, including the use of HiC proximity end-pairing, which allows for assembly to chromosome length. To reduce heterozygosity, we chose to sequence a lab population of bugs, originally from [Plantation Key](https://www.google.com/maps/place/Coral+Rd,+Islamorada,+FL+33070/@24.9778928,-80.6211314,10z/) in Tavernier, FL, and crossed full siblings for 5 generations. Dovetail made the DNA isolation and prepared libraries from one of these in-bred male bugs. ## Progress so far In August 2019, Dovetail returned the draft genome assembly. The total sequence length from all scaffolds was 2.08 Gbp, very close to the previous estimate. Seven large scaffolds contain 89.9% of the sequence, and likely represent the seven chromosomes seen in the bug's karyotype. ![](https://i.imgur.com/hQpEsD2.png =450x) > The chromosomes of *J. haematoloma* are represented by seven scaffolds over 1 Mbp in length. Here, the length of these scaffolds is plotted on a log-scale against their sequencing depth, reflected by the number of reads mapping per million assembled base pairs (CPM). Chromosome names are given to the scaffolds based on the size and read depth, following [Porter (1917)](https://www.journals.uchicago.edu/doi/10.2307/1536296). Metrics, such as the distribution of ambiguous nucleotides and repetitive sequences, all indicate that the genome assembly is of high quality. We are now in the annotation phase of this project. A preliminary survey using BLAST found orthologs for 80 of 81 candidate genes. | chromosome | length (Mbp) | number of candidate genes | gene density (per Mbp) | | :-------------- | :------------------------: | :-----------------------: | :--------------------: | | Chr1 | 559.6 | 24 | 0.0429 | | Chr2 | 375.1 | 12 | 0.0320 | | Chr3 | 293.5 | 13 | 0.0443 | | Chr4 | 240.4 | 9 | 0.0374 | | Chr5 | 193.1 | 19 | 0.0984 | | X | 179.5 | 12 | 0.0669 | | m | 28.9 | 0 | 0 | | other scaffolds | each <0.56 (211.5 overall) | 1 | 0.0047 (overall) | We are currently using de novo gene prediction methods to further characterize the genome. Gene expression studies are also underway to characterize genes involved in nutritionally dependent plasticity in wing growth and patterning. In the future, the genome sequenbce will also enable population-level differences among bugs in the wild to be mapped and placed in the context of genes and other genomic features. ---