# *Vanessa cardui* candidate exon structures 27 November 2022 Based on BLAST of transcripts from the [Reed & Pillardy transcriptome](http://www.butterflygenome.org/?q=node/5) to the chromosome-level NCBI [VanCard 2.2](https://www.ncbi.nlm.nih.gov/data-hub/genome/GCA_905220365.2/) genome assembly. ## Sex determination genes ### *vermilion* ![](https://i.imgur.com/fnL4V9a.png) - chr19, plus strand - 8 exons - The second of our sgRNAs, T7-Vc-v-618-Cas9sg, targets a polymorphic site. - CRISPR with T7-Vc-v-258-Cas9sg; diagnose with primers Vc-v-192F/311R ### *dsx* ![](https://i.imgur.com/z3cyKjx.png) - chr20, minus strand, positions 8672278 to 8559025 - 5 exons - Unlike other insects, exon 1 is quite large (848 bp). It includes the DM DNA-binding domain. Our two sgRNAs target upstream of that domain. The OD1 dimerization domain covers all of exon 2 and a bit of exon 3. An in-frame stop appears in exon 4. Exon 5 is entirely 5' UTR. - CRISPR with either sgRNA; diagnose with primers Vc-dsx-40F/515R ### *fem1* ![](https://i.imgur.com/Dl3z0pG.png) ![](https://i.imgur.com/6j4pXZ9.png) - chr23, minus strand, positions 11278992 to 11274460 - 9 exons - The second sgRNA, T7-Vc-fem1-613-Cas9sg, targets the conserved TRPC - CRISPR with T7-Vc-fem1-613-Cas9sg; diagnose with primers Vc-fem1-2442F/3728R ### *MASC* ![](https://i.imgur.com/HWeFpfZ.png) ![](https://i.imgur.com/wrzfwnG.png) ![](https://i.imgur.com/WvWLUyM.png) - There's a tandem duplication of *MASC* on the Z chromosome. The two genes have the same exon structure through their coding sequence but differ in their 5' UTR exon structures. The first one is the better match for the transcript we've identified as the *MASC* ortholog. But both genomic sequences encode equally good BLAST hits to the *Bombyx mori MASC* protein. - For better or worse, the sgRNAs match identical sequence in both genes. - CRISPR with T7-Vc-Masc-526-Cas9sg; diagnose with primers Vc-Masc-390F/647R ### *PSI* ![](https://i.imgur.com/doJTE5r.png) ![](https://i.imgur.com/exBRY5s.png) - chr24, plus strand, positions 5806168 to 5820890 - 16 exons - *PSI* has sooo many exons! Worse, each exons is very short, being about 140 bp, and they are still spirited by much larger introns. This structure will make primers to assess deletions very difficult. It may be necessary to design them to intronic sequences. - Both sgRNAs are within exons, and the second, T7-Vc-PSI-660-Cas9sg, is just downstream of the first KH RNA-binding domain. - CRISPR with T7-Vc-PSI-660-Cas9sg; diagnose with primers Vc-PSI-5397F/6372R ## Pigmentation genes ### *DDC* ![](https://i.imgur.com/Mc2erY6.png) ![](https://i.imgur.com/GdMwTnz.png) - Two paralogs in inverted orientation on chr7. One on the minus strand at positions 9456501 to 9452160. The second on the plus strand, positions 9215777 to 9220173. - 5 exons, in both genes ### *optix* ![](https://i.imgur.com/L8TYtJd.png) - chr3, plus strand, positions 1545272 to 1546123. - 1 exon ### *tan* ![](https://i.imgur.com/IhOM79D.png) - chrZ, minus strand, positions 9446355 to 9405144 - 8 exons ### *yellow* ![](https://i.imgur.com/T6Sh6lg.png) - chr4, plus strand, positions 15807490 to 15817683 - 3 exons - *yellow* is a family of several related proteins (Orthologs of ancient paralogs.) I initially pulled six. [Zhang et al. (2017)](https://academic.oup.com/genetics/article/205/4/1537/6066393) identified ten. They appear to mostly have 1-to-1 orthology with genes in *Drosophila*. Zhang et al. targeted the actual *yellow* ortholog with CRISPR in *V. cardui* and got mosaic loss-of-melanin phenotypes in larval cuticle and adult wings. Their targets are annotated in the map above. Zhange et al. also targetted *yellow-d*, which produced small knock-out clones with subtle changes in pigment ratios. CRISPR for *yellow-h2* and *yellow-h3* were lethal at late pupal stages. #### *V. cardui yellow* genes obtained with BLAST search using *D. melanogaster yellow* | provisional name | transcript ID | exons | chr | strand | Dmel BLAST | | :--------------: | :-----------: | :-------: | :--: | :----: | --------------------- | | yellow | c23148_g2 | 3 | chr4 | plus | yellow | | yellow_2 | c25544_g1 | 3 | chr12 | plus | yellow-b | | yellow_3 | c23765_g1 | 4 | chr5 | plus | yellow-h | | yellow_4 | c27346_g2 | 7 | chr6 | minus | yellow-c | | yellow_5 | c23140_g1 | 3 | chr5 | plus | yellow-h | | yellow_6 | c24086_g1 | 11 | chr30 | minus | yellow-f2 | ## Reference genes ### *EF1a* - Likely a good reference gene. - chr1, plus strand, positions 3418201 to 3419592 - 1 exon ---