---
title: 'acoRn: A Forest Adventure in Search of Oak Parents'
tags:
- R
- ecology
- ecological-modelling
- parental-assignment
authors:
- name: Nikos Pechlivanis
orcid: 0000-0003-2502-612X
affiliation: 1
- name: Fotis Psomopoulos
orcid: 0000-0002-0222-4273
affiliation: 1
- name: Aristotelis C. Papageorgiou
orcid: 0000-0001-6657-7820
affiliation: 2
affiliations:
- name: Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, Greece
index: 1
- name: Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, Greece
index: 2
date: 14 June 2024
bibliography: paper.bib
# Optional fields if submitting to a AAS journal too, see this blog post:
# https://blog.joss.theoj.org/2018/12/a-new-collaboration-with-aas-publishing
# aas-doi: 10.3847/xxxxx <- update this with the DOI from AAS once you know it.
# aas-journal: Astrophysical Journal <- The name of the AAS journal.
---
# Summary
In this study, we present `acoRn`, an open-source R package designed for exclusion-based parentage assignment. Utilizing the principles of Mendelian segregation, `acoRn` analyzes multilocus genotype data from potential parents and offspring to identify likely parentage relationships, while accommodating genotyping errors, missing data, and duplicate genotypes. We demonstrated the application of `acoRn` by analyzing synthetic datasets of adult and progeny trees within a specific forest stand. Our findings indicated that only a small subset of adult trees contributed to the juvenile generation, showcasing the tool's capability to elucidate parentage patterns. `acoRn` is effective not only for oak trees but also for a wide range of organisms, making it a versatile tool for parentage analysis. Its ability to process diverse datasets and deliver clear results highlights its utility in studying reproductive relationships and population dynamics in biological research. `acoRn` serves as a valuable resource for researchers seeking robust parentage assignment methods and is freely available on GitHub at [**npechl/acoRn**](https://github.com/npechl/acoRn).
# Statement of need
Parentage assignment is a critical technique employed across multiple biological disciplines to gain insights into the reproductive relationships and genetic structures of populations [@Huang_et_al]. This method is essential for several key applications [@Jones_et_al]:
- **Biodiversity Conservation**: In conservation biology, understanding the genetic relationships within and between populations is fundamental for preserving genetic diversity. Parentage assignment helps identify which individuals are contributing offspring to the next generation, thereby informing strategies to maintain or enhance genetic variability, support breeding programs for endangered species, and manage habitat restoration efforts.
- **Breeding Programs**: In agricultural and horticultural contexts, parentage assignment is crucial for selective breeding programs. By accurately identifying parent-offspring relationships, breeders can make informed decisions to enhance desirable traits such as disease resistance, yield, or growth rates. This technique ensures the traceability of genetic lineage, aids in the prevention of inbreeding, and facilitates the development of new varieties or breeds with optimized characteristics.
- **Understanding Mating Patterns**: Analyzing mating patterns within a population provides insights into reproductive strategies, mate choice, and the genetic structure of populations. Parentage assignment can reveal patterns such as polygamy, monogamy, or assortative mating, which have significant implications for the genetic health and evolution of populations.
- **Gene Flow Studies**: Gene flow refers to the transfer of genetic material between populations. Parentage assignment helps track the movement of genes across geographical and ecological boundaries, thereby elucidating the connectivity between populations. This information is crucial for understanding how populations adapt to changing environments and for managing gene flow to prevent genetic isolation.
- **Hybridization Studies**: In natural and managed populations, hybridization between species or subspecies can have profound effects on genetic diversity and adaptation. Parentage assignment allows researchers to identify hybrid individuals and assess the extent and impact of hybridization events. This is particularly important in conservation, where hybridization with non-native species can threaten the genetic integrity of native populations.
Here, we introduce `acoRn`, an open-source R package designed for exclusion-based parentage assignment. Leveraging Mendelian segregation principles, `acoRn` compares multilocus genotype data from potential parents and offspring to identify likely parentage relationships, even in the presence of genotyping errors, missing values, and duplicate genotypes.
**`acoRn`** offers two main algorithms:
1. The first generates synthetic genotype data based on user-defined parameters, including the number of trees, variant frequencies, and the number of loci.
2. The second uses genotype data from multiple samples to identify parentage patterns.
In our study, we applied `acoRn` to synthetic datasets of adult and juvenile trees within a forest stand. We created six different representative cases with synthetic datasets for both parents and juvenile individuals, following the random mating model for a selected number of parents and mating events (Table XX). Then we tested `acoRn` to assign the parents for each of the six cases. The results (Figure XX, Table XXX) indicated that...
In the real case, we applied accorn in a small mixed oak forest, where three different oak species occur. ... After addressing duplicate genotypes and missing values, parentage relationships were identified using comprehensive plots and tables generated by `acoRn`. The analysis revealed that only a small subset of adult trees contributed to the next generation, demonstrating `acoRn`'s capability to uncover parentage patterns.
`acoRn` is versatile and suitable for a wide range of organisms beyond oak trees. It efficiently handles various datasets and provides clear results, making it a valuable tool for researchers studying reproductive relationships and population dynamics.
# Installation
`acoRn` is licensed under the [MIT License](https://opensource.org/license/mit) and can be easily installed from [GitHub](https://github.com/npechl/acoRn) as follows:
```R
# install.packages("remotes")
remotes::install_github("npechl/acoRn")
```
# Usage
## Synthetic genotype data generation
```R
library(acoRn)
parents <- create_mock_parents()
offspring <- create_mock_progeny(p[[1]], fparents = 5, mparents = 5, prog = 5)
```
## Parental assignment
```R
library(acoRn)
data("parents")
data("offspring")
r <- acoRn(parents, offspring)
head(r)
```
# Mathematics
THIS SECTION WILL BE REMOVED
# References
[to be added to paper.bib]
@article{Huang_et_al,
author = {Huang, Kang and Mi, Rui and Dunn, Derek W and Wang, Tongcheng and Li, Baoguo},
title = "{Performing Parentage Analysis in the Presence of Inbreeding and Null Alleles}",
journal = {Genetics},
volume = {210},
number = {4},
pages = {1467-1481},
year = {2018},
month = {10},
abstract = "{Parentage analysis is an important method that is used widely in zoological and ecological studies. Current mathematical models of parentage analyses usually assume that a population has a uniform genetic structure and that mating is panmictic. In a natural population, the geographic or social structure of a population, and/or nonrandom mating, usually leads to a genetic structure and results in genotypic frequencies deviating from those expected under the Hardy-Weinberg equilibrium (HWE). In addition, in the presence of null alleles, an observed genotype represents one of several possible true genotypes. The true father of a given offspring may thus be erroneously excluded in parentage analyses, or may have a low or negative LOD score. Here, we present a new mathematical model to estimate parentage that includes simultaneously the effects of inbreeding, null alleles, and negative amplification. The influences of these three factors on previous model are evaluated by Monte-Carlo simulations and empirical data, and the performance of our new model is compared under controlled conditions. We found that, for both simulated and empirical data, our new model outperformed other methods in many situations. We make available our methods in a new, free software package entitled parentage. This can be downloaded via http://github.com/huangkang1987/parentage.}",
issn = {1943-2631},
doi = {10.1534/genetics.118.301592},
url = {https://doi.org/10.1534/genetics.118.301592},
eprint = {https://academic.oup.com/genetics/article-pdf/210/4/1467/42214485/genetics1467.pdf},
}
@article{Jones_et_al,
author = {Jones, Adam and Small, Clayton and Paczolt, Kimberly and Ratterman, Nicholas},
year = {2010},
month = {01},
pages = {6 - 30},
title = {A practical guide to methods of parentage analysis: TECHNICAL REVIEW},
volume = {10},
journal = {Molecular Ecology Resources},
doi = {10.1111/j.1755-0998.2009.02778.x}
}