Margarita Geleta

EECS Department, University of California, Berkeley

Technical Report No. UCB/

December 1, 2024

http://www2.eecs.berkeley.edu/Pubs/TechRpts/Hold/75799c90aab78548f97b8fe2698ead53.pdf

Genomic sequence propagation through time is intricately linked to our family ties, shaping human genetic diversity across generations. This diversity has been influenced by various factors, including migration, genetic drift, and marriage patterns, together with varying family structures. Family trees, combined with genetic data, hold immense promise in forensics, where kinship inference, despite its challenges, has already demonstrated transformative potential by solving hundreds of cold cases through genetic genealogy. The objective of this research is to explore the efficient integration of genetic and genealogical data to enhance kinship inference. To achieve this, we developed a forward-in-time genotype simulator capable of generating large-scale synthetic SNP biobank datasets, incorporating genealogical structure and sex-specific recombination. The simulator provides a framework for quantifying pairwise relatedness in terms of time-to-coalescence (t2c) across the genome, rather than tracking solely identity-by-descent (IBD) features. This approach yields a more nuanced understanding of shared ancestry, which is crucial for improving the accuracy of kinship inference and advancing genetic genealogy research.

Advisors: Nilah Ioannidis


BibTeX citation:

@mastersthesis{Geleta:31565,
    Author= {Geleta, Margarita},
    Editor= {Ioannidis, Nilah and Ioannidis, Alexander},
    Title= {Pedigree-Aware Genotype Simulation},
    School= {EECS Department, University of California, Berkeley},
    Year= {2024},
    Number= {UCB/},
    Abstract= {Genomic sequence propagation through time is intricately linked to our family ties, shaping human genetic diversity across generations. This diversity has been influenced by various factors, including migration, genetic drift, and marriage patterns, together with varying family structures. Family trees, combined with genetic data, hold immense promise in forensics, where kinship inference, despite its challenges, has already demonstrated transformative potential by solving hundreds of cold cases through genetic genealogy. The objective of this research is to explore the efficient integration of genetic and genealogical data to enhance kinship inference. To achieve this, we developed a forward-in-time genotype simulator capable of generating large-scale synthetic SNP biobank datasets, incorporating genealogical structure and sex-specific recombination. The simulator provides a framework for quantifying pairwise relatedness in terms of time-to-coalescence (t2c) across the genome, rather than tracking solely identity-by-descent (IBD) features. This approach yields a more nuanced understanding of shared ancestry, which is crucial for improving the accuracy of kinship inference and advancing genetic genealogy research.},
}

EndNote citation:

%0 Thesis
%A Geleta, Margarita 
%E Ioannidis, Nilah 
%E Ioannidis, Alexander 
%T Pedigree-Aware Genotype Simulation
%I EECS Department, University of California, Berkeley
%D 2024
%8 December 1
%@ UCB/
%F Geleta:31565