Computational approaches to understanding the genetic architecture of complex traits

Brielin Brown

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2016-194
December 9, 2016

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-194.pdf

Advances in DNA sequencing technology have resulted in the ability to generate genetic data at costs unimaginable even ten years ago. This has resulted in a tremendous amount of data, with large studies providing genotypes of hundreds of thousands of individuals at millions of genetic locations. This rapid increase in the scale of genetic data necessitates the development of computational methods that can analyze this data rapidly without sacrificing statistical rigor. The low cost of DNA sequencing also provides an opportunity to tailor medical care to an individuals unique genetic signature. However, this type of precision medicine is limited by our understanding of how genetic variation shapes disease. Our understanding of so- called complex diseases is particularly poor, and most identified variants explain only a tiny fraction of the variance in the disease that is expected to be due to genetics. This is further complicated by the fact that most studies of complex disease go directly from genotype to phenotype, ignoring the complex biological processes that take place in between. Herein, we discuss several advances in the field of complex trait genetics. We begin with a review of computational and statistical methods for working with genotype and phenotype data, as well as a discussion of methods for analyzing RNA-seq data in effort to bridge the gap between genotype and phenotype. We then describe our methods for 1) improving power to detect common variants associated with disease, 2) determining the extent to which different world populations share similar disease genetics and 3) identifying genes which show differential expression between the two haplotypes of a single individual. Finally, we discuss opportunities for future investigation in this field.

Advisor: Lior Pachter


BibTeX citation:

@phdthesis{Brown:EECS-2016-194,
    Author = {Brown, Brielin},
    Title = {Computational approaches to understanding the genetic architecture of complex traits},
    School = {EECS Department, University of California, Berkeley},
    Year = {2016},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-194.html},
    Number = {UCB/EECS-2016-194},
    Abstract = {Advances in DNA sequencing technology have resulted in the ability to generate genetic data at costs unimaginable even ten years ago. This has resulted in a tremendous amount of data, with large studies providing genotypes of hundreds of thousands of individuals at millions of genetic locations. This rapid increase in the scale of genetic data necessitates the development of computational methods that can analyze this data rapidly without sacrificing statistical rigor.
The low cost of DNA sequencing also provides an opportunity to tailor medical care to an individuals unique genetic signature. However, this type of precision medicine is limited by our understanding of how genetic variation shapes disease. Our understanding of so- called complex diseases is particularly poor, and most identified variants explain only a tiny fraction of the variance in the disease that is expected to be due to genetics. This is further complicated by the fact that most studies of complex disease go directly from genotype to phenotype, ignoring the complex biological processes that take place in between.
Herein, we discuss several advances in the field of complex trait genetics. We begin with a review of computational and statistical methods for working with genotype and phenotype data, as well as a discussion of methods for analyzing RNA-seq data in effort to bridge the gap between genotype and phenotype. We then describe our methods for 1) improving power to detect common variants associated with disease, 2) determining the extent to which different world populations share similar disease genetics and 3) identifying genes which show differential expression between the two haplotypes of a single individual. Finally, we discuss opportunities for future investigation in this field.}
}

EndNote citation:

%0 Thesis
%A Brown, Brielin
%T Computational approaches to understanding the genetic architecture of complex traits
%I EECS Department, University of California, Berkeley
%D 2016
%8 December 9
%@ UCB/EECS-2016-194
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-194.html
%F Brown:EECS-2016-194