Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny
Eleazar Eskin and Eran Halperin and Richard M. Karp
EECS Department, University of California, Berkeley
Technical Report No. UCB/CSD-02-1196
, 2002
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/CSD-02-1196.pdf
Each person's genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person's genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. <p>Since experimental determination of a person's genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site. <p>We present a simple and efficient polynomial-time algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2-SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. <p>Our algorithms have been tested on real data and give biologically meaningful results.
BibTeX citation:
@techreport{Eskin:CSD-02-1196, Author= {Eskin, Eleazar and Halperin, Eran and Karp, Richard M.}, Title= {Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny}, Year= {2002}, Month= {Aug}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/5823.html}, Number= {UCB/CSD-02-1196}, Abstract= {Each person's genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person's genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. <p>Since experimental determination of a person's genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site. <p>We present a simple and efficient polynomial-time algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2-SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. <p>Our algorithms have been tested on real data and give biologically meaningful results.}, }
EndNote citation:
%0 Report %A Eskin, Eleazar %A Halperin, Eran %A Karp, Richard M. %T Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny %I EECS Department, University of California, Berkeley %D 2002 %@ UCB/CSD-02-1196 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/5823.html %F Eskin:CSD-02-1196