On the Divergence of Alleles in Nested Subsamples from Finite Populations |
| |
Authors: | Richard R. Hudson and Norman L. Kaplan |
| |
Affiliation: | National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709 |
| |
Abstract: | Within-population variation at the DNA level will rarely be studied by sequencing of loci of randomly chosen individuals. Instead, individuals will usually be chosen for sequencing based on some knowledge of their genotype. Data collected in this way require new sampling theory. Motivated by these observations, we have examined the sampling properties of a finite population model with two mutation processes and with no selection or recombination. One mutation process generates new alleles according to an infinite-alleles model, and the other generates polymorphisms at sites according to an infinite-sites model. A sample of n genes is considered. The stationary distribution of the number of segregating sites in a subsample from one of the allelic classes in the sample conditional on the allelic configuration of the sample is studied. A recursive scheme is developed to compute the moments of this distribution, and it is shown that the distribution is functionally independent of the number of additional alleles in the sample and their respective frequencies in the sample. For the case in which the sample contains only two alleles, the distribution of the number of segregating sites in a subsample containing both alleles conditional on the sample frequencies of the alleles is studied. The results are applied to the analysis of DNA sequences of two alleles found at the Adh locus of Drosophila melanogaster. No significant departure from the neutral model is detected. |
| |
Keywords: | |
|
|