首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Michael Lynch 《Genetics》2009,182(1):295-301
A new generation of high-throughput sequencing strategies will soon lead to the acquisition of high-coverage genomic profiles of hundreds to thousands of individuals within species, generating unprecedented levels of information on the frequencies of nucleotides segregating at individual sites. However, because these new technologies are error prone and yield uneven coverage of alleles in diploid individuals, they also introduce the need for novel methods for analyzing the raw read data. A maximum-likelihood method for the estimation of allele frequencies is developed, eliminating both the need to arbitrarily discard individuals with low coverage and the requirement for an extrinsic measure of the sequence error rate. The resultant estimates are nearly unbiased with asymptotically minimal sampling variance, thereby defining the limits to our ability to estimate population-genetic parameters and providing a logical basis for the optimal design of population-genomic surveys.  相似文献   

2.
A method for reconstructing allele frequencies characteristic of an original ethnically homogeneous population before the start of migration processes is described. Information on both the ethnic group studied and offspring of interethnic marriages is used to estimate the allele frequencies. This makes it possible to increase the informativeness of the sample, which, in the case of ethnic heterogeneity, depends not only on allele frequencies and the total sample size, but also on the ethnic structure of the sample. The problem of estimating allele frequency in an ethnically heterogeneous sample has been solved analytically for diallelic loci. It has been demonstrated that, if offspring of interethnic marriages with the same degree of outbreeding is added to a sample of the ethnic group studied, the sample informativeness does not change. To utilize the information contained in the phenotypes of the offspring of interethnic marriages, representatives of the population from which migration occurs should be included into the sample. The size of the sample ensuring the preassigned accuracy of estimation is minimized at a certain ratio between the numbers of the offspring of interethnic marriages and the “immigrants.” To analyze polyallelic loci, a software package has been developed that allows estimating allele frequencies, determining the errors of these estimates, and planning the sample ensuring the preassigned accuracy of estimation. The package is available free at http://mga.bionet.nsc.ru/PopMixed/PopMixed.html.__________Translated from Genetika, Vol. 41, No. 7, 2005, pp. 990–996.Original Russian Text Copyright © 2005 by Axenovich, Kirichenko.  相似文献   

3.
Procedure is described to estimate allele frequencies in indigenous populations of Siberia using phenotype data not only for pure-blood representatives of the ethnic groups examined, but also for the descendants of mixed marriages. Implementation of the method requires reconstruction of the pedigree structure for the sample examined. Inclusion of the data on descendants of mixed marriages into the analysis increases the sample information content and decreases variance of the estimates obtained. The advantages of the method are illustrated using an example of Tundra Nentsy, for whom it was shown that variance of estimates at the analysis of the blood groups allele frequencies can be diminished approximately by a factor of 1.5.  相似文献   

4.
Sequencing of pooled samples (Pool-Seq) using next-generation sequencing technologies has become increasingly popular, because it represents a rapid and cost-effective method to determine allele frequencies for single nucleotide polymorphisms (SNPs) in population pools. Validation of allele frequencies determined by Pool-Seq has been attempted using an individual genotyping approach, but these studies tend to use samples from existing model organism databases or DNA stores, and do not validate a realistic setup for sampling natural populations. Here we used pyrosequencing to validate allele frequencies determined by Pool-Seq in three natural populations of Arabidopsis halleri (Brassicaceae). The allele frequency estimates of the pooled population samples (consisting of 20 individual plant DNA samples) were determined after mapping Illumina reads to (i) the publicly available, high-quality reference genome of a closely related species (Arabidopsis thaliana) and (ii) our own de novo draft genome assembly of A. halleri. We then pyrosequenced nine selected SNPs using the same individuals from each population, resulting in a total of 540 samples. Our results show a highly significant and accurate relationship between pooled and individually determined allele frequencies, irrespective of the reference genome used. Allele frequencies differed on average by less than 4%. There was no tendency that either the Pool-Seq or the individual-based approach resulted in higher or lower estimates of allele frequencies. Moreover, the rather high coverage in the mapping to the two reference genomes, ranging from 55 to 284x, had no significant effect on the accuracy of the Pool-Seq. A resampling analysis showed that only very low coverage values (below 10-20x) would substantially reduce the precision of the method. We therefore conclude that a pooled re-sequencing approach is well suited for analyses of genetic variation in natural populations.  相似文献   

5.
Comparing allele frequencies among populations that differ in environment has long been a tool for detecting loci involved in local adaptation. However, such analyses are complicated by an imperfect knowledge of population allele frequencies and neutral correlations of allele frequencies among populations due to shared population history and gene flow. Here we develop a set of methods to robustly test for unusual allele frequency patterns and correlations between environmental variables and allele frequencies while accounting for these complications based on a Bayesian model previously implemented in the software Bayenv. Using this model, we calculate a set of “standardized allele frequencies” that allows investigators to apply tests of their choice to multiple populations while accounting for sampling and covariance due to population history. We illustrate this first by showing that these standardized frequencies can be used to detect nonparametric correlations with environmental variables; these correlations are also less prone to spurious results due to outlier populations. We then demonstrate how these standardized allele frequencies can be used to construct a test to detect SNPs that deviate strongly from neutral population structure. This test is conceptually related to FST and is shown to be more powerful, as we account for population history. We also extend the model to next-generation sequencing of population pools—a cost-efficient way to estimate population allele frequencies, but one that introduces an additional level of sampling noise. The utility of these methods is demonstrated in simulations and by reanalyzing human SNP data from the Human Genome Diversity Panel populations and pooled next-generation sequencing data from Atlantic herring. An implementation of our method is available from http://gcbias.org.  相似文献   

6.

Background

DNA barcoding refers to the use of short DNA sequences for rapid identification of species. Genetic distance or character attributes of a particular barcode locus discriminate the species. We report an efficient approach to analyze short sequence data for discrimination between species.

Methodology and Principal Findings

A new approach, Oligonucleotide Frequency Range (OFR) of barcode loci for species discrimination is proposed. OFR of the loci that discriminates between species was characteristic of a species, i.e., the maxima and minima within a species did not overlap with that of other species. We compared the species resolution ability of different barcode loci using p-distance, Euclidean distance of oligonucleotide frequencies, nucleotide-character based approach and OFR method. The species resolution by OFR was either higher or comparable to the other methods. A short fragment of 126 bp of internal transcribed spacer region in ribosomal RNA gene was sufficient to discriminate a majority of the species using OFR.

Conclusions/Significance

Oligonucleotide frequency range of a barcode locus can discriminate between species. Ability to discriminate species using very short DNA fragments may have wider applications in forensic and conservation studies.  相似文献   

7.
DNA条形码研究进展   总被引:4,自引:0,他引:4  
DNA条形码是应用有足够变异的标准化短基因片段对物种进行快速、准确鉴定的新的生物身份识别系统.2003年,加拿大Guelph大学Hebert等首次正式提出了DNA条形码概念,2004年成立了生物条形码联盟,目前有来自50个国家的两百多个组织成为其成员,2007年5月加拿大Guelph大学组建了世界上第一个DNA barcoding鉴定中心,2009年1月正式启动"国际生命条形码计划",中国科学院代表中国与加拿大、美国和欧盟共同为iBOL 4个中心节点.线粒体细胞色素C氧化酶基因COⅠ具有引物通用性高和进化速率快等优点,是理想的动物DNA条形码,不过,COⅠ在植物中应用效果较差,因此,核糖体ITS序列和质体rbcL、matK和trnH-psbA等序列也相继被引入植物的DNA条形码研究.虽然DNA条形码研究还处于起步阶段,面临巨大挑战,但是,越来越多的研究表明DNA条形码可以广泛应用于生物的分类和鉴定,是一种简便、高效、准确的物种鉴定技术,已经在动物、植物和微生物等研究中取得了显著成果,是生命科学领域发展最快的学科前沿之一.本文从DNA条形码的开发、应用、国内相关文献研究现状、DNA条形码面临的挑战以及发展前景等进行了综合分析,以期推动我国DNA条形码和分类学研究的发展.  相似文献   

8.
A Measure of Population Subdivision Based on Microsatellite Allele Frequencies   总被引:48,自引:9,他引:48  
M. Slatkin 《Genetics》1995,139(1):457-462
  相似文献   

9.
P. E. Jorde  N. Ryman 《Genetics》1996,143(3):1369-1381
We studied temporal allele frequency shifts over 15 years and estimated the genetically effective size of four natural populations of brown trout (Salmo trutta L.) on the basis of the variation at 14 polymorphic allozyme loci. The allele frequency differences between consecutive cohorts were significant in all four populations. There were no indications of natural selection, and we conclude that random genetic drift is the most likely cause of temporal allele frequency shifts at the loci examined. Effective population sizes were estimated from observed allele frequency shifts among cohorts, taking into consideration the demographic characteristics of each population. The estimated effective sizes of the four populations range from 52 to 480 individuals, and we conclude that the effective size of natural brown trout populations may differ considerably among lakes that are similar in size and other apparent characteristics. In spite of their different effective sizes all four populations have similar levels of genetic variation (average heterozygosity) indicating that excessive loss of genetic variability has been retarded, most likely because of gene flow among neighboring populations.  相似文献   

10.
DNA条形编码技术在动物分类中的研究进展   总被引:18,自引:0,他引:18  
DNA条形编码(DNA Barcoding)技术是一种新的生物分类方法,它是分子生物学和生物信息学相结合的产物。这一概念认为,就像在商店里扫描仪读取条形码那样,对地球上每一种生物也能通过快速分析其DNA中的一小段(线粒体细胞色素C氧化酶Ⅰ亚基,mt COI)加以识别。在最近3年里,该技术已成为生物分类学中研究的热点。理论上,DNA条形编码在生物分类鉴定中具有重要作用,但目前国际上对其的争论也不少。综述了DNA条形编码技术的产生、发展概况、原理与操作及其在动物分类中的应用,突出了该技术在寄生虫分类中应用的意义与可行性,并讨论了DNA条形编码在生物分类应用中可能存在的问题。  相似文献   

11.
菌物DNA条形码技术原理与操作   总被引:1,自引:0,他引:1  
刘淑艳  张傲  李玉 《菌物研究》2012,10(3):205-209
DNA条形码技术是通过对1个较短目的基因的DNA序列进行分析从而进行物种鉴定的方法,它通过对1个或多个相关基因进行大范围的扫描,进而鉴定未知物种或者发现新种。当传统的分类学受到阻碍时,这种技术可以发挥其优势。相对于其他生物,菌物的生活史独特而复杂,这就使得对其进行的形态学鉴定要受到菌物自身生长发育时期的限制。国内外科学家对寻找适合于大多数菌物的标准DNA条码进行过探索,但还没有找到满足全部特征的基因片段。文中对DNA条形码技术的概念、原理依据、操作步骤和优缺点方面进行了介绍,并对DNA条形码技术在我国菌物研究方面的应用前景进行了展望。  相似文献   

12.
13.
N. Lehman  R. K. Wayne 《Genetics》1991,128(2):405-416
A restriction-site survey of 327 coyotes (Canis latrans) from most parts of their North American range reveals 32 mitochondrial DNA (mtDNA) genotypes. The genotypes are not strongly partitioned in space, suggesting that there is high gene flow among coyote subpopulations. Consequently, each new geographic location added to the study has a decreasing probability of containing a mtDNA genotype that had not been previously discovered. This being the case, by using Monte Carlo sampling experiments, we can estimate the total number of genotypes that would be found if all possible localities were surveyed. This estimate of total genotypic variability agrees qualitatively with estimates based on theoretical considerations of the expected number of alleles in a stable population. We also predict effective population sizes from genotype data. The accuracy of these estimates is thought to be dependent on the fact that coyotes are not highly genetically structured, a situation which may apply to highly mobile species.  相似文献   

14.
High-throughput pooled resequencing offers significant potential for whole genome population sequencing. However, its main drawback is the loss of haplotype information. In order to regain some of this information, we present LDx, a computational tool for estimating linkage disequilibrium (LD) from pooled resequencing data. LDx uses an approximate maximum likelihood approach to estimate LD (r2) between pairs of SNPs that can be observed within and among single reads. LDx also reports r2 estimates derived solely from observed genotype counts. We demonstrate that the LDx estimates are highly correlated with r2 estimated from individually resequenced strains. We discuss the performance of LDx using more stringent quality conditions and infer via simulation the degree to which performance can improve based on read depth. Finally we demonstrate two possible uses of LDx with real and simulated pooled resequencing data. First, we use LDx to infer genomewide patterns of decay of LD with physical distance in D. melanogaster population resequencing data. Second, we demonstrate that r2 estimates from LDx are capable of distinguishing alternative demographic models representing plausible demographic histories of D. melanogaster.  相似文献   

15.
DNA条形码是一段可用于物种鉴定的DNA序列。本文综述了近年来多种基于DNA条形码的分析方法及其在物种鉴定和隐存种发现中的应用,主要包括遗传距离法、进化树法、相似性比对法、诊断法和统计分类法等,旨在为这一技术的广泛应用提供参考。  相似文献   

16.
The genetic polymorphism of apolipoprotein C in normal plasma of four European sheep breeds (Suffolk, Corriedale, Cheviot, and Finn) was first detected using one-dimensional polyacrylamide gel isoelectric focusing (pH 2.5–5.0) followed by immunoblotting with antihuman apolipoprotein CII antibody. Six phenotypes (1-1, 2-1, 2-2, 3-1, 3-2, and 3-3) were identified in the 4.3–4.8 pH range, consisting of the combination of three isoform groups. On the basis of family and population data, these phenotypes were controlled autosomally by three codominant alleles, designated APOC*1, APOC*2, and APOC*3, the first being the most common allele. The frequency distributions of these alleles were similar between the Suffolk and Corriedale sheep, and between the Cheviot and Finn sheep. The former breeds had a significantly lower APOC*2 frequency than the latter breeds (P< 0.001). The mean plasma total-, HDL- and LDL-cholesterol levels of type 3-1 animals were significantly higher compared to type 1-1 animals in the Suffolk sheep (P 0.04). However, these differences were not seen in the Corriedale sheep  相似文献   

17.
With the global biodiversity crisis, DNA barcoding aims for fast species identification and cryptic species diversity revelation. For more than 10 years, large amounts of DNA barcode data have been accumulating in publicly available databases, most of which were conducted by distance or tree-building methods that have often been argued, especially for cryptic species revelation. In this context, overlooked cryptic diversity may exist in the available barcoding data. The character-based DNA barcoding, however, has a good chance for detecting the overlooked cryptic diversity. In this study, marine mollusk was as the ideal case for detecting the overlooked potential cryptic species from existing cytochrome c oxidase I (COI) sequences with character-based DNA barcode. A total of 1081 COI sequences of mollusks, belonging to 176 species of 25 families of Gastropoda, Cephalopoda, and Lamellibranchia, were conducted by character analysis. As a whole, the character-based barcoding results were consistent with previous distance and tree-building analysis for species discrimination. More importantly, quite a number of species analyzed were divided into distinct clades with unique diagnostical characters. Based on the concept of cryptic species revelation of character-based barcoding, these species divided into separate taxonomic groups might be potential cryptic species. The detection of the overlooked potential cryptic diversity proves that the character-based barcoding mode possesses more advantages of revealing cryptic biodiversity. With the development of DNA barcoding, making the best use of barcoding data is worthy of our attention for species conservation.  相似文献   

18.
Sugarcane borers are economically damaging insects with species that vary in distribution patterns both geographically and temporally, and vary based on ecological niche. Currently, identification of sugarcane borers is mostly based on morphological characters. However, morphological identification requires taxonomic expertise. An alternative method to identify sugarcane borers is the use of molecular data. DNA barcoding based on partial cytochrome c oxidase subunit 1 (COI) sequences has proven to be a useful tool for rapid and accurate species determination in many insect taxa. This study was conducted to test the effectiveness of DNA barcodes to discriminate among sugarcane borer species in China. Partial sequences of the COI gene (709 bp) were obtained from six species collected from different geographic areas. Results showed that the pairwise intraspecies genetic distance was < 0.02, whereas the interspecies genetic distance ranged from 0.117 to 0.182. Results from a neighbor-joining tree showed that the six sugarcane borer species were certainly separated. These results suggested that the partial COI sequences had high barcoding resolution in discriminating among sugarcane borer species. Our study emphasized the use of DNA barcodes for identification of the analyzed sugarcane borer species and represents an important step for building a comprehensive barcode library for sugarcane borers in China.  相似文献   

19.
Iain Mathieson  Gil McVean 《Genetics》2013,193(3):973-984
Inferring the nature and magnitude of selection is an important problem in many biological contexts. Typically when estimating a selection coefficient for an allele, it is assumed that samples are drawn from a panmictic population and that selection acts uniformly across the population. However, these assumptions are rarely satisfied. Natural populations are almost always structured, and selective pressures are likely to act differentially. Inference about selection ought therefore to take account of structure. We do this by considering evolution in a simple lattice model of spatial population structure. We develop a hidden Markov model based maximum-likelihood approach for estimating the selection coefficient in a single population from time series data of allele frequencies. We then develop an approximate extension of this to the structured case to provide a joint estimate of migration rate and spatially varying selection coefficients. We illustrate our method using classical data sets of moth pigmentation morph frequencies, but it has wide applications in settings ranging from ecology to human evolution.  相似文献   

20.
It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号