首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
DeGiorgio M  Jankovic I  Rosenberg NA 《Genetics》2010,186(4):1367-1387
Gene diversity, a commonly used measure of genetic variation, evaluates the proportion of heterozygous individuals expected at a locus in a population, under the assumption of Hardy-Weinberg equilibrium. When using the standard estimator of gene diversity, the inclusion of related or inbred individuals in a sample produces a downward bias. Here, we extend a recently developed estimator shown to be unbiased in a diploid autosomal sample that includes known related or inbred individuals to the general case of arbitrary ploidy. We derive an exact formula for the variance of the new estimator, H, and present an approximation to facilitate evaluation of the variance when each individual is related to at most one other individual in a sample. When examining samples from the human X chromosome, which represent a mixture of haploid and diploid individuals, we find that H performs favorably compared to the standard estimator, both in theoretical computations of mean squared error and in data analysis. We thus propose that H is a useful tool in characterizing gene diversity in samples of arbitrary ploidy that contain related or inbred individuals.  相似文献   

2.
3.
Gene diversity is an important measure of genetic variability in inbred populations. The survival of species in changing environments depends on, among other factors, the genetic variability of the population. In this communication, I have derived the uniformly minimum variance unbiased estimator of gene diversity. The proposed estimator of gene diversity does not assume that the inbreeding coefficient is known. I have also provided the approximate variance of this estimator according to Fisher's method. In addition, I have developed a numerical resampling-based method for obtaining variances and confidence intervals based on the maximum likelihood estimator and the uniformly minimum variance unbiased estimator. Efficiency in estimation of the gene diversity based on these two estimators is discussed. In accordance with the simulation results, I found that the uniformly minimum variance estimator developed in this report is more accurate for estimation of gene diversity than the maximum likelihood estimator.  相似文献   

4.
Testing for Hardy-Weinberg equilibrium in samples with related individuals   总被引:2,自引:0,他引:2  
Bourgain C  Abney M  Schneider D  Ober C  McPeek MS 《Genetics》2004,168(4):2349-2361
When the classical chi(2) goodness-of-fit test for Hardy-Weinberg (HW) equilibrium is used on samples with related individuals, the type I error can be greatly inflated. In particular the test is inappropriate in population isolates where the individuals are related through multiple lines of descent. In this article, we propose a new test for HW (the QL-HW test) suitable for any sample with related individuals, including large inbred pedigrees, provided that their genealogy is known. Performed conditional on the pedigree structure, the QL-HW test detects departures from HW that are not due to the genealogy. Because the computation of the QL-HW test becomes intractable for very polymorphic loci in large inbred pedigrees, a simpler alternative, the GCC-HW test, is also proposed. The statistical properties of the QL-HW and GCC-HW tests are studied through simulations considering a sample of independent nuclear families, a sample of extended outbred genealogies, and samples from the Hutterite population, a North American highly inbred isolate. Finally, the method is used to test a set of 143 biallelic markers spanning 82 genes in this latter population.  相似文献   

5.
We show that the number of lineages ancestral to a sample, as a function of time back into the past, which we call the number of lineages as a function of time (NLFT), is a nearly deterministic property of large-sample gene genealogies. We obtain analytic expressions for the NLFT for both constant-sized and exponentially growing populations. The low level of stochastic variation associated with the NLFT of a large sample suggests using the NLFT to make estimates of population parameters. Based on this, we develop a new computational method of inferring the size and growth rate of a population from a large sample of DNA sequences at a single locus. We apply our method first to a sample of 1,212 mitochondrial DNA (mtDNA) sequences from China, confirming a pattern of recent population growth previously identified using other techniques, but with much smaller confidence intervals for past population sizes due to the low variation of the NLFT. We further analyze a set of 63 mtDNA sequences from blue whales (BWs), concluding that the population grew in the past. This calls for reevaluation of previous studies that were based on the assumption that the BW population was fixed.  相似文献   

6.
SUMMARY: Graphical modeling is used to extend the gene counting method to compute maximum likelihood estimates of allele frequencies for samples of individuals related in extended pedigrees. Genotypes may be missing or partially observed, and error rates can be simultaneously estimated. AVAILABILITY: The Java classes and Javadocs pages for \mathsf\hbox GeneCountAlleles can be obtained from bioinformatics.med.utah.edu/~alun, which also has more information on its use and file formats.  相似文献   

7.
8.
Genomic techniques commonly used for assessing distributions of microorganisms in the environment often produce small sample sizes. We investigated artificial neural networks for analyzing the distributions of nitrite reductase genes (nirS and nirK) and two sets of dissimilatory sulfite reductase genes (dsrAB1 and dsrAB2) in small sample sets. Data reduction (to reduce the number of input parameters), cross-validation (to measure the generalization error), weight decay (to adjust model parameters to reduce generalization error), and importance analysis (to determine which variables had the most influence) were useful in developing and interpreting neural network models that could be used to infer relationships between geochemistry and gene distributions. A robust relationship was observed between geochemistry and the frequencies of genes that were not closely related to known dissimilatory sulfite reductase genes (dsrAB2). Uranium and sulfate appeared to be the most related to distribution of two groups of these unusual dsrAB-related genes. For the other three groups, the distributions appeared to be related to pH, nickel, nonpurgeable organic carbon, and total organic carbon. The models relating the geochemical parameters to the distributions of the nirS, nirK, and dsrAB1 genes did not generalize as well as the models for dsrAB2. The data also illustrate the danger (generating a model that has a high generalization error) of not using a validation approach in evaluating the meaningfulness of the fit of linear or nonlinear models to such small sample sizes.  相似文献   

9.
An investigation has been made of the relation between species diversity and the lognormal distribution of individuals among species, using phytoplankton samples from the Indian Ocean. The area under the truncated lognormal curve representing a sample gives the total number of species, S, while the logarithmic standard deviation, σ, gives a measure of the scatter of distribution of individuals among species, the other factor affecting diversity. Using a method described by Hald, truncated lognormal curves were fitted to the phytoplankton data and estimates obtained for the mean, ξ, the logarithmic standard deviation, σ, the number of species in the modal octave, No, and the number of species in the population universe, N. Since one is interested in a sample only in so far as it reflects the properties of the population from whence it came, estimated population parameters were used to define measures of diversity by which means it was hoped to obtain a diversity index independent of sample size, i.e., a diversity value related more to the population than to the sample.  相似文献   

10.
The eukaryotic biodiversity in historical air-dried samples of Dutch agricultural soil has been assessed by random sequencing of an 18S rRNA gene library and by denaturing gradient gel electrophoresis. Representatives of nearly all taxa of eukaryotic soil microbes could be identified, demonstrating that it is possible to study eukaryotic microbiota in samples from soil archives that have been stored for more than 30 years at room temperature. In a pilot study, 41 sequences were retrieved that could be assigned to fungi and a variety of aerobic and anaerobic protists such as cercozoans, ciliates, xanthophytes (stramenopiles), heteroloboseans, and amoebozoans. A PCR-denaturing gradient gel electrophoresis analysis of samples collected between 1950 and 1975 revealed significant changes in the composition of the eukaryotic microbiota.  相似文献   

11.
Family studies of individual tissues have shown that gene expression traits are genetically heritable. Here, we investigate cis and trans components of heritability both within and across tissues by applying variance-components methods to 722 Icelanders from family cohorts, using identity-by-descent (IBD) estimates from long-range phased genome-wide SNP data and gene expression measurements for approximately 19,000 genes in blood and adipose tissue. We estimate the proportion of gene expression heritability attributable to cis regulation as 37% in blood and 24% in adipose tissue. Our results indicate that the correlation in gene expression measurements across these tissues is primarily due to heritability at cis loci, whereas there is little sharing of trans regulation across tissues. One implication of this finding is that heritability in tissues composed of heterogeneous cell types is expected to be more dominated by cis regulation than in tissues composed of more homogeneous cell types, consistent with our blood versus adipose results as well as results of previous studies in lymphoblastoid cell lines. Finally, we obtained similar estimates of the cis components of heritability using IBD between unrelated individuals, indicating that transgenerational epigenetic inheritance does not contribute substantially to the "missing heritability" of gene expression in these tissue types.  相似文献   

12.
Iqbal A  Lim YA  Surin J  Sim BL 《PloS one》2012,7(2):e31139

Background

Currently, there is a lack of vital information in the genetic makeup of Cryptosporidium especially in developing countries. The present study aimed at determining the genotypes and subgenotypes of Cryptosporidium in hospitalized Malaysian human immunodeficiency virus (HIV) positive patients.

Methodology/Principal Findings

In this study, 346 faecal samples collected from Malaysian HIV positive patients were genetically analysed via PCR targeting the 60 kDa glycoprotein (gp60) gene. Eighteen (5.2% of 346) isolates were determined as Cryptosporidium positive with 72.2% (of 18) identified as Cryptosporidium parvum whilst 27.7% as Cryptosporidium hominis. Further gp60 analysis revealed C. parvum belonging to subgenotypes IIaA13G1R1 (2 isolates), IIaA13G2R1 (2 isolates), IIaA14G2R1 (3 isolates), IIaA15G2R1 (5 isolates) and IIdA15G1R1 (1 isolate). C. hominis was represented by subgenotypes IaA14R1 (2 isolates), IaA18R1 (1 isolate) and IbA10G2R2 (2 isolates).

Conclusions/Significance

These findings highlighted the presence of high diversity of Cryptosporidium subgenotypes among Malaysian HIV infected individuals. The predominance of the C. parvum subgenotypes signified the possibility of zoonotic as well as anthroponotic transmissions of cryptosporidiosis in HIV infected individuals.  相似文献   

13.
14.
15.
An approximately unbiased (AU) test that uses a newly devised multiscale bootstrap technique was developed for general hypothesis testing of regions in an attempt to reduce test bias. It was applied to maximum-likelihood tree selection for obtaining the confidence set of trees. The AU test is based on the theory of Efron et al. (Proc. Natl. Acad. Sci. USA 93:13429-13434; 1996), but the new method provides higher-order accuracy yet simpler implementation. The AU test, like the Shimodaira-Hasegawa (SH) test, adjusts the selection bias overlooked in the standard use of the bootstrap probability and Kishino-Hasegawa tests. The selection bias comes from comparing many trees at the same time and often leads to overconfidence in the wrong trees. The SH test, though safe to use, may exhibit another type of bias such that it appears conservative. Here I show that the AU test is less biased than other methods in typical cases of tree selection. These points are illustrated in a simulation study as well as in the analysis of mammalian mitochondrial protein sequences. The theoretical argument provides a simple formula that covers the bootstrap probability test, the Kishino-Hasegawa test, the AU test, and the Zharkikh-Li test. A practical suggestion is provided as to which test should be used under particular circumstances.  相似文献   

16.
An unbiased genome-wide analysis of zinc-finger nuclease specificity   总被引:1,自引:0,他引:1  
Zinc-finger nucleases (ZFNs) allow gene editing in live cells by inducing a targeted DNA double-strand break (DSB) at a specific genomic locus. However, strategies for characterizing the genome-wide specificity of ZFNs remain limited. We show that nonhomologous end-joining captures integrase-defective lentiviral vectors at DSBs, tagging these transient events. Genome-wide integration site analysis mapped the actual in vivo cleavage activity of four ZFN pairs targeting CCR5 or IL2RG. Ranking loci with repeatedly detectable nuclease activity by deep-sequencing allowed us to monitor the degree of ZFN specificity in vivo at these positions. Cleavage required binding of ZFNs in specific spatial arrangements on DNA bearing high homology to the intended target site and only tolerated mismatches at individual positions of the ZFN binding sites. Whereas the consensus binding sequence derived in vivo closely matched that obtained in biochemical experiments, the ranking of in vivo cleavage sites could not be predicted in silico. Comprehensive mapping of ZFN activity in vivo will facilitate the broad application of these reagents in translational research.  相似文献   

17.
18.
Analysis of genomic DNA derived from cells and fresh or fixed tissues often requires whole genome amplification prior to microarray screening. Technical hurdles to this process are the introduction of amplification bias and/or the inhibitory effects of formalin fixation on DNA amplification. Here we demonstrate a balanced-PCR procedure that allows unbiased amplification of genomic DNA from fresh or modestly degraded paraffin-embedded DNA samples. Following digestion and ligation of a target and a control genome with distinct linkers, the two are mixed and amplified in a single PCR, thereby avoiding biases associated with PCR saturation and impurities. We demonstrate genome-wide retention of allelic differences following balanced-PCR amplification of DNA from breast cancer and normal human cells and genomic profiling by array-CGH (cDNA arrays, 100 kb resolution) and by real-time PCR (single gene resolution). Comparison of balanced-PCR with multiple displacement amplification (MDA) demonstrates equivalent performance between the two when intact genomic DNA is used. When DNA from paraffin-embedded samples is used, balanced PCR overcomes problems associated with modest DNA degradation and produces unbiased amplification whereas MDA does not. Balanced-PCR allows amplification and recovery of modestly degraded genomic DNA for subsequent retrospective analysis of human tumors with known outcomes.  相似文献   

19.
20.
Genotypic diversity: estimation and prediction in samples   总被引:10,自引:1,他引:10  
Stoddart JA  Taylor JF 《Genetics》1988,118(4):705-711
We show that a commonly used statistic of genotypic diversity can be used to reflect one form of deviation from panmixia, viz. clonal reproduction, by comparing observed and predicted sample statistics. The characteristics of the statistic, in particular its relationship with population genotypic diversity, are formalised and a method of predicting the genotypic diversity of a sample drawn from a panmictic population using allelic frequencies and sample size is developed. The sensitivity of some possible tests of significance of the deviation from panmictic expectations is examined using computer simulations. Goodness-of-fit tests are robust but produce an unacceptably high level of type II error. With means and variances calculated either from Monte Carlo simulations or from distributional and series approximations, t-tests perform better than goodness-of-fit tests. Under simulation, both forms of t-test exhibit acceptable rates of type I error. Rates of type II are usually large when allele frequencies are severely skewed although the latter test performs the better in those conditions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号