首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

With the completion of the HapMap project, a variety of computational algorithms and tools have been proposed for haplotype inference, tag SNP selection and genome-wide association studies. Simulated data are commonly used in evaluating these new developed approaches. In addition to simulations based on population models, empirical data generated by perturbing real data, has also been used because it may inherit specific properties from real data. However, there is no tool that is publicly available to generate large scale simulated variation data by taking into account knowledge from the HapMap project.  相似文献   

2.
人体微生物组计划又称第二人类基因组计划。由美国国立卫生研究院立项资助,2007年正式启动。计划用5年时间耗资1.5亿美元完成900个人体微生物基因组测序。其目标是探索研究人类微生物组的可行性;研究人体微生物组变化与疾病健康的关系;同时为其它科学研究提供信息和技术支持。  相似文献   

3.
Surveys of biochemical and molecular genetic variation in natural populations have generated a wealth of data, but this valuable resource has not been adequately preserved. We hope to prevent further loss by establishing a community database for population genetic surveys. We explored the feasibility of a population genetics database by developing a prototype for animal mitochondrial DNA (mtDNA) surveys. This prototype includes the specification of a format for data files that are to be submitted to the database, an open-source object database that encapsulates data with methods to display and analyze data, and a website where data can be retrieved in either its original form or extensible markup language (XML). Data from more than 50 published surveys of mtDNA variation were retrieved from the literature and entered into the database. We hope that the population genetics community will support this project by contributing both data and expertise.  相似文献   

4.
Kang CJ  Marjoram P 《Genetics》2011,189(2):595-605
We live in an age in which our ability to collect large amounts of genome-wide genetic variation data offers the promise of providing the key to the understanding and treatment of genetic diseases. Over the next few years this effort will be spearheaded by so-called next-generation sequencing technologies, which provide vast amounts of short-read sequence data at relatively low cost. This technology is often used to detect unknown variation in regions that have been linked with a given disease or phenotype. However, error rates are significant, leading to some nontrivial issues when it comes to interpreting the data. In this article, we present a method with which to address questions of widespread interest: calling variants and estimating the population mutation rate. We show performance of the method using simulation studies before applying our approach to an analysis of data from the 1000 Genomes project.  相似文献   

5.
The distribution of spatial genetic variation across a region can shape evolutionary dynamics and impact population persistence. Local population dynamics and among‐population dispersal rates are strong drivers of this spatial genetic variation, yet for many species we lack a clear understanding of how these population processes interact in space to shape within‐species genetic variation. Here, we used extensive genetic and demographic data from 10 subpopulations of greater sage‐grouse to parameterize a simulated approximate Bayesian computation (ABC) model and (i) test for regional differences in population density and dispersal rates for greater sage‐grouse subpopulations in Wyoming, and (ii) quantify how these differences impact subpopulation regional influence on genetic variation. We found a close match between observed and simulated data under our parameterized model and strong variation in density and dispersal rates across Wyoming. Sensitivity analyses suggested that changes in dispersal (via landscape resistance) had a greater influence on regional differentiation, whereas changes in density had a greater influence on mean diversity across all subpopulations. Local subpopulations, however, varied in their regional influence on genetic variation. Decreases in the size and dispersal rates of central populations with low overall and net immigration (i.e. population sources) had the greatest negative impact on genetic variation. Overall, our results provide insight into the interactions among demography, dispersal and genetic variation and highlight the potential of ABC to disentangle the complexity of regional population dynamics and project the genetic impact of changing conditions.  相似文献   

6.
Data on the genetic structure of tree and shrub populations on the continental scale have accumulated dramatically over the past decade. However, our ability to make inferences on the impact of the last ice age still depends crucially on the availability of informative palaeoecological data. This is well illustrated by the results from a recent project, during which new pollen fossil maps were established and the variation in chloroplast DNA was studied in 22 European species of trees and shrubs. Species exhibit very different levels of genetic variation between and within populations, and obviously went through very different histories after Ice Ages. However, when palaeoecological data are non-informative, inferences on past history are difficult to draw from entirely genetic data. On the other hand, as illustrated by a study in ponderosa pine, when we can infer the species' history with some certainty, coalescent simulations can be used and new hypotheses can be tested.  相似文献   

7.

Background  

Next generation ultra-sequencing technologies are starting to produce extensive quantities of data from entire human genome or exome sequences, and therefore new software is needed to present and analyse this vast amount of information. The 1000 Genomes project has recently released raw data for 629 complete genomes representing several human populations through their Phase I interim analysis and, although there are certain public tools available that allow exploration of these genomes, to date there is no tool that permits comprehensive population analysis of the variation catalogued by such data.  相似文献   

8.
Many species appear to be undergoing shifts in phenology, arising from climate change. To predict the direction and magnitude of future changes requires an understanding of how phenology depends on climatic variation. Species show large‐scale spatial variation in phenology (affected by differentiation among populations) as well as variation in phenology from year‐to‐year at the same site (affected predominantly by local plasticity). Teasing apart spatial and temporal variation in phenology should allow improved predictions of phenology under climate change. This study is the first to quantify large‐scale spatial and temporal variation in the entire emergence pattern of species, and to test the relationships found by predicting future data. We use data from up to 33 years of permanent transect records of butterflies in the United Kingdom to fit and test models for 15 butterfly species. We use generalized additive models to model spatial and temporal variation in the distribution of adult butterflies over the season, allowing us to capture changes in the timing of emergence peaks, relative sizes of peaks and/or number of peaks in a single analysis. We develop these models using data for 1973–2000, and then use them to predict phenologies from 2001 to 2006. For six of our study species, a model with only spatial variation in phenology is the best predictor of the future, implying that these species have limited plasticity. For the remaining nine species, the best predictions come from a model with both spatial and temporal variation in phenology; for four of these, growing degree‐days have similar effects over space and time, implying high levels of plasticity. The results show that statistical phenology models can be used to predict phenology shifts in a second time period, suggesting that it should be feasible to project phenologies under climate change scenarios, at least over modest time scales.  相似文献   

9.
wombsoft is an r package that analyses individually georeferenced multilocus genotypes for the inferences of genetic boundaries between populations. It is based on the Wombling method that estimates the systemic function by looking for the local variation of the allele frequencies. This study presents an original way of estimating the systemic function, based on the local polynomial regression, and a binomial test to assess the significance of boundaries. The method applies to codominant or dominant markers and allows for missing data. The software r can be downloaded from http://www.r‐project.org/ and wombsoft from http://www‐leca.ujf‐grenoble.fr/logiciels.htm or http://www.r‐project.org/ .  相似文献   

10.
Environmental variation over a species's range creates differing pressures to which organisms must adjust in order to survive. Taxa can respond to these pressures at population and individual levels, leading to localized phenotypic differentiation. Assessing the spatial distribution of phenotypic variation can illuminate how dramatically varying environmental factors shape phenotypes and may forecast a taxon's ability to adapt should conditions change. We characterized morphological variation along a transect sampled in the Grinnell Resurvey project to determine whether Gambel's white‐footed mouse (Peromyscus maniculatus gambelii), a generalist taxon inhabiting the full elevational range of habitats in Yosemite National Park and surrounding areas, has responded morphologically to variation in its environment. We quantified variation in modern P. m. gambelii cranial shape using 2D generalized Procrustes analysis and Euclidean distance matrix‐based geometric morphometrics. We performed multivariate regression of shape coordinates on elevation to test for environmental influences on shape within the principal geographic dimension of change along the transect. We observe a statistically significant correlation with shape on elevation for occlusal and lateral views of the cranium, explaining a small percentage of the overall variation in shape. Modern P. m. gambelii crania show a pattern of flexion in which the angle of the cranial base decreases at higher elevations. Results of EDMA parallel these findings, but highlight additional areas of the cranium that vary with elevation. Collectively, the patterns of variation detected suggest a biological response to the environment that warrants further study. This work lays the foundation for comparison with morphological data from historical specimens, which can address evolutionary scenarios generated from our findings, and for investigation of other taxa included in the resurvey project. J. Morphol. 271:897–909, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

11.
宋述慧  滕徐菲  肖景发 《遗传》2018,40(11):1048-1054
随着人类基因组计划和国际千人基因组计划的实施,已公开数百个中国人个体的全基因组数据。建立高精度的中国人群参考基因组序列,发现并解析中国人群特有的序列变异,是我国未来精准医学研究的基础。为满足未来精准医学研究中国人基因组数据持续增长的科学管理和深入研究的需求,中国科学院北京基因组研究所发展并建立了基于中国人群全基因组测序数据的虚拟中国人基因组数据库(Virtual Chinese Genome Database, VCGDB)和中国人群基因组变异数据库(Genome Variation Map, GVM),面向国内外用户提供数据检索、共享、下载和在线分析服务。本文重点介绍了这两个数据库的特点和功能,以及未来发展与应用前景,以期为中国人群参考基因组及基因组变异图谱资源库的推广使用、发展完善提供有益信息。  相似文献   

12.
Noor MA  Cunningham AL  Larkin JC 《Genetics》2001,159(2):581-588
We examine the effect of variation in gene density per centimorgan on quantitative trait locus (QTL) mapping studies using data from the Drosophila melanogaster genome project and documented regional rates of recombination. There is tremendous variation in gene density per centimorgan across this genome, and we observe that this variation can cause systematic biases in QTL mapping studies. Specifically, in our simulated mapping experiments of 50 equal-effect QTL distributed randomly across the physical genome, very strong QTL are consistently detected near the centromeres of the two major autosomes, and few or no QTL are often detected on the X chromosome. This pattern persisted with varying heritability, marker density, QTL effect sizes, and transgressive segregation. Our results are consistent with empirical data collected from QTL mapping studies of this species and its close relatives, and they explain the "small X-effect" that has been documented in genetic studies of sexual isolation in the D. melanogaster group. Because of the biases resulting from recombination rate variation, results of QTL mapping studies should be taken as hypotheses to be tested by additional genetic methods, particularly in species for which detailed genetic and physical genome maps are not available.  相似文献   

13.
This study investigates the size, composition andconcentration of airborne particles. These featuresare examined from continuously recorded volumetricdaily air samples, taken by Burkard and Hirst trapsfrom the center of Cardiff City and samples fromselected sites around Cardiff. The set of slides isunique as it dates from 1954 to the present day, andcontains data, which precedes any other routinemeasurements of PM10.Image analysis has not been used previously to examinePM10 from slides taken by Hirst-type traps, but it hasbeen demonstrated as an important application inalternative projects. The advantages of being able toperform simple but tedious measurements quickly makeit an important tool for this project. It can alsomeasure a number of images simultaneously and quantifyparameters that would otherwise have been based onqualitative subjective comparisons.Environmental data including wind speeds, rain falland temperature measurements are investigated toexamine the influence on the temporal variation of theabundance and characteristics of airborne particulatematter. Confounding factors that may have impacts oncardiovascular and respiratory illness are beingexamined. These include data on aeroallergens (pollenand fungal spore counts), nitrogen oxide, sulphurdioxide, and carbon monoxide. The project will beextended to an analysis of the results in relation tohealth data.  相似文献   

14.
DNA microarray gene expression and microarray-based comparative genomic hybridization (aCGH) have been widely used for biomedical discovery. Because of the large number of genes and the complex nature of biological networks, various analysis methods have been proposed. One such method is "gene shaving," a procedure which identifies subsets of the genes with coherent expression patterns and large variation across samples. Since combining genomic information from multiple sources can improve classification and prediction of diseases, in this paper we proposed a new method, "ICA gene shaving" (ICA, independent component analysis), for jointly analyzing gene expression and copy number data. First we used ICA to analyze joint measurements, gene expression and copy number, of a biological system and project the data onto statistically independent biological processes. Next, we used these results to identify patterns of variation in the data and then applied an iterative shaving method. We investigated the properties of our proposed method by analyzing both simulated and real data. We demonstrated that the robustness of our method to noise using simulated data. Using breast cancer data, we showed that our method is superior to the Generalized Singular Value Decomposition (GSVD) gene shaving method for identifying genes associated with breast cancer.  相似文献   

15.
Gene frequencies of coat colour and horn types were assessed in 22 Nordic cattle breeds in a project aimed at establishing genetic profiles of the breeds under study. The coat colour loci yielding information on genetic variation were: extension, agouti, spotting, brindle, dun dilution and colour sided. The polled locus was assessed for two alleles. A profound variation between breeds was observed in the frequencies of both colour and horn alleles, with the older breeds generally showing greater variation in observed colour, horn types and segregating alleles than the modern breeds. The correspondence between the present genetic distance matrix and previous molecular marker distance matrices was low (r = 0.08 – 0.12). The branching pattern of a neighbour-joining tree disagreed to some extent with the molecular data structure. The current data indicates that 70% of the total genetic variation could be explained by differences between the breeds, suggesting a much greater breed differentiation than typically found at protein and microsatellite loci. The marked differentiation of the cattle breeds and observed disagreements with the results from the previous molecular data in the topology of the phylogenetic trees are most likely a result of selection on phenotypic characters analysed in this study.  相似文献   

16.

Background

Breast density is a significant breast cancer risk factor. Currently, there is no standard method for measuring this important factor. Work presented here represents an essential component of an ongoing project that seeks to determine the appropriate method for calibrating (standardizing) mammography image data to account for the x-ray image acquisition influences. Longer term goals of this project are to make accurate breast density measurements in support of risk studies.

Methods

Logarithmic response calibration curves and effective x-ray attenuation coefficients were measured from two full field digital mammography (FFDM) systems with breast tissue equivalent phantom imaging and compared. Normalization methods were studied to assess the possibility of reducing the amount of calibration data collection. The percent glandular calibration map functional form was investigated. Spatial variations in the calibration data were used to assess the uncertainty in the calibration application by applying error propagation analyses.

Results

Logarithmic response curves are well approximated as linear. Measured effective x-ray attenuation coefficients are characteristic quantities independent of the imaging system and are in agreement with those predicted numerically. Calibration data collection can be reduced by applying a simple normalization technique. The calibration map is well approximated as linear. Intrasystem calibration variation was on the order of four percent, which was approximately half of the intersystem variation.

Conclusion

FFDM systems provide a quantitative output, and the calibration quantities presented here may be used for data acquired on similar FFDM systems.  相似文献   

17.
18.
Immune defense is temperature dependent in cold‐blooded vertebrates (CBVs) and thus directly impacted by global warming. We examined whether immunity and within‐host infectious disease progression are altered in CBVs under realistic climate warming in a seasonal mid‐latitude setting. Going further, we also examined how large thermal effects are in relation to the effects of other environmental variation in such a setting (critical to our ability to project infectious disease dynamics from thermal relationships alone). We employed the three‐spined stickleback and three ecologically relevant parasite infections as a “wild” model. To generate a realistic climatic warming scenario we used naturalistic outdoors mesocosms with precise temperature control. We also conducted laboratory experiments to estimate thermal effects on immunity and within‐host infectious disease progression under controlled conditions. As experimental readouts we measured disease progression for the parasites and expression in 14 immune‐associated genes (providing insight into immunophenotypic responses). Our mesocosm experiment demonstrated significant perturbation due to modest warming (+2°C), altering the magnitude and phenology of disease. Our laboratory experiments demonstrated substantial thermal effects. Prevailing thermal effects were more important than lagged thermal effects and disease progression increased or decreased in severity with increasing temperature in an infection‐specific way. Combining laboratory‐determined thermal effects with our mesocosm data, we used inverse modeling to partition seasonal variation in Saprolegnia disease progression into a thermal effect and a latent immunocompetence effect (driven by nonthermal environmental variation and correlating with immune gene expression). The immunocompetence effect was large, accounting for at least as much variation in Saprolegnia disease as the thermal effect. This suggests that managers of CBV populations in variable environments may not be able to reliably project infectious disease risk from thermal data alone. Nevertheless, such projections would be improved by primarily considering prevailing thermal effects in the case of within‐host disease and by incorporating validated measures of immunocompetence.  相似文献   

19.
The definition of haplotype blocks of single-nucleotide polymorphisms (SNPs) has been proposed so that the haplotypes can be used as markers in association studies and to efficiently describe human genetic variation. The International Haplotype Map (HapMap) project to construct a comprehensive catalog of haplotypic variation in humans is underway. However, a number of factors have already been shown to influence the definition of blocks, including the population studied and the sample SNP density. Here, we examine the effect that marker selection has on the definition of blocks and the pattern of haplotypes by using comparable but complementary SNP sets and a number of block definition methods in various genomic regions and populations that were provided by the Encyclopedia of DNA Elements (ENCODE) project. We find that the chosen SNP set has a profound effect on the block-covered sequence and block borders, even at high marker densities. Our results question the very concept of discrete haplotype blocks and the possibility of generalizing block findings from the HapMap project. We comparatively apply the block-free tagging-SNP approach and discuss both the haplotype approach and the tagging-SNP approach as means to efficiently catalog genetic variation.  相似文献   

20.
A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies. It provides the ability to analyse complex experiments involving multiple treatment conditions and blocking variables while still taking full account of biological variation. Biological variation between RNA samples is estimated separately from the technical variation associated with sequencing technologies. Novel empirical Bayes methods allow each gene to have its own specific variability, even when there are relatively few biological replicates from which to estimate such variability. The pipeline is implemented in the edgeR package of the Bioconductor project. A case study analysis of carcinoma data demonstrates the ability of generalized linear model methods (GLMs) to detect differential expression in a paired design, and even to detect tumour-specific expression changes. The case study demonstrates the need to allow for gene-specific variability, rather than assuming a common dispersion across genes or a fixed relationship between abundance and variability. Genewise dispersions de-prioritize genes with inconsistent results and allow the main analysis to focus on changes that are consistent between biological replicates. Parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical. Simulations demonstrate the ability of adjusted profile likelihood estimators to return accurate estimators of biological variability in complex situations. When variation is gene-specific, empirical Bayes estimators provide an advantageous compromise between the extremes of assuming common dispersion or separate genewise dispersion. The methods developed here can also be applied to count data arising from DNA-Seq applications, including ChIP-Seq for epigenetic marks and DNA methylation analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号