首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Population differentiation is the result of demographic and evolutionary forces. Whole genome datasets from the 1000 Genomes Project (October 2012) provide an unbiased view of genetic variation across populations from Europe, Asia, Africa and the Americas. Common population-specific SNPs (MAF > 0.05) reflect a deep history and may have important consequences for health and wellbeing. Their interpretation is contextualised by currently available genome data.

Results

The identification of common population-specific (CPS) variants (SNPs and SSV) is influenced by admixture and the sample size under investigation. Nine of the populations in the 1000 Genomes Project (2 African, 2 Asian (including a merged Chinese group) and 5 European) revealed that the African populations (LWK and YRI), followed by the Japanese (JPT) have the highest number of CPS SNPs, in concordance with their histories and given the populations studied. Using two methods, sliding 50-SNP and 5-kb windows, the CPS SNPs showed distinct clustering across large genome segments and little overlap of clusters between populations. iHS enrichment score and the population branch statistic (PBS) analyses suggest that selective sweeps are unlikely to account for the clustering and population specificity. Of interest is the association of clusters close to recombination hotspots. Functional analysis of genes associated with the CPS SNPs revealed over-representation of genes in pathways associated with neuronal development, including axonal guidance signalling and CREB signalling in neurones.

Conclusions

Common population-specific SNPs are non-randomly distributed throughout the genome and are significantly associated with recombination hotspots. Since the variant alleles of most CPS SNPs are the derived allele, they likely arose in the specific population after a split from a common ancestor. Their proximity to genes involved in specific pathways, including neuronal development, suggests evolutionary plasticity of selected genomic regions. Contrary to expectation, selective sweeps did not play a large role in the persistence of population-specific variation. This suggests a stochastic process towards population-specific variation which reflects demographic histories and may have some interesting implications for health and susceptibility to disease.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-437) contains supplementary material, which is available to authorized users.  相似文献   

2.
3.
4.

Background

Turkey is a crossroads of major population movements throughout history and has been a hotspot of cultural interactions. Several studies have investigated the complex population history of Turkey through a limited set of genetic markers. However, to date, there have been no studies to assess the genetic variation at the whole genome level using whole genome sequencing. Here, we present whole genome sequences of 16 Turkish individuals resequenced at high coverage (32 × -48×).

Results

We show that the genetic variation of the contemporary Turkish population clusters with South European populations, as expected, but also shows signatures of relatively recent contribution from ancestral East Asian populations. In addition, we document a significant enrichment of non-synonymous private alleles, consistent with recent observations in European populations. A number of variants associated with skin color and total cholesterol levels show frequency differentiation between the Turkish populations and European populations. Furthermore, we have analyzed the 17q21.31 inversion polymorphism region (MAPT locus) and found increased allele frequency of 31.25% for H1/H2 inversion polymorphism when compared to European populations that show about 25% of allele frequency.

Conclusion

This study provides the first map of common genetic variation from 16 western Asian individuals and thus helps fill an important geographical gap in analyzing natural human variation and human migration. Our data will help develop population-specific experimental designs for studies investigating disease associations and demographic history in Turkey.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-963) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Dairy cattle breeding objectives are in general similar across countries, but environment and management conditions may vary, giving rise to slightly different selection pressures applied to a given trait. This potentially leads to different selection pressures to loci across the genome that, if large enough, may give rise to differential regions with high levels of homozygosity. The objective of this study was to characterize differences and similarities in the location and frequency of homozygosity related measures of Jersey dairy cows and bulls from the United States (US), Australia (AU) and New Zealand (NZ).

Results

The populations consisted of a subset of genotyped Jersey cows born in US (n = 1047) and AU (n = 886) and Jersey bulls progeny tested from the US (n = 736), AU (n = 306) and NZ (n = 768). Differences and similarities across populations were characterized using a principal component analysis (PCA) and a run of homozygosity (ROH) statistic (ROH45), which counts the frequency of a single nucleotide polymorphism (SNP) being in a ROH of at least 45 SNP. Regions that exhibited high frequencies of ROH45 and those that had significantly different ROH45 frequencies between populations were investigated for their association with milk yield traits. Within sex, the PCA revealed slight differentiation between the populations, with the greatest occurring between the US and NZ bulls. Regions with high levels of ROH45 for all populations were detected on BTA3 and BTA7 while several other regions differed in ROH45 frequency across populations, the largest number occurring for the US and NZ bull contrast. In addition, multiple regions with different ROH45 frequencies across populations were found to be associated with milk yield traits.

Conclusion

Multiple regions exhibited differential ROH45 across AU, NZ and US cow and bull populations, an interpretation is that locations of the genome are undergoing differential directional selection. Two regions on BTA3 and BTA7 had high ROH45 frequencies across all populations and will be investigated further to determine the gene(s) undergoing directional selection.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1352-4) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

This study aimed to identify markers for muscle growth rate and the different cellular contributors to cattle muscle and to link the muscle growth rate markers to specific cell types.

Results

The expression of two groups of genes in the longissimus muscle (LM) of 48 Brahman steers of similar age, significantly enriched for “cell cycle” and “ECM (extracellular matrix) organization” Gene Ontology (GO) terms was correlated with average daily gain/kg liveweight (ADG/kg) of the animals. However, expression of the same genes was only partly related to growth rate across a time course of postnatal LM development in two cattle genotypes, Piedmontese x Hereford (high muscling) and Wagyu x Hereford (high marbling). The deposition of intramuscular fat (IMF) altered the relationship between the expression of these genes and growth rate. K-means clustering across the development time course with a large set of genes (5,596) with similar expression profiles to the ECM genes was undertaken. The locations in the clusters of published markers of different cell types in muscle were identified and used to link clusters of genes to the cell type most likely to be expressing them. Overall correspondence between published cell type expression of markers and predicted major cell types of expression in cattle LM was high. However, some exceptions were identified: expression of SOX8 previously attributed to muscle satellite cells was correlated with angiogenesis. Analysis of the clusters and cell types suggested that the “cell cycle” and “ECM” signals were from the fibro/adipogenic lineage. Significant contributions to these signals from the muscle satellite cells, angiogenic cells and adipocytes themselves were not as strongly supported. Based on the clusters and cell type markers, sets of five genes predicted to be representative of fibro/adipogenic precursors (FAPs) and endothelial cells, and/or ECM remodelling and angiogenesis were identified.

Conclusions

Gene sets and gene markers for the analysis of many of the major processes/cell populations contributing to muscle composition and growth have been proposed, enabling a consistent interpretation of gene expression datasets from cattle LM. The same gene sets are likely to be applicable in other cattle muscles and in other species.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1403-x) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Top-down mass spectrometry plays an important role in intact protein identification and characterization. Top-down mass spectra are more complex than bottom-up mass spectra because they often contain many isotopomer envelopes from highly charged ions, which may overlap with one another. As a result, spectral deconvolution, which converts a complex top-down mass spectrum into a monoisotopic mass list, is a key step in top-down spectral interpretation.

Results

In this paper, we propose a new scoring function, L-score, for evaluating isotopomer envelopes. By combining L-score with MS-Deconv, a new software tool, MS-Deconv+, was developed for top-down spectral deconvolution. Experimental results showed that MS-Deconv+ outperformed existing software tools in top-down spectral deconvolution.

Conclusions

L-score shows high discriminative ability in identification of isotopomer envelopes. Using L-score, MS-Deconv+ reports many correct monoisotopic masses missed by other software tools, which are valuable for proteoform identification and characterization.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1140) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study.

Results

Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this “gold-standard” comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues.

Conclusions

Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-649) contains supplementary material, which is available to authorized users.  相似文献   

9.
10.

Background

The Tibetan pig is one of domestic animals indigenous to the Qinghai-Tibet Plateau. Several geographically isolated pig populations are distributed throughout the Plateau. It remained an open question if these populations have experienced different demographic histories and have evolved independent adaptive loci for the harsh environment of the Plateau. To address these questions, we herein investigated ~ 40,000 genetic variants across the pig genome in a broad panel of 678 individuals from 5 Tibetan geographic populations and 34 lowland breeds.

Results

Using a series of population genetic analyses, we show that Tibetan pig populations have marked genetic differentiations. Tibetan pigs appear to be 3 independent populations corresponding to the Tibetan, Gansu and Sichuan & Yunnan locations. Each population is more genetically similar to its geographic neighbors than to any of the other Tibetan populations. By applying a locus-specific branch length test, we identified both population-specific and -shared candidate genes under selection in Tibetan pigs. These genes, such as PLA2G12A, RGCC, C9ORF3, GRIN2B, GRID1 and EPAS1, are involved in high-altitude physiology including angiogenesis, pulmonary hypertension, oxygen intake, defense response and erythropoiesis. A majority of these genes have not been implicated in previous studies of highlanders and high-altitude animals.

Conclusion

Tibetan pig populations have experienced substantial genetic differentiation. Historically, Tibetan pigs likely had admixture with neighboring lowland breeds. During the long history of colonization in the Plateau, Tibetan pigs have developed a complex biological adaptation mechanism that could be different from that of Tibetans and other animals. Different Tibetan pig populations appear to have both distinct and convergent adaptive loci for the harsh environment of the Plateau.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-834) contains supplementary material, which is available to authorized users.  相似文献   

11.

Background

Nelore and Gir are the two most important indicine cattle breeds for production of beef and milk in Brazil. Historical records state that these breeds were introduced in Brazil from the Indian subcontinent, crossed to local taurine cattle in order to quickly increase the population size, and then backcrossed to the original breeds to recover indicine adaptive and productive traits. Previous investigations based on sparse DNA markers detected taurine admixture in these breeds. High-density genome-wide analyses can provide high-resolution information on the genetic composition of current Nelore and Gir populations, estimate more precisely the levels and nature of taurine introgression, and shed light on their history and the strategies that were used to expand these breeds.

Results

We used the high-density Illumina BovineHD BeadChip with more than 777 K single nucleotide polymorphisms (SNPs) that were reduced to 697 115 after quality control filtering to investigate the structure of Nelore and Gir populations and seven other worldwide populations for comparison. Multidimensional scaling and model-based ancestry estimation clearly separated the indicine, European taurine and African taurine ancestries. The average level of taurine introgression in the autosomal genome of Nelore and Gir breeds was less than 1% but was 9% for the Brahman breed. Analyses based on the mitochondrial SNPs present in the Illumina BovineHD BeadChip did not clearly differentiate taurine and indicine haplotype groupings.

Conclusions

The low level of taurine ancestry observed for both Nelore and Gir breeds confirms the historical records of crossbreeding and supports a strong directional selection against taurine haplotypes via backcrossing. Random sampling in production herds across the country and subsequent genotyping would be useful for a more complete view of the admixture levels in the commercial Nelore and Gir populations.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0109-5) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

While the possible sources underlying the so-called ‘missing heritability’ evident in current genome-wide association studies (GWAS) of complex traits have been actively pursued in recent years, resolving this mystery remains a challenging task. Studying heritability of genome-wide gene expression traits can shed light on the goal of understanding the relationship between phenotype and genotype. Here we used microarray gene expression measurements of lymphoblastoid cell lines and genome-wide SNP genotype data from 210 HapMap individuals to examine the heritability of gene expression traits.

Results

Heritability levels for expression of 10,720 genes were estimated by applying variance component model analyses and 1,043 expression quantitative loci (eQTLs) were detected. Our results indicate that gene expression traits display a bimodal distribution of heritability, one peak close to 0% and the other summit approaching 100%. Such a pattern of the within-population variability of gene expression heritability is common among different HapMap populations of unrelated individuals but different from that obtained in the CEU and YRI trio samples. Higher heritability levels are shown by housekeeping genes and genes associated with cis eQTLs. Both cis and trans eQTLs make comparable cumulative contributions to the heritability. Finally, we modelled gene-gene interactions (epistasis) for genes with multiple eQTLs and revealed that epistasis was not prevailing in all genes but made a substantial contribution in explaining total heritability for some genes analysed.

Conclusions

We utilised a mixed effect model analysis for estimating genetic components from population based samples. On basis of analyses of genome-wide gene expression from four HapMap populations, we demonstrated detailed exploitation of the distribution of genetic heritabilities for expression traits from different populations, and highlighted the importance of studying interaction at the gene expression level as an important source of variation underlying missing heritability.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-13) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

In invertebrates, genes belonging to dynamically regulated functional categories appear to be less methylated than “housekeeping” genes, suggesting that DNA methylation may modulate gene expression plasticity. To date, however, experimental evidence to support this hypothesis across different natural habitats has been lacking.

Results

Gene expression profiles were generated from 30 pairs of genetically identical fragments of coral Acropora millepora reciprocally transplanted between distinct natural habitats for 3 months. Gene expression was analyzed in the context of normalized CpG content, a well-established signature of historical germline DNA methylation. Genes with weak methylation signatures were more likely to demonstrate differential expression based on both transplant environment and population of origin than genes with strong methylation signatures. Moreover, the magnitude of expression differences due to environment and population were greater for genes with weak methylation signatures.

Conclusions

Our results support a connection between differential germline methylation and gene expression flexibility across environments and populations. Studies of phylogenetically basal invertebrates such as corals will further elucidate the fundamental functional aspects of gene body methylation in Metazoa.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1109) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

A minor but significant fraction of samples subjected to next-generation sequencing methods are either mixed-up or cross-contaminated. These events can lead to false or inconclusive results. We have therefore developed SASI-Seq; a process whereby a set of uniquely barcoded DNA fragments are added to samples destined for sequencing. From the final sequencing data, one can verify that all the reads derive from the original sample(s) and not from contaminants or other samples.

Results

By adding a mixture of three uniquely barcoded amplicons, of different sizes spanning the range of insert sizes one would normally use for Illumina sequencing, at a spike-in level of approximately 0.1%, we demonstrate that these fragments remain intimately associated with the sample. They can be detected following even the tightest size selection regimes or exome enrichment and can report the occurrence of sample mix-ups and cross-contamination.As a consequence of this work, we have designed a set of 384 eleven-base Illumina barcode sequences that are at least 5 changes apart from each other, allowing for single-error correction and very low levels of barcode misallocation due to sequencing error.

Conclusion

SASI-Seq is a simple, inexpensive and flexible tool that enables sample assurance, allows deconvolution of sample mix-ups and reports levels of cross-contamination between samples throughout NGS workflows.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-110) contains supplementary material, which is available to authorized users.  相似文献   

15.
16.
17.
18.
19.
The cellular composition of heterogeneous samples can be predicted using an expression deconvolution algorithm to decompose their gene expression profiles based on pre-defined, reference gene expression profiles of the constituent populations in these samples. However, the expression profiles of the actual constituent populations are often perturbed from those of the reference profiles due to gene expression changes in cells associated with microenvironmental or developmental effects. Existing deconvolution algorithms do not account for these changes and give incorrect results when benchmarked against those measured by well-established flow cytometry, even after batch correction was applied. We introduce PERT, a new probabilistic expression deconvolution method that detects and accounts for a shared, multiplicative perturbation in the reference profiles when performing expression deconvolution. We applied PERT and three other state-of-the-art expression deconvolution methods to predict cell frequencies within heterogeneous human blood samples that were collected under several conditions (uncultured mono-nucleated and lineage-depleted cells, and culture-derived lineage-depleted cells). Only PERT''s predicted proportions of the constituent populations matched those assigned by flow cytometry. Genes associated with cell cycle processes were highly enriched among those with the largest predicted expression changes between the cultured and uncultured conditions. We anticipate that PERT will be widely applicable to expression deconvolution strategies that use profiles from reference populations that vary from the corresponding constituent populations in cellular state but not cellular phenotypic identity.  相似文献   

20.

Background

Understanding the genetic mechanisms that underlie meat quality traits is essential to improve pork quality. To date, most quantitative trait loci (QTL) analyses have been performed on F2 crosses between outbred pig strains and have led to the identification of numerous QTL. However, because linkage disequilibrium is high in such crosses, QTL mapping precision is unsatisfactory and only a few QTL have been found to segregate within outbred strains, which limits their use to improve animal performance. To detect QTL in outbred pig populations of Chinese and Western origins, we performed genome-wide association studies (GWAS) for meat quality traits in Chinese purebred Erhualian pigs and a Western Duroc × (Landrace × Yorkshire) (DLY) commercial population.

Methods

Three hundred and thirty six Chinese Erhualian and 610 DLY pigs were genotyped using the Illumina PorcineSNP60K Beadchip and evaluated for 20 meat quality traits. After quality control, 35 985 and 56 216 single nucleotide polymorphisms (SNPs) were available for the Chinese Erhualian and DLY datasets, respectively, and were used to perform two separate GWAS. We also performed a meta-analysis that combined P-values and effects of 29 516 SNPs that were common to Erhualian, DLY, F2 and Sutai pig populations.

Results

We detected 28 and nine suggestive SNPs that surpassed the significance level for meat quality in Erhualian and DLY pigs, respectively. Among these SNPs, ss131261254 on pig chromosome 4 (SSC4) was the most significant (P = 7.97E-09) and was associated with drip loss in Erhualian pigs. Our results suggested that at least two QTL on SSC12 and on SSC15 may have pleiotropic effects on several related traits. All the QTL that were detected by GWAS were population-specific, including 12 novel regions. However, the meta-analysis revealed seven novel QTL for meat characteristics, which suggests the existence of common underlying variants that may differ in frequency across populations. These QTL regions contain several relevant candidate genes.

Conclusions

These findings provide valuable insights into the molecular basis of convergent evolution of meat quality traits in Chinese and Western breeds that show divergent phenotypes. They may contribute to genetic improvement of purebreds for crossbred performance.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0120-x) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号