首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background

A number of methods are available to scan a genome for selection signatures by evaluating patterns of diversity within and between breeds. Among these, “extended haplotype homozygosity” (EHH) is a reliable approach to detect genome regions under recent selective pressure. The objective of this study was to use this approach to identify regions that are under recent positive selection and shared by the most representative Italian dairy and beef cattle breeds.

Results

A total of 3220 animals from Italian Holstein (2179), Italian Brown (775), Simmental (493), Marchigiana (485) and Piedmontese (379) breeds were genotyped with the Illumina BovineSNP50 BeadChip v.1. After standard quality control procedures, genotypes were phased and core haplotypes were identified. The decay of linkage disequilibrium (LD) for each core haplotype was assessed by measuring the EHH. Since accurate estimates of local recombination rates were not available, relative EHH (rEHH) was calculated for each core haplotype. Genomic regions that carry frequent core haplotypes and with significant rEHH values were considered as candidates for recent positive selection. Candidate regions were aligned across to identify signals shared by dairy or beef cattle breeds. Overall, 82 and 87 common regions were detected among dairy and beef cattle breeds, respectively. Bioinformatic analysis identified 244 and 232 genes in these common genomic regions. Gene annotation and pathway analysis showed that these genes are involved in molecular functions that are biologically related to milk or meat production.

Conclusions

Our results suggest that a multi-breed approach can lead to the identification of genomic signatures in breeds of cattle that are selected for the same production goal and thus to the localisation of genomic regions of interest in dairy and beef production.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0113-9) contains supplementary material, which is available to authorized users.  相似文献   

2.

Background

Cattle populations are characterized by regular outburst of genetic defects as a result of the extensive use of elite sires. The causative genes and mutations can nowadays be rapidly identified by means of genome-wide association studies combined with next generation DNA sequencing, provided that the causative mutations are conventional loss-of-function variants. We show in this work how the combined use of next generation DNA and RNA sequencing allows for the rapid identification of otherwise difficult to identify splice-site variants.

Results

We report the use of haplotype-based association mapping to identify a locus on bovine chromosome 10 that underlies autosomal recessive arthrogryposis in Belgian Blue Cattle. We identify 31 candidate mutations by resequencing the genome of four cases and 15 controls at ~10-fold depth. By analyzing RNA-Seq data from a carrier fetus, we observe skipping of the second exon of the PIGH gene, which we confirm by RT-PCR to be fully penetrant in tissues from affected calves. We identify - amongst the 31 candidate variants - a C-to-G transversion in the first intron of the PIGH gene (c211-10C > G) that is predicted to affect its acceptor splice-site. The resulting PIGH protein is likely to be non-functional as it lacks essential domains, and hence to cause arthrogryposis.

Conclusions

This work illustrates how the growing arsenal of genome exploration tools continues to accelerate the identification of an even broader range of disease causing mutations, therefore improving the management and control of genetic defects in livestock.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1528-y) contains supplementary material, which is available to authorized users.  相似文献   

3.

Background

It has been an abiding belief among geneticists that multicellular organisms’ genomes can be analyzed under the assumption that a single individual has a uniform genome in all its cells. Despite some evidence to the contrary, this belief has been used as an axiomatic assumption in most genome analysis software packages. In this paper we present observations in human whole genome data, human whole exome data and in mouse whole genome data to challenge this assumption. We show that heterogeneity is in fact ubiquitous and readily observable in ordinary Next Generation Sequencing (NGS) data.

Results

Starting with the assumption that a single NGS read (or read pair) must come from one haplotype, we built a procedure for directly observing haplotypes at a local level by examining 2 or 3 adjacent single nucleotide polymorphisms (SNPs) which are close enough on the genome to be spanned by individual reads. We applied this procedure to NGS data from three different sources: whole genome of a Central European trio from the 1000 genomes project, whole genome data from laboratory-bred strains of mouse, and whole exome data from a set of patients of head and neck tumors. Thousands of loci were found in each genome where reads spanning 2 or 3 SNPs displayed more than two haplotypes, indicating that the locus is heterogeneous. We show that such loci are ubiquitous in the genome and cannot be explained by segmental duplications. We explain them on the basis of cellular heterogeneity at the genomic level. Such heterogeneous loci were found in all normal and tumor genomes examined.

Conclusions

Our results highlight the need for new methods to analyze genomic variation because existing ones do not systematically consider local haplotypes. Identification of cancer somatic mutations is complicated because of tumor heterogeneity. It is further complicated if, as we show, normal tissues are also heterogeneous. Methods for biomarker discovery must consider contextual haplotype information rather than just whether a variant “is present”.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-418) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Belgian Blue cattle are famous for their exceptional muscular development or “double-muscling”. This defining feature emerged following the fixation of a loss-of-function variant in the myostatin gene in the eighties. Since then, sustained selection has further increased muscle mass of Belgian Blue animals to a comparable extent. In the present paper, we study the genetic determinants of this second wave of muscle growth.

Results

A scan for selective sweeps did not reveal the recent fixation of another allele with major effect on muscularity. However, a genome-wide association study identified two genome-wide significant and three suggestive quantitative trait loci (QTL) affecting specific muscle groups and jointly explaining 8-21% of the heritability. The top two QTL are caused by presumably recent mutations on unique haplotypes that have rapidly risen in frequency in the population. While one appears on its way to fixation, the ascent of the other is compromised as the likely underlying MRC2 mutation causes crooked tail syndrome in homozygotes. Genomic prediction models indicate that the residual additive variance is largely polygenic.

Conclusions

Contrary to complex traits in humans which have a near-exclusive polygenic architecture, muscle mass in beef cattle (as other production traits under directional selection), appears to be controlled by (i) a handful of recent mutations with large effect that rapidly sweep through the population, and (ii) a large number of presumably older variants with very small effects that rise slowly in the population (polygenic adaptation).

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-796) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Mutation(s) in proteins are a natural byproduct of evolution but can also cause serious diseases. Aminoacyl-tRNA synthetases (aaRSs) are indispensable components of all cellular protein translational machineries, and in humans they drive translation in both cytoplasm and mitochondria. Mutations in aaRSs have been implicated in a plethora of diseases including neurological conditions, metabolic disorders and cancer.

Results

We have developed an algorithmic approach for genome-wide analyses of sequence substitutions that combines evolutionary, structural and functional information. This pipeline enabled us to super-annotate human aaRS mutations and analyze their linkage to health disorders. Our data suggest that in some but not all cases, aaRS mutations occur in functional and structural sectors where they can manifest their pathological effects by altering enzyme activity or causing structural instability. Further, mutations appear in both solvent exposed and buried regions of aaRSs indicating that these alterations could lead to dysfunctional enzymes resulting in abnormal protein translation routines by affecting inter-molecular interactions or by disruption of non-bonded interactions. Overall, the prevalence of mutations is much higher in mitochondrial aaRSs, and the two most often mutated aaRSs are mitochondrial glutamyl-tRNA synthetase and dual localized glycyl-tRNA synthetase. Out of 63 mutations annotated in this work, only 12 (~20%) were observed in regions that could directly affect aminoacylation activity via either binding to ATP/amino-acid, tRNA or by involvement in dimerization. Mutations in structural cores or at potential biomolecular interfaces account for ~55% mutations while remaining mutations (~25%) remain structurally un-annotated.

Conclusion

This work provides a comprehensive structural framework within which most defective human aaRSs have been structurally analyzed. The methodology described here could be employed to annotate mutations in other protein families in a high-throughput manner.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1063) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

Targeting Induced Local Lesions in Genomes (TILLING) is a powerful reverse genetics approach for functional genomics studies. We used high-throughput sequencing, combined with a two-dimensional pooling strategy, with either minimum read percentage with non-reference nucleotide or minimum variance multiplier as mutation prediction parameters, to detect genes related to abiotic and biotic stress resistances. In peanut, lipoxygenase genes were reported to be highly induced in mature seeds infected with Aspergillus spp., indicating their importance in plant-fungus interactions. Recent studies showed that phospholipase D (PLD) expression was elevated more quickly in drought sensitive lines than in drought tolerant lines of peanut. A newly discovered lipoxygenase (LOX) gene in peanut, along with two peanut PLD genes from previous publications were selected for TILLING. Additionally, two major allergen genes Ara h 1 and Ara h 2, and fatty acid desaturase AhFAD2, a gene which controls the ratio of oleic to linoleic acid in the seed, were also used in our study. The objectives of this research were to develop a suitable TILLING by sequencing method for this allotetraploid, and use this method to identify mutations induced in stress related genes.

Results

We screened a peanut root cDNA library and identified three candidate LOX genes. The gene AhLOX7 was selected for TILLING due to its high expression in seeds and roots. By screening 768 M2 lines from the TILLING population, four missense mutations were identified for AhLOX7, three missense mutations were identified for AhPLD, one missense and two silent mutations were identified for Ara h 1.01, three silent and five missense mutations were identified for Ara h 1.02, one missense mutation was identified for AhFAD2B, and one silent mutation was identified for Ara h 2.02. The overall mutation frequency was 1 SNP/1,066 kb. The SNP detection frequency for single copy genes was 1 SNP/344 kb and 1 SNP/3,028 kb for multiple copy genes.

Conclusions

Our TILLING by sequencing approach is efficient to identify mutations in single and multi-copy genes. The mutations identified in our study can be used to further study gene function and have potential usefulness in breeding programs.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1348-0) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Cattle breeding populations are susceptible to the propagation of recessive diseases. Individual sires generate tens of thousands of progeny via artificial insemination. The frequency of deleterious alleles carried by such sires may increase considerably within few generations. Deleterious alleles manifest themselves often by missing homozygosity resulting from embryonic/fetal, perinatal or juvenile lethality of homozygotes.

Results

A scan for homozygous haplotype deficiency in 25,544 Fleckvieh cattle uncovered four haplotypes affecting reproductive and rearing success. Exploiting whole-genome resequencing data from 263 animals facilitated to pinpoint putatively causal mutations in two of these haplotypes. A mutation causing an evolutionarily unlikely substitution in SUGT1 was perfectly associated with a haplotype compromising insemination success. The mutation was not found in homozygous state in 10,363 animals (P = 1.79 × 10−5) and is thus likely to cause lethality of homozygous embryos. A frameshift mutation in SLC2A2 encoding glucose transporter 2 (GLUT2) compromises calf survival. The mutation leads to premature termination of translation and activates cryptic splice sites resulting in multiple exon variants also with premature translation termination. The affected calves exhibit stunted growth, resembling the phenotypic appearance of Fanconi-Bickel syndrome in humans (OMIM 227810), which is also caused by mutations in SLC2A2.

Conclusions

Exploiting comprehensive genotype and sequence data enabled us to reveal two deleterious alleles in SLC2A2 and SUGT1 that compromise pre- and postnatal survival in homozygous state. Our results provide the basis for genome-assisted approaches to avoiding inadvertent carrier matings and to improving reproductive and rearing success in Fleckvieh cattle.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1483-7) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

The epidermis forms a critical barrier that is maintained by orchestrated programs of proliferation, differentiation, and cell death. Gene mutations that disturb this turnover process may cause skin diseases. Human GASDERMIN A (GSDMA) is frequently silenced in gastric cancer cell lines and its overexpression has been reported to induce apoptosis. GSDMA has also been linked with airway hyperresponsiveness in genetic association studies. The function of GSDMA in the skin was deduced by dominant mutations in mouse gasdermin A3 (Gsdma3), which caused skin inflammation and hair loss. However, the mechanism for the autosomal dominance of Gsdma3 mutations and the mode of Gsdma3’s action remain unanswered.

Results

We demonstrated a novel function of Gsdma3 in modulating mitochondrial oxidative stress. We showed that Gsdma3 is regulated by intramolecular fold-back inhibition, which is disrupted by dominant mutations in the C-terminal domain. The unmasked N-terminal domain of Gsdma3 associates with Hsp90 and is delivered to mitochondrial via mitochondrial importer receptor Tom70, where it interacts with the mitochondrial chaperone Trap1 and causes increased production of mitochondrial reactive oxygen species (ROS), dissipation of mitochondrial membrane potential, and mitochondrial permeability transition (MPT). Overexpression of the C-terminal domain of Gsdma3 as well as pharmacological interventions of mitochondrial translocation, ROS production, and MPT pore opening alleviate the cell death induced by Gsdma3 mutants.

Conclusions

Our results indicate that the genetic mutations in the C-terminal domain of Gsdma3 are gain-of-function mutations which unmask the N-terminal functional domain of Gsdma3. Gsdma3 regulates mitochondrial oxidative stress through mitochondrial targeting. Since mitochondrial ROS has been shown to promote epidermal differentiation, we hypothesize that Gsdma3 regulates context-dependent response of keratinocytes to differentiation and cell death signals by impinging on mitochondria.

Electronic supplementary material

The online version of this article (doi:10.1186/s12929-015-0152-0) contains supplementary material, which is available to authorized users.  相似文献   

9.

Background

Analysis of data from multiple sources has the potential to enhance knowledge discovery by capturing underlying structures, which are, otherwise, difficult to extract. Fusing data from multiple sources has already proved useful in many applications in social network analysis, signal processing and bioinformatics. However, data fusion is challenging since data from multiple sources are often (i) heterogeneous (i.e., in the form of higher-order tensors and matrices), (ii) incomplete, and (iii) have both shared and unshared components. In order to address these challenges, in this paper, we introduce a novel unsupervised data fusion model based on joint factorization of matrices and higher-order tensors.

Results

While the traditional formulation of coupled matrix and tensor factorizations modeling only shared factors fails to capture the underlying structures in the presence of both shared and unshared factors, the proposed data fusion model has the potential to automatically reveal shared and unshared components through modeling constraints. Using numerical experiments, we demonstrate the effectiveness of the proposed approach in terms of identifying shared and unshared components. Furthermore, we measure a set of mixtures with known chemical composition using both LC-MS (Liquid Chromatography - Mass Spectrometry) and NMR (Nuclear Magnetic Resonance) and demonstrate that the structure-revealing data fusion model can (i) successfully capture the chemicals in the mixtures and extract the relative concentrations of the chemicals accurately, (ii) provide promising results in terms of identifying shared and unshared chemicals, and (iii) reveal the relevant patterns in LC-MS by coupling with the diffusion NMR data.

Conclusions

We have proposed a structure-revealing data fusion model that can jointly analyze heterogeneous, incomplete data sets with shared and unshared components and demonstrated its promising performance as well as potential limitations on both simulated and real data.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-239) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

Idiopathic epilepsy is a common neurological disease in human and domestic dogs but relatively few risk genes have been identified to date. The seizure characteristics, including focal and generalised seizures, are similar between the two species, with gene discovery facilitated by the reduced genetic heterogeneity of purebred dogs. We have recently identified a risk locus for idiopathic epilepsy in the Belgian Shepherd breed on a 4.4 megabase region on CFA37.

Results

We have expanded a previous study replicating the association with a combined analysis of 157 cases and 179 controls in three additional breeds: Schipperke, Finnish Spitz and Beagle (pc = 2.9e–07, pGWAS = 1.74E-02). A targeted resequencing of the 4.4 megabase region in twelve Belgian Shepherd cases and twelve controls with opposite haplotypes identified 37 case-specific variants within the ADAM23 gene. Twenty-seven variants were validated in 285 cases and 355 controls from four breeds, resulting in a strong replication of the ADAM23 locus (praw = 2.76e–15) and the identification of a common 28 kb-risk haplotype in all four breeds. Risk haplotype was present in frequencies of 0.49–0.7 in the breeds, suggesting that ADAM23 is a low penetrance risk gene for canine epilepsy.

Conclusions

These results implicate ADAM23 in common canine idiopathic epilepsy, although the causative variant remains yet to be identified. ADAM23 plays a role in synaptic transmission and interacts with known epilepsy genes, LGI1 and LGI2, and should be considered as a candidate gene for human epilepsies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1651-9) contains supplementary material, which is available to authorized users.  相似文献   

11.
12.

Background

Dominance effect may play an important role in genetic variation of complex traits. Full featured and easy-to-use computing tools for genomic prediction and variance component estimation of additive and dominance effects using genome-wide single nucleotide polymorphism (SNP) markers are necessary to understand dominance contribution to a complex trait and to utilize dominance for selecting individuals with favorable genetic potential.

Results

The GVCBLUP package is a shared memory parallel computing tool for genomic prediction and variance component estimation of additive and dominance effects using genome-wide SNP markers. This package currently has three main programs (GREML_CE, GREML_QM, and GCORRMX) and a graphical user interface (GUI) that integrates the three main programs with an existing program for the graphical viewing of SNP additive and dominance effects (GVCeasy). The GREML_CE and GREML_QM programs offer complementary computing advantages with identical results for genomic prediction of breeding values, dominance deviations and genotypic values, and for genomic estimation of additive and dominance variances and heritabilities using a combination of expectation-maximization (EM) algorithm and average information restricted maximum likelihood (AI-REML) algorithm. GREML_CE is designed for large numbers of SNP markers and GREML_QM for large numbers of individuals. Test results showed that GREML_CE could analyze 50,000 individuals with 400 K SNP markers and GREML_QM could analyze 100,000 individuals with 50K SNP markers. GCORRMX calculates genomic additive and dominance relationship matrices using SNP markers. GVCeasy is the GUI for GVCBLUP integrated with an existing software tool for the graphical viewing of SNP effects and a function for editing the parameter files for the three main programs.

Conclusion

The GVCBLUP package is a powerful and versatile computing tool for assessing the type and magnitude of genetic effects affecting a phenotype by estimating whole-genome additive and dominance heritabilities, for genomic prediction of breeding values, dominance deviations and genotypic values, for calculating genomic relationships, and for research and education in genomic prediction and estimation.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-270) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Driver mutations are positively selected during the evolution of cancers. The relative frequency of a particular mutation within a gene is typically used as a criterion for identifying a driver mutation. However, driver mutations may occur with relative infrequency at a particular site, but cluster within a region of the gene. When analyzing across different cancers, particular mutation sites or mutations within a particular region of the gene may be of relatively low frequency in some cancers, but still provide selective growth advantage.

Results

This paper presents a method that allows rapid and easy visualization of mutation data sets and identification of potential gene mutation hotspot sites and/or regions. As an example, we identified hotspot regions in the NFE2L2 gene that are potentially functionally relevant in endometrial cancer, but would be missed using other analyses.

Conclusions

HotSpotter is a quick, easy-to-use visualization tool that delivers gene identities with associated mutation locations and frequencies overlaid upon a large cancer mutation reference set. This allows the user to identify potential driver mutations that are less frequent in a cancer or are localized in a hotspot region of relatively infrequent mutations.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1044) contains supplementary material, which is available to authorized users.  相似文献   

14.
15.
The in vivo validation of cancer mutations and genes identified in cancer genomics is resource-intensive because of the low throughput of animal experiments. We describe a mouse model that allows multiple cancer mutations to be validated in each animal line. Animal lines are generated with multiple candidate cancer mutations using transposons. The candidate cancer genes are tagged and randomly expressed in somatic cells, allowing easy identification of the cancer genes involved in the generated tumours. This system presents a useful, generalised and efficient means for animal validation of cancer genes.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0455-6) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

Expression quantitative trait loci (eQTL) play an important role in the regulation of gene expression. Gene expression levels and eQTLs are expected to vary from tissue to tissue, and therefore multi-tissue analyses are necessary to fully understand complex genetic conditions in humans. Dura mater tissue likely interacts with cranial bone growth and thus may play a role in the etiology of Chiari Type I Malformation (CMI) and related conditions, but it is often inaccessible and its gene expression has not been well studied. A genetic basis to CMI has been established; however, the specific genetic risk factors are not well characterized.

Results

We present an assessment of eQTLs for whole blood and dura mater tissue from individuals with CMI. A joint-tissue analysis identified 239 eQTLs in either dura or blood, with 79% of these eQTLs shared by both tissues. Several identified eQTLs were novel and these implicate genes involved in bone development (IPO8, XYLT1, and PRKAR1A), and ribosomal pathways related to marrow and bone dysfunction, as potential candidates in the development of CMI.

Conclusions

Despite strong overall heterogeneity in expression levels between blood and dura, the majority of cis-eQTLs are shared by both tissues. The power to detect shared eQTLs was improved by using an integrative statistical approach. The identified tissue-specific and shared eQTLs provide new insight into the genetic basis for CMI and related conditions.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-014-1211-8) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.
19.
20.

Background

Molecular mechanisms associated with frequent relapse of diffuse large B-cell lymphoma (DLBCL) are poorly defined. It is especially unclear how primary tumor clonal heterogeneity contributes to relapse. Here, we explore unique features of B-cell lymphomas - VDJ recombination and somatic hypermutation - to address this question.

Results

We performed high-throughput sequencing of rearranged VDJ junctions in 14 pairs of matched diagnosis-relapse tumors, among which 7 pairs were further characterized by exome sequencing. We identify two distinctive modes of clonal evolution of DLBCL relapse: an early-divergent mode in which clonally related diagnosis and relapse tumors diverged early and developed in parallel; and a late-divergent mode in which relapse tumors developed directly from diagnosis tumors with minor divergence. By examining mutation patterns in the context of phylogenetic information provided by VDJ junctions, we identified mutations in epigenetic modifiers such as KMT2D as potential early driving events in lymphomagenesis and immune escape alterations as relapse-associated events.

Conclusions

Altogether, our study for the first time provides important evidence that DLBCL relapse may result from multiple, distinct tumor evolutionary mechanisms, providing rationale for therapies for each mechanism. Moreover, this study highlights the urgent need to understand the driving roles of epigenetic modifier mutations in lymphomagenesis, and immune surveillance factor genetic lesions in relapse.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0432-0) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号