首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Low initial response to alcohol has been shown to be among the best predictors of development of alcoholism. A similar phenotypic measure, difference in initial sensitivity to ethanol, has been used for the genetic selection of two mouse strains, the Inbred Long-Sleep (ILS) and Inbred Short-Sleep (ISS) mice, and for the subsequent identification of four quantitative trait loci (QTLs) for alcohol sensitivity. We now report the application of high throughput comparative gene sequencing in the search for genes underlying these four QTLs. To carry out this search, over 1.7 million bases of comparative DNA sequence were generated from 68 candidate genes within the QTL intervals, corresponding to a survey of over 36,000 amino acids. Eight central nervous system genes, located within these QTLs, were identified that contain a total of 36 changes in protein coding sequence. Some of these coding variants are likely to contribute to the phenotypic variation between ILS/ISS animals, including sensitivity to alcohol, providing specific new genetic targets potentially important to the neuronal actions of alcohol.  相似文献   

2.
3.
The Inbred Long- and Short-Sleep (ILS, ISS) mouse lines were selected for differences in acute ethanol sensitivity using the loss of righting response (LORR) as the selection trait. The lines show an over tenfold difference in LORR and, along with a recombinant inbred panel derived from them (the LXS), have been widely used to dissect the genetic underpinnings of acute ethanol sensitivity. Here we have sequenced the genomes of the ILS and ISS to investigate the DNA variants that contribute to their sensitivity difference. We identified ~2.7 million high-confidence SNPs and small indels and ~7000 structural variants between the lines; variants were found to occur in 6382 annotated genes. Using a hidden Markov model, we were able to reconstruct the genome-wide ancestry patterns of the eight inbred progenitor strains from which the ILS and ISS were derived, and found that quantitative trait loci that have been mapped for LORR were slightly enriched for DNA variants. Finally, by mapping and quantifying RNA-seq reads from the ILS and ISS to their strain-specific genomes rather than to the reference genome, we found a substantial improvement in a differential expression analysis between the lines. This work will help in identifying and characterizing the DNA sequence variants that contribute to the difference in ethanol sensitivity between the ILS and ISS and will also aid in accurate quantification of RNA-seq data generated from the LXS RIs.  相似文献   

4.
Copy number variants (CNVs) are pervasive in several animal and plant genomes and contribute to shaping genetic diversity. In barley, there is evidence that changes in gene copy number underlie important agronomic traits. The recently released reference sequence of barley represents a valuable genomic resource for unveiling the incidence of CNVs that affect gene content and for identifying sequence features associated with CNV formation. Using exome sequencing and read count data, we detected 16 605 deletions and duplications that affect barley gene content by surveying a diverse panel of 172 cultivars, 171 landraces, 22 wild relatives and other 32 uncategorized domesticated accessions. The quest for segmental duplications (SDs) in the reference sequence revealed many low‐copy repeats, most of which overlap predicted coding sequences. Statistical analyses revealed that the incidence of CNVs increases significantly in SD‐rich regions, indicating that these sequence elements act as hot spots for the formation of CNVs. The present study delivers a comprehensive genome‐wide study of CNVs affecting barley gene content and implicates SDs in the molecular mechanisms that lead to the formation of this class of CNVs.  相似文献   

5.
Although large-scale copy-number variation is an important contributor to conspecific genomic diversity, whether these variants frequently contribute to human phenotype differences remains unknown. If they have few functional consequences, then copy-number variants (CNVs) might be expected both to be distributed uniformly throughout the human genome and to encode genes that are characteristic of the genome as a whole. We find that human CNVs are significantly overrepresented close to telomeres and centromeres and in simple tandem repeat sequences. Additionally, human CNVs were observed to be unusually enriched in those protein-coding genes that have experienced significantly elevated synonymous and nonsynonymous nucleotide substitution rates, estimated between single human and mouse orthologues. CNV genes encode disproportionately large numbers of secreted, olfactory, and immunity proteins, although they contain fewer than expected genes associated with Mendelian disease. Despite mouse CNVs also exhibiting a significant elevation in synonymous substitution rates, in most other respects they do not differ significantly from the genomic background. Nevertheless, they encode proteins that are depleted in olfactory function, and they exhibit significantly decreased amino acid sequence divergence. Natural selection appears to have acted discriminately among human CNV genes. The significant overabundance, within human CNVs, of genes associated with olfaction, immunity, protein secretion, and elevated coding sequence divergence, indicates that a subset may have been retained in the human population due to the adaptive benefit of increased gene dosage. By contrast, the functional characteristics of mouse CNVs either suggest that advantageous gene copies have been depleted during recent selective breeding of laboratory mouse strains or suggest that they were preferentially fixed as a consequence of the larger effective population size of wild mice. It thus appears that CNV differences among mouse strains do not provide an appropriate model for large-scale sequence variations in the human population.  相似文献   

6.
High-throughput sequencing of DNA coding regions has become a common way of assaying genomic variation in the study of human diseases. Copy number variation (CNV) is an important type of genomic variation, but detecting and characterizing CNV from exome sequencing is challenging due to the high level of biases and artifacts. We propose CODEX, a normalization and CNV calling procedure for whole exome sequencing data. The Poisson latent factor model in CODEX includes terms that specifically remove biases due to GC content, exon capture and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data. CODEX is compared to existing methods on a population analysis of HapMap samples from the 1000 Genomes Project, and shown to be more accurate on three microarray-based validation data sets. We further evaluate performance on 222 neuroblastoma samples with matched normals and focus on a well-studied rare somatic CNV within the ATRX gene. We show that the cross-sample normalization procedure of CODEX removes more noise than normalizing the tumor against the matched normal and that the segmentation procedure performs well in detecting CNVs with nested structures.  相似文献   

7.
Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene.  相似文献   

8.
It is well established that genomic alterations play an essential role in oncogenesis, disease progression, and response of tumors to therapeutic intervention. The advances of next-generation sequencing technologies (NGS) provide unprecedented capabilities to scan genomes for changes such as mutations, deletions, and alterations of chromosomal copy number. However, the cost of full-genome sequencing still prevents the routine application of NGS in many areas. Capturing and sequencing the coding exons of genes (the "exome") can be a cost-effective approach for identifying changes that result in alteration of protein sequences. We applied an exome-sequencing technology (Roche Nimblegen capture paired with 454 sequencing) to identify sequence variation and mutations in eight commonly used cancer cell lines from a variety of tissue origins (A2780, A549, Colo205, GTL16, NCI-H661, MDA-MB468, PC3, and RD). We showed that this technology can accurately identify sequence variation, providing ~95% concordance with Affymetrix SNP Array 6.0 performed on the same cell lines. Furthermore, we detected 19 of the 21 mutations reported in Sanger COSMIC database for these cell lines. We identified an average of 2,779 potential novel sequence variations/mutations per cell line, of which 1,904 were non-synonymous. Many non-synonymous changes were identified in kinases and known cancer-related genes. In addition we confirmed that the read-depth of exome sequence data can be used to estimate high-level gene amplifications and identify homologous deletions. In summary, we demonstrate that exome sequencing can be a reliable and cost-effective way for identifying alterations in cancer genomes, and we have generated a comprehensive catalogue of genomic alterations in coding regions of eight cancer cell lines. These findings could provide important insights into cancer pathways and mechanisms of resistance to anti-cancer therapies.  相似文献   

9.
Copy number variations (CNVs) are gains and losses of genomic sequence greater than 50?bp between two individuals of a species. While single nucleotide polymorphisms (SNPs) are more frequent, CNVs impact a higher percentage of genomic sequence and have potentially greater effects, including the changing of gene structure and dosage, altering gene regulation and exposing recessive alleles. In particular, segmental duplications (SDs) were shown to be one of the catalysts and hotspots for CNV formation. Substantial progress has been made in understanding CNVs in mammals, especially in humans and rodents. CNVs have been shown to be important in both normal phenotypic variability and disease susceptibility. Recently, interest in CNV study has extended into domesticated animals, including cattle. Multiple genome-wide cattle CNV studies have been carried out using both microarray and next generation sequencing technologies. Integration of SD and CNV results with SNP and other datasets are beginning to reveal impacts of CNVs on cattle domestication, health, and production traits.  相似文献   

10.
11.
12.
The melanophilin (MLPH) gene has been characterized as the candidate gene for dilute coat color in some species, but little is known about it in the goat. In this study, part of the genomic DNA sequence (19,289 bp) containing the whole coding region of the MLPH gene from goat, as well as from sheep, was determined. We found 16 exons and 15 introns; the coding region was 1767 bp distributed in 15 exons (2–16). In sheep, the length of part of the genomic DNA sequence was 16,988 bp, with 16 exons and 15 introns, and the coding region was 1833 bp, distributed in 15 exons (2–16). Dozens of SNPs as well as some noticeable motifs in the goat MLPH gene were found during the process of sequencing and polymorphism screening. Based on the SSR Tool, three simple sequence repeat motifs were detected in the goat and sheep DNA sequences. Compared with cattle, we found insertions of 4 amino acids in goats and 26 amino acids in sheep.  相似文献   

13.

Background

The domestic pig (Sus scrofa) is both an important livestock species and a model for biomedical research. Exome sequencing has accelerated identification of protein-coding variants underlying phenotypic traits in human and mouse. We aimed to develop and validate a similar resource for the pig.

Results

We developed probe sets to capture pig exonic sequences based upon the current Ensembl pig gene annotation supplemented with mapped expressed sequence tags (ESTs) and demonstrated proof-of-principle capture and sequencing of the pig exome in 96 pigs, encompassing 24 capture experiments. For most of the samples at least 10x sequence coverage was achieved for more than 90% of the target bases. Bioinformatic analysis of the data revealed over 236,000 high confidence predicted SNPs and over 28,000 predicted indels.

Conclusions

We have achieved coverage statistics similar to those seen with commercially available human and mouse exome kits. Exome capture in pigs provides a tool to identify coding region variation associated with production traits, including loss of function mutations which may explain embryonic and neonatal losses, and to improve genomic assemblies in the vicinity of protein coding genes in the pig.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-550) contains supplementary material, which is available to authorized users.  相似文献   

14.
Splicing is a cellular mechanism, which dictates eukaryotic gene expression by removing the noncoding introns and ligating the coding exons in the form of a messenger RNA molecule. Alternative splicing (AS) adds a major level of complexity to this mechanism and thus to the regulation of gene expression. This widespread cellular phenomenon generates multiple messenger RNA isoforms from a single gene, by utilizing alternative splice sites and promoting different exon-intron inclusions and exclusions. AS greatly increases the coding potential of eukaryotic genomes and hence contributes to the diversity of eukaryotic proteomes. Mutations that lead to disruptions of either constitutive splicing or AS cause several diseases, among which are myotonic dystrophy and cystic fibrosis. Aberrant splicing is also well established in cancer states. Identification of rare novel mutations associated with splice-site recognition, and splicing regulation in general, could provide further insight into genetic mechanisms of rare diseases. Here, disease relevance of aberrant splicing is reviewed, and the new methodological approach of starting from disease phenotype, employing exome sequencing and identifying rare mutations affecting splicing regulation is described. Exome sequencing has emerged as a reliable method for finding sequence variations associated with various disease states. To date, genetic studies using exome sequencing to find disease-causing mutations have focused on the discovery of nonsynonymous single nucleotide polymorphisms that alter amino acids or introduce early stop codons, or on the use of exome sequencing as a means to genotype known single nucleotide polymorphisms. The involvement of splicing mutations in inherited diseases has received little attention and thus likely occurs more frequently than currently estimated. Studies of exome sequencing followed by molecular and bioinformatic analyses have great potential to reveal the high impact of splicing mutations underlying human disease.  相似文献   

15.
16.
Genomic rearrangements involving the peripheral myelin protein gene (PMP22) in human chromosome 17p12 are associated with neuropathy: duplications cause Charcot-Marie-Tooth disease type 1A (CMT1A), whereas deletions lead to hereditary neuropathy with liability to pressure palsies (HNPP). Our previous studies showed that >99% of these rearrangements are recurrent and mediated by nonallelic homologous recombination (NAHR). Rare copy number variations (CNVs) generated by nonrecurrent rearrangements also exist in 17p12, but their underlying mechanisms are not well understood. We investigated 21 subjects with rare CNVs associated with CMT1A or HNPP by oligonucleotide-based comparative genomic hybridization microarrays and breakpoint sequence analyses, and we identified 17 unique CNVs, including two genomic deletions, ten genomic duplications, two complex rearrangements, and three small exonic deletions. Each of these CNVs includes either the entire PMP22 gene, or exon(s) only, or ultraconserved potential regulatory sequences upstream of PMP22, further supporting the contention that PMP22 is the critical gene mediating the neuropathy phenotypes associated with 17p12 rearrangements. Breakpoint sequence analysis reveals that, different from the predominant NAHR mechanism in recurrent rearrangement, various molecular mechanisms, including nonhomologous end joining, Alu-Alu-mediated recombination, and replication-based mechanisms (e.g., FoSTeS and/or MMBIR), can generate nonrecurrent 17p12 rearrangements associated with neuropathy. We document a multitude of ways in which gene function can be altered by CNVs. Given the characteristics, including small size, structural complexity, and location outside of coding regions, of selected rare CNVs, their identification remains a challenge for genome analysis. Rare CNVs may potentially represent an important portion of “missing heritability” for human diseases.  相似文献   

17.
Identifying copy number variants (CNVs) can provide diagnoses to patients and provide important biological insights into human health and disease. Current exome and targeted sequencing approaches cannot detect clinically and biologically-relevant CNVs outside their target area. We present SavvyCNV, a tool which uses off-target read data from exome and targeted sequencing data to call germline CNVs genome-wide. Up to 70% of sequencing reads from exome and targeted sequencing fall outside the targeted regions. We have developed a new tool, SavvyCNV, to exploit this ‘free data’ to call CNVs across the genome. We benchmarked SavvyCNV against five state-of-the-art CNV callers using truth sets generated from genome sequencing data and Multiplex Ligation-dependent Probe Amplification assays. SavvyCNV called CNVs with high precision and recall, outperforming the five other tools at calling CNVs genome-wide, using off-target or on-target reads from targeted panel and exome sequencing. We then applied SavvyCNV to clinical samples sequenced using a targeted panel and were able to call previously undetected clinically-relevant CNVs, highlighting the utility of this tool within the diagnostic setting. SavvyCNV outperforms existing tools for calling CNVs from off-target reads. It can call CNVs genome-wide from targeted panel and exome data, increasing the utility and diagnostic yield of these tests. SavvyCNV is freely available at https://github.com/rdemolgen/SavvySuite.  相似文献   

18.
Hanwoo, Korean native cattle, is indigenous to the Korean peninsula. They have been used mainly as draft animals for about 5,000 years; however, in the last 30 years, their main role has been changed to meat production by selective breeding which has led to substantial increases in their productivity. Massively parallel sequencing technology has recently made possible the systematic identification of structural variations in cattle genomes. In particular, copy number variation (CNV) has been recognized as an important genetic variation complementary to single-nucleotide polymorphisms that can be used to account for variations of economically important traits in cattle. Here we report genome-wide copy number variation regions (CNVRs) in Hanwoo cattle obtained by comparing the whole genome sequence of Hanwoo with Black Angus and Holstein sequence datasets. We identified 1,173 and 963 putative CNVRs representing 16.7 and 7.8 Mbp from comparisons between Black Angus and Hanwoo and between Holstein and Hanwoo, respectively. The potential functional roles of the CNVRs were assessed by Gene Ontology enrichment analysis. The results showed that response to stimulus, immune system process, and cellular component organization were highly enriched in the genic-CNVRs that overlapped with annotated cattle genes. Of the 11 CNVRs that were selected for validation by quantitative real-time PCR, 9 exhibited the expected copy number differences. The results reported in this study show that genome-wide CNVs were detected successfully using massively parallel sequencing technology. The CNVs may be a valuable resource for further studies to correlate CNVs and economically important traits in cattle.  相似文献   

19.
Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.  相似文献   

20.
Mouse phenome research: implications of genetic background   总被引:4,自引:0,他引:4  
Now that sequencing of the mouse genome has been completed, the function of each gene remains to be elucidated through phenotypic analysis. The "genetic background" (in which each gene functions) is defined as the genotype of all other related genes that may interact with the gene of interest, and therefore potentially influences the specific phenotype. To understand the nature and importance of genetic background on phenotypic expression of specific genes, it is necessary to know the origin and evolutionary history of the laboratory mouse genome. Molecular analysis has indicated that the fancy mice of Japan and Europe contributed significantly to the origin of today's laboratory mice. The genetic background of present-day laboratory mice varies by mouse strain, but is mainly derived from the European domesticus subspecies group and to a lesser degree from Asian mice, probably Japanese fancy mice, which belong to the musculus subspecies group. Inbred laboratory mouse strains are genetically uniform due to extensive inbreeding, and they have greatly contributed to the genetic analysis of many Mendelian traits. Meanwhile, for a variety of practical reasons, many transgenic and targeted mutant mice have been created in mice of mixed genetic backgrounds to elucidate the function of the genes, although efforts have been made to create inbred transgenic mice and targeted mutant mice with coisogenic embryonic stem cell lines. Inbred mouse strains have provided uniform genetic background for accurate evaluation of specific genes phenotypes, thus eliminating the phenotypic variations caused by mixed genetic backgrounds. However, the process of inbreeding and selection of various inbred strain characteristics has resulted in inadvertent selection of other undesirable genetic characteristics and mutations that may influence the genotype and preclude effective phenotypic analysis. Because many of the common inbred mouse stains have been established from relatively small gene pools, common inbred strains have limitations in their genetic polymorphisms and phenotypic variations. Wild-derived mouse strains can complement deficiencies of common inbred mouse strains, providing novel allelic variants and phenotypes. Although wild-derived strains are not as tame as the common laboratory strains, their genetic characteristics are attractive for the future study of gene function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号