首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
An accurate and precisely annotated genome assembly is a fundamental requirement for functional genomic analysis. Here, the complete DNA sequence and gene annotation of mouse Chromosome 11 was used to test the efficacy of large-scale sequencing for mutation identification. We re-sequenced the 14,000 annotated exons and boundaries from over 900 genes in 41 recessive mutant mouse lines that were isolated in an N-ethyl-N-nitrosourea (ENU) mutation screen targeted to mouse Chromosome 11. Fifty-nine sequence variants were identified in 55 genes from 31 mutant lines. 39% of the lesions lie in coding sequences and create primarily missense mutations. The other 61% lie in noncoding regions, many of them in highly conserved sequences. A lesion in the perinatal lethal line l11Jus13 alters a consensus splice site of nucleoredoxin (Nxn), inserting 10 amino acids into the resulting protein. We conclude that point mutations can be accurately and sensitively recovered by large-scale sequencing, and that conserved noncoding regions should be included for disease mutation identification. Only seven of the candidate genes we report have been previously targeted by mutation in mice or rats, showing that despite ongoing efforts to functionally annotate genes in the mammalian genome, an enormous gap remains between phenotype and function. Our data show that the classical positional mapping approach of disease mutation identification can be extended to large target regions using high-throughput sequencing.  相似文献   

2.
In a previous study we determined that BcA86 mice, a strain belonging to a panel of AcB/BcA recombinant congenic strains, have an airway responsiveness phenotype resembling mice from the airway hyperresponsive A/J strain. The majority of the BcA86 genome is however from the hyporesponsive C57BL/6J strain. The aim of this study was to identify candidate regions and genes associated with airway hyperresponsiveness (AHR) by quantitative trait locus (QTL) analysis using the BcA86 strain. Airway responsiveness of 205 F2 mice generated from backcrossing BcA86 strain to C57BL/6J strain was measured and used for QTL analysis to identify genomic regions in linkage with AHR. Consomic mice for the QTL containing chromosomes were phenotyped to study the contribution of each chromosome to lung responsiveness. Candidate genes within the QTL were selected based on expression differences in mRNA from whole lungs, and the presence of coding non-synonymous mutations that were predicted to have a functional effect by amino acid substitution prediction tools. One QTL for AHR was identified on Chromosome 12 with its 95% confidence interval ranging from 54.6 to 82.6 Mbp and a maximum LOD score of 5.11 (p = 3.68×10−3). We confirmed that the genotype of mouse Chromosome 12 is an important determinant of lung responsiveness using a Chromosome 12 substitution strain. Mice with an A/J Chromosome 12 on a C57BL/6J background have an AHR phenotype similar to hyperresponsive strains A/J and BcA86. Within the QTL, genes with deleterious coding variants, such as Foxa1, and genes with expression differences, such as Mettl21d and Snapc1, were selected as possible candidates for the AHR phenotype. Overall, through QTL analysis of a recombinant congenic strain, microarray analysis and coding variant analysis we identified Chromosome 12 and three potential candidate genes to be in linkage with airway responsiveness.  相似文献   

3.

Background

Structural genomic variation study, along with microarray technology development has provided many genomic resources related with architecture of human genome, and led to the fact that human genome structure is a lot more complicated than previously thought.

Methodology/Principal Findings

In the case of International HapMap Project, Epstein-Barr various immortalized cell lines were preferably used over blood in order to get a larger number of genomic DNA. However, genomic aberration stemming from immortalization process, biased representation of the donor tissue, and culture process may influence the accuracy of SNP genotypes. In order to identify chromosome aberrations including loss of heterozygosity (LOH), large-scale and small-scale copy number variations, we used Illumina HumanHap500 BeadChip (555,352 markers) on Korean HapMap individuals (n = 90) to obtain Log R ratio and B allele frequency information, and then utilized the data with various programs including Illumina ChromoZone, cnvParition and PennCNV. As a result, we identified 28 LOHs (>3 mb) and 35 large-scale CNVs (>1 mb), with 4 samples having completely duplicated chromosome. In addition, after checking the sample quality (standard deviation of log R ratio <0.30), we selected 79 samples and used both signal intensity and B allele frequency simultaneously for identification of small-scale CNVs (<1 mb) to discover 4,989 small-scale CNVs. Identified CNVs in this study were successfully validated using visual examination of the genoplot images, overlapping analysis with previously reported CNVs in DGV, and quantitative PCR.

Conclusion/Significance

In this study, we describe the result of the identified chromosome aberrations in Korean HapMap individuals, and expect that these findings will provide more meaningful information on the human genome.  相似文献   

4.
Identifying genomic locations that have experienced selective sweeps is an important first step toward understanding the molecular basis of adaptive evolution. Using statistical methods that account for the confounding effects of population demography, recombination rate variation, and single-nucleotide polymorphism ascertainment, while also providing fine-scale estimates of the position of the selected site, we analyzed a genomic dataset of 1.2 million human single-nucleotide polymorphisms genotyped in African-American, European-American, and Chinese samples. We identify 101 regions of the human genome with very strong evidence (p < 10−5) of a recent selective sweep and where our estimate of the position of the selective sweep falls within 100 kb of a known gene. Within these regions, genes of biological interest include genes in pigmentation pathways, components of the dystrophin protein complex, clusters of olfactory receptors, genes involved in nervous system development and function, immune system genes, and heat shock genes. We also observe consistent evidence of selective sweeps in centromeric regions. In general, we find that recent adaptation is strikingly pervasive in the human genome, with as much as 10% of the genome affected by linkage to a selective sweep.  相似文献   

5.
Genome annotation in differently evolved organisms presents challenges because the lack of sequence-based homology limits the ability to determine the function of putative coding regions. To provide an alternative to annotation by sequence homology, we developed a method that takes advantage of unusual trypanosomatid biology and skews in nucleotide composition between coding regions and upstream regions to rank putative open reading frames based on the likelihood of coding. The method is 93% accurate when tested on known genes. We have applied our method to the full complement of open reading frames on Chromosome I of Trypanosoma brucei, and we can predict with high confidence that 226 putative coding regions are likely to be functional. Methods such as the one described here for discriminating true coding regions are critical for genome annotation when other sources of evidence for function are limited.  相似文献   

6.
Multiple disease resistance has important implications for plant fitness, given the selection pressure that many pathogens exert directly on natural plant populations and indirectly via crop improvement programs. Evidence of a locus conditioning resistance to multiple pathogens was found in bin 1.06 of the maize genome with the allele from inbred line “Tx303” conditioning quantitative resistance to northern leaf blight (NLB) and qualitative resistance to Stewart’s wilt. To dissect the genetic basis of resistance in this region and to refine candidate gene hypotheses, we mapped resistance to the two diseases. Both resistance phenotypes were localized to overlapping regions, with the Stewart’s wilt interval refined to a 95.9-kb segment containing three genes and the NLB interval to a 3.60-Mb segment containing 117 genes. Regions of the introgression showed little to no recombination, suggesting structural differences between the inbred lines Tx303 and “B73,” the parents of the fine-mapping population. We examined copy number variation across the region using next-generation sequencing data, and found large variation in read depth in Tx303 across the region relative to the reference genome of B73. In the fine-mapping region, association mapping for NLB implicated candidate genes, including a putative zinc finger and pan1. We tested mutant alleles and found that pan1 is a susceptibility gene for NLB and Stewart’s wilt. Our data strongly suggest that structural variation plays an important role in resistance conditioned by this region, and pan1, a gene conditioning susceptibility for NLB, may underlie the QTL.  相似文献   

7.
8.
Olfactory receptors (OR), responsible for detection of odor molecules, belong to the largest family of genes and are highly polymorphic in nature having distinct polymorphisms associated with specific regions around the globe. Since there are no reports on the presence of copy number variations in OR repertoire of Indian population, the present investigation in 43 Indians along with 270 HapMap and 31 Tibetan samples was undertaken to study genome variability and evolution. Analysis was performed using Affymetrix Genome-Wide Human SNP Array 6.0 chip, Affymterix CytoScan® High-Density array, HD-CNV, and MAFFT program. We observed a total of 1527 OR genes in 503 CNV events from 81.3% of the study group, which includes 67.6% duplications and 32.4% deletions encompassing more of genes than pseudogenes. We report human genotypic variation in functional OR repertoire size across populations and it was found that the combinatorial effect of both “orthologous obtained from closely related species” and “paralogous derived sequences” provide the complexity to the continuously occurring OR CNVs.  相似文献   

9.
10.
Detecting recent selected ‘genomic footprints’ applies directly to the discovery of disease genes and in the imputation of the formative events that molded modern population genetic structure. The imprints of historic selection/adaptation episodes left in human and animal genomes allow one to interpret modern and ancestral gene origins and modifications. Current approaches to reveal selected regions applied in genome-wide selection scans (GWSSs) fall into eight principal categories: (I) phylogenetic footprinting, (II) detecting increased rates of functional mutations, (III) evaluating divergence versus polymorphism, (IV) detecting extended segments of linkage disequilibrium, (V) evaluating local reduction in genetic variation, (VI) detecting changes in the shape of the frequency distribution (spectrum) of genetic variation, (VII) assessing differentiating between populations (FST), and (VIII) detecting excess or decrease in admixture contribution from one population. Here, we review and compare these approaches using available human genome-wide datasets to provide independent verification (or not) of regions found by different methods and using different populations. The lessons learned from GWSSs will be applied to identify genome signatures of historic selective pressures on genes and gene regions in other species with emerging genome sequences. This would offer considerable potential for genome annotation in functional, developmental and evolutionary contexts.  相似文献   

11.

Background

The detection and functional characterization of genomic structural variations are important for understanding the landscape of genetic variation in the chicken. A recently recognized aspect of genomic structural variation, called copy number variation (CNV), is gaining interest in chicken genomic studies. The aim of the present study was to investigate the pattern and functional characterization of CNVs in five characteristic chicken breeds, which will be important for future studies associating phenotype with chicken genome architecture.

Results

Using a commercial 385 K array-based comparative genomic hybridization (aCGH) genome array, we performed CNV discovery using 10 chicken samples from four local Chinese breeds and the French breed Houdan chicken. The female Anka broiler was used as a reference. A total of 281 copy number variation regions (CNVR) were identified, covering 12.8 Mb of polymorphic sequences or 1.07% of the entire chicken genome. The functional annotation of CNVRs indicated that these regions completely or partially overlapped with 231 genes and 1032 quantitative traits loci, suggesting these CNVs have important functions and might be promising resources for exploring differences among various breeds. In addition, we employed quantitative PCR (qPCR) to further validate several copy number variable genes, such as prolactin receptor, endothelin 3 (EDN3), suppressor of cytokine signaling 2, CD8a molecule, with important functions, and the results suggested that EDN3 might be a molecular marker for the selection of dark skin color in poultry production. Moreover, we also identified a new CNVR (chr24: 3484617–3512275), encoding the sortilin-related receptor gene, with copy number changes in only black-bone chicken.

Conclusions

Here, we report a genome-wide analysis of the CNVs in five chicken breeds using aCGH. The association between EDN3 and melanoblast proliferation was further confirmed using qPCR. These results provide additional information for understanding genomic variation and related phenotypic characteristics.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-934) contains supplementary material, which is available to authorized users.  相似文献   

12.
DNA base composition is a fundamental genome feature. However, the evolutionary pattern of base composition and its potential causes have not been well understood. Here, we report findings from comparative analysis of base composition at the whole-genome level across 2210 species, the polymorphic-site level across eight population comparison sets, and the mutation-site level in 12 mutation-tracking experiments. We first demonstrate that base composition follows the individual-strand base equality rule at the genome, chromosome and polymorphic-site levels. More intriguingly, clear separation of base-composition values calculated across polymorphic sites was consistently observed between basal and derived groups, suggesting common underlying mechanisms. Individuals in the derived groups show an A&T-increase/G&C-decrease pattern compared with the basal groups. Spontaneous and induced mutation experiments indicated these patterns of base composition change can emerge across mutation sites. With base-composition across polymorphic sites as a genome phenotype, genome scans with human 1000 Genomes and HapMap3 data identified a set of significant genomic regions enriched with Gene Ontology terms for DNA repair. For three DNA repair genes (BRIP1, PMS2P3 and TTDN), ENCODE data provided evidence for interaction between genomic regions containing these genes and regions containing the significant SNPs. Our findings provide insights into the mechanisms of genome evolution.  相似文献   

13.
Power to detect risk alleles using genome-wide tag SNP panels   总被引:1,自引:0,他引:1       下载免费PDF全文
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ~ 1.8–2.0). Relative risks as low as λ ~ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.  相似文献   

14.
A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celera’s database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity ≥90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000–20 000 NotI sites, of which 6000–9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.  相似文献   

15.
16.
17.
18.
We describe methods for rapid sequencing of the entire human mitochondrial genome (mtgenome), which involve long-range PCR for specific amplification of the mtgenome, pyrosequencing, quantitative mapping of sequence reads to identify sequence variants and heteroplasmy, as well as de novo sequence assembly. These methods have been used to study 40 publicly available HapMap samples of European (CEU) and African (YRI) ancestry to demonstrate a sequencing error rate <5.63×10−4, nucleotide diversity of 1.6×10−3 for CEU and 3.7×10−3 for YRI, patterns of sequence variation consistent with earlier studies, but a higher rate of heteroplasmy varying between 10% and 50%. These results demonstrate that next-generation sequencing technologies allow interrogation of the mitochondrial genome in greater depth than previously possible which may be of value in biology and medicine.  相似文献   

19.
20.

Background

While the possible sources underlying the so-called ‘missing heritability’ evident in current genome-wide association studies (GWAS) of complex traits have been actively pursued in recent years, resolving this mystery remains a challenging task. Studying heritability of genome-wide gene expression traits can shed light on the goal of understanding the relationship between phenotype and genotype. Here we used microarray gene expression measurements of lymphoblastoid cell lines and genome-wide SNP genotype data from 210 HapMap individuals to examine the heritability of gene expression traits.

Results

Heritability levels for expression of 10,720 genes were estimated by applying variance component model analyses and 1,043 expression quantitative loci (eQTLs) were detected. Our results indicate that gene expression traits display a bimodal distribution of heritability, one peak close to 0% and the other summit approaching 100%. Such a pattern of the within-population variability of gene expression heritability is common among different HapMap populations of unrelated individuals but different from that obtained in the CEU and YRI trio samples. Higher heritability levels are shown by housekeeping genes and genes associated with cis eQTLs. Both cis and trans eQTLs make comparable cumulative contributions to the heritability. Finally, we modelled gene-gene interactions (epistasis) for genes with multiple eQTLs and revealed that epistasis was not prevailing in all genes but made a substantial contribution in explaining total heritability for some genes analysed.

Conclusions

We utilised a mixed effect model analysis for estimating genetic components from population based samples. On basis of analyses of genome-wide gene expression from four HapMap populations, we demonstrated detailed exploitation of the distribution of genetic heritabilities for expression traits from different populations, and highlighted the importance of studying interaction at the gene expression level as an important source of variation underlying missing heritability.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-13) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号