首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Domestication and selection for important performance traits can impact the genome, which is most often reflected by reduced heterozygosity in and surrounding genes related to traits affected by selection. In this study, analysis of the genomic impact caused by domestication and artificial selection was conducted by investigating the signatures of selection using single nucleotide polymorphisms (SNPs) in channel catfish (Ictalurus punctatus). A total of 8.4 million candidate SNPs were identified by using next generation sequencing. On average, the channel catfish genome harbors one SNP per 116 bp. Approximately 6.6 million, 5.3 million, 4.9 million, 7.1 million and 6.7 million SNPs were detected in the Marion, Thompson, USDA103, Hatchery strain, and wild population, respectively. The allele frequencies of 407,861 SNPs differed significantly between the domestic and wild populations. With these SNPs, 23 genomic regions with putative selective sweeps were identified that included 11 genes. Although the function for the majority of the genes remain unknown in catfish, several genes with known function related to aquaculture performance traits were included in the regions with selective sweeps. These included hypoxia-inducible factor 1β· HIFιβ ¨ and the transporter gene ATP-binding cassette sub-family B member 5 (ABCB5). HIF1β· is important for response to hypoxia and tolerance to low oxygen levels is a critical aquaculture trait. The large numbers of SNPs identified from this study are valuable for the development of high-density SNP arrays for genetic and genomic studies of performance traits in catfish.  相似文献   

2.
Identifying genomic locations that have experienced selective sweeps is an important first step toward understanding the molecular basis of adaptive evolution. Using statistical methods that account for the confounding effects of population demography, recombination rate variation, and single-nucleotide polymorphism ascertainment, while also providing fine-scale estimates of the position of the selected site, we analyzed a genomic dataset of 1.2 million human single-nucleotide polymorphisms genotyped in African-American, European-American, and Chinese samples. We identify 101 regions of the human genome with very strong evidence (p < 10−5) of a recent selective sweep and where our estimate of the position of the selective sweep falls within 100 kb of a known gene. Within these regions, genes of biological interest include genes in pigmentation pathways, components of the dystrophin protein complex, clusters of olfactory receptors, genes involved in nervous system development and function, immune system genes, and heat shock genes. We also observe consistent evidence of selective sweeps in centromeric regions. In general, we find that recent adaptation is strikingly pervasive in the human genome, with as much as 10% of the genome affected by linkage to a selective sweep.  相似文献   

3.
4.

Background

Vulnerabilities to dependence on addictive substances are substantially heritable complex disorders whose underlying genetic architecture is likely to be polygenic, with modest contributions from variants in many individual genes. “Nontemplate” genome wide association (GWA) approaches can identity groups of chromosomal regions and genes that, taken together, are much more likely to contain allelic variants that alter vulnerability to substance dependence than expected by chance.

Methodology/Principal Findings

We report pooled “nontemplate” genome-wide association studies of two independent samples of substance dependent vs control research volunteers (n = 1620), one European-American and the other African-American using 1 million SNP (single nucleotide polymorphism) Affymetrix genotyping arrays. We assess convergence between results from these two samples using two related methods that seek clustering of nominally-positive results and assess significance levels with Monte Carlo and permutation approaches. Both “converge then cluster” and “cluster then converge” analyses document convergence between the results obtained from these two independent datasets in ways that are virtually never found by chance. The genes identified in this fashion are also identified by individually-genotyped dbGAP data that compare allele frequencies in cocaine dependent vs control individuals.

Conclusions/Significance

These overlapping results identify small chromosomal regions that are also identified by genome wide data from studies of other relevant samples to extents much greater than chance. These chromosomal regions contain more genes related to “cell adhesion” processes than expected by chance. They also contain a number of genes that encode potential targets for anti-addiction pharmacotherapeutics. “Nontemplate” GWA approaches that seek chromosomal regions in which nominally-positive associations are found in multiple independent samples are likely to complement classical, “template” GWA approaches in which “genome wide” levels of significance are sought for SNP data from single case vs control comparisons.  相似文献   

5.
Bicuspid Aortic Valve (BAV) is a highly heritable congenital heart defect. The low frequency of BAV (1% of general population) limits our ability to perform genome-wide association studies. We present the application of four a priori SNP selection techniques, reducing the multiple-testing penalty by restricting analysis to SNPs relevant to BAV in a genome-wide SNP dataset from a cohort of 68 BAV probands and 830 control subjects. Two knowledge-based approaches, CANDID and STRING, were used to systematically identify BAV genes, and their SNPs, from the published literature, microarray expression studies and a genome scan. We additionally tested Functionally Interpolating SNPs (fitSNPs) present on the array; the fourth consisted of SNPs selected by Random Forests, a machine learning approach. These approaches reduced the multiple testing penalty by lowering the fraction of the genome probed to 0.19% of the total, while increasing the likelihood of studying SNPs within relevant BAV genes and pathways. Three loci were identified by CANDID, STRING, and fitSNPS. A haplotype within the AXIN1-PDIA2 locus (p-value of 2.926×10−06) and a haplotype within the Endoglin gene (p-value of 5.881×10−04) were found to be strongly associated with BAV. The Random Forests approach identified a SNP on chromosome 3 in association with BAV (p-value 5.061×10−06). The results presented here support an important role for genetic variants in BAV and provide support for additional studies in well-powered cohorts. Further, these studies demonstrate that leveraging existing expression and genomic data in the context of GWAS studies can identify biologically relevant genes and pathways associated with a congenital heart defect.  相似文献   

6.

Background

A large single nucleotide polymorphism (SNP) dataset was used to analyze genome-wide diversity in a diverse collection of watermelon cultivars representing globally cultivated, watermelon genetic diversity. The marker density required for conducting successful association mapping depends on the extent of linkage disequilibrium (LD) within a population. Use of genotyping by sequencing reveals large numbers of SNPs that in turn generate opportunities in genome-wide association mapping and marker-assisted selection, even in crops such as watermelon for which few genomic resources are available. In this paper, we used genome-wide genetic diversity to study LD, selective sweeps, and pairwise FST distributions among worldwide cultivated watermelons to track signals of domestication.

Results

We examined 183 Citrullus lanatus var. lanatus accessions representing domesticated watermelon and generated a set of 11,485 SNP markers using genotyping by sequencing. With a diverse panel of worldwide cultivated watermelons, we identified a set of 5,254 SNPs with a minor allele frequency of ≥ 0.05, distributed across the genome. All ancestries were traced to Africa and an admixture of various ancestries constituted secondary gene pools across various continents. A sliding window analysis using pairwise FST values was used to resolve selective sweeps. We identified strong selection on chromosomes 3 and 9 that might have contributed to the domestication process. Pairwise analysis of adjacent SNPs within a chromosome as well as within a haplotype allowed us to estimate genome-wide LD decay. LD was also detected within individual genes on various chromosomes. Principal component and ancestry analyses were used to account for population structure in a genome-wide association study. We further mapped important genes for soluble solid content using a mixed linear model.

Conclusions

Information concerning the SNP resources, population structure, and LD developed in this study will help in identifying agronomically important candidate genes from the genomic regions underlying selection and for mapping quantitative trait loci using a genome-wide association study in sweet watermelon.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-767) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Bacteriophages that infect the opportunistic pathogen Pseudomonas aeruginosa have been classified into several groups. One of them, which includes temperate phage particles with icosahedral heads and long flexible tails, bears genomes whose architecture and replication mechanism, but not their nucleotide sequences, are like those of coliphage Mu. By comparing the genomic sequences of this group of P. aeruginosa phages one could draw conclusions about their ontogeny and evolution.

Results

Two newly isolated Mu-like phages of P. aeruginosa are described and their genomes sequenced and compared with those available in the public data banks. The genome sequences of the two phages are similar to each other and to those of a group of P. aeruginosa transposable phages. Comparing twelve of these genomes revealed a common genomic architecture in the group. Each phage genome had numerous genes with homologues in all the other genomes and a set of variable genes specific for each genome. The first group, which comprised most of the genes with assigned functions, was named “core genome”, and the second group, containing mostly short ORFs without assigned functions was called “accessory genome”. Like in other phage groups, variable genes are confined to specific regions in the genome.

Conclusion

Based on the known and inferred functions for some of the variable genes of the phages analyzed here, they appear to confer selective advantages for the phage survival under particular host conditions. We speculate that phages have developed a mechanism for horizontally acquiring genes to incorporate them at specific loci in the genome that help phage adaptation to the selective pressures imposed by the host.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1146) contains supplementary material, which is available to authorized users.  相似文献   

8.
Ironside JE  Filatov DA 《Genetics》2005,171(2):705-713
Previous studies have demonstrated that the diversity of Y-linked genes is substantially lower than that of their X-linked homologs in the plant Silene latifolia. This difference has been attributed to selective sweeps, Muller's ratchet, and background selection, processes that are predicted to severely affect the evolution of the nonrecombining Y chromosome. We studied the DNA diversity of a noncoding region of the homologous genes DD44Y and DD44X, sampling S. latifolia populations from a wide geographical area and also including the closely related species S. dioica, S. diclinis, and S. heuffelii. On the Y chromosome of S. latifolia, we found substantial DNA diversity. Geographical population structure was far higher than on the X chromosome and differentiation between the species was also higher for the Y than for the X chromosome. Our findings indicate that the loss of genetic diversity on the Y chromosome in Silene occurs within local populations rather than within entire species. These results are compatible with background selection, Muller's ratchet, and local selective sweeps, but not with species-wide selective sweeps. The higher interspecific divergence of DD44Y, compared to DD44X, supports the hypothesis that Y chromosome differentiation between incipient species precedes reproductive isolation of the entire genome, forming an early stage in the process of speciation.  相似文献   

9.
The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor “silent” germline micronuclear genome by a process of “unscrambling” and fragmentation. The tiny macronuclear “nanochromosomes” typically encode single, protein-coding genes (a small portion, 10%, encode 2–8 genes), have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size) that vary from 469 bp to 66 kb long (mean ∼3.2 kb) and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%), suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb) suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing studies of rearrangements arising during evolution and disease.  相似文献   

10.
Alternative synonymous codons are often used at unequal frequencies. Classically, studies of such codon usage bias (CUB) attempted to separate the impact of neutral from selective forces by assuming that deviations from a predicted neutral equilibrium capture selection. However, GC-biased gene conversion (gBGC) can also cause deviation from a neutral null. Alternatively, selection has been inferred from CUB in highly expressed genes, but the accuracy of this approach has not been extensively tested, and gBGC can interfere with such extrapolations (e.g., if expression and gene conversion rates covary). It is therefore critical to examine deviations from a mutational null in a species with no gBGC. To achieve this goal, we implement such an analysis in the highly AT rich genome of Dictyostelium discoideum, where we find no evidence of gBGC. We infer neutral CUB under mutational equilibrium to quantify “adaptive codon preference,” a nontautologous genome wide quantitative measure of the relative selection strength driving CUB. We observe signatures of purifying selection consistent with selection favoring adaptive codon preference. Preferred codons are not GC rich, underscoring the independence from gBGC. Expression-associated “preference” largely matches adaptive codon preference but does not wholly capture the influence of selection shaping patterns across all genes, suggesting selective constraints associated specifically with high expression. We observe patterns consistent with effects on mRNA translation and stability shaping adaptive codon preference. Thus, our approach to quantifying adaptive codon preference provides a framework for inferring the sources of selection that shape CUB across different contexts within the genome.  相似文献   

11.
Here, we report the genome of one gammaproteobacterial member of the gut microbiota, for which we propose the name “Candidatus Schmidhempelia bombi,” that was inadvertently sequenced alongside the genome of its host, the bumble bee, Bombus impatiens. This symbiont is a member of the recently described bacterial order Orbales, which has been collected from the guts of diverse insect species; however, “Ca. Schmidhempelia” has been identified exclusively with bumble bees. Metabolic reconstruction reveals that “Ca. Schmidhempelia” lacks many genes for a functioning NADH dehydrogenase I, all genes for the high-oxygen cytochrome o, and most genes in the tricarboxylic acid (TCA) cycle. “Ca. Schmidhempelia” has retained NADH dehydrogenase II, the low-oxygen specific cytochrome bd, anaerobic nitrate respiration, mixed-acid fermentation pathways, and citrate fermentation, which may be important for survival in low-oxygen or anaerobic environments found in the bee hindgut. Additionally, a type 6 secretion system, a Flp pilus, and many antibiotic/multidrug transporters suggest complex interactions with its host and other gut commensals or pathogens. This genome has signatures of reduction (2.0 megabase pairs) and rearrangement, as previously observed for genomes of host-associated bacteria. A survey of wild and laboratory B. impatiens revealed that “Ca. Schmidhempelia” is present in 90% of individuals and, therefore, may provide benefits to its host.  相似文献   

12.

Background

There are several studies describing loss of genes through reductive evolution in microbes, but how selective forces are associated with genome expansion due to horizontal gene transfer (HGT) has not received similar attention. The aim of this study was therefore to examine how selective pressures influence genome expansion in 53 fully sequenced and assembled Escherichia coli strains. We also explored potential connections between genome expansion and the attainment of virulence factors. This was performed using estimations of several genomic parameters such as AT content, genomic drift (measured using relative entropy), genome size and estimated HGT size, which were subsequently compared to analogous parameters computed from the core genome consisting of 1729 genes common to the 53 E. coli strains. Moreover, we analyzed how selective pressures (quantified using relative entropy and dN/dS), acting on the E. coli core genome, influenced lineage and phylogroup formation.

Results

Hierarchical clustering of dS and dN estimations from the E. coli core genome resulted in phylogenetic trees with topologies in agreement with known E. coli taxonomy and phylogroups. High values of dS, compared to dN, indicate that the E. coli core genome has been subjected to substantial purifying selection over time; significantly more than the non-core part of the genome (p<0.001). This is further supported by a linear association between strain-wise dS and dN values (β = 26.94 ± 0.44, R2~0.98, p<0.001). The non-core part of the genome was also significantly more AT-rich (p<0.001) than the core genome and E. coli genome size correlated with estimated HGT size (p<0.001). In addition, genome size (p<0.001), AT content (p<0.001) as well as estimated HGT size (p<0.005) were all associated with the presence of virulence factors, suggesting that pathogenicity traits in E. coli are largely attained through HGT. No associations were found between selective pressures operating on the E. coli core genome, as estimated using relative entropy, and genome size (p~0.98).

Conclusions

On a larger time frame, genome expansion in E. coli, which is significantly associated with the acquisition of virulence factors, appears to be independent of selective forces operating on the core genome.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-882) contains supplementary material, which is available to authorized users.  相似文献   

13.
14.
15.
Ammonia-oxidizing archaea (AOA) are ubiquitous and abundant and contribute significantly to the carbon and nitrogen cycles in the ocean. In this study, we assembled AOA draft genomes from two deep marine sediments from Donghae, South Korea, and Svalbard, Arctic region, by sequencing the enriched metagenomes. Three major microorganism clusters belonging to Thaumarchaeota, Epsilonproteobacteria, and Gammaproteobacteria were deduced from their 16S rRNA genes, GC contents, and oligonucleotide frequencies. Three archaeal genomes were identified, two of which were distinct and were designated Ca. “Nitrosopumilus koreensis” AR1 and “Nitrosopumilus sediminis” AR2. AR1 and AR2 exhibited average nucleotide identities of 85.2% and 79.5% to N. maritimus, respectively. The AR1 and AR2 genomes contained genes pertaining to energy metabolism and carbon fixation as conserved in other AOA, but, conversely, had fewer heme-containing proteins and more copper-containing proteins than other AOA. Most of the distinctive AR1 and AR2 genes were located in genomic islands (GIs) that were not present in other AOA genomes or in a reference water-column metagenome from the Sargasso Sea. A putative gene cluster involved in urea utilization was found in the AR2 genome, but not the AR1 genome, suggesting niche specialization in marine AOA. Co-cultured bacterial genome analysis suggested that bacterial sulfur and nitrogen metabolism could be involved in interactions with AOA. Our results provide fundamental information concerning the metabolic potential of deep marine sedimentary AOA.  相似文献   

16.
The aim of this investigation was to exploit the vast comparative data generated by comparative genome hybridization (CGH) studies of Campylobacter jejuni in developing a genotyping method. We examined genes in C. jejuni that exhibit binary status (present or absent between strains) within known plasticity regions, in order to identify a minimal subset of gene targets that provide high-resolution genetic fingerprints. Using CGH data from three studies as input, binary gene sets were identified with “Minimum SNPs” software. “Minimum SNPs” selects for the minimum number of targets required to obtain a predefined resolution, based on Simpson's index of diversity (D). After implementation of stringent criteria for gene presence/absence, eight binary genes were found that provided 100% resolution (D = 1) of 20 C. jejuni strains. A real-time PCR assay was developed and tested on 181 C. jejuni and Campylobacter coli isolates, a subset of which have previously been characterized by multilocus sequence typing, flaA short variable region sequencing, and pulsed-field gel electrophoresis. In addition to the binary gene real-time PCR assay, we refined the seven-member single nucleotide polymorphism (SNP) real-time PCR assay previously described for C. jejuni and C. coli. By normalizing the SNP assay with the respective C. jejuni and C. coli ubiquitous genes, mapA and ceuE, the polymorphisms at each SNP could be determined without separate reactions for every polymorphism. We have developed and refined a rapid, highly discriminatory genotyping method for C. jejuni and C. coli that uses generic technology and is amenable to high-throughput analyses.  相似文献   

17.
18.
In Europe, especially in Mediterranean areas, the sheep has been traditionally exploited as a dual purpose species, with income from both meat and milk. Modernization of husbandry methods and the establishment of breeding schemes focused on milk production have led to the development of “dairy breeds.” This study investigated selective sweeps specifically related to dairy production in sheep by searching for regions commonly identified in different European dairy breeds. With this aim, genotypes from 44,545 SNP markers covering the sheep autosomes were analysed in both European dairy and non-dairy sheep breeds using two approaches: (i) identification of genomic regions showing extreme genetic differentiation between each dairy breed and a closely related non-dairy breed, and (ii) identification of regions with reduced variation (heterozygosity) in the dairy breeds using two methods. Regions detected in at least two breeds (breed pairs) by the two approaches (genetic differentiation and at least one of the heterozygosity-based analyses) were labeled as core candidate convergence regions and further investigated for candidate genes. Following this approach six regions were detected. For some of them, strong candidate genes have been proposed (e.g. ABCG2, SPP1), whereas some other genes designated as candidates based on their association with sheep and cattle dairy traits (e.g. LALBA, DGAT1A) were not associated with a detectable sweep signal. Few of the identified regions were coincident with QTL previously reported in sheep, although many of them corresponded to orthologous regions in cattle where QTL for dairy traits have been identified. Due to the limited number of QTL studies reported in sheep compared with cattle, the results illustrate the potential value of selection mapping to identify genomic regions associated with dairy traits in sheep.  相似文献   

19.
Association studies in candidate genes have been widely used to search for common low penetrance susceptibility alleles, but few definite associations have been established. We have conducted association studies in breast cancer using an empirical single nucleotide polymorphism (SNP) tagging approach to capture common genetic variation in genes that are candidates for breast cancer based on their known function. We genotyped 710 SNPs in 120 candidate genes in up to 4,400 breast cancer cases and 4,400 controls using a staged design. Correction for population stratification was done using the genomic control method, on the basis of data from 280 genomic control SNPs. Evidence for association with each SNP was assessed using a Cochran–Armitage trend test (p-trend) and a two-degrees of freedom χ2 test for heterogeneity (p-het). The most significant single SNP (p-trend = 8 × 10−5) was not significant at a nominal 5% level after adjusting for population stratification and multiple testing. To evaluate the overall evidence for an excess of positive associations over the proportion expected by chance, we applied two global tests: the admixture maximum likelihood (AML) test and the rank truncated product (RTP) test corrected for population stratification. The admixture maximum likelihood experiment-wise test for association was significant for both the heterogeneity test (p = 0.0031) and the trend test (p = 0.017), but no association was observed using the rank truncated product method for either the heterogeneity test or the trend test (p = 0.12 and p = 0.24, respectively). Genes in the cell-cycle control pathway and genes involved in steroid hormone metabolism and signalling were the main contributors to the association. These results suggest that a proportion of SNPs in these candidate genes are associated with breast cancer risk, but that the effects of individual SNPs is likely to be small. Large sample sizes from multicentre collaboration will be needed to identify associated SNPs with certainty.  相似文献   

20.
mRNA localization is a widespread mode of delivering proteins to their site of function. The embryonic axes in Drosophila are determined in the oocyte, through Dynein-dependent transport of gurken/TGF-α mRNA, containing a small localization signal that assigns its destination. A signal with a similar secondary structure, but lacking significant sequence similarity, is present in the I factor retrotransposon mRNA, also transported by Dynein. It is currently unclear whether other mRNAs exist that are localized to the same site using similar signals. Moreover, searches for other genes containing similar elements have not been possible due to a lack of suitable bioinformatics methods for searches of secondary structure elements and the difficulty of experimentally testing all the possible candidates. We have developed a bioinformatics approach for searching across the genome for small RNA elements that are similar to the secondary structures of particular localization signals. We have uncovered 48 candidates, of which we were able to test 22 for their localization potential using injection assays for Dynein mediated RNA localization. We found that G2 and Jockey transposons each contain a gurken/I factor-like RNA stem–loop required for Dynein-dependent localization to the anterior and dorso–anterior corner of the oocyte. We conclude that I factor, G2, and Jockey are members of a “family” of transposable elements sharing a gurken-like mRNA localization signal and Dynein-dependent mechanism of transport. The bioinformatics pipeline we have developed will have broader utility in fields where small RNA signals play important roles.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号