首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 242 毫秒
1.
2.
3.
4.
As large-scale re-sequencing of genomes reveals many protein mutations, especially in human cancer tissues, prediction of their likely functional impact becomes important practical goal. Here, we introduce a new functional impact score (FIS) for amino acid residue changes using evolutionary conservation patterns. The information in these patterns is derived from aligned families and sub-families of sequence homologs within and between species using combinatorial entropy formalism. The score performs well on a large set of human protein mutations in separating disease-associated variants (∼19 200), assumed to be strongly functional, from common polymorphisms (∼35 600), assumed to be weakly functional (area under the receiver operating characteristic curve of ∼0.86). In cancer, using recurrence, multiplicity and annotation for ∼10 000 mutations in the COSMIC database, the method does well in assigning higher scores to more likely functional mutations (‘drivers’). To guide experimental prioritization, we report a list of about 1000 top human cancer genes frequently mutated in one or more cancer types ranked by likely functional impact; and, an additional 1000 candidate cancer genes with rare but likely functional mutations. In addition, we estimate that at least 5% of cancer-relevant mutations involve switch of function, rather than simply loss or gain of function.  相似文献   

5.
Pathway analysis of genome-wide association studies (GWAS) offer a unique opportunity to collectively evaluate genetic variants with effects that are too small to be detected individually. We applied a pathway analysis to a bladder cancer GWAS containing data from 3,532 cases and 5,120 controls of European background (n = 5 studies). Thirteen hundred and ninety-nine pathways were drawn from five publicly available resources (Biocarta, Kegg, NCI-PID, HumanCyc, and Reactome), and we constructed 22 additional candidate pathways previously hypothesized to be related to bladder cancer. In total, 1421 pathways, 5647 genes and ∼90,000 SNPs were included in our study. Logistic regression model adjusting for age, sex, study, DNA source, and smoking status was used to assess the marginal trend effect of SNPs on bladder cancer risk. Two complementary pathway-based methods (gene-set enrichment analysis [GSEA], and adapted rank-truncated product [ARTP]) were used to assess the enrichment of association signals within each pathway. Eighteen pathways were detected by either GSEA or ARTP at P≤0.01. To minimize false positives, we used the I2 statistic to identify SNPs displaying heterogeneous effects across the five studies. After removing these SNPs, seven pathways (‘Aromatic amine metabolism’ [PGSEA = 0.0100, PARTP = 0.0020], ‘NAD biosynthesis’ [PGSEA = 0.0018, PARTP = 0.0086], ‘NAD salvage’ [PARTP = 0.0068], ‘Clathrin derived vesicle budding’ [PARTP = 0.0018], ‘Lysosome vesicle biogenesis’ [PGSEA = 0.0023, PARTP<0.00012], ’Retrograde neurotrophin signaling’ [PGSEA = 0.00840], and ‘Mitotic metaphase/anaphase transition’ [PGSEA = 0.0040]) remained. These pathways seem to belong to three fundamental cellular processes (metabolic detoxification, mitosis, and clathrin-mediated vesicles). Identification of the aromatic amine metabolism pathway provides support for the ability of this approach to identify pathways with established relevance to bladder carcinogenesis.  相似文献   

6.
7.

Background

Knowledge about how frugivory and seed deposition are spatially distributed is valuable to understand the role of dispersers on the structure and dynamics of plant populations. This may be particularly important within anthropogenic areas, where either the patchy distribution of wild plants or the presence of cultivated fleshy-fruits may influence plant-disperser interactions.

Methodology/Principal Findings

We investigated frugivory and spatial patterns of seed deposition by carnivorous mammals in anthropogenic landscapes considering two spatial scales: ‘landscape’ (∼10 km2) and ‘habitat type’ (∼1–2 km2). We sampled carnivore faeces and plant abundance at three contrasting habitats (chestnut woods, mosaics and scrublands), each replicated within three different landscapes. Sixty-five percent of faeces collected (n = 1077) contained seeds, among which wild and cultivated seeds appeared in similar proportions (58% and 53%) despite that cultivated fruiting plants were much less abundant. Seed deposition was spatially structured among both spatial scales being different between fruit types. Whereas the most important source of spatial variation in deposition of wild seeds was the landscape scale, it was the habitat scale for cultivated seeds. At the habitat scale, seeds of wild species were mostly deposited within mosaics while seeds of cultivated species were within chestnut woods and scrublands. Spatial concordance between seed deposition and plant abundance was found only for wild species.

Conclusions/Significance

Spatial patterns of seed deposition by carnivores differed between fruit types and seemed to be modulated by the fleshy-fruited plant assemblages and the behaviour of dispersers. Our results suggest that a strong preference for cultivated fruits by carnivores may influence their spatial foraging behaviour and lower their dispersal services to wild species. However, the high amount of seeds removed within and between habitats suggests that carnivores must play an important role – often overlooked – as ‘restorers’ and ‘habitat shapers’ in anthropogenic areas.  相似文献   

8.
9.
An important step in ‘metagenomics’ analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines use a single-genome assembler with carefully optimized parameters. A limitation of a single-genome assembler for de novo metagenome assembly is that sequences of highly abundant species are likely misidentified as repeats in a single genome, resulting in a number of small fragmented scaffolds. We extended a single-genome assembler for short reads, known as ‘Velvet’, to metagenome assembly, which we called ‘MetaVelvet’, for mixed short reads of multiple species. Our fundamental concept was to first decompose a de Bruijn graph constructed from mixed short reads into individual sub-graphs, and second, to build scaffolds based on each decomposed de Bruijn sub-graph as an isolate species genome. We made use of two features, the coverage (abundance) difference and graph connectivity, for the decomposition of the de Bruijn graph. For simulated datasets, MetaVelvet succeeded in generating significantly higher N50 scores than any single-genome assemblers. MetaVelvet also reconstructed relatively low-coverage genome sequences as scaffolds. On real datasets of human gut microbial read data, MetaVelvet produced longer scaffolds and increased the number of predicted genes.  相似文献   

10.
The elucidation of principles governing evolution of gene regulatory sequence is critical to the study of metazoan diversification. We are therefore exploring the structure and organizational constraints of regulatory sequences by studying functionally equivalent cis-regulatory modules (CRMs) that have been evolving in parallel across several loci. Such an independent dataset allows a multi-locus study that is not hampered by nonfunctional or constrained homology. The neurogenic ectoderm enhancers (NEEs) of Drosophila melanogaster are one such class of coordinately regulated CRMs. The NEEs share a common organization of binding sites and as a set would be useful to study the relationship between CRM organization and CRM activity across evolving lineages. We used the D. melanogaster transgenic system to screen for functional adaptations in the NEEs from divergent drosophilid species. We show that the individual NEE modules across a genome in any one lineage have independently evolved adaptations to compensate for lineage-specific developmental and/or genomic changes. Specifically, we show that both the site composition and the site organization of NEEs have been finely tuned by distinct, lineage-specific selection pressures in each of the three divergent species that we have examined: D. melanogaster, D. pseudoobscura, and D. virilis. Furthermore, by precisely altering the organization of NEEs with different morphogen gradient threshold readouts, we show that CRM organizational evolution is sufficient for explaining changes in enhancer activity. Thus, evolution can act on CRM organization to fine-tune morphogen gradient threshold readouts over a wide dynamic range. Our study demonstrates that equivalence classes of CRMs are powerful tools for detecting lineage-specific adaptations by gene regulatory sequences.  相似文献   

11.

Background and Aims

The production of triploid banana and plantain (Musa spp.) cultivars with improved characteristics (e.g. greater disease resistance or higher yield), while still preserving the main features of current popular cultivars (e.g. taste and cooking quality), remains a major challenge for Musa breeders. In this regard, breeders require a sound knowledge of the lineage of the current sterile triploid cultivars, to select diploid parents that are able to transmit desirable traits, together with a breeding strategy ensuring final triploidization and sterility. Highly polymorphic single sequence repeats (SSRs) are valuable markers for investigating phylogenetic relationships.

Methods

Here, the allelic distribution of each of 22 SSR loci across 561 Musa accessions is analysed.

Key Results and Conclusions

We determine the closest diploid progenitors of the triploid ‘Cavendish’ and ‘Gros Michel’ subgroups, valuable information for breeding programmes. Nevertheless, in establishing the likely monoclonal origin of the main edible triploid banana subgroups (i.e. ‘Cavendish’, ‘Plantain’ and ‘Mutika-Lujugira’), we postulated that the huge phenotypic diversity observed within these subgroups did not result from gamete recombination, but rather from epigenetic regulations. This emphasizes the need to investigate the regulatory mechanisms of genome expression on a unique model in the plant kingdom. We also propose experimental standards to compare additional and independent genotyping data for reference.  相似文献   

12.

Background

Feasibility of genotyping of hundreds and thousands of single nucleotide polymorphisms (SNPs) in thousands of study subjects have triggered the need for fast, powerful, and reliable methods for genome-wide association analysis. Here we consider a situation when study participants are genetically related (e.g. due to systematic sampling of families or because a study was performed in a genetically isolated population). Of the available methods that account for relatedness, the Measured Genotype (MG) approach is considered the ‘gold standard’. However, MG is not efficient with respect to time taken for the analysis of genome-wide data. In this context we proposed a fast two-step method called Genome-wide Association using Mixed Model and Regression (GRAMMAR) for the analysis of pedigree-based quantitative traits. This method certainly overcomes the drawback of time limitation of the measured genotype (MG) approach, but pays in power. One of the major drawbacks of both MG and GRAMMAR, is that they crucially depend on the availability of complete and correct pedigree data, which is rarely available.

Methodology

In this study we first explore type 1 error and relative power of MG, GRAMMAR, and Genomic Control (GC) approaches for genetic association analysis. Secondly, we propose an extension to GRAMMAR i.e. GRAMMAR-GC. Finally, we propose application of GRAMMAR-GC using the kinship matrix estimated through genomic marker data, instead of (possibly missing and/or incorrect) genealogy.

Conclusion

Through simulations we show that MG approach maintains high power across a range of heritabilities and possible pedigree structures, and always outperforms other contemporary methods. We also show that the power of our proposed GRAMMAR-GC approaches to that of the ‘gold standard’ MG for all models and pedigrees studied. We show that this method is both feasible and powerful and has correct type 1 error in the context of genome-wide association analysis in related individuals.  相似文献   

13.

Background

Habitat loss and overexploitation are among the primary factors threatening populations of many mammal species. Recently, aquatic mammals have been highlighted as particularly vulnerable. Here we test (1) if aquatic mammals emerge as more phylogenetically urgent conservation priorities than their terrestrial relatives, and (2) if high priority species are receiving sufficient conservation effort. We also compare results among some phylogenetic conservation methods.

Methodology/Principal Findings

A phylogenetic analysis of conservation priorities for all 620 species of Cetartiodactyla and Carnivora, including most aquatic mammals. Conservation priority ranking of aquatic versus terrestrial species is approximately proportional to their diversity. However, nearly all obligated freshwater cetartiodactylans are among the top conservation priority species. Further, ∼74% and 40% of fully aquatic cetartiodactylans and carnivores, respectively, are either threatened or data deficient, more so than their terrestrial relatives. Strikingly, only 3% of all ‘high priority’ species are thought to be stable. An overwhelming 97% of these species thus either show decreasing population trends (87%) or are insufficiently known (10%). Furthermore, a disproportional number of highly evolutionarily distinct species are experiencing population decline, thus, such species should be closely monitored even if not currently threatened. Comparison among methods reveals that exact species ranking differs considerably among methods, nevertheless, most top priority species consistently rank high under any method. While we here favor one approach, we also suggest that a consensus approach may be useful when methods disagree.

Conclusions/Significance

These results reinforce prior findings, suggesting there is an urgent need to gather basic conservation data for aquatic mammals, and special conservation focus is needed on those confined to freshwater. That evolutionarily distinct—and thus ‘biodiverse’—species are faring relatively poorly is alarming and requires further study. Our results offer a detailed guide to phylogeny-based conservation prioritization for these two orders.  相似文献   

14.
Parallel analysis of RNA ends (PARE) is a technique utilizing high-throughput sequencing to profile uncapped, mRNA cleavage or decay products on a genome-wide basis. Tools currently available to validate miRNA targets using PARE data employ only annotated genes, whereas important targets may be found in unannotated genomic regions. To handle such cases and to scale to the growing availability of PARE data and genomes, we developed a new tool, ‘sPARTA’ (small RNA-PARE target analyzer) that utilizes a built-in, plant-focused target prediction module (aka ‘miRferno’). sPARTA not only exhibits an unprecedented gain in speed but also it shows greater predictive power by validating more targets, compared to a popular alternative. In addition, the novel ‘seed-free’ mode, optimized to find targets irrespective of complementarity in the seed-region, identifies novel intergenic targets. To fully capitalize on the novelty and strengths of sPARTA, we developed a web resource, ‘comPARE’, for plant miRNA target analysis; this facilitates the systematic identification and analysis of miRNA-target interactions across multiple species, integrated with visualization tools. This collation of high-throughput small RNA and PARE datasets from different genomes further facilitates re-evaluation of existing miRNA annotations, resulting in a ‘cleaner’ set of microRNAs.  相似文献   

15.
In this paper, I review the relevance of the niche to biogeography, and what biogeography may tell us about the niche. The niche is defined as the combination of abiotic and biotic conditions where a species can persist. I argue that most biogeographic patterns are created by niche differences over space, and that even ‘geographic barriers’ must have an ecological basis. However, we know little about specific ecological factors underlying most biogeographic patterns. Some evidence supports the importance of abiotic factors, whereas few examples exist of large-scale patterns created by biotic interactions. I also show how incorporating biogeography may offer new perspectives on resource-related niches and species interactions. Several examples demonstrate that even after a major evolutionary radiation within a region, the region can still be invaded by ecologically similar species from another clade, countering the long-standing idea that communities and regions are generally ‘saturated’ with species. I also describe the somewhat paradoxical situation where competition seems to limit trait evolution in a group, but does not prevent co-occurrence of species with similar values for that trait (called here the ‘competition–divergence–co-occurrence conundrum’). In general, the interface of biogeography and ecology could be a major area for research in both fields.  相似文献   

16.
Jin G  Zhang S  Zhang XS  Chen L 《PloS one》2007,2(11):e1207

Background

It has been recognized that modular organization pervades biological complexity. Based on network analysis, ‘party hubs’ and ‘date hubs’ were proposed to understand the basic principle of module organization of biomolecular networks. However, recent study on hubs has suggested that there is no clear evidence for coexistence of ‘party hubs’ and ‘date hubs’. Thus, an open question has been raised as to whether or not ‘party hubs’ and ‘date hubs’ truly exist in yeast interactome.

Methodology

In contrast to previous studies focusing on the partners of a hub or the individual proteins around the hub, our work aims to study the network motifs of a hub or interactions among individual proteins including the hub and its neighbors. Depending on the relationship between a hub''s network motifs and protein complexes, we define two new types of hubs, ‘motif party hubs’ and ‘motif date hubs’, which have the same characteristics as the original ‘party hubs’ and ‘date hubs’ respectively. The network motifs of these two types of hubs display significantly different features in spatial distribution (or cellular localizations), co-expression in microarray data, controlling topological structure of network, and organizing modularity.

Conclusion

By virtue of network motifs, we basically solved the open question about ‘party hubs’ and ‘date hubs’ which was raised by previous studies. Specifically, at the level of network motifs instead of individual proteins, we found two types of hubs, motif party hubs (mPHs) and motif date hubs (mDHs), whose network motifs display distinct characteristics on biological functions. In addition, in this paper we studied network motifs from a different viewpoint. That is, we show that a network motif should not be merely considered as an interaction pattern but be considered as an essential function unit in organizing modules of networks.  相似文献   

17.

Background

Developmental instability of shelled gastropods is measured as deviations from a perfect equiangular (logarithmic) spiral. We studied six species of gastropods at ‘Evolution Canyons I and II’ in Carmel and the Galilee Mountains, Israel, respectively. The xeric, south-facing, ‘African’ slopes and the mesic, north-facing, ‘European’ slopes have dramatically different microclimates and plant communities. Moreover, ‘Evolution Canyon II’ receives more rainfall than ‘Evolution Canyon I.’

Methodology/Principal Findings

We examined fluctuating asymmetry, rate of whorl expansion, shell height, and number of rotations of the body suture in six species of terrestrial snails from the two ‘Evolution Canyons.’ The xeric ‘African’ slope should be more stressful to land snails than the ‘European’ slope, and ‘Evolution Canyon I’ should be more stressful than ‘Evolution Canyon II.’ Only Eopolita protensa jebusitica showed marginally significant differences in fluctuating helical asymmetry between the two slopes. Contrary to expectations, asymmetry was marginally greater on the ‘European’ slope. Shells of Levantina spiriplana caesareana at ‘Evolution Canyon I,’ were smaller and more asymmetric than those at ‘Evolution Canyon II.’ Moreover, shell height and number of rotations of the suture were greater on the north-facing slopes of both canyons.

Conclusions/Significance

Our data is consistent with a trade-off between drought resistance and thermoregulation in snails; Levantina was significantly smaller on the ‘African’ slope, for increasing surface area and thermoregulation, while Eopolita was larger on the ‘African’ slope, for reducing water evaporation. In addition, ‘Evolution Canyon I’ was more stressful than Evolution Canyon II’ for Levantina.  相似文献   

18.
19.

Background

Deep-sequencing has enabled the identification of large numbers of miRNAs and siRNAs, making the high-throughput target identification a main limiting factor in defining their function. In plants, several tools have been developed to predict targets, majority of them being trained on Arabidopsis datasets. An extensive and systematic evaluation has not been made for their suitability for predicting targets in species other than Arabidopsis. Nor, these have not been evaluated for their suitability for high-throughput target prediction at genome level.

Results

We evaluated the performance of 11 computational tools in identifying genome-wide targets in Arabidopsis and other plants with procedures that optimized score-cutoffs for estimating targets. Targetfinder was most efficient [89% ‘precision’ (accuracy of prediction), 97% ‘recall’ (sensitivity)] in predicting ‘true-positive’ targets in Arabidopsis miRNA-mRNA interactions. In contrast, only 46% of true positive interactions from non-Arabidopsis species were detected, indicating low ‘recall’ values. Score optimizations increased the ‘recall’ to only 70% (corresponding ‘precision’: 65%) for datasets of true miRNA-mRNA interactions in species other than Arabidopsis. Combining the results of Targetfinder and psRNATarget delivers high true positive coverage, whereas the intersection of psRNATarget and Tapirhybrid outputs deliver highly ‘precise’ predictions. The large number of ‘false negative’ predictions delivered from non-Arabidopsis datasets by all the available tools indicate the diversity in miRNAs-mRNA interaction features between Arabidopsis and other species. A subset of miRNA-mRNA interactions differed significantly for features in seed regions as well as the total number of matches/mismatches.

Conclusion

Although, many plant miRNA target prediction tools may be optimized to predict targets with high specificity in Arabidopsis, such optimized thresholds may not be suitable for many targets in non-Arabidopsis species. More importantly, non-conventional features of miRNA-mRNA interaction may exist in plants indicating alternate mode of miRNA target recognition. Incorporation of these divergent features would enable next-generation of algorithms to better identify target interactions.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-348) contains supplementary material, which is available to authorized users.  相似文献   

20.
DNA methylation plays a central role in genomic regulation and disease. Sodium bisulfite treatment (SBT) causes unmethylated cytosines to be sequenced as thymine, which allows methylation levels to reflected in the number of ‘C’-‘C’ alignments covering reference cytosines. Di-base color reads produced by lifetech’s SOLiD sequencer provide unreliable results when translated to bases because single sequencing errors effect the downstream sequence. We describe FadE, an algorithm to accurately determine genome-wide methylation rates directly in color or nucleotide space. FadE uses SBT unmethylated and untreated data to determine background error rates and incorporate them into a model which uses Newton–Raphson optimization to estimate the methylation rate and provide a credible interval describing its distribution at every reference cytosine. We sequenced two slides of human fibroblast cell-line bisulfite-converted fragment library with the SOLiD sequencer to investigate genome-wide methylation levels. FadE reported widespread differences in methylation levels across CpG islands and a large number of differentially methylated regions adjacent to genes which compares favorably to the results of an investigation on the same cell-line using nucleotide-space reads at higher coverage levels, suggesting that FadE is an accurate method to estimate genome-wide methylation with color or nucleotide reads. http://code.google.com/p/fade/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号