首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
The identification of functional gene modules that are derived from integration of information from different types of networks is a powerful strategy for interpreting the etiology of complex diseases such as rheumatoid arthritis (RA). Genetic variants are known to increase the risk of developing RA. Here, a novel method, the construction of a genetic network, was used to mine functional gene modules linked with RA. A polymorphism interaction analy-sis (PIA) algorithm was used to obtain cooperating single nucleotide polymorphisms (SNPs) that contribute to RA disease. The acquired SNP pairs were used to construct a SNP-SNP network. Sub-networks defined by hub SNPs were then extracted and turned into gene modules by mapping SNPs to genes using dbSNP database. We per-formed Gene Ontology (GO) analysis on each gene module, and some GO terms enriched in the gene modules can be used to investigate clustered gene function for better understanding RA pathogenesis. This method was applied to the Genetic Analysis Workshop 15 (GAW 15) RA dataset. The results show that genes involved in func-tional gene modules, such as CD160 (rs744877) and RUNX1 (rs2051179), are especially relevant to RA, which is supported by previous reports. Furthermore, the 43 SNPs involved in the identified gene modules were found to be the best classifiers when used as variables for sample classification.  相似文献   

2.
随着人口老龄化问题的越来越严重,医疗护理机器人的开发,今后将会有大量的需求,基于表面肌电信号的医疗护理机器人的开发将是其中的一个热点.本文提出了基于Bayesian正则化的多层感知器人工神经网络方法来提取人体肘关节运动角度,解决了普通神经网络对于表面肌电信号这一复杂亚高斯随机信号泛化能力不强的缺点,有助于将表面肌电信号的研究推向医疗护理机器人研发的实际应用阶段.  相似文献   

3.
类群取样与系统发育分析精确度之探索   总被引:6,自引:2,他引:4  
Appropriate and extensive taxon sampling is one of the most important determinants of accurate phylogenetic estimation. In addition, accuracy of inferences about evolutionary processes obtained from phylogenetic analyses is improved significantly by thorough taxon sampling efforts. Many recent efforts to improve phylogenetic estimates have focused instead on increasing sequence length or the number of overall characters in the analysis, and this often does have a beneficial effect on the accuracy of phylogenetic analyses. However, phylogenetic analyses of few taxa (but each represented by many characters) can be subject to strong systematic biases, which in turn produce high measures of repeatability (such as bootstrap proportions) in support of incorrect or misleading phylogenetic results. Thus, it is important for phylogeneticists to consider both the sampling of taxa, as well as the sampling of characters, in designing phylogenetic studies. Taxon sampling also improves estimates of evolutionary parameters derived from phylogenetic trees, and is thus important for improved applications of phylogenetic analyses. Analysis of sensitivity to taxon inclusion, the possible effects of long-branch attraction, and sensitivity of parameter estimation for model-based methods should be a part of any careful and thorough phylogenetic analysis. Furthermore, recent improvements in phylogenetic algorithms and in computational power have removed many constraints on analyzing large, thoroughly sampled data sets. Thorough taxon sampling is thus one of the most practical ways to improve the accuracy of phylogenetic estimates, as well as the accuracy of biological inferences that are based on these phylogenetic trees.  相似文献   

4.
从细胞色素b基因序列变异分析中国鲇形目鱼类的系统发育   总被引:18,自引:0,他引:18  
采用PCR技术获得中国鲇形目鱼类11科24属27个代表种类细胞色素b基因1138bp全序列,比较分析了来自北美洲、非洲的部分鲇形目鱼类同一基因序列,并选取脂鲤目、鲤形目和鲱形目鱼类作外类群,采用Bayesian方法和最大简约法(MP)构建分子系统树。结果表明:(1)鲇形目鱼类细胞色素b基因序列中,与脂鲤目、鲤形目以及鲱形目鱼类相比存在3bp的缺失;(2)鲇形目鱼类各科代表种类形成一单系群;(3)两种建树方法均支持铫科、粒鲇科和钝头鮠科形成一单系群;而胡子鲇科、刀鲇科、海鲇科、鮰科、长臀鮠科、鲢科、鲇科、棘脂鲿科、鲿科形成一大的单系群;但鳗鲇科的系统位置两种建树方法没有取得一致结果;而其中长臀鲍科与北美的鮰科形成姐妹群,胡子鲇、鮰科、鲇科、鲿科和鮡科是较明显的单系群。  相似文献   

5.
The characterization of the interacting behaviors of complex biological systems is a primary objective in protein–protein network analysis and computational biology. In this paper we present FunMod, an innovative Cytoscape version 2.8 plugin that is able to mine undirected protein–protein networks and to infer sub-networks of interacting proteins intimately correlated with relevant biological pathways. This plugin may enable the discovery of new pathways involved in diseases. In order to describe the role of each protein within the relevant biological pathways, FunMod computes and scores three topological features of the identified sub-networks. By integrating the results from biological pathway clustering and topological network analysis, FunMod proved to be useful for the data interpretation and the generation of new hypotheses in two case studies.  相似文献   

6.
Identifying antimicrobial resistant(AMR) bacteria in metagenomics samples is essential for public health and food safety. Next-generation sequencing(NGS) technology has provided a powerful tool in identifying the genetic variation and constructing the correlations between genotype and phenotype in humans and other species. However, for complex bacterial samples, there lacks a powerful bioinformatic tool to identify genetic polymorphisms or copy number variations(CNVs) for given genes. Here we provide a Bayesian framework for genotype estimation for mixtures of multiple bacteria, named as Genetic Polymorphisms Assignments(GPA). Simulation results showed that GPA has reduced the false discovery rate(FDR) and mean absolute error(MAE) in CNV and single nucleotide variant(SNV) identification. This framework was validated by whole-genome sequencing and Pool-seq data from Klebsiella pneumoniae with multiple bacteria mixture models, and showed the high accuracy in the allele fraction detections of CNVs and SNVs in AMR genes between two populations. The quantitative study on the changes of AMR genes fraction between two samples showed a good consistency with the AMR pattern observed in the individual strains. Also, the framework together with the genome annotation and population comparison tools has been integrated into an application, which could provide a complete solution for AMR gene identification and quantification in unculturable clinical samples. The GPA package is available at https://github.com/IID-DTH/GPA-package.  相似文献   

7.
Understanding how human cardiomyocytes mature is crucial to realizing stem cell-based heart regeneration, modeling adult heart diseases, and facilitating drug discovery. However, it is not feasible to analyze human samples for maturation due to inaccessibility to samples while cardiomy-ocytes mature during fetal development and childhood, as well as difficulty in avoiding variations among individuals. Using model animals such as mice can be a useful strategy;nonetheless, it is not well-understood whether and to what degree gene expression profiles during maturation are shared between humans and mice. Therefore, we performed a comparative gene expression analysis of mice and human samples. First, we examined two distinct mice microarray platforms for shared gene expression profiles, aiming to increase reliability of the analysis. We identified a set of genes display-ing progressive changes during maturation based on principal component analysis. Second, we demonstrated that the genes identified had a differential expression pattern between adult and ear-lier stages (e.g., fetus) common in mice and humans. Our findings provide a foundation for further genetic studies of cardiomyocyte maturation.  相似文献   

8.
细胞中的生理活动主要是通过蛋白质 - 蛋白质之间的相互作用来调控完成 . 详尽细致的蛋白质 - 蛋白质相互作用网络的解析对于理解细胞中复杂的调控、代谢和信号通路有重要的意义 . 近年来,关于新的蛋白质 - 蛋白质相互作用预测领域进展快速,这里,利用贝叶斯算法结合关联的 GO (Gene Ontology) ,来预测蛋白质的相互作用 . 利用非冗余的蛋白质相互作用数据来观察 GO 对的特性,得到 GO 关联的概率 . 通过阳性的和阴性的标准对照数据证实这个新方法可以很好地区别这两类不同的数据,显示出较好的灵敏度和非常低的假阳性预测率 . 通过与已知的高通量的实验数据比较,这个方法具有灵敏度高、速度快的优点 . 而且,运用这个新方法可以提供一些新的关于细胞内蛋白质之间相互作用的信息,为进一步的实验提供理论依据 .  相似文献   

9.
基于28S rDNA序列的鞘藻目系统发育研究   总被引:3,自引:3,他引:0  
作者进行了较广泛的样品采集,通过实验分离纯化培养得到多个鞘藻目种类的株系,并采用PCR技术新获得鞘藻目2属8个种类的部分28S rDNA序列,连同GenBank中的另两条序列,分析的物种涵盖了鞘藻目中的每个属。通过比较分析绿藻纲中包括此10条序列的共36个种类的同一基因序列,并选取Trebouxiophyceae中的椭圆小球藻(Chlorella ellipsoidea)和Fusochloris perforata作外类群,运用多种方法构建分子系统树,包括邻接法(Neighbor-Joining)、最大简约法(Maximum Parsimony)和Bayesian方法。3种方法所得的结果非常相似,在形态上就在整个绿藻中界限分明的鞘藻目从分子水平上再次证明为单系起源的类群;构建的系统发育树还在一定程度上表明毛鞘藻属处于鞘藻目内三个属中较分离的位置,而枝鞘藻属与鞘藻属植物并无明显界限。  相似文献   

10.
Co-regulation of genes has been extensively analyzed, however, rather limited knowledge is available on co-regulations within the miRNome. We investigated differential co-expression of microRNAs (miRNAs) based on miRNome profiles of whole blood from 540 individuals. These include patients suffering from different cancer and non-cancer diseases, and unaffected controls. Using hierarchi-cal clustering, we found 9 significant clusters of co-expressed miRNAs containing 2-36 individual miRNAs. Through analyzing multiple sequencing alignments in the clusters, we found that co-expression of miRNAs is associated with both sequence similarity and genomic co-localization. We calculated correlations for all 371,953 pairs of miRNAs for all 540 individuals and identified 184 pairs of miRNAs with high correlation values. Out of these 184 pairs of miRNAs, 16 pairs (8.7%) were differentially co-expressed in unaffected controls, cancer patients and patients with non-cancer diseases. By computing correlated and anti-correlated miRNA pairs, we constructed a network with 184 putative co-regulations as edges and 100 miRNAs as nodes. Thereby, we detected specific clusters of miRNAs with high and low correlation values. Our approach represents the most comprehensive co-regulation analysis based on whole miRNome-wide expression profiling. Our findings further decrypt the interactions of miRNAs in normal and human pathological processes.  相似文献   

11.
JJ Wiens  J Tiu 《PloS one》2012,7(8):e42925

Background

Phylogenies are essential to many areas of biology, but phylogenetic methods may give incorrect estimates under some conditions. A potentially common scenario of this type is when few taxa are sampled and terminal branches for the sampled taxa are relatively long. However, the best solution in such cases (i.e., sampling more taxa versus more characters) has been highly controversial. A widespread assumption in this debate is that added taxa must be complete (no missing data) in order to save analyses from the negative impacts of limited taxon sampling. Here, we evaluate whether incomplete taxa can also rescue analyses under these conditions (empirically testing predictions from an earlier simulation study).

Methodology/Principal Findings

We utilize DNA sequence data from 16 vertebrate species with well-established phylogenetic relationships. In each replicate, we randomly sample 4 species, estimate their phylogeny (using Bayesian, likelihood, and parsimony methods), and then evaluate whether adding in the remaining 12 species (which have 50, 75, or 90% of their data replaced with missing data cells) can improve phylogenetic accuracy relative to analyzing the 4 complete taxa alone. We find that in those cases where sampling few taxa yields an incorrect estimate, adding taxa with 50% or 75% missing data can frequently (>75% of relevant replicates) rescue Bayesian and likelihood analyses, recovering accurate phylogenies for the original 4 taxa. Even taxa with 90% missing data can sometimes be beneficial.

Conclusions

We show that adding taxa that are highly incomplete can improve phylogenetic accuracy in cases where analyses are misled by limited taxon sampling. These surprising empirical results confirm those from simulations, and show that the benefits of adding taxa may be obtained with unexpectedly small amounts of data. These findings have important implications for the debate on sampling taxa versus characters, and for studies attempting to resolve difficult phylogenetic problems.  相似文献   

12.
The problem of missing data is often considered to be the most important obstacle in reconstructing the phylogeny of fossil taxa and in combining data from diverse characters and taxa for phylogenetic analysis. Empirical and theoretical studies show that including highly incomplete taxa can lead to multiple equally parsimonious trees, poorly resolved consensus trees, and decreased phylogenetic accuracy. However, the mechanisms that cause incomplete taxa to be problematic have remained unclear. It has been widely assumed that incomplete taxa are problematic because of the proportion or amount of missing data that they bear. In this study, I use simulations to show that the reduced accuracy associated with including incomplete taxa is caused by these taxa bearing too few complete characters rather than too many missing data cells. This seemingly subtle distinction has a number of important implications. First, the so-called missing data problem for incomplete taxa is, paradoxically, not directly related to their amount or proportion of missing data. Thus, the level of completeness alone should not guide the exclusion of taxa (contrary to common practice), and these results may explain why empirical studies have sometimes found little relationship between the completeness of a taxon and its impact on an analysis. These results also (1) suggest a more effective strategy for dealing with incomplete taxa, (2) call into question a justification of the controversial phylogenetic supertree approach, and (3) show the potential for the accurate phylogenetic placement of highly incomplete taxa, both when combining diverse data sets and when analyzing relationships of fossil taxa.  相似文献   

13.
Missing data are a widely recognized nuisance factor in phylogenetic analyses, and the fear of missing data may deter systematists from including characters that are highly incomplete. In this paper, I used simulations to explore the consequences of including sets of characters that contain missing data. More specifically, I tested whether the benefits of increasing the number of characters outweigh the costs of adding missing data cells to a matrix. The results show that the addition of a set of characters with missing data is generally more likely to increase phylogenetic accuracy than decrease it, but the potential benefits of adding these characters quickly disappear as the proportion of missing data increases. Furthermore, despite the overall trend, adding characters with missing data does decrease accuracy in some cases. In these situations, the missing data entries are not themselves misleading, but their presence may mimic the effects of limited taxon sampling, which can positively mislead. Criteria are discussed for predicting whether adding characters with missing data may increase or decrease accuracy. The results of this study also suggest that accuracy can be increased to a surprising degree by (1) "filling the holes" in a data matrix as much as possible (even when relatively few taxa are missing data), and (2) adding fewer characters scored for all taxa rather than adding a larger number of characters known for fewer taxa. Missing data can also be eliminated from an analysis through the exclusion of incomplete taxa rather than incomplete characters, but this approach may reduce the usefulness of the analysis and (in some cases) the accuracy of the estimated trees.  相似文献   

14.
Missing data are commonly thought to impede a resolved or accurate reconstruction of phylogenetic relationships, and probabilistic analysis techniques are increasingly viewed as less vulnerable to the negative effects of data incompleteness than parsimony analyses. We test both assumptions empirically by conducting parsimony and Bayesian analyses on an approximately 1.5 × 106‐cell (27 965 characters × 52 species) mustelid–procyonid molecular supermatrix with 62.7% missing entries. Contrary to the first assumption, phylogenetic relationships inferred from our analyses are fully (Bayesian) or almost fully (parsimony) resolved topologically with mostly strong support and also largely in accord with prior molecular estimations of mustelid and procyonid phylogeny derived with parsimony, Bayesian, and other probabilistic analysis techniques from smaller but complete or nearly complete data sets. Contrary to the second assumption, we found no compelling evidence in support of a relationship between the inferior performance of parsimony and taxon incompleteness (i.e. the proportion of missing character data for a taxon), although we found evidence for a connection between the inferior performance of parsimony and character incompleteness (i.e. no overlap in character data between some taxa). The relatively good performance of our analyses may be related to the large number of sampled characters, so that most taxa (even highly incomplete ones) are represented by a sufficient number of characters allowing both approaches to resolve their relationships. © The Willi Hennig Society 2009.  相似文献   

15.
Molecular data offer great potential to resolve the phylogeny of living taxa but can molecular data improve our understanding of relationships of fossil taxa? Simulations suggest that this is possible, but few empirical examples have demonstrated the ability of molecular data to change the placement of fossil taxa. We offer such an example here. We analyze the placement of snakes among squamate reptiles, combining published morphological data (363 characters) and new DNA sequence data (15,794 characters, 22 nuclear loci) for 45 living and 19 fossil taxa. We find several intriguing results. First, some fossil taxa undergo major changes in their phylogenetic position when molecular data are added. Second, most fossil taxa are placed with strong support in the expected clades by the combined data Bayesian analyses, despite each having >98% missing cells and despite recent suggestions that extensive missing data are problematic for Bayesian phylogenetics. Third, morphological data can change the placement of living taxa in combined analyses, even when there is an overwhelming majority of molecular characters. Finally, we find strong but apparently misleading signal in the morphological data, seemingly associated with a burrowing lifestyle in snakes, amphisbaenians, and dibamids. Overall, our results suggest promise for an integrated and comprehensive Tree of Life by combining molecular and morphological data for living and fossil taxa.  相似文献   

16.
A stable phylogenetic hypothesis for families within jellyfish class Scyphozoa has been elusive. Reasons for the lack of resolution of scyphozoan familial relationships include a dearth of morphological characters that reliably distinguish taxa and incomplete taxonomic sampling in molecular studies. Here, we address the latter issue by using maximum likelihood and Bayesian methods to reconstruct the phylogenetic relationships among all 19 currently valid scyphozoan families, using sequence data from two nuclear genes: 18S and 28S rDNA. Consistent with prior morphological hypotheses, we find strong evidence for monophyly of subclass Discomedusae, order Coronatae, rhizostome suborder Kolpophorae and superfamilies Actinomyariae, Kampylomyariae, Krikomyariae, and Scapulatae. Eleven of the 19 currently recognized scyphozoan families are robustly monophyletic, and we suggest recognition of two new families pending further analyses. In contrast to long-standing morphological hypotheses, the phylogeny shows coronate family Nausithoidae, semaeostome family Cyaneidae, and rhizostome suborder Daktyliophorae to be nonmonophyletic. Our analyses neither strongly support nor strongly refute monophyly of order Rhizostomeae, superfamily Inscapulatae, and families Ulmaridae, Catostylidae, Lychnorhizidae, and Rhizostomatidae. These taxa, as well as familial relationships within Coronatae and within rhizostome superfamily Inscapulatae, remain unclear and may be resolved by additional genomic and taxonomic sampling. In addition to clarifying some historically difficult taxonomic questions and highlighting nodes in particular need of further attention, the molecular phylogeny presented here will facilitate more robust study of phenotypic evolution in the Scyphozoa, including the evolution characters associated with mass occurrences of jellyfish.  相似文献   

17.
Many phylogenetic analyses that include numerous terminals but few genes show high resolution and branch support for relatively recently diverged clades, but lack of resolution and/or support for "basal" clades of the tree. The various benefits of increased taxon and character sampling have been widely discussed in the literature, albeit primarily based on simulations rather than empirical data. In this study, we used a well-sampled gene-tree analysis (based on 100 mitochondrial genomes of higher teleost fishes) to test empirically the efficiency of different methods of data sampling and phylogenetic inference to "correctly" resolve the basal clades of a tree (based on congruence with the reference tree constructed using all 100 taxa and 7990 characters). By itself, increased character sampling was an inefficient method by which to decrease the likelihood of "incorrect" resolution (i.e., incongruence with the reference tree) for parsimony analyses. Although increased taxon sampling was a powerful approach to alleviate "incorrect" resolution for parsimony analyses, it had the general effect of increasing the number of, and support for, "incorrectly" resolved clades in the Bayesian analyses. For both the parsimony and Bayesian analyses, increased taxon sampling, by itself, was insufficient to help resolve the basal clades, making this sampling strategy ineffective for that purpose. For this empirical study, the most efficient of the six approaches considered to resolve the basal clades when adding nucleotides to a dataset that consists of a single gene sampled for a small, but representative, number of taxa, is to increase character sampling and analyze the characters using the Bayesian method.  相似文献   

18.
The annual sunflowers (Helianthus sect. Helianthus) present a formidable challenge for phylogenetic inference because of ancient hybrid speciation, recent introgression, and suspected issues with deep coalescence. Here we analyze sequence data from 11 nuclear DNA (nDNA) genes for multiple genotypes of species within the section to (1) reconstruct the phylogeny of this group, (2) explore the utility of nDNA gene trees for detecting hybrid speciation and introgression; and (3) test an empirical method of hybrid identification based on the phylogenetic congruence of nDNA gene trees from tightly linked genes. We uncovered considerable topological heterogeneity among gene trees with or without three previously identified hybrid species included in the analyses, as well as a general lack of reciprocal monophyly of species. Nonetheless, partitioned Bayesian analyses provided strong support for the reciprocal monophyly of all species except H. annuus (0.89 PP), the most widespread and abundant annual sunflower. Previous hypotheses of relationships among taxa were generally strongly supported (1.0 PP), except among taxa typically associated with H. annuus, apparently due to the paraphyly of the latter in all gene trees. While the individual nDNA gene trees provided a useful means for detecting recent hybridization, identification of ancient hybridization was problematic for all ancient hybrid species, even when linkage was considered. We discuss biological factors that affect the efficacy of phylogenetic methods for hybrid identification.  相似文献   

19.
Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of "incomplete" fossil taxa, the issue of taxon completeness (i.e., proportion of missing data) has been largely ignored in most subsequent discussions of taxon sampling and long-branch attraction. In this article, I use simulations to test the ability of incomplete taxa to subdivide long branches and improve phylogenetic accuracy in situations of potential long-branch attraction. The results show that for most methods and conditions examined, adding taxa that are only 50% complete may provide similar benefits to adding the same number of complete taxa (suggesting that the advantages of increased taxon sampling may be obtained with less data than previously considered). For parsimony, taxa that are less complete (5% to 25% complete) may often have limited ability to rescue analyses from long-branch attraction. In contrast, highly incomplete taxa can be surprisingly beneficial when using model-based methods. The results also suggest the importance of model-based methods in phylogenetic analyses that combine molecular and fossil data.  相似文献   

20.
Taxa missing large amounts of data pose challenges that may hinder the recovery of a well‐resolved, accurate phylogeny and leave questions surrounding their phylogenetic position. Systematists commonly have to contend with one or two species in a group for which there is little or no material available suitable for recovering molecular data. It is unclear whether these taxa can be better placed using analyses based on morphological data only, or should be included in broader analyses based on both morphological and molecular data. The extinct madtom catfish Noturus trautmani is known from few specimens for which molecular data are unavailable. We included this taxon in parsimony and Bayesian analyses of relationships of madtom catfishes based on a combination of morphological and molecular data. Results indicate that using a combination of morphological and molecular data does a better job at providing a phylogenetic placement for N. trautmani than morphology alone, even though it is missing all of its molecular characters. We provide a novel hypothesis of relationships among Noturus species and recommendations for classification within the group. © 2009 The Linnean Society of London, Zoological Journal of the Linnean Society, 2009, 155 , 60–75.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号