首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background

Phylogenetic trees have become increasingly essential across biology disciplines. Consequently, learning about phylogenetic trees has become an important component of biology education and an area of interest for biology education research. Construction tasks, in which students generate phylogenetic trees from some type of data, are often used for instruction. However, the impact of these exercises on student learning is uncertain, in part due to our fragmented knowledge of what students construct during the tasks. The goal of this project was to develop a more robust method for describing student-generated phylogenetic trees, which will support future investigations that attempt to link construction tasks with student learning.

Results

Through iterative examination of data from an introductory biology course, we developed a method for describing student-generated phylogenetic trees in terms of style, conventionality, and accuracy. Students used the diagonal style more often than the bracket style for construction tasks. The majority of phylogenetic trees were constructed conventionally, and variable orientation of branches was the most common unconventional feature. In addition, the majority of phylogenetic trees were generated correctly (no errors) or adequately (minor errors only) in terms of accuracy. Suggesting extant taxa are descended from other extant taxa was the most common major error, while empty branches and extra nodes were very common minor errors.

Conclusions

The method we developed to describe student-constructed phylogenetic trees uncovered several trends that warrant further investigation. For example, while diagonal and bracket phylogenetic trees contain equivalent information, student preference for using the diagonal style could impact comprehension. In addition, despite a lack of explicit instruction, students generated phylogenetic trees that were largely conventional and accurate. Surprisingly, accuracy and conventionality were also dependent on each other. Our method for describing phylogenetic trees constructed by students is based on data from one introductory biology course at one institution, and the results are likely limited. We encourage researchers to use our method as a baseline for developing a more generalizable tool, which will support future investigations that attempt to link construction tasks with student learning.
  相似文献   

2.
3.
The amino acid sequences of 47 P-type ATPases from several eukaryotic and bacterial kingdoms were divided into three structural segments based on individual hydropathy profiles. Each homologous segment was (1) multiply aligned and functionally evaluated, (2) statistically analyzed to determine the degrees of sequence similarity, and (3) used for the construction of parsimonious phylogenetic trees. The results show that all of the P-type ATPases analyzed comprise a single family with four major clusters correlating with their cation specificities and biological sources as follows: cluster 1: Ca2+-transporting ATPases; cluster 2: Na+- and gastric H+-ATPases; cluster 3: plasma membrane H+-translocating ATPases of plants, fungi, and lower eukaryotes; and cluster 4: all but one of the bacterial P-type ATPases (specific for K+, Cd2+, Cu2+ and an unknown cation). The one bacterial exception to this general pattern was the Mg2+-ATPase of Salmonella typhimurium, which clustered with the eukaryotic sequences. Although exceptions were noted, the similarities of the phylogenetic trees derived from the three segments analyzed led to the probability that the N-terminal segments 1 and the centrally localized segments 2 evolved from a single primordial ATPase which existed prior to the divergence of eukaryotes from prokaryotes. By contrast, the C-terminal segments 3 appear to be eukaryotic specific, are not found in similar form in any of the prokaryotic enzymes, and are not all demonstrably homologous among the eukaryotic enzymes. These C-terminal domains may therefore have either arisen after the divergence of eukaryotes from prokaryotes or exhibited more rapid sequence divergence than either segment 1 or 2, thus masking their common origin. The relative rates of evolutionary divergence for the three segments were determined to be segment 2 < segment 1 < segment 3. Correlative functional analyses of the most conserved regions of these ATPases, based on published site-specific mutagenesis data, provided preliminary evidence for their functional roles in the transport mechanism. Our studies define the structural and evolutionary relationships among the P-type ATPases. They should provide a guide for the design of future studies of structure-function relationships employing molecular genetic, biochemical, and biophysical techniques. Correspondence to: M.H. Saier, Jr.  相似文献   

4.

Background  

Sequence-based phylogeny reconstruction is a fundamental task in Bioinformatics. Practically all methods for phylogeny reconstruction are based on multiple alignments. The quality and stability of the underlying alignments is therefore crucial for phylogenetic analysis.  相似文献   

5.
Taxonomic names and phylogenetic trees   总被引:2,自引:0,他引:2  
This paper addresses the issue of philosophy of names within the context of biological taxonomy, more specifically how names refer. By contrasting two philosophies of names, one that is based on the idea that names can be defined and one that they cannot be defined, I point out some advantages of the latter within phylogenetic systematics. Due to the changing nature of phylogenetic hypotheses, the former approach tends to rob taxonomy from its unique communicative value since a name that is defined refers to whatever fits the definition. This is particularly troublesome should the hypothesis of phylogenetic relationship change. I argue that, should we decide to accept a new phylogenetic hypothesis, it is also likely that our view of what to name may change. A system where names only refer acknowledge this, and accordingly leaves it open whether to keep a name (and accept the way it refers in the new hypothesis) or discard a name and introduce new names for the parts of the tree that we find scientifically interesting. One of the main differences between a phylogenetic system of definition (PSD) and a phylogenetic system of reference (PSR) is that the former is governed by laws of language while the latter by communicative needs of taxonomists. Thus, a PSR tends to give primacy to phylogenetic trees rather than phylogenetic definitions of names should our views of which phylogenetic hypothesis to accept change. © 1998 The Norwegian Academy of Sciences and Letters  相似文献   

6.
The problem of determining an optimal phylogenetic tree from a set of data is an example of the Steiner problem in graphs. There is no efficient algorithm for solving this problem with reasonably large data sets. In the present paper an approach is described that proves in some cases that a given tree is optimal without testing all possible trees. The method first uses a previously described heuristic algorithm to find a tree of relatively small total length. The second part of the method independently analyses subsets of sites to determine a lower bound on the length of any tree. We simultaneously attempt to reduce the total length of the tree and increase the lower bound. When these are equal it is not possible to make a shorter tree with a given data set and given criterion. An example is given where the only two possible minimal trees are found for twelve different mammalian cytochrome c sequences. The criterion of finding the smallest number of minimum base changes was used. However, there is no general method of guaranteeing that a solution will be found in all cases and in particular better methods of improving the estimate of the lower bound need to be developed.  相似文献   

7.
This paper poses the problem of estimating and validating phylogenetic trees in statistical terms. The problem is hard enough to warrant several tacks: we reason by analogy to rounding real numbers, and dealing with ranking data. These are both cases where, as in phylogeny the parameters of interest are not real numbers. Then we pose the problem in geometrical terms, using distances and measures on a natural space of trees. We do not solve the problems of inference on tree space, but suggest some coherent ways of tackling them.  相似文献   

8.
Complete amino acid sequences of ferredoxin and rubredoxin from Butyribacterium methylotrophicum, a methylotrophic hetero-acetogen, were determined by combination of protease digestion, Edman degradation, carboxypeptidase digestion, and/or partial acid hydrolysis. The ferredoxin was composed of 55 amino acids with a molecular weight of 5,732 excluding iron and sulfur atoms and showed a typical 2[4Fe-4S]-type ferredoxin sequence with an internal repeat at the 14-23 and 42-51 positions. The rubredoxin was composed of 53 amino acids with a molecular weight of 5,672 excluding iron atom and showed a sequence similar to those of other anaerobic rubredoxins. The sequences were compared to those of corresponding proteins from six different bacteria to construct phylogenetic trees, which showed essentially the same topology. The relationships between the ferredoxin sequences from this bacterium and those of Clostridium thermoaceticum and Methanosarcina barkeri, both of which possess a carbonyl-dependent acetyl-CoA metabolic system, are also discussed.  相似文献   

9.
The most widely used evolutionary model for phylogenetic trees is the equal-rates Markov (ERM) model. A problem is that the ERM model predicts less imbalance than observed for trees inferred from real data; in fact, the observed imbalance tends to fall between the values predicted by the ERM model and those predicted by the proportional-to-distinguishable-arrangements (PDA) model. Here, a continuous multi-rate (MR) family of evolutionary models is presented which contains entire subfamilies corresponding to both the PDA and ERM models. Furthermore, this MR family covers an entire range from 'completely balanced' to 'completely unbalanced' models. In particular, the MR family contains other known evolutionary models. The MR family is very versatile and virtually free of assumptions on the character of evolution; yet it is highly susceptible to rigorous analyses. In particular, such analyses help to uncover adaptability, quasi-stabilization and prolonged stasis as major possible causes of the imbalance. However, the MR model is functionally simple and requires only three parameters to reproduce the observed imbalance.  相似文献   

10.
VOSTORG is a new, versatile package of programs for the inference and presentation of phylogenetic trees, as well as an efficient tool for nucleotide (nt) and amino acid (aa) sequence analysis (sequence input, verification, alignment, construction of consensus, etc.). On appropriately equipped systems, these data can be displayed on a video monitor or printed as required. They are implemented on IBM PC/XT/AT/PS-2 or compatible computers and hardware graphic support is recommended. The package is designed to be easily handled by occasional computer users and yet it is powerful enough for experienced professionals.  相似文献   

11.
MOTIVATION: Despite substantial efforts to develop and populate the back-ends of biological databases, front-ends to these systems often rely on taxonomic expertise. This research applies techniques from human-computer interaction research to the biodiversity domain. RESULTS: We developed an interactive node-link tool, TaxonTree, illustrating the value of a carefully designed interaction model, animation, and integrated searching and browsing towards retrieval of biological names and other information. Users tested the tool using a new, large integrated dataset of animal names with phylogenetic-based and classification-based tree structures. These techniques also translated well for a tool, DoubleTree, to allow comparison of trees using coupled interaction. Our approaches will be useful not only for biological data but as general portal interfaces.  相似文献   

12.
Phylogenetic networks are models of evolution that go beyond trees, incorporating non-tree-like biological events such as recombination (or more generally reticulation), which occur either in a single species (meiotic recombination) or between species (reticulation due to lateral gene transfer and hybrid speciation). The central algorithmic problems are to reconstruct a plausible history of mutations and non-tree-like events, or to determine the minimum number of such events needed to derive a given set of binary sequences, allowing one mutation per site. Meiotic recombination, reticulation and recurrent mutation can cause conflict or incompatibility between pairs of sites (or characters) of the input. Previously, we used "conflict graphs" and "incompatibility graphs" to compute lower bounds on the minimum number of recombination nodes needed, and to efficiently solve constrained cases of the minimization problem. Those results exposed the structural and algorithmic importance of the non-trivial connected components of those two graphs. In this paper, we more fully develop the structural importance of non-trivial connected components of the incompatibility and conflict graphs, proving a general decomposition theorem (Gusfield and Bansal, 2005) for phylogenetic networks. The decomposition theorem depends only on the incompatibilities in the input sequences, and hence applies to many types of phylogenetic networks, and to any biological phenomena that causes pairwise incompatibilities. More generally, the proof of the decomposition theorem exposes a maximal embedded tree structure that exists in the network when the sequences cannot be derived on a perfect phylogenetic tree. This extends the theory of perfect phylogeny in a natural and important way. The proof is constructive and leads to a polynomial-time algorithm to find the unique underlying maximal tree structure. We next examine and fully solve the major open question from Gusfield and Bansal (2005): Is it true that for every input there must be a fully decomposed phylogenetic network that minimizes the number of recombination nodes used, over all phylogenetic networks for the input. We previously conjectured that the answer is yes. In this paper, we show that the answer in is no, both for the case that only single-crossover recombination is allowed, and also for the case that unbounded multiple-crossover recombination is allowed. The latter case also resolves a conjecture recently stated in (Huson and Klopper, 2007) in the context of reticulation networks. Although the conjecture from Gusfield and Bansal (2005) is disproved in general, we show that the answer to the conjecture is yes in several natural special cases, and establish necessary combinatorial structure that counterexamples to the conjecture must possess. We also show that counterexamples to the conjecture are rare (for the case of single-crossover recombination) in simulated data.  相似文献   

13.
Interior-branch and bootstrap tests of phylogenetic trees   总被引:19,自引:3,他引:16  
We have compared statistical properties of the interior-branch and bootstrap tests of phylogenetic trees when the neighbor-joining tree- building method is used. For each interior branch of a predetermined topology, the interior-branch and bootstrap tests provide the confidence values, PC and PB, respectively, that indicate the extent of statistical support of the sequence cluster generated by the branch. In phylogenetic analysis these two values are often interpreted in the same way, and if PC and PB are high (say, > or = 0.95), the sequence cluster is regarded as reliable. We have shown that PC is in fact the complement of the P-value used in the standard statistical test, but PB is not. Actually, the bootstrap test usually underestimates the extent of statistical support of species clusters. The relationship between the confidence values obtained by the two tests varies with both the topology and expected branch lengths of the true (model) tree. The most conspicuous difference between PC and PB is observed when the true tree is starlike, and there is a tendency for the difference to increase as the number of sequences in the tree increases. The reason for this is that the bootstrap test tends to become progressively more conservative as the number of sequences in the tree increases. Unlike the bootstrap, the interior-branch test has the same statistical properties irrespective of the number of sequences used when a predetermined tree is considered. Therefore, the interior-branch test appears to be preferable to the bootstrap test as long as unbiased estimators of evolutionary distances are used. However, when the interior-branch is applied to a tree estimated from a given data set, PC may give an overestimate of statistical confidence. For this case, we developed a method for computing a modified version (P'C) of the PC value and showed that this P'C tends to give a conservative estimate of statistical confidence, though it is not as conservative as PB. In this paper we have introduced a model in which evolutionary distances between sequences follow a multivariate normal distribution. This model allowed us to study the relationships between the two tests analytically.   相似文献   

14.
15.
Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html  相似文献   

16.
17.
18.
T-REX (tree and reticulogram reconstruction) is an application to reconstruct phylogenetic trees and reticulation networks from distance matrices. The application includes a number of tree fitting methods like NJ, UNJ or ADDTREE which have been very popular in phylogenetic analysis. At the same time, the software comprises several new methods of phylogenetic analysis such as: tree reconstruction using weights, tree inference from incomplete distance matrices or modeling a reticulation network for a collection of objects or species. T-REX also allows the user to visualize obtained tree or network structures using Hierarchical, Radial or Axial types of tree drawing and manipulate them interactively. AVAILABILITY: T-REX is a freeware package available online at: http://www.fas.umontreal.ca/biol/casgrain/en/labo/t-rex  相似文献   

19.
Deciphering the network of protein interactions that underlines cellular operations has become one of the main tasks of proteomics and computational biology. Recently, a set of bioinformatics approaches has emerged for the prediction of possible interactions by combining sequence and genomic information. Even though the initial results are very promising, the current methods are still far from perfect. We propose here a new way of discovering possible protein-protein interactions based on the comparison of the evolutionary distances between the sequences of the associated protein families, an idea based on previous observations of correspondence between the phylogenetic trees of associated proteins in systems such as ligands and receptors. Here, we extend the approach to different test sets, including the statistical evaluation of their capacity to predict protein interactions. To demonstrate the possibilities of the system to perform large-scale predictions of interactions, we present the application to a collection of more than 67 000 pairs of E.coli proteins, of which 2742 are predicted to correspond to interacting proteins.  相似文献   

20.
Bayesian methods have become among the most popular methods in phylogenetics, but theoretical opposition to this methodology remains. After providing an introduction to Bayesian theory in this context, I attempt to tackle the problem mentioned most often in the literature: the “problem of the priors”—how to assign prior probabilities to tree hypotheses. I first argue that a recent objection—that an appropriate assignment of priors is impossible—is based on a misunderstanding of what ignorance and bias are. I then consider different methods of assigning prior probabilities to trees. I argue that priors need to be derived from an understanding of how distinct taxa have evolved and that the appropriate evolutionary model is captured by the Yule birth–death process. This process leads to a well-known statistical distribution over trees. Though further modifications may be necessary to model more complex aspects of the branching process, they must be modifications to parameters in an underlying Yule model. Ignoring these Yule priors commits a fallacy leading to mistaken inferences both about the trees themselves and about macroevolutionary processes more generally.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号