首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multigene and genomic data sets have become commonplace in the field of phylogenetics, but many existing tools are not designed for such data sets, which often makes the analysis time‐consuming and tedious. Here, we present PhyloSuite , a (cross‐platform, open‐source, stand‐alone Python graphical user interface) user‐friendly workflow desktop platform dedicated to streamlining molecular sequence data management and evolutionary phylogenetics studies. It uses a plugin‐based system that integrates several phylogenetic and bioinformatic tools, thereby streamlining the entire procedure, from data acquisition to phylogenetic tree annotation (in combination with iTOL). It has the following features: (a) point‐and‐click and drag‐and‐drop graphical user interface; (b) a workplace to manage and organize molecular sequence data and results of analyses; (c) GenBank entry extraction and comparative statistics; and (d) a phylogenetic workflow with batch processing capability, comprising sequence alignment (mafft and macse ), alignment optimization (trimAl, HmmCleaner and Gblocks), data set concatenation, best partitioning scheme and best evolutionary model selection (PartitionFinder and modelfinder ), and phylogenetic inference (MrBayes and iq‐tree ). PhyloSuite is designed for both beginners and experienced researchers, allowing the former to quick‐start their way into phylogenetic analysis, and the latter to conduct, store and manage their work in a streamlined way, and spend more time investigating scientific questions instead of wasting it on transferring files from one software program to another.  相似文献   

2.
Mortality site investigations of telemetered wildlife are important for cause‐specific survival analyses and understanding underlying causes of observed population dynamics. Yet, eroding ecoliteracy and a lack of quality control in data collection can lead researchers to make incorrect conclusions, which may negatively impact management decisions for wildlife populations. We reviewed a random sample of 50 peer‐reviewed studies published between 2000 and 2019 on survival and cause‐specific mortality of ungulates monitored with telemetry devices. This concise review revealed extensive variation in reporting of field procedures, with many studies omitting critical information for the cause of mortality inference. Field protocols used to investigate mortality sites and ascertain the cause of mortality are often minimally described and frequently fail to address how investigators dealt with uncertainty. We outline a step‐by‐step procedure for mortality site investigations of telemetered ungulates, including evidence that should be documented in the field. Specifically, we highlight data that can be useful to differentiate predation from scavenging and more conclusively identify the predator species that killed the ungulate. We also outline how uncertainty in identifying the cause of mortality could be acknowledged and reported. We demonstrate the importance of rigorous protocols and prompt site investigations using data from our 5‐year study on survival and cause‐specific mortality of telemetered mule deer (Odocoileus hemionus) in northern California. Over the course of our study, we visited mortality sites of neonates (n = 91) and adults (n = 23) to ascertain the cause of mortality. Rapid site visitations significantly improved the successful identification of the cause of mortality and confidence levels for neonates. We discuss the need for rigorous and standardized protocols that include measures of confidence for mortality site investigations. We invite reviewers and journal editors to encourage authors to provide supportive information associated with the identification of causes of mortality, including uncertainty.  相似文献   

3.
Large river valleys (LRVs) are heterogeneous in habitat and rich in biodiversity, but they are largely overlooked in policies that prioritize conservation. Here, we aimed to identify plant diversity hotspots along LRVs based on species richness and spatial phylogenetics, evaluate current conservation effectiveness, determine gaps in the conservation networks, and offer suggestions for prioritizing conservation. We divided the study region into 50 km × 50 km grid cells and determined the distribution patterns of seed plants by studying 124,927 occurrence points belonging to 14,481 species, using different algorithms. We generated phylogenies for the plants using the “V. PhyloMaker” R package, determined spatial phylogenetics, and conducted correlation analyses between different distribution patterns and spatial phylogenetics. We evaluated the effectiveness of current conservation practices and discovered gaps of hotspots within the conservation networks. In the process, we identified 36 grid cells as hotspots (covering 10% of the total area) that contained 83.4% of the species. Fifty‐eight percent of the hotspot area falls under the protection of national nature reserves (NNRs) and 83% falls under national and provincial nature reserves (NRs), with 42% of the area identified as conservation gaps of NNRs and 17% of the area as gaps of NRs. The hotspots contained high proportions of endemic and threatened species, as did conservation gaps. Therefore, it is necessary to optimize the layout of current conservation networks, establish micro‐nature reserves, conduct targeted conservation priority planning focused on specific plant groups, and promote conservation awareness. Our results show that the conservation of three hotspots in Southwest China, in particular, is likely to positively affect the protection of biodiversity in the LRVs, especially with the participation of the neighboring countries, India, Myanmar, and Laos.  相似文献   

4.
The mitochondrial genome is now widely used in the study of phylogenetics and molecular evolution due to its maternal inheritance, fast evolutionary rate, and highly conserved gene content. To explore the phylogenetic relationships of the tribe Aeromachini within the subfamily Hesperiinae at the mitochondrial genomic level, we sequenced and annotated the complete mitogenomes of 3 skippers: Ampittia virgata, Halpe nephele, and Onryza maga (new mitogenomes for 2 genera) with a total length of 15,333 bp, 15,291 bp, and 15,381 bp, respectively. The mitogenomes all contain 13 protein‐coding genes (PCGs), 22 transfer RNAs (tRNAs), 2 ribosomal RNAs (rRNAs), and a noncoding A + T‐rich region and are consistent with other lepidopterans in gene order and type. In addition, we reconstructed the phylogenetic trees of Hesperiinae using maximum likelihood (ML) and Bayesian inference (BI) methods based on mitogenomic data. Results show that the tribe Aeromachini in this study robustly constitute a monophyletic group in the subfamily Hesperiinae, with the relationships Coeliadinae + (Euschemoninae + (Pyrginae + ((Eudaminae + Tagiadinae) + (Heteropterinae + ((Trapezitinae + Barcinae) + Hesperiinae))))). Moreover, our study supports the view that Apostictopterus fuliginosus and Barca bicolor should be placed out of the subfamily Hesperiinae.  相似文献   

5.
Zebrafish is a powerful vertebrate model system for studying development, modeling disease, and performing drug screening. Recently a variety of genetic tools have been introduced, including multiple strategies for inducing mutations and generating transgenic lines. However, large-scale screening is limited by traditional genotyping methods, which are time-consuming and labor-intensive. Here we describe a technique to analyze zebrafish genotypes by PCR combined with high-resolution melting analysis (HRMA). This approach is rapid, sensitive, and inexpensive, with lower risk of contamination artifacts. Genotyping by PCR with HRMA can be used for embryos or adult fish, including in high-throughput screening protocols.  相似文献   

6.
Abstract New methods for performing quantitative proteome analyses based on differential labeling protocols or label-free techniques are reported in the literature on an almost monthly basis. In parallel, a correspondingly vast number of software tools for the analysis of quantitative proteomics data has also been described in the literature and produced by private companies. In this article we focus on the review of some of the most popular techniques in the field and present a critical appraisal of several software packages available to process and analyze the data produced. We also describe the importance of community standards to support the wide range of software, which may assist researchers in the analysis of data using different platforms and protocols. It is intended that this review will serve bench scientists both as a useful reference and a guide to the selection and use of different pipelines to perform quantitative proteomics data analysis. We have produced a web-based tool ( http://www.proteosuite.org/?q=other_resources ) to help researchers find appropriate software for their local instrumentation, available file formats, and quantitative methodology.  相似文献   

7.
Peña C  Malm T 《PloS one》2012,7(6):e39071
There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).  相似文献   

8.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed.  相似文献   

9.
ABSTRACT: BACKGROUND: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. RESULTS: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. CONCLUSIONS: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.  相似文献   

10.
The increased use of the automated external defibrillator (AED) contributes to the rising survival rate after sudden cardiac arrest in the Netherlands. When used, the AED records the unconscious person’s medical data (heart rhythm and information about cardiopulmonary resuscitation), which may be important for further diagnosis and treatment. In practice, ethical and legal questions arise about what can and should be done with these ‘AED data’. In this article, the authors advocate the development of national guidelines on the handling of AED data. These guidelines should serve two purposes: (1) to safeguard that data are handled carefully in accordance with data protection principles and the rules of medical confidentiality; and (2) to ensure nationwide availability of data for care of patients who survive resuscitation, as well as for quality monitoring of this care and for related scientific research. Given the medical ethical duties of beneficence and fairness, existing (sometimes lifesaving) information about AED use ought to be made available to clinicians and researchers on a structural basis. Creating a national AED data infrastructure, however, requires overcoming practical and organisational barriers. In addition, further legal study is warranted.  相似文献   

11.
Sequence capture across large phylogenetic scales is not easy because hybridization capture is only effective when the genetic distance between the bait and target is small. Here, we propose a simple but effective strategy to tackle this issue: pooling DNA from a number of selected representative species of different clades to prepare PCR‐generated baits to minimize the genetic distance between the bait and target. To demonstrate the utility of this strategy, we newly developed a set of universal nuclear markers (including 94 nuclear protein‐coding genes) for Lepidoptera, a superdiverse insect group. We used a DNA pool from six lepidopteran species (representing six superfamilies) to prepare PCR baits for the 94 markers. These homemade PCR baits were used to capture sequence data from 43 species of 17 lepidopteran families, and 94% of the target loci were recovered. We constructed two data sets from the obtained data (one containing ~90 kb target coding sequences and the other containing ~120 kb target + flanking coding sequences). Both data sets yielded highly similar and well‐resolved trees with 90% of nodes having >95% bootstrap support. Our capture experiment indicated that using DNA mixtures pooled from different clade‐representative species of Lepidoptera to prepare PCR baits can reliably capture a large number of targeted nuclear markers across different Lepidoptera lineages. We hope that this newly developed nuclear marker set will serve as a new phylogenetic tool for Lepidoptera phylogenetics, and the PCR bait preparation strategy can facilitate the application of sequence capture techniques by researchers to accelerate data collection.  相似文献   

12.
Rodrigue N  Lartillot N  Bryant D  Philippe H 《Gene》2005,347(2):207-217
Standard likelihood-based frameworks in phylogenetics consider the process of evolution of a sequence site by site. Assuming that sites evolve independently greatly simplifies the required calculations. However, this simplification is known to be incorrect in many cases. Here, a computational method that allows for general dependence between sites of a sequence is investigated. Using this method, measures acting as sequence fitness proxies can be considered over a phylogenetic tree. In this work, a set of statistically derived amino acid pairwise potentials, developed in the context of protein threading, is used to account for what we call the structural fitness of a sequence. We describe a model combining statistical potentials with an empirical amino acid substitution matrix. We propose such a combination as a useful way of capturing the complexity of protein evolution. Finally, we outline features of the model using three datasets and show the approach's sensitivity to different tree topologies.  相似文献   

13.
Several existing technologies enable short genomic alterations including generating indels and short nucleotide variants, however, engineering more significant genomic changes is more challenging due to reduced efficiency and precision. Here, we developed RecT Editor via Designer-Cas9-Initiated Targeting (REDIT), which leverages phage single-stranded DNA-annealing proteins (SSAP) RecT for mammalian genome engineering. Relative to Cas9-mediated homology-directed repair (HDR), REDIT yielded up to a 5-fold increase of efficiency to insert kilobase-scale exogenous sequences at defined genomic regions. We validated our REDIT approach using different formats and lengths of knock-in templates. We further demonstrated that REDIT tools using Cas9 nickase have efficient gene-editing activities and reduced off-target errors, measured using a combination of targeted sequencing, genome-wide indel, and insertion mapping assays. Our experiments inhibiting repair enzyme activities suggested that REDIT has the potential to overcome limitations of endogenous DNA repair steps. Finally, our REDIT method is applicable across cell types including human stem cells, and is generalizable to different Cas9 enzymes.  相似文献   

14.
An experimental phylogeny was constructed using bacteriophage T7 and a propagation protocol, in the presence of the mutagen N-methyl-N′-nitro-N′-nitrosoguanidine, based on Hillis et al. [Hillis, D.M., Bull, J.J., White, M.E., Badgett, M.R., Molineux, I.J., 1992. Experimental phylogenetics, generation of a known phylogeny. Science 255, 589–592]. The topology presented in this study has a considerable variation in branch lengths and is less symmetric than the one presented by Hillis et al. [Hillis, D.M., Bull, J.J., White, M.E., Badgett, M.R., Molineux, I.J., 1992. Experimental phylogenetics, generation of a known phylogeny. Science 255, 589–592]. These features are known to present additional difficulties to phylogenetic inference methods. The performance of several phylogenetic methods (conventional and less conventional) was tested using restriction site and nucleotide data. Only methods that encompassed a molecular clock or those based on sequence signatures recovered the true phylogeny. Nevertheless a likelihood ratio test rejected the hypothesis of the existence of a molecular clock when the whole sequence data set was considered. This fact or the particular substitution pattern (mainly G → A and C → T) may be related to the unexpected performance of distance methods based on sequence signatures. To test if the results could have been predicted by simulation studies we estimated the evolution parameters from the real phylogeny and used them to simulate evolution along the same tree (parametric bootstrap). We found that simulation could predict most but not all of the problems encountered by phylogenetic inference methods in the real phylogeny. Short interior branches may be more prone to error than predicted by theoretical studies.  相似文献   

15.
Studies examining phylogenetic community structure have become increasingly prevalent, yet little attention has been given to the influence of the input phylogeny on metrics that describe phylogenetic patterns of co-occurrence. Here, we examine the influence of branch length, tree reconstruction method, and amount of sequence data on measures of phylogenetic community structure, as well as the phylogenetic signal (Pagel’s λ) in morphological traits, using Trichoptera larval communities from Churchill, Manitoba, Canada. We find that model-based tree reconstruction methods and the use of a backbone family-level phylogeny improve estimations of phylogenetic community structure. In addition, trees built using the barcode region of cytochrome c oxidase subunit I (COI) alone accurately predict metrics of phylogenetic community structure obtained from a multi-gene phylogeny. Input tree did not alter overall conclusions drawn for phylogenetic signal, as significant phylogenetic structure was detected in two body size traits across input trees. As the discipline of community phylogenetics continues to expand, it is important to investigate the best approaches to accurately estimate patterns. Our results suggest that emerging large datasets of DNA barcode sequences provide a vast resource for studying the structure of biological communities.  相似文献   

16.
郑巍  罗阿蓉  史卫峰  郑为民  朱朝东 《昆虫学报》2013,56(10):1217-1228
随着生物技术的不断发展和系统发育学的深入研究, 在重构系统发育树时, 研究人员往往要面对更多的挑战和困难, 比如: (1)需要分析的样本数(物种数或个体数)不断增加; (2)需要分析的数据量迅速扩大。尤其在基因组测序技术的推动下, 基于分子信息的系统发育重建需要极大的计算量, 因此数学方法、 计算机技术以及其他辅助工具对于系统发育重建的效率和精确度起着至关重要的作用。最大简约法(maximum parsimony)是一种重要的系统发育重建方法, 提高其计算效率对系统发育学研究具有重要意义, 针对该算法的优化改进需要生物学家和计算机专家的共同努力。本文通过详细地阐述最大简约法的计算流程, 分析其参数选择对计算效率的影响, 帮助更多的计算机使用者, 在并不了解系统发育学基础的情况下, 更方便地针对实际的系统发育算法问题给出更好、 更快、 更精准的解决方案; 同时为系统发育研究工作者, 较为清晰地解释最大简约法的构树思想和计算逻辑, 推动针对最大简约法的不断改进与优化。  相似文献   

17.
MOTIVATION: The determination of gene orthology is a prerequisite for mining and utilizing the rapidly increasing amount of sequence data for genome-scale phylogenetics and comparative genomic studies. Until now, most researchers use pairwise distance comparisons algorithms, such as BLAST, COG, RBH, RSD and INPARANOID, to determine gene orthology. In contrast, orthology determination within a character-based phylogenetic framework has not been utilized on a genomic scale owing to the lack of efficiency and automation. RESULTS: We have developed OrthologID, a Web application that automates the labor-intensive procedures of gene orthology determination within a character-based phylogenetic framework, thus making character-based orthology determination on a genomic scale possible. In addition to generating gene family trees and determining orthologous gene sets for complete genomes, OrthologID can also identify diagnostic characters that define each orthologous gene set, as well as diagnostic characters that are responsible for classifying query sequences from other genomes into specific orthology groups. The OrthologID database currently includes several complete plant genomes, including Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, as well as a unicellular outgroup, Chlamydomonas reinhardtii. To improve the general utility of OrthologID beyond plant species, we plan to expand our sequence database to include the fully sequenced genomes of prokaryotes and other non-plant eukaryotes. AVAILABILITY: http://nypg.bio.nyu.edu/orthologid/  相似文献   

18.
We utilize the secondary structural properties of the 28S rRNA D2–D10 expansion segments to hypothesize a multiple sequence alignment for major lineages of the hymenopteran superfamily Ichneumonoidea (Braconidae, Ichneumonidae). The alignment consists of 290 sequences (originally analyzed in Belshaw and Quicke, Syst Biol 51:450–477, 2002) and provides the first global alignment template for this diverse group of insects. Predicted structures for these expansion segments as well as for over half of the 18S rRNA are given, with highly variable regions characterized and isolated within conserved structures. We demonstrate several pitfalls of optimization alignment and illustrate how these are potentially addressed with structure-based alignments. Our global alignment is presented online at (http://hymenoptera.tamu.edu/rna) with summary statistics, such as basepair frequency tables, along with novel tools for parsing structure-based alignments into input files for most commonly used phylogenetic software. These resources will be valuable for hymenopteran systematists, as well as researchers utilizing rRNA sequences for phylogeny estimation in any taxon. We explore the phylogenetic utility of our structure-based alignment by examining a subset of the data under a variety of optimality criteria using results from Belshaw and Quicke (2002) as a benchmark.Access to on-line data: http://hymenoptera.tamu.edu/rna; username, ichs; password, ichzzz  相似文献   

19.
20.
A highly interoperable informatics infrastructure rapidly emerged to handle genomic data used for phylogenetics and was instrumental in the growth of molecular systematics. Parallel growth in software and databases to address needs peculiar to phylophenomics has been relatively slow and fragmented. Systematists currently face the challenge that Earth may hold tens of millions of species (living and fossil) to be described and classified. Grappling with research on this scale has increasingly resulted in work by teams, many constructing large phenomic supermatrices. Until now, phylogeneticists have managed data in single‐user, file‐based desktop software wholly unsuitable for real‐time, team‐based collaborative work. Furthermore, phenomic data often differ from genomic data in readily lending themselves to media representation (e.g. 2D and 3D images, video, sound). Phenomic data are a growing component of phylogenetics, and thus teams require the ability to record homology hypotheses using media and to share and archive these data. Here we describe MorphoBank, a web application and database leveraging software as a service methodology compatible with “cloud” computing technology for the construction of matrices of phenomic data. In its tenth year, and fully available to the scientific community at‐large since inception, MorphoBank enables interactive collaboration not possible with desktop software, permitting self‐assembling teams to develop matrices, in real time, with linked media in a secure web environment. MorphoBank also provides any user with tools to build character and media ontologies (rule sets) within matrices, and to display these as directed acyclic graphs. These rule sets record the phylogenetic interrelatedness of characters (e.g. if X is absent, Y is inapplicable, or X–Z characters share a media view). MorphoBank has enabled an order of magnitude increase in phylophenomic data collection: a recent collaboration by more than 25 researchers has produced a database of > 4500 phenomic characters supported by > 10 000 media.
© The Willi Hennig Society 2011.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号