首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
DNA sequences from orthologous loci can provide universal characters for taxonomic identification. Molecular taxonomy is of particular value for groups in which distinctive morphological features are difficult to observe or compare. To assist in species identification for the little known family Ziphiidae (beaked whales), we compiled a reference database of mitochondrial DNA (mtDNA) control region (437 bp) and cytochrome b (384 bp) sequences for all 21 described species in this group. This mtDNA database is complemented by a nuclear database of actin intron sequences (925 bp) for 17 of the 21 species. All reference sequences were derived from specimens validated by diagnostic skeletal material or other documentation, and included four holotypes. Phylogenetic analyses of mtDNA sequences confirmed the genetic distinctiveness of all beaked whale species currently recognized. Both mitochondrial loci were well suited for species identification, with reference sequences for all known ziphiids forming robust species-specific clades in phylogenetic reconstructions. The majority of species were also distinguished by nuclear alleles. Phylogenetic comparison of sequence data from "test" specimens to these reference databases resulted in three major taxonomic discoveries involving animals previously misclassified from morphology. Based on our experience with this family and the order Cetacea as a whole, we suggest that a molecular taxonomy should consider the following components: comprehensiveness, validation, locus sensitivity, genetic distinctiveness and exclusivity, concordance, and universal accessibility and curation.  相似文献   

2.
The accuracy and reliability of DNA metabarcoding analyses depend on the breadth and quality of the reference libraries that underpin them. However, there are limited options available to obtain and curate the huge volumes of sequence data that are available on public repositories such as NCBI and BOLD. Here, we provide a pipeline to download, clean and annotate mitochondrial DNA sequence data for a given list of fish species. Features of this pipeline include (a) support for multiple metabarcode markers; (b) searches on species synonyms and taxonomic name validation; (c) phylogeny assisted quality control for identification and removal of misannotated sequences; (d) automatically generated coverage reports for each new GenBank release update; and (e) citable, versioned DOIs. As an example we provide a ready-to-use curated reference library for the marine and freshwater fishes of the U.K. To augment this reference library for environmental DNA metabarcoding specifically, we generated 241 new MiFish-12S sequences for 88 U.K. marine species, and make available new primer sets useful for sequencing these. This brings the coverage of common U.K. species for the MiFish-12S fragment to 93%, opening new avenues for scaling up fish metabarcoding across wide spatial gradients. The Meta-Fish-Lib reference library and pipeline is hosted at https://github.com/genner-lab/meta-fish-lib .  相似文献   

3.
The campaign to DNA barcode all fishes, FISH-BOL   总被引:3,自引:0,他引:3  
FISH-BOL, the Fish Barcode of Life campaign, is an international research collaboration that is assembling a standardized reference DNA sequence library for all fishes. Analysis is targeting a 648 base pair region of the mitochondrial cytochrome c oxidase I (COI) gene. More than 5000 species have already been DNA barcoded, with an average of five specimens per species, typically vouchers with authoritative identifications. The barcode sequence from any fish, fillet, fin, egg or larva can be matched against these reference sequences using BOLD; the Barcode of Life Data System ( http://www.barcodinglife.org ). The benefits of barcoding fishes include facilitating species identification, highlighting cases of range expansion for known species, flagging previously overlooked species and enabling identifications where traditional methods cannot be applied. Results thus far indicate that barcodes separate c. 98 and 93% of already described marine and freshwater fish species, respectively. Several specimens with divergent barcode sequences have been confirmed by integrative taxonomic analysis as new species. Past concerns in relation to the use of fish barcoding for species discrimination are discussed. These include hybridization, recent radiations, regional differentiation in barcode sequences and nuclear copies of the barcode region. However, current results indicate these issues are of little concern for the great majority of specimens.  相似文献   

4.
Recovery of evolutionary history and delimiting species boundaries in widely distributed, poorly known groups requires extensive geographic sampling, but sampling regimes are difficult to design a priori because evolutionary diversity is often "hidden" by inadequate taxonomy. Large data sets are needed, and these provide unique challenges for analysis when they span intra- and interspecific levels of divergence. However, protocols have been designed to combine methods of analysis for DNA sequences that exhibit both very shallow and relatively deeper divergences. In this study, we combined several tree-based phylogeny reconstruction methods with nested-clade analysis to extract maximum historical signal at various levels in the poorly known Liolaemus elongatus-kriegi lizard complex in temperate South America. We implemented a recently descrirbed tree-based protocol for DNA sequences to test for species boundaries, and we propose modifications to accommodate large data sets and gene regions with heterogeneous substitution rates. Combining haplotype trees with nested-clade analyses allowed testing of species boundaries on the basis of a priori defined criteria. The results obtained suggest that the number of putative species in the L. elongatus-kriegi complex could be doubled. We discuss these findings in the context of the advantages and limitations of a combined approach for retrieval of maximum historical information in large data sets and with reference to the yet formidable unresolved issues of sampling strategies.  相似文献   

5.
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.  相似文献   

6.
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.  相似文献   

7.
DNA barcoding as a method for species identification is rapidly increasing in popularity. However, there are still relatively few rigorous methodological tests of DNA barcoding. Current distance-based methods are frequently criticized for treating the nearest neighbor as the closest relative via a raw similarity score, lacking an objective set of criteria to delineate taxa, or for being incongruent with classical character-based taxonomy. Here, we propose an artificial intelligence-based approach - inferring species membership via DNA barcoding with back-propagation neural networks (named BP-based species identification) - as a new advance to the spectrum of available methods. We demonstrate the value of this approach with simulated data sets representing different levels of sequence variation under coalescent simulations with various evolutionary models, as well as with two empirical data sets of COI sequences from East Asian ground beetles (Carabidae) and Costa Rican skipper butterflies. With a 630-to 690-bp fragment of the COI gene, we identified 97.50% of 80 unknown sequences of ground beetles, 95.63%, 96.10%, and 100% of 275, 205, and 9 unknown sequences of the neotropical skipper butterfly to their correct species, respectively. Our simulation studies indicate that the success rates of species identification depend on the divergence of sequences, the length of sequences, and the number of reference sequences. Particularly in cases involving incomplete lineage sorting, this new BP-based method appears to be superior to commonly used methods for DNA-based species identification.  相似文献   

8.
This study examines the utility of morphology and DNA barcoding in species identification of freshwater fishes from north‐central Nigeria. We compared molecular data (mitochondrial cytochrome c oxidase subunit I (COI) sequences) of 136 de novo samples from 53 morphologically identified species alongside others in GenBank and BOLD databases. Using DNA sequence similarity‐based (≥97% cutoff) identification technique, 50 (94.30%) and 24 (45.30%) species were identified to species level using GenBank and BOLD databases, respectively. Furthermore, we identified cases of taxonomic problems in 26 (49.00%) morphologically identified species. There were also four (7.10%) cases of mismatch in DNA barcoding in which our query sequence in GenBank and BOLD showed a sequence match with different species names. Using DNA barcode reference data, we also identified four unknown fish samples collected from fishermen to species level. Our Neighbor‐joining (NJ) tree analysis recovers several intraspecific species clusters with strong bootstrap support (≥95%). Analysis uncovers two well‐supported lineages within Schilbe intermedius. The Bayesian phylogenetic analyses of Nigerian S. intermedius with others from GenBank recover four lineages. Evidence of genetic structuring is consistent with geographic regions of sub‐Saharan Africa. Thus, cryptic lineage diversity may illustrate species’ adaptive responses to local environmental conditions. Finally, our study underscores the importance of incorporating morphology and DNA barcoding in species identification. Although developing a complete DNA barcode reference library for Nigerian ichthyofauna will facilitate species identification and diversity studies, taxonomic revisions of DNA sequences submitted in databases alongside voucher specimens are necessary for a reliable taxonomic and diversity inventory.  相似文献   

9.
DNA条形码主要目的是物种鉴定和新物种或隐存种的发现,而DNA条形码参考数据库是物种快速鉴定的重要基础。目前中国维管植物DNA条形码参考数据库正在建设之中,借助于公共数据库(NCBI)和初步建立的中国植物DNA条形码参考数据库,运用DNA条形码数据开展了植物标本鉴定的核查工作:(1)比较DNA序列信息与标本鉴定信息,从科、属、种级水平查找鉴定错误的标本;(2)基于有较好研究基础的DNA条形码参考数据库,开展未知标本的鉴定;(3)通过对标本核查的总结,提出DNA条形码参考数据库建设过程中的几点建议。  相似文献   

10.
The ends of chromosome in higher eukaryote are termed telomere. The DNAs present at that part of chromosome is called telomeric DNA. Telomeric DNA consists of tandemly repeated DNA sequences. The replication of the ends of chromosomes is not controlled by conventional DNA polymerases rather a special kind of enzyme is involved in this process. It is a ribonucleoprotein and known as telomerase. Cells in senescence stage face telomeric crisis that leads to loss of telomeric ends. Surveillance turns to procancer cells with increased telomerase activity which is a later consequence. Based on these facts a key diagnostic approach has been developed for detection of tumour. A novel therapy for tumour repression has been developed using telomerase inhibitors. However, these inhibitors are very much effective for solid tumour therapy and conceptually will not work on hematological malignancies.  相似文献   

11.

Background

DNA barcoding enhances the prospects for species-level identifications globally using a standardized and authenticated DNA-based approach. Reference libraries comprising validated DNA barcodes (COI) constitute robust datasets for testing query sequences, providing considerable utility to identify marine fish and other organisms. Here we test the feasibility of using DNA barcoding to assign species to tissue samples from fish collected in the central Mediterranean Sea, a major contributor to the European marine ichthyofaunal diversity.

Methodology/Principal Findings

A dataset of 1278 DNA barcodes, representing 218 marine fish species, was used to test the utility of DNA barcodes to assign species from query sequences. We tested query sequences against 1) a reference library of ranked DNA barcodes from the neighbouring North East Atlantic, and 2) the public databases BOLD and GenBank. In the first case, a reference library comprising DNA barcodes with reliability grades for 146 fish species was used as diagnostic dataset to screen 486 query DNA sequences from fish specimens collected in the central basin of the Mediterranean Sea. Of all query sequences suitable for comparisons 98% were unambiguously confirmed through complete match with reference DNA barcodes. In the second case, it was possible to assign species to 83% (BOLD-IDS) and 72% (GenBank) of the sequences from the Mediterranean. Relatively high intraspecific genetic distances were found in 7 species (2.2%–18.74%), most of them of high commercial relevance, suggesting possible cryptic species.

Conclusion/Significance

We emphasize the discriminatory power of COI barcodes and their application to cases requiring species level resolution starting from query sequences. Results highlight the value of public reference libraries of reliability grade-annotated DNA barcodes, to identify species from different geographical origins. The ability to assign species with high precision from DNA samples of disparate quality and origin has major utility in several fields, from fisheries and conservation programs to control of fish products authenticity.  相似文献   

12.
Abstract The Mediterranean species complex of Senecio serves to illustrate evolutionary processes that are likely to confound phylogenetic inference, including rapid diversification, gene tree‐species tree discordance, reticulation, interlocus concerted evolution, and lack of complete lineage sorting. Phylogeographic patterns of chloroplast DNA (cpDNA) haplotype variation were studied by sampling 156 populations (502 individuals) across 18 species of the complex, and a species phylogeny was reconstructed based on sequences from the internal transcribed spacer (ITS) regions of nuclear ribosomal DNA. For a subset of species, randomly amplified polymorphic DNAs (RAPDs) provided reference points for comparison with the cpDNA and ITS datasets. Two classes of cpDNA haplotypes were identified, with each predominating in certain parts of the Mediterranean region. However, with the exception of S. gallicus, intraspecific phylogeographic structure is limited, and only a few haplotypes detected were species‐specific. Nuclear sequence divergence is low, and several unresolved phylogenetic groupings are suggestive of near simultaneous diversification. Two well‐supported ITS clades contain the majority of species, amongst which there is a pronounced sharing of cpDNA haplotypes. Our data are not capable of diagnosing the relative impact of reticulation versus insufficient lineage sorting for the entire complex. However, there is firm evidence that S. flavus subsp. breviflorus and S. rupestris have acquired cpDNA haplotypes and ITS sequences from co‐occurring species by reticulation. In contrast, insufficient lineage sorting is a viable hypothesis for cpDNA haplotypes shared between S. gallicus and its close relatives. We estimated the minimum coalescent times for these haplotypes by utilizing the inferred species phylogeny and associated divergence times. Our data suggest that ancestral cpDNA polymorphisms may have survived for ca. 0.4–1.0 million years, depending on molecular clock calibrations.  相似文献   

13.
14.

Public molecular databases are fundamental tools for modern taxonomic studies whose usefulness rely on the soundness of the data within them. Here, we study potential errors that can arise along the data pipeline from sampling, specimen identification and molecular processing (digestion, amplification and sequencing) to the submission of sequences to these databases by using the DNA sequences of Hydrachnidia (Acari, Parasitengona) as a case study. Our results indicate that molecular information is available for only about 3% of the Hydrachnidia species known to date; yet, within this small percentage, errors are present in almost 5% of the species analyzed (0.5% of the sequences and almost 11% of the genera). This study underscores the scarcity of genetic data available for Hydrachnidia, but also that the proportion of errors in DNA sequences is relatively small. Even so, it highlights the danger associated with using DNA sequences from public databases, particularly for species identification, and reinforces the need for greater quality control measures and/or protocols to avoid an intensification of errors in the (post) genomics era. Finally, our study emphasizes that potential errors may also reveal cryptic diversity within a species.

  相似文献   

15.
Using less stringent hybridization conditions and cloned viral DNA probes representing the avian sarcoma virus gag, pol, env, and long terminal repeat (LTR) gene sequences, we detected related sequences in two avian species purportedly lacking all endogenous avian leukosis viruses, the ev- chicken and the Japanese quail. The blot hybridization patterns obtained with the various probes suggest the presence of between 40 and 100 copies of retrovirus-related sequences in the genomes of these two species. An ev- chicken genomic DNA library was prepared and screened with gag-specific and pol-specific DNA probes. Several different clones were obtained from this library and characterized. Analysis of these clones revealed that the retrovirus-related gene sequences are linked in the order LTR-gag-pol-env-LTR, a structure indicative of a complete provirus. These data indicate the presence of previously unidentified endogenous retrovirus species in avian cells, suggesting that under the appropriate conditions of hybridization additional, more distantly evolved families of endogenous retrovirus genes may be identified in vertebrate species.  相似文献   

16.
环境 DNA (eDNA) 技术是一种生态和生物多样性监测和评价的新手段, 完整和准确的参考序列库是eDNA技术应用于水生生物多样性调查的基础。当前, 不同水生生物eDNA参考序列还存在诸多问题, 如不同类群使用的标记基因不同且资源较为分散, 部分参考序列分类不准确, 以及针对我国各类水体中水生生物eDNA参考序列不多等。针对上述问题, 研究构建了水生生物eDNA数据库(AeDNA, http://aedna.ihb.ac.cn/)。 AeDNA整合了DNA条形码和基因组两种类型参考序列。其中18S、28S、ITS、COΙ、12S、rbcL 等各类DNA条形码60余万条, 涉及2万余种鱼类、1万余种水生植物、1万余种底栖动物、1万余种浮游动物和1万余种浮游植物; 基因组包含线粒体、叶绿体等细胞器基因组6199个及万种鱼类基因组计划和万种原生生物基因组计划所产生的物种基因组。涉及的生境有江、河、湖、海、冰川和温泉等各类水环境, 尤其数据库构建团队贡献的6万余条参考序列, 具有我国丰富的各类水体生境信息。总体来说, AeDNA是一个数据量大、类群覆盖全、准确性高且具有我国水生生物特色的综合性eDNA参考序列库, 是水生态和水生生物多样性监测的重要基础资源。  相似文献   

17.
A DNA marker that distinguishes plant associated bacteria at the species level and below was derived by comparing six sequenced genomes of Xanthomonas, a genus that contains many important phytopathogens. This DNA marker comprises a portion of the dnaA replication initiation factor (RIF). Unlike the rRNA genes, dnaA is a single copy gene in the vast majority of sequenced bacterial genomes, and amplification of RIF requires genus-specific primers. In silico analysis revealed that RIF has equal or greater ability to differentiate closely related species of Xanthomonas than the widely used ribosomal intergenic spacer region (ITS). Furthermore, in a set of 263 Xanthomonas, Ralstonia and Clavibacter strains, the RIF marker was directly sequenced in both directions with a success rate approximately 16% higher than that for ITS. RIF frameworks for Xanthomonas, Ralstonia and Clavibacter were constructed using 682 reference strains representing different species, subspecies, pathovars, races, hosts and geographic regions, and contain a total of 109 different RIF sequences. RIF sequences showed subspecific groupings but did not place strains of X. campestris or X. axonopodis into currently named pathovars nor R. solanacearum strains into their respective races, confirming previous conclusions that pathovar and race designations do not necessarily reflect genetic relationships. The RIF marker also was sequenced for 24 reference strains from three genera in the Enterobacteriaceae: Pectobacterium, Pantoea and Dickeya. RIF sequences of 70 previously uncharacterized strains of Ralstonia, Clavibacter, Pectobacterium and Dickeya matched, or were similar to, those of known reference strains, illustrating the utility of the frameworks to classify bacteria below the species level and rapidly match unknown isolates to reference strains. The RIF sequence frameworks are available at the online RIF database, RIFdb, and can be queried for diagnostic purposes with RIF sequences obtained from unknown strains in both chromatogram and FASTA format.  相似文献   

18.
Nucleic Acid Homologies Among Species of Saccharomyces   总被引:19,自引:4,他引:15       下载免费PDF全文
Evolutionary divergence among species of the yeast genus Saccharomyces was estimated from measurements of deoxyribonucleic acid (DNA)/DNA and ribosomal ribonucleic acid (RNA)/DNA homology. Much diversity was found in the DNA base sequences with several species showing little or no homology to the three reference species, S. cerevisiae, S. lactis, and S. fragilis. These three reference species also showed little or no homology to each other. On the other hand the diversity among ribosomal RNA base sequences was small since most species showed a high degree of homology to the reference species. The arrangement of species based on ribosomal RNA homologies agrees in most cases with current taxonomic groupings. A yeast hybrid (S. fragilis x S. lactis) was shown to contain two nonhomologous genomes. A minimum genome size of 9.2 x 10(9) daltons for S. cerevisiae was calculated from the rate of DNA renaturation.  相似文献   

19.
Correct species identifications are of tremendous importance for invasion ecology, as mistakes could lead to misdirecting limited resources against harmless species or inaction against problematic ones. DNA barcoding is becoming a promising and reliable tool for species identifications, however the efficacy of such molecular taxonomy depends on gene region(s) that provide a unique sequence to differentiate among species and on availability of reference sequences in existing genetic databases. Here, we assembled a list of aquatic and terrestrial non-indigenous species (NIS) and checked two leading genetic databases for corresponding sequences of six genome regions used for DNA barcoding. The genetic databases were checked in 2010, 2012, and 2016. All four aquatic kingdoms (Animalia, Chromista, Plantae and Protozoa) were initially equally represented in the genetic databases, with 64, 65, 69, and 61 % of NIS included, respectively. Sequences for terrestrial NIS were present at rates of 58 and 78 % for Animalia and Plantae, respectively. Six years later, the number of sequences for aquatic NIS increased to 75, 75, 74, and 63 % respectively, while those for terrestrial NIS increased to 74 and 88 % respectively. Genetic databases are marginally better populated with sequences of terrestrial NIS of plants compared to aquatic NIS and terrestrial NIS of animals. The rate at which sequences are added to databases is not equal among taxa. Though some groups of NIS are not detectable at all based on available data—mostly aquatic ones—encouragingly, current availability of sequences of taxa with environmental and/or economic impact is relatively good and continues to increase with time.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号