首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.  相似文献   

2.
Characterization of molecular markers and the development of better assays for precise and rapid detection of domestic species are always in demand. This is particularly due to recent food scares and the crisis of biodiversity resulting from the huge ongoing illegal traffic of endangered species. The aim of this study was to develop a new and easy method for domestic species identification (river buffalo, cattle, sheep and goat) based on the analysis of a specific mitochondrial nucleotide sequence. For this reason, a specific fragment of Egyptian buffalo mitochondrial 16S rRNA gene (422 bp) was amplified by PCR using two universal primers. The sequence of this specific fragment is completely conserved between all tested Egyptian buffaloes and other river buffaloes in different places in the world. Also, the lengths of the homologous fragments were less by one nucleotide (421 bp) in case of goats and two nucleotides (420 bp) in case of both cattle and sheep. The detection of specific variable sites between investigated species within this fragment was sufficient to identify the biological origin of the samples. This was achieved by alignment between the unknown homologous sequence and the reference sequences deposited in GenBank database (accession numbers, FJ748599–FJ748607). Considering multiple alignment results between 16S rRNA homologous sequences obtained from GenBank database with the reference sequence, it was shown that definite nucleotides are specific for each of the four studied species of the family Bovidae. In addition, other nucleotides are detected which can allow discrimination between two groups of animals belonging to two subfamilies of family Bovidae, Group one (closely related species like cattle and buffalo, Subfamily Bovinae) and Group two (closely related species like sheep and goat, Subfamily Caprinae). This 16S DNA barcode character-based approach could be used to complement cytochrome c oxidase I (COI) in DNA barcoding. Also, it is a good tool for identification of unknown sample belonging to one of the four domestic animal species of family Bovidae quickly and easily.  相似文献   

3.
Human activities impact all ecosystems on Earth, which urges scientists to better understand biodiversity changes across temporal and spatial scales. Environmental DNA (eDNA) metabarcoding is a promising non-invasive method to assess species composition in a wide range of ecosystems. Yet, this method requires the completeness of a reference database, i.e. a list of DNA sequences attached to each species of the regional pool, which is rarely met. As an alternative, molecular operational taxonomic units (MOTUs) can be extracted as clusters of sequences. However, the extent to which the diversity of MOTUs can predict the diversity of species across spatial scales is unknown. Here, we used 196 samples along the Rhone river (France) for which the reference database is complete to assess whether a blind eDNA approach can reliably predict the ground-truth number of species at different spatial scales. Using the 12S rDNA teleo primer, we curated and clustered 60 million sequences into MOTUs using a new assembled bioinformatic pipeline. We show that stringent quality filters were necessary to remove artefact noise, notably MOTUs present in a single PCR replicate, which represented 55% of MOTUs (103). Post-clustering cleaning also removed 19 additional erroneous MOTUs and only discarded one truly present species. We then show that the diversity of retained fish MOTUs accurately predicted the local (α, r = 0.98) and regional (γ) ground-truth species diversity (67 MOTUs versus 63 species), but also the species dissimilarity between samples (β-diversity, r = 0.98). This work paves the way towards extending the use of eDNA metabarcoding in community ecology and biogeography despite major gaps in genetic reference databases.  相似文献   

4.
Metabarcoding has improved the way we understand plants within our environment, from their ecology and conservation to invasive species management. The notion of identifying plant taxa within environmental samples relies on the ability to match unknown sequences to known reference libraries. Without comprehensive reference databases, species can go undetected or be incorrectly assigned, leading to false‐positive and false‐negative detections. To improve our ability to generate reference sequence databases, we developed a targeted capture approach using the OZBaits_CP V1.0 set, designed to capture chloroplast gene regions across the entirety of flowering plant diversity. We focused on generating a reference database for coastal temperate plant species given the lack of reference sequences for these taxa. Our approach was successful across all specimens with a target gene recovery rate of 92%, which was achieved in a single assay (i.e., samples were pooled), thus making this approach much faster and more efficient than standard barcoding. Further testing of this database highlighted 80% of all samples could be discriminated to family level across all gene regions with some genes achieving greater resolution than others—which was also dependent on the taxon of interest. Thus, we demonstrate the importance of generating reference sequences across multiple chloroplast gene regions as no single loci are sufficient to discriminate across all plant groups. The targeted capture approach outlined in this study provides a way forward to achieve this.  相似文献   

5.
DNA barcoding approaches have greatly increased our understanding of biodiversity on the planet, and metabarcoding is widely used for classifying members of the phylum Nematoda. However, loci typically utilized in metabarcoding studies are often unable to resolve closely related species or are unable to recover all taxa present in a sample due to inadequate PCR primer binding. Mitochondrial metagenomics (mtMG) is an alternative approach utilizing shotgun sequencing of total DNA to recover the mitochondrial genomes of all species present in samples. However, this approach requires a comprehensive reference database for identification and currently available mitochondrial sequences for nematodes are highly dominated by sequences from the order Rhabditida, and excludes many clades entirely. Here, we analysed the efficacy of mtMG for the recovery of nematode taxa and the generation of mitochondrial genomes. We first developed a curated reference database of nematode mitochondrial sequences and expanded it with 40 newly sequenced taxa. We then tested the mito-metagenomics approach using a series of nematode mock communities consisting of morphologically identified nematode species representing various feeding traits, life stages, and phylogenetic relationships. We were able to identify all but two species through the de novo assembly of COX1 genes. We were also able to recover additional mitochondrial protein coding genes (PCGs) for 23 of the 24 detected species including a full array of 12 PCGs from five of the species. We conclude that mtMG offers a potential for the effective recovery of nematode biodiversity but remains limited by the breadth of the reference database.  相似文献   

6.
DNA barcoding is based on the use of short DNA sequences to provide taxonomic tags for rapid, efficient identification of biological specimens. Currently, reference databases are being compiled. In the future, it will be important to facilitate access to these databases, especially for nonspecialist users. The method described here provides a rapid, web-based, user-friendly link between the DNA sequence from an unidentified biological specimen and various types of biological information, including the species name. Specifically, we use a customized, Google-type search algorithm to quickly match an unknown DNA sequence to a list of verified DNA barcodes in the reference database. In addition to retrieving the species name, our web tool also provides automatic links to a range of other information about that species. As the DNA barcode database becomes more populated, it will become increasingly important for the broader user community to be able to exploit it for the rapid identification of unknown specimens and to easily obtain relevant biological information about these species. The application presented here meets that need.  相似文献   

7.
【背景】对于环境样品中氨氧化古菌(Ammonia-oxidizing archaea,AOA)多样性的研究,利用amoA功能基因作为分子标记会比16SrRNA基因有更强的特异性和更高的分辨率,能更准确地反映环境样品中氨氧化古菌的种群结构和分布特征。然而,目前对amoA基因扩增子高通量测序的分析存在两大限制因素:一是缺乏相应的amoA基因参考数据库;二是AOA amoA基因在种水平上的相似性阈值未知,分析过程中没有明确的划分种水平操作分类单元(Operational taxonomic unit,OTU)的阈值。【目的】构建基于amoA功能基因序列分析氨氧化古菌多样性的方法,为基于高通量测序的功能微生物多样性分析提供参考。【方法】基于目前已通过分离纯化或富集培养获得的34株氨氧化古菌及功能基因数据库中收录的环境样品amoA基因序列,构建氨氧化古菌amoA基因参考数据库。通过菌株间两两比对获得的amoA基因相似度与16SrRNA基因相似度的相关性分析,确定amoA基因在种水平上的相似性阈值。基于MOTHUR软件平台,利用建立的参考数据库和确定的阈值对南海一个垂直水体剖面样品的amoA基因序列进行多样性分析。【结果】构建了含有26 091条序列信息的古菌amoA基因参考数据库,确定了89%作为分析过程中古菌amoA基因划分种水平OTU的阈值,对南海水体样品氨氧化古菌的多样性分析结果很好地显示了南海不同深度水层水体中氨氧化古菌的种群结构和系统发育关系,有效揭示了南海氨氧化古菌的垂直分布差异。【结论】建立了基于amoA基因高通量测序的氨氧化古菌多样性分析方法,此方法可以有效分析环境样品中氨氧化古菌的多样性。  相似文献   

8.
DNA sequences from orthologous loci can provide universal characters for taxonomic identification. Molecular taxonomy is of particular value for groups in which distinctive morphological features are difficult to observe or compare. To assist in species identification for the little known family Ziphiidae (beaked whales), we compiled a reference database of mitochondrial DNA (mtDNA) control region (437 bp) and cytochrome b (384 bp) sequences for all 21 described species in this group. This mtDNA database is complemented by a nuclear database of actin intron sequences (925 bp) for 17 of the 21 species. All reference sequences were derived from specimens validated by diagnostic skeletal material or other documentation, and included four holotypes. Phylogenetic analyses of mtDNA sequences confirmed the genetic distinctiveness of all beaked whale species currently recognized. Both mitochondrial loci were well suited for species identification, with reference sequences for all known ziphiids forming robust species-specific clades in phylogenetic reconstructions. The majority of species were also distinguished by nuclear alleles. Phylogenetic comparison of sequence data from "test" specimens to these reference databases resulted in three major taxonomic discoveries involving animals previously misclassified from morphology. Based on our experience with this family and the order Cetacea as a whole, we suggest that a molecular taxonomy should consider the following components: comprehensiveness, validation, locus sensitivity, genetic distinctiveness and exclusivity, concordance, and universal accessibility and curation.  相似文献   

9.
A DNA marker that distinguishes plant associated bacteria at the species level and below was derived by comparing six sequenced genomes of Xanthomonas, a genus that contains many important phytopathogens. This DNA marker comprises a portion of the dnaA replication initiation factor (RIF). Unlike the rRNA genes, dnaA is a single copy gene in the vast majority of sequenced bacterial genomes, and amplification of RIF requires genus-specific primers. In silico analysis revealed that RIF has equal or greater ability to differentiate closely related species of Xanthomonas than the widely used ribosomal intergenic spacer region (ITS). Furthermore, in a set of 263 Xanthomonas, Ralstonia and Clavibacter strains, the RIF marker was directly sequenced in both directions with a success rate approximately 16% higher than that for ITS. RIF frameworks for Xanthomonas, Ralstonia and Clavibacter were constructed using 682 reference strains representing different species, subspecies, pathovars, races, hosts and geographic regions, and contain a total of 109 different RIF sequences. RIF sequences showed subspecific groupings but did not place strains of X. campestris or X. axonopodis into currently named pathovars nor R. solanacearum strains into their respective races, confirming previous conclusions that pathovar and race designations do not necessarily reflect genetic relationships. The RIF marker also was sequenced for 24 reference strains from three genera in the Enterobacteriaceae: Pectobacterium, Pantoea and Dickeya. RIF sequences of 70 previously uncharacterized strains of Ralstonia, Clavibacter, Pectobacterium and Dickeya matched, or were similar to, those of known reference strains, illustrating the utility of the frameworks to classify bacteria below the species level and rapidly match unknown isolates to reference strains. The RIF sequence frameworks are available at the online RIF database, RIFdb, and can be queried for diagnostic purposes with RIF sequences obtained from unknown strains in both chromatogram and FASTA format.  相似文献   

10.
A number of Antarctic fish species are affected by an unusual gill condition known as X-cell disease, named in reference to morphologically similar lesions of unknown aetiology reported from northern hemisphere fishes. Despite the disease being first recorded in Antarctic fishes over 25 years ago, no progress has been made in identifying its cause or in confirming any possible relationship with northern fishes. Although once thought to be a neoplasm, observations of lesions in non-Antarctic fishes point towards a parasitic origin. The life cycle of the proposed causal organism is unknown, however, and the only stages identified are those of the eponymous cells in the lesions. Here, we show X-cells in diseased gills of the Antarctic nototheniid Trematomus bernacchii represent multinucleate cysts of an unknown parasitic organism. Furthermore, we use molecular genetic methodology to show that the organism responsible is closely related to that identified in X-cell lesions of the common European dab, Limanda limanda and that the disease thus has a global distribution. Phylogenetic tree construction based on 18S rDNA sequences confirms that X-cell organisms form a group of closely related parasites, but robust positioning of the X-cell clade in the tree awaits more extensive genetic sequencing.  相似文献   

11.
The genome sequences completed so far contain more than 20 000 genes with unknown function and no similarity to genes in other genomes. The origin and evolution of the orphan genes is an enigma. Here, we discuss the suggestion that some orphan genes may represent pseudogenes or short fragments of genes that were functional in the genome of a common ancestor. These may be the remains of unsuccessful duplication or horizontal gene transfer events, in which the acquired sequences have entered the fragmentation process and thereby lost their similarity to genes in other species. This scenario is supported by a recent case study of orphan genes in several closely related species of Rickettsia, where full-length ancestral genes were reconstructed from sets of short, overlapping orphan genes. One of these was found to display similarity to genes encoding proteins with ankyrin-repeat domains.  相似文献   

12.
The diversity of methanogenic archaea associated with different species of ciliated protozoa in the rumen was analysed. Partial fragments of archaeal SSU rRNA genes were amplified from DNA isolated from single cells from the rumen protozoal species Metadinium medium, Entodinium furca, Ophryoscolex caudatus and Diplodinium dentatum. Sequence analysis of these fragments indicated that although all of the new isolates clustered with sequences previously described for methanogens, there was a difference in the relative distribution of sequences detected here as compared to that of previous work. In addition, many of the novel sequences, although clearly of archaeal origin have relatively low identity to the sequences in database which are most closely related to them.  相似文献   

13.
DNA条形码主要目的是物种鉴定和新物种或隐存种的发现,而DNA条形码参考数据库是物种快速鉴定的重要基础。目前中国维管植物DNA条形码参考数据库正在建设之中,借助于公共数据库(NCBI)和初步建立的中国植物DNA条形码参考数据库,运用DNA条形码数据开展了植物标本鉴定的核查工作:(1)比较DNA序列信息与标本鉴定信息,从科、属、种级水平查找鉴定错误的标本;(2)基于有较好研究基础的DNA条形码参考数据库,开展未知标本的鉴定;(3)通过对标本核查的总结,提出DNA条形码参考数据库建设过程中的几点建议。  相似文献   

14.
The Archaea present in salt marsh sediment samples from a tidal creek and from an adjacent area of vegetative marshland, both of which showed active methanogenesis and sulfate reduction, were sampled by using 16S rRNA gene libraries created with Archaea-specific primers. None of the sequences were the same as reference sequences from cultured taxa, although some were closely related to sequences from methanogens previously isolated from marine sediments. A wide range of Euryarchaeota sequences were recovered, but no sequences from Methanococcus, Methanobacterium, or the Crenarchaeota were recovered. Clusters of closely related sequences were common and generally contained sequences from both sites, suggesting that some related organisms were present in both samples. Recovery of sequences closely related to those of methanogens such as Methanococcoides and Methanolobus, which can use substrates other than hydrogen, provides support for published hypotheses that such methanogens are probably important in sulfate-rich sediments and identifies some likely candidates. Sequences closely related to those of methanogens such as Methanoculleus and Methanogenium, which are capable of using hydrogen, were also discovered, in agreement with previous inhibitor and process measurements suggesting that these taxa are present at low levels of activity. More surprisingly, we recovered a variety of sequences closely related to those from different halophilic Archaea and a cluster of divergent sequences specifically related to the marine group II archaeal sequences recently shown by PCR and probing to have a cosmopolitan distribution in marine samples.  相似文献   

15.
Proteomics research is hampered in many organisms due to a lack of an appropriate reference genome sequence that can be used in the interpretation of tandem mass spectrometry data for the identification of proteins. Public DNA sequence repositories have grown to considerable size and can, in most cases, serve to provide at least partial interpretation of a large-scale proteomics dataset. However, when species-specific sequences or sequences from a closely related species are available, a boutique sequence database can provide considerable increases in specificity, confidence, and completeness of protein identification. Here, we describe the development of a protein database from a large-scale expressed sequence tag and full-length complementary DNA sequencing project in the economically and ecologically important spruce (Picea) genus.  相似文献   

16.
Earthworms are known for their important role within the functioning of an ecosystem, and their diversity can be used as an indicator of ecosystem health. To date, earthworm diversity has been investigated through conventional extraction methods such as handsorting, soil washing or the application of a mustard solution. Such techniques are time consuming and often difficult to apply. We showed that combining DNA metabarcoding and next-generation sequencing facilitates the identification of earthworm species from soil samples. The first step of our experiments was to create a reference database of mitochondrial DNA (mtDNA) 16S gene for 14 earthworm species found in the French Alps. Using this database, we designed two new primer pairs targeting very short and informative DNA sequences (about 30 and 70 bp) that allow unambiguous species identification. Finally, we analysed extracellular DNA taken from soil samples in two localities (two plots per locality and eight samples per plot). The two short metabarcode regions led to the identification of a total of eight earthworm species. The earthworm communities identified by the DNA-based approach appeared to be well differentiated between the two localities and are consistent with results derived from inventories collected using the handsorting method. The possibility of assessing earthworm communities from hundreds or even thousands of localities through the use of extracellular soil DNA will undoubtedly stimulate further ecological research on these organisms. Using the same DNA extracts, our study also illustrates the potential of environmental DNA as a tool to assess the diversity of other soil-dwelling animal taxa.  相似文献   

17.
18.
Paleogenomics is the nascent discipline concerned with sequencing and analysis of genome‐scale information from historic, ancient, and even extinct samples. While once inconceivable due to the challenges of DNA damage, contamination, and the technical limitations of PCR‐based Sanger sequencing, following the dawn of the second‐generation sequencing revolution, it has rapidly become a reality. However, a significant challenge facing ancient DNA studies on extinct species is the lack of closely related reference genomes against which to map the sequencing reads from ancient samples. Although bioinformatic efforts to improve the assemblies have focused mainly in mapping algorithms, in this article we explore the potential of an alternative approach, namely using reconstructed ancestral genome as reference for mapping DNA sequences of ancient samples. Specifically, we present a preliminary proof of concept for a general framework and demonstrate how under certain evolutionary divergence thresholds, considerable mapping improvements can be easily obtained.  相似文献   

19.
We present an analysis of a chromosomal walk in the region of the euchromatin-heterochromatin transition at the base of the X chromosome of Drosophila melanogaster. This region is difficult to analyse because of the presence of repeated sequences, and we have used cosmids to walk from the last euchromatic gene, suppressor of forked, towards the pericentric heterochromatin. The proximal 30-kb sequence we have isolated consists of repetitive DNA, including four tandem copies of a 5.9-kb sequence. This tandem repeat is itself a mosaic of other, mostly repeated, sequences, including part of a retrotransposon without long terminal repeats, a simple-sequence region of TAA repeats and part of a retrotransposon with long terminal repeats that has not been previously described. Although sequences homologous to these components are found elsewhere in the genome, this arrangement of repeated sequences is only found at the base of the X chromosome. It is conserved in D. melanogaster strains of different geographic origin, but is not conserved in even closely related species.  相似文献   

20.
The origin and evolution of the thousands of species-specific genes with unknown functions, the so-called orphan genes, has been a mystery. Here, we have studied the rates and patterns of orphan sequence evolution, using the Rickettsia as our reference system. Of the Rickettsia conorii orphans examined in this study, 80% were found to be short gene fragments or fusions of short segments from neighboring genes. We reconstructed the putative sequences of the full-length genes from which the short orphan fragments are thought to have originated. One of the genes thus reconstructed displays weak similarity to the ankyrin-repeat protein family, an identification that is strongly supported by comparative molecular modeling. Studies of the patterns of gene fragmentation underscore the importance of short repeated sequences as targets for recombination events that result in sequence loss and the formation of short, transient open reading frames. Our analysis demonstrates that gene sequences present in the common ancestor can be inferred even in cases when no full-length open reading frame is present in any of the contemporary species. Such reconstructions support the identification of lost protein functions and hint at important lifestyle changes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号