首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
We provide a general overview of features and technical specifications of an online, interactive tool for the identification of scale insects of concern to the U.S.A. ports-of-entry. Full lists of terminal taxa included in the keys (of which there are four), a list of features used in them, and a discussion of the structure of the tool are provided. We also briefly discuss the advantages of interactive keys for the identification of potential scale insect pests. The interactive key is freely accessible on http://idtools.org/id/scales/index.php  相似文献   

3.
Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.  相似文献   

4.

Background

DNA barcoding has been advanced as a promising tool to aid species identification and discovery through the use of short, standardized gene targets. Despite extensive taxonomic studies, for a variety of reasons the identification of fishes can be problematic, even for experts. DNA barcoding is proving to be a useful tool in this context. However, its broad application is impeded by the need to construct a comprehensive reference sequence library for all fish species. Here, we make a regional contribution to this grand challenge by calibrating the species discrimination efficiency of barcoding among 125 Argentine fish species, representing nearly one third of the known fauna, and examine the utility of these data to address several key taxonomic uncertainties pertaining to species in this region.

Methodology/Principal Findings

Specimens were collected and morphologically identified during crusies conducted between 2005 and 2008. The standard BARCODE fragment of COI was amplified and bi-directionally sequenced from 577 specimens (mean of 5 specimens/species), and all specimens and sequence data were archived and interrogated using analytical tools available on the Barcode of Life Data System (BOLD; www.barcodinglife.org). Nearly all species exhibited discrete clusters of closely related haplogroups which permitted the discrimination of 95% of the species (i.e. 119/125) examined while cases of shared haplotypes were detected among just three species-pairs. Notably, barcoding aided the identification of a new species of skate, Dipturus argentinensis, permitted the recognition of Genypterus brasiliensis as a valid species and questions the generic assignment of Paralichthys isosceles.

Conclusions/Significance

This study constitutes a significant contribution to the global barcode reference sequence library for fishes and demonstrates the utility of barcoding for regional species identification. As an independent assessment of alpha taxonomy, barcodes provide robust support for most morphologically based taxon concepts and also highlight key areas of taxonomic uncertainty worthy of reappraisal.  相似文献   

5.
Molecular dynamic simulation is a practical and powerful technique for analysis of protein structure. Several programs have been developed to facilitate the mentioned investigation, under them the visual molecular dynamic or VMD is the most frequently used programs. One of the beneficial properties of the VMD is its ability to be extendable by designing new plug-in. We introduce here a new facility of the VMD for distance analysis and radius of gyration of biopolymers such as protein and DNA.

Availability

The database is available for free at http://trc.ajums.ac.ir/HomePage.aspx/?TabID/=12618/&Site/=trc.ajums.ac/&Lang/=fa-IR  相似文献   

6.
Identification of ortholog is one of the important tasks to understand a novel genome. It helps to assign functional annotations, from one organism to another organism. To identify the putative ortholog, Reciprocal Best BLAST hit (RBBH) method is known to be an efficient approach. OrFin makes use of the same approach to identify pair of orthologous proteins for a given set of sequences of two species. It is a user-friendly web tool which works with user defined parameters to search RBBHs. Results are produced in both html and text format.

Availability

This web tool is freely available at http://bifl.uohyd.ac.in/orfin  相似文献   

7.
The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modules which can be used independently or sequentially. The tool efficiently extracts sequences from large datasets with few simple steps. It can efficiently extract multiple sequences of any desired length from a genome of any organism. The results are crosschecked by published data.

Availability

URL 1: http://ww3.comsats.edu.pk/bio/ResearchProjects.aspxURL 2: http://ww3.comsats.edu.pk/bio/SequenceManeuverer.aspx  相似文献   

8.
PHProteomicDB is a PHP-written module to help researchers in proteomics to share two-dimenslonal gel electrophoresis data using personal web sites. No technical or PHP knowledge is necessary except a few basics about web site management. PHProteomicDB has a user-friendly administration interface to enter and update data. It creates web pages on the fly displaying gel characteristics, gel pictures, and numbered gel spots with their related identifications pointing to their reference pages in protein databanks. The module is freely available at http://www.huvec.com/index.php3?rub=Download.  相似文献   

9.
The Prp43 DExD/H-box protein is required for progression of the biochemically distinct pre-messenger RNA and ribosomal RNA (rRNA) maturation pathways. In Saccharomyces cerevisiae, the Spp382/Ntr1, Sqs1/Pfa1, and Pxr1/Gno1 proteins are implicated as cofactors necessary for Prp43 helicase activation during spliceosome dissociation (Spp382) and rRNA processing (Sqs1 and Pxr1). While otherwise dissimilar in primary sequence, these Prp43-binding proteins each contain a short glycine-rich G-patch motif required for function and thought to act in protein or nucleic acid recognition. Here yeast two-hybrid, domain-swap, and site-directed mutagenesis approaches are used to investigate G-patch domain activity and portability. Our results reveal that the Spp382, Sqs1, and Pxr1 G-patches differ in Prp43 two-hybrid response and in the ability to reconstitute the Spp382 and Pxr1 RNA processing factors. G-patch protein reconstitution did not correlate with the apparent strength of the Prp43 two-hybrid response, suggesting that this domain has function beyond that of a Prp43 tether. Indeed, while critical for Pxr1 activity, the Pxr1 G-patch appears to contribute little to the yeast two-hybrid interaction. Conversely, deletion of the primary Prp43 binding site within Pxr1 (amino acids 102–149) does not impede rRNA processing but affects small nucleolar RNA (snoRNA) biogenesis, resulting in the accumulation of slightly extended forms of select snoRNAs, a phenotype unexpectedly shared by the prp43 loss-of-function mutant. These and related observations reveal differences in how the Spp382, Sqs1, and Pxr1 proteins interact with Prp43 and provide evidence linking G-patch identity with pathway-specific DExD/H-box helicase activity.  相似文献   

10.
Chen YC  Aguan K  Yang CW  Wang YT  Pal NR  Chung IF 《PloS one》2011,6(5):e20025

Background

The need for efficient algorithms to uncover biologically relevant phosphorylation motifs has become very important with rapid expansion of the proteomic sequence database along with a plethora of new information on phosphorylation sites. Here we present a novel unsupervised method, called Motif Finder (in short, F-Motif) for identification of phosphorylation motifs. F-Motif uses clustering of sequence information represented by numerical features that exploit the statistical information hidden in some foreground data. Furthermore, these identified motifs are then filtered to find “actual” motifs with statistically significant motif scores.

Results and Discussion

We have applied F-Motif to several new and existing data sets and compared its performance with two well known state-of-the-art methods. In almost all cases F-Motif could identify all statistically significant motifs extracted by the state-of-the-art methods. More importantly, in addition to this, F-Motif uncovers several novel motifs. We have demonstrated using clues from the literature that most of these new motifs discovered by F-Motif are indeed novel. We have also found some interesting phenomena. For example, for CK2 kinase, the conserved sites appear only on the right side of S. However, for CDK kinase, the adjacent site on the right of S is conserved with residue P. In addition, three different encoding methods, including a novel position contrast matrix (PCM) and the simplest binary coding, are used and the ability of F-motif to discover motifs remains quite robust with respect to encoding schemes.

Conclusions

An iterative algorithm proposed here uses exploratory data analysis to discover motifs from phosphorylated data. The effectiveness of F-Motif has been demonstrated using several real data sets as well as using a synthetic data set. The method is quite general in nature and can be used to find other types of motifs also. We have also provided a server for F-Motif at http://f-motif.classcloud.org/, http://bio.classcloud.org/f-motif/ or http://ymu.classcloud.org/f-motif/.  相似文献   

11.
Alistipes senegalensis strain JC50T is the type strain of A. senegalensis sp. nov., a new species within the Alistipes genus. This strain, whose genome is described here, was isolated from the fecal flora of an asymptomatic patient. A. senegalensis is an anaerobic Gram-negative rod-shaped bacterium. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,017,609 bp long genome (1 chromosome, but no plasmid) contains 3,113 protein-coding and 50 RNA genes, including 5 rRNA genes.  相似文献   

12.
Liu X  Liu B  Huang Z  Shi T  Chen Y  Zhang J 《PloS one》2012,7(1):e30938

Background

The molecular network sustained by different types of interactions among proteins is widely manifested as the fundamental driving force of cellular operations. Many biological functions are determined by the crosstalk between proteins rather than by the characteristics of their individual components. Thus, the searches for protein partners in global networks are imperative when attempting to address the principles of biology.

Results

We have developed a web-based tool “Sequence-based Protein Partners Search” (SPPS) to explore interacting partners of proteins, by searching over a large repertoire of proteins across many species. SPPS provides a database containing more than 60,000 protein sequences with annotations and a protein-partner search engine in two modes (Single Query and Multiple Query). Two interacting proteins of human FBXO6 protein have been found using the service in the study. In addition, users can refine potential protein partner hits by using annotations and possible interactive network in the SPPS web server.

Conclusions

SPPS provides a new type of tool to facilitate the identification of direct or indirect protein partners which may guide scientists on the investigation of new signaling pathways. The SPPS server is available to the public at http://mdl.shsmu.edu.cn/SPPS/.  相似文献   

13.
Plant promoter prediction with confidence estimation   总被引:10,自引:0,他引:10       下载免费PDF全文
  相似文献   

14.
15.
We report the properties of a draft genome sequence of the bacterium Anaerococcus vaginalis strain PH9, a species within the Anaerococcus genus. This strain, whose genome is described here, was isolated from the fecal flora of a 26-year-old woman suffering from morbid obesity. A. vaginalis is an obligate anaerobic coccus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,048,125-bp long (one chromosome but no plasmid) and contains 2,095 protein-coding and 38 RNA genes, including three rRNA genes.Key words: Anaerococcus vaginalis, genome  相似文献   

16.
17.
Brevibacterium senegalense strain JC43T sp. nov. is the type strain of Brevibacterium senegalense sp. nov., a new species within the Brevibacterium genus. This strain, whose genome is described here, was isolated from the fecal flora of a healthy Senegalese patient. B. senegalense is an aerobic rod-shaped Gram-positive bacterium. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,425,960 bp long genome (1 chromosome but no plasmid) contains 3,064 protein-coding and 49 RNA genes.  相似文献   

18.
Li W  Wooley JC  Godzik A 《PloS one》2008,3(10):e3375

Background

The scale and diversity of metagenomic sequencing projects challenge both our technical and conceptual approaches in gene and genome annotations. The recent Sorcerer II Global Ocean Sampling (GOS) expedition yielded millions of predicted protein sequences, which significantly altered the landscape of known protein space by more than doubling its size and adding thousands of new families (Yooseph et al., 2007 PLoS Biol 5, e16). Such datasets, not only by their sheer size, but also by many other features, defy conventional analysis and annotation methods.

Methodology/Principal Findings

In this study, we describe an approach for rapid analysis of the sequence diversity and the internal structure of such very large datasets by advanced clustering strategies using the newly modified CD-HIT algorithm. We performed a hierarchical clustering analysis on the 17.4 million Open Reading Frames (ORFs) identified from the GOS study and found over 33 thousand large predicted protein clusters comprising nearly 6 million sequences. Twenty percent of these clusters did not match known protein families by sequence similarity search and might represent novel protein families. Distributions of the large clusters were illustrated on organism composition, functional class, and sample locations.

Conclusion/Significance

Our clustering took about two orders of magnitude less computational effort than the similar protein family analysis of original GOS study. This approach will help to analyze other large metagenomic datasets in the future. A Web server with our clustering results and annotations of predicted protein clusters is available online at http://tools.camera.calit2.net/gos under the CAMERA project.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号