首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i?+?1. Biegert and S?ding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch.

Results

We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI??s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST.

Conclusions

DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the ??Protein BLAST?? link at http://blast.ncbi.nlm.nih.gov.

Reviewers

This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber.  相似文献   

2.
3.
4.
5.

Background

The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH). One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment.

Results

We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment) supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser.

Conclusion

ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.  相似文献   

6.

Background

The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, Glossina pallidipes, we found the need for tools to search quickly a set of reads for near exact text matches.

Methods

A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of de novo assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads.

Results

Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension.

Conclusion

The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.  相似文献   

7.
8.
9.

Background

Cytokines are small proteins that regulate immunity in vertebrate species. Marsupial and eutherian mammals last shared a common ancestor more than 180 million years ago, so it is not surprising that attempts to isolate many key marsupial cytokines using traditional laboratory techniques have been unsuccessful. This paucity of molecular data has led some authors to suggest that the marsupial immune system is 'primitive' and not on par with the sophisticated immune system of eutherian (placental) mammals.

Results

The sequencing of the first marsupial genome has allowed us to identify highly divergent immune genes. We used gene prediction methods that incorporate the identification of gene location using BLAST, SYNTENY + BLAST and HMMER to identify 23 key marsupial immune genes, including IFN-γ, IL-2, IL-4, IL-6, IL-12 and IL-13, in the genome of the grey short-tailed opossum (Monodelphis domestica). Many of these genes were not predicted in the publicly available automated annotations.

Conclusion

The power of this approach was demonstrated by the identification of orthologous cytokines between marsupials and eutherians that share only 30% identity at the amino acid level. Furthermore, the presence of key immunological genes suggests that marsupials do indeed possess a sophisticated immune system, whose function may parallel that of eutherian mammals.  相似文献   

10.

Background

The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set.

Results

In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method P athway A nalysis with D own-weighting of O verlapping G enes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results.

Conclusions

PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org.  相似文献   

11.
12.
13.

Background

The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways.

Results

annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools.

Conclusion

annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects.  相似文献   

14.

Background

Arenaviruses are a family of rodent-borne viruses that cause several hemorrhagic fevers. These diseases can be devastating and are often lethal. Herein, to aid in the design and development of diagnostics, treatments and vaccines for arenavirus infections, we have developed a database containing protein sequences from the seven pathogenic arenaviruses (Junin, Guanarito, Sabia, Machupo, Whitewater Arroyo, Lassa and LCMV).

Results

The database currently contains a non-redundant set of 333 protein sequences which were manually annotated. All entries were linked to NCBI and cited PubMed references. The database has a convenient query interface including BLAST search. Sequence variability analyses were also performed and the results are hosted in the database.

Conclusion

The database is available at http://epitope.liai.org:8080/projects/arena and can be used to aid in studies that require proteomic information from pathogenic arenaviruses.  相似文献   

15.

Key message

A cytogenetic map of wheat was constructed using FISH with cDNA probes. FISH markers detected homoeology and chromosomal rearrangements of wild relatives, an important source of genes for wheat improvement.

Abstract

To transfer agronomically important genes from wild relatives to bread wheat (Triticum aestivum L., 2n = 6x = 42, AABBDD) by induced homoeologous recombination, it is important to know the chromosomal relationships of the species involved. Fluorescence in situ hybridization (FISH) can be used to study chromosome structure. The genomes of allohexaploid bread wheat and other species from the Triticeae tribe are colinear to some extent, i.e., composed of homoeoloci at similar positions along the chromosomes, and with genic regions being highly conserved. To develop cytogenetic markers specific for genic regions of wheat homoeologs, we selected more than 60 full-length wheat cDNAs using BLAST against mapped expressed sequence tags and used them as FISH probes. Most probes produced signals on all three homoeologous chromosomes at the expected positions. We developed a wheat physical map with several cDNA markers located on each of the 14 homoeologous chromosome arms. The FISH markers confirmed chromosome rearrangements within wheat genomes and were successfully used to study chromosome structure and homoeology in wild Triticeae species. FISH analysis detected 1U-6U chromosome translocation in the genome of Aegilops umbellulata, showed colinearity between chromosome A of Ae. caudata and group-1 wheat chromosomes, and between chromosome arm 7S#3L of Thinopyrum intermedium and the long arm of the group-7 wheat chromosomes.  相似文献   

16.

Background

Haitian migrants played an important role shaping Cuban culture and traditional ethnobotanical knowledge. An ethnobotanical investigation was conducted to collect information on medicinal plant use by Haitian immigrants and their descendants in the Province of Camagüey, Cuba.

Methods

Information was obtained from semi-structured interviews with Haitian immigrants and their descendants, direct observations, and by reviewing reports of traditional Haitian medicine in the literature.

Results

Informants reported using 123 plant species belonging to 112 genera in 63 families. Haitian immigrants and their descendants mainly decoct or infuse aerial parts and ingest them, but medicinal baths are also relevant. Some 22 herbal mixtures are reported, including formulas for a preparation obtained using the fruit of Crescentia cujete. Cultural aspects related to traditional plant posology are addressed, as well as changes and adaptation of Haitian medicinal knowledge with emigration and integration over time.

Conclusion

The rapid disappearance of Haitian migrants' traditional culture due to integration and urbanization suggests that unrecorded ethnomedicinal information may be lost forever. Given this, as well as the poor availability of ethnobotanical data relating to traditional Haitian medicine, there is an urgent need to record this knowledge.  相似文献   

17.

Background

Clinical studies are a necessity for new medications and therapies. Many studies, however, struggle to meet their recruitment numbers in time or have problems in meeting them at all. With increasing numbers of electronic health records (EHRs) in hospitals, huge databanks emerge that could be utilized to support research. The Innovative Medicine Initiative (IMI) funded project ‘Electronic Health Records for Clinical Research’ (EHR4CR) created a standardized and homogenous inventory of data elements to support research by utilizing EHRs. Our aim was to develop a Data Inventory that contains elements required for site feasibility analysis.

Methods

The Data Inventory was created in an iterative, consensus driven approach, by a group of up to 30 people consisting of pharmaceutical experts and informatics specialists. An initial list was subsequently expanded by data elements of simplified eligibility criteria from clinical trial protocols. Each element was manually reviewed by pharmaceutical experts and standard definitions were identified and added. To verify their availability, data exports of the source systems at eleven university hospitals throughout Europe were conducted and evaluated.

Results

The Data Inventory consists of 75 data elements that, on the one hand are frequently used in clinical studies, and on the other hand are available in European EHR systems. Rankings of data elements were created from the results of the data exports. In addition a sub-list was created with 21 data elements that were separated from the Data Inventory because of their low usage in routine documentation.

Conclusion

The data elements in the Data Inventory were identified with the knowledge of domain experts from pharmaceutical companies. Currently, not all information that is frequently used in site feasibility is documented in routine patient care.  相似文献   

18.

Background, aim, and scope

The primary aim of this paper is to indicate that partitioning allocation methods yields only a small subset of solutions to an ill-posed problem that has potentially infinitely many exact solutions. It will be shown that each of the existing partitioning methods arrives at just one particular solution from among infinitely many solutions of an underdetermined system of linear equations.

Materials and methods

Some life cycle inventories fall into a class of functions called estimable functions in linear model framework, in which case they are invariant to allocation assumptions. This class of functions unites results described by Heijungs and Frischknecht (Int J Life Cycle Assess 3:321–332, 1998) and Heijungs and Suh (2002, Conjecture 1, p. 91). The inventories for non-estimable functions obtained through allocation are, in fact, derived under a set of additional implicit equality constraints called side conditions, often resulting in inventory results which differ greatly from one allocation to the next.

Results and discussions

This paper explicates (1) identification of all estimable functions from any given technology matrix and (2) recovery of side conditions imposed on non-estimable functions through partitioning. These methods are illustrated in a simple example, and their relation to least squares techniques for allocation explored by Marvuglia et al. (Int J Life Cycle Assess 15:1020–1040, 2010) ;(Int J Agr Environ Inf Syst 3:51–71, 2012) are discussed.

Conclusions and outlook

Recommendations are made that may lead to more meaningful ways to obtain additional data or include additional information in life cycle inventories in the future.  相似文献   

19.
20.

Background

Although cardiac auscultation remains important to detect abnormal sounds and murmurs indicative of cardiac pathology, the application of electronic methods remains seldom used in everyday clinical practice. In this report we provide preliminary data showing how the phonocardiogram can be analyzed using color spectrographic techniques and discuss how such information may be of future value for noninvasive cardiac monitoring.

Methods

We digitally recorded the phonocardiogram using a high-speed USB interface and the program Gold Wave http://www.goldwave.com in 55 infants and adults with cardiac structural disease as well as from normal individuals and individuals with innocent murmurs. Color spectrographic analysis of the signal was performed using Spectrogram (Version 16) as a well as custom MATLAB code.

Results

Our preliminary data is presented as a series of seven cases.

Conclusions

We expect the application of spectrographic techniques to phonocardiography to grow substantially as ongoing research demonstrates its utility in various clinical settings. Our evaluation of a simple, low-cost phonocardiographic recording and analysis system to assist in determining the characteristic features of heart murmurs shows promise in helping distinguish innocent systolic murmurs from pathological murmurs in children and is expected to useful in other clinical settings as well.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号