首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The requirements for bioinformatics resources to support genome research in farm animals is reviewed.The resources developed to meet these needs are described. Resource databases and associated tools have been developed to handle experimental data. Several of these systems serve the needs of multinational collaborations. Genome databases have been established to provide contemporary summaries of the status of genome maps in a range of farm and domestic animals along with experimental details and citations. New resources and tools will be required to address the informatics needs of emerging technologies such as microarrays. However, continued investment is also required to maintain the currency and utility of the current systems, especially the genome databases.  相似文献   

2.
Bioinformatics is a central discipline in modern life sciences aimed at describing the complex properties of living organisms starting from large-scale data sets of cellular constituents such as genes and proteins. In order for this wealth of information to provide useful biological knowledge, databases and software tools for data collection, analysis and interpretation need to be developed. In this paper, we review recent advances in the design and implementation of bioinformatics resources devoted to the study of metals in biological systems, a research field traditionally at the heart of bioinorganic chemistry. We show how metalloproteomes can be extracted from genome sequences, how structural properties can be related to function, how databases can be implemented, and how hints on interactions can be obtained from bioinformatics.  相似文献   

3.
With several plant genomes sequenced, the power of comparative genome analysis can now be applied. However, genome-scale cross-species analyses are limited by the effort for data integration. To develop an integrated cross-species plant genome resource, we maintain comprehensive databases for model plant genomes, including Arabidopsis (Arabidopsis thaliana), maize (Zea mays), Medicago truncatula, and rice (Oryza sativa). Integration of data and resources is emphasized, both in house as well as with external partners and databases. Manual curation and state-of-the-art bioinformatic analysis are combined to achieve quality data. Easy access to the data is provided through Web interfaces and visualization tools, bulk downloads, and Web services for application-level access. This allows a consistent view of the model plant genomes for comparative and evolutionary studies, the transfer of knowledge between species, and the integration with functional genomics data.  相似文献   

4.
The availability of many genomic resources such as genome sequences, functional genomics resources including microarrays and RNA-seq, sufficient numbers of molecular markers, express sequence tags (ESTs) and high-density genetic maps is causing a rapid acceleration of genetics and genomic research of many fruit plants. This is leading to an increase in our knowledge of the genes that are linked to many horticultural and agronomically important traits. Recently, some progress has also been made on the identification and functional analysis of miRNAs in some fruit plants. This is one of the most active research fields in plant sciences. The last decade has witnessed development of genomic resources in many fruit plants such as apple, banana, citrus, grapes, papaya, pears, strawberry etc.; however, many of them are still not being exploited. Furthermore, owing to lack of resources, infrastructure and research facilities in many lesser-developed countries, development of genomic resources in many underutilized or less-studied fruit crops, which grow in these countries, is limited. Thus, research emphasis should be given to those fruit crops for which genomic resources are relatively scarce. The development of genomic databases of these less-studied fruit crops will enable biotechnologists to identify target genes that underlie key horticultural and agronomical traits. This review presents an overview of the current status of the development of genomic resources in fruit plants with the main emphasis being on genome sequencing, EST resources, functional genomics resources including microarray and RNA-seq, identification of quantitative trait loci and construction of genetic maps as well as efforts made on the identification and functional analysis of miRNAs in fruit plants.  相似文献   

5.
Nucleic acid sequences from genome sequencing projects are submitted as raw data, from which biologists attempt to elucidate the function of the predicted gene products. The protein sequences are stored in public databases, such as the UniProt Knowledgebase (UniProtKB), where curators try to add predicted and experimental functional information. Protein function prediction can be done using sequence similarity searches, but an alternative approach is to use protein signatures, which classify proteins into families and domains. The major protein signature databases are available through the integrated InterPro database, which provides a classification of UniProtKB sequences. As well as characterization of proteins through protein families, many researchers are interested in analyzing the complete set of proteins from a genome (i.e. the proteome), and there are databases and resources that provide non-redundant proteome sets and analyses of proteins from organisms with completely sequenced genomes. This article reviews the tools and resources available on the web for single and large-scale protein characterization and whole proteome analysis.  相似文献   

6.
In the wake of the numerous now-fruitful genome projects, we have witnessed a 'tsunami' of sequence data and with it the birth of the field of bioinformatics. Bioinformatics involves the application of information technology to the management and analysis of biological data. For many of us, this means that databases and their search tools have become an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments, even for the cognoscenti. Moreover, increasing amounts of sequence information do not necessarily equate with an increase in knowledge, and in the panic to automate the route from raw data to biological insight, we may be generating and propagating innumerable errors in our precious databases. In the genome era upon us, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. For the pharmaceutical industry in particular, the Pandora's box of bioinformatics harbours an information-rich nugget, ripe with potential drug targets and possible new avenues for the development of therapeutic agents. This review outlines the current status of the major pattern databases now used routinely in the analysis of protein sequences. The review is divided into three main sections. In the first, commonly used terms are defined and the methods behind the databases are briefly described; in the second, the structure and content of the principal pattern databases are discussed; and in the final part, several alignment databases, which are frequently confused with pattern databases, are mentioned. For the new-comer, the array of resources, the range of methods behind them and the different tools required to search them can be confusing. The review therefore also briefly mentions a current international endeavour to integrate the diverse databases, which effort should facilitate sequence analysis in the future. This is particularly important for target-discovery programmes, where the challenge is to rationalise the enormous numbers of potential targets generated by sequence database searches. This problem may be addressed, at least in part, by reducing search outputs to the more focused and manageable subsets suggested by searches of integrated groups of family-specific pattern databases.  相似文献   

7.
A wealth of bioinformatics tools and databases has been created over the last decade and most are freely available to the general public. However, these valuable resources live a shadow existence compared to experimental results and methods that are widely published in journals and relatively easily found through publication databases such as PubMed. For the general scientist as well as bioinformaticists, these tools can deliver great value to the design and analysis of biological and medical experiments, but there is no inventory presenting an up-to-date and easily searchable index of all these resources. To remedy this, the BioWareDB search engine has been created. BioWareDB is an extensive and current catalog of software and databases of relevance to researchers in the fields of biology and medicine, and presently consists of 2800 validated entries. AVAILABILITY: BioWareDB is freely available over the Internet at http://www.biowaredb.org/  相似文献   

8.
Genetic maps are important tools in plant genomics and breeding. The present study reports the large-scale discovery of single nucleotide polymorphisms (SNPs) for genetic map construction in tea plant. We developed a total of 6,042 valid SNP markers using specific-locus amplified fragment sequencing (SLAF-seq), and subsequently mapped them into the previous framework map. The final map contained 6,448 molecular markers, distributing on fifteen linkage groups corresponding to the number of tea plant chromosomes. The total map length was 3,965 cM, with an average inter-locus distance of 1.0 cM. This map is the first SNP-based reference map of tea plant, as well as the most saturated one developed to date. The SNP markers and map resources generated in this study provide a wealth of genetic information that can serve as a foundation for downstream genetic analyses, such as the fine mapping of quantitative trait loci (QTL), map-based cloning, marker-assisted selection, and anchoring of scaffolds to facilitate the process of whole genome sequencing projects for tea plant.  相似文献   

9.
10.
Enormous amounts of data result from genome sequencing projects and new experimental methods. Within this tremendous amount of genomic data 30-40 per cent of the genes being identified in an organism remain unknown in terms of their biological function. As a consequence of this lack of information the overall schema of all the biological functions occurring in a specific organism cannot be properly represented. To understand the functional properties of the genomic data more experimental data must be collected. A pathway database is an effort to handle the current knowledge of biochemical pathways and in addition can be used for interpretation of sequence data. Some of the existing pathway databases can be interpreted as detailed functional annotations of genomes because they are tightly integrated with genomic information. However, experimental data are often lacking in these databases. This paper summarises a list of pathway databases and some of their corresponding biological databases, and also focuses on information about the content and the structure of these databases, the organisation of the data and the reliability of stored information from a biological point of view. Moreover, information about the representation of the pathway data and tools to work with the data are given. Advantages and disadvantages of the analysed databases are pointed out, and an overview to biological scientists on how to use these pathway databases is given.  相似文献   

11.
ABSTRACT: BACKGROUND: Multi-locus sequence typing (MLST) has become the gold standard for population analyses of bacterial pathogens. This method focuses on the sequences of a small number of loci (usually seven) to divide the population and is simple, robust and facilitates comparison of results between laboratories and over time. Over the last decade, researchers and population health specialists have invested substantial effort in building up public MLST databases for nearly 100 different bacterial species, and these databases contain a wealth of important information linked to MLST sequence types such as time and place of isolation of isolation, host or niche, serotype and even clinical or drug resistance profiles. Recent advances in sequencing technology mean it is increasingly feasible to perform bacterial population analysis at the whole genome level. This offers massive gains in resolving power and genetic profiling compared to MLST, and will eventually replace MLST for bacterial typing and population analysis. However given the wealth of data currently available in MLST databases, it is crucial to maintain backwards compatibility with MLST schemes so that new genome analyses can be understood in their proper historical context. RESULTS: We present a software tool, SRST, for quick and accurate retrieval of sequence types from short read sets, using inputs easily downloaded from public databases. SRST assigns alleles using read mapping and an allele assignment score incorporating sequence coverage and variability, to determine the most likely allele. Analysis of over 3,500 loci in more than 500 publicly accessible Illumina read sets showed SRST to be highly accurate at allele assignment. SRST output is compatible with common analysis tools such as eBURST, Clonal Frame or PhyloViz, allowing easy comparison between novel genome data and MLST data. Alignment, fastq and pileup files can also be generated for novel alleles. CONCLUSIONS: SRST is a novel software tool for accurate assignment of sequence types using short read data. Several uses for the tool are demonstrated, including quality control for high-throughput sequencing projects, plasmid MLST and analysis of genomic data during outbreak investigation. SRST is open-source, requires Python, BWA and SamTools, and is available from http://srst.sourceforge.net.  相似文献   

12.
13.
The role of pattern databases in sequence analysis   总被引:2,自引:0,他引:2  
In the wake of the numerous now-fruitful genome projects, we are entering an era rich in biological data. The field of bioinformatics is poised to exploit this information in increasingly powerful ways, but the abundance and growing complexity both of the data and of the tools and resources required to analyse them are threatening to overwhelm us. Databases and their search tools are now an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments. In an age of information overload, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. But what are those tools? How do we access them? Which should we use? This review focuses on a particular type of database that is increasingly used in the task of routine sequence analysis--the so-called pattern database. The paper aims to provide an overview of the current status of pattern databases in common use, outlining the methods behind them and giving pointers on their diagnostic strengths and weaknesses.  相似文献   

14.
The wealth of databases containing genomic information and the easy access via the internet can be an invaluable tool in performing genetic studies and identifying important sequences. This article provides an overview and specific methods for using these resources in both positional cloning and identifying candidate genes for diseases and phenotypes. The ability to apply information across the mouse and human species is stressed. Useful internet sites and their contents are identified and described, and some understanding of their current limitations is provided. As additional genomic definition accelerates, the use of these tools will become more essential in cutting edge research linking sequence and function in whole mammalian organisms.  相似文献   

15.
An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.  相似文献   

16.
Polyploidy events (polyploidization) followed by progressive loss of redundant genome components are a major feature of plant evolution, with new evidence suggesting that all flowering plants possess ancestral genome duplications. Furthermore, many of our most important crop plants have undergone additional, relatively recent, genome duplication events. Recent advances in DNA sequencing have made vast amounts of new genomic data available for many plants, including a range of important crop species with highly duplicated genomes. Along with assisting traditional forward genetics approaches to study gene function, this wealth of new sequence data facilitates extensive reverse genetics-based functional analyses. However, plants featuring high levels of genome duplication as a result of recent polyploidization pose additional challenges for reverse genetic analysis. Here we review reverse genetic analysis in such polyploid plants and highlight key challenges.  相似文献   

17.
18.

Background  

A main goal in understanding cell mechanisms is to explain the relationship among genes and related molecular processes through the combined use of technological platforms and bioinformatics analysis. High throughput platforms, such as microarrays, enable the investigation of the whole genome in a single experiment. There exist different kind of microarray platforms, that produce different types of binary data (images and raw data). Moreover, also considering a single vendor, different chips are available. The analysis of microarray data requires an initial preprocessing phase (i.e. normalization and summarization) of raw data that makes them suitable for use on existing platforms, such as the TIGR M4 Suite. Nevertheless, the annotations of data with additional information such as gene function, is needed to perform more powerful analysis. Raw data preprocessing and annotation is often performed in a manual and error prone way. Moreover, many available preprocessing tools do not support annotation. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of microarray data are needed.  相似文献   

19.
The Swiss-Prot protein knowledgebase provides manually annotated entries for all species, but concentrates on the annotation of entries from model organisms to ensure the presence of high quality annotation of representative members of all protein families. A specific Plant Protein Annotation Program (PPAP) was started to cope with the increasing amount of data produced by the complete sequencing of plant genomes. Its main goal is the annotation of proteins from the model plant organism Arabidopsis thaliana. In addition to bibliographic references, experimental results, computed features and sometimes even contradictory conclusions, direct links to specialized databases connect amino acid sequences with the current knowledge in plant sciences. As protein families and groups of plant-specific proteins are regularly reviewed to keep up with current scientific findings, we hope that the wealth of information of Arabidopsis origin accumulated in our knowledgebase, and the numerous software tools provided on the Expert Protein Analysis System (ExPASy) web site might help to identify and reveal the function of proteins originating from other plants. Recently, a single, centralized, authoritative resource for protein sequences and functional information, UniProt, was created by joining the information contained in Swiss-Prot, Translation of the EMBL nucleotide sequence (TrEMBL), and the Protein Information Resource-Protein Sequence Database (PIR-PSD). A rising problem is that an increasing number of nucleotide sequences are not being submitted to the public databases, and thus the proteins inferred from such sequences will have difficulties finding their way to the Swiss-Prot or TrEMBL databases.  相似文献   

20.
Genome browsing with Ensembl: a practical overview.   总被引:1,自引:0,他引:1  
A wealth of gene information is accruing in public databases. Genome browsers such as Ensembl are needed to organize and depict this information in the context of the genome. Ensembl provides an open source gene set based on experimental evidence for over 30 species, the majority of which are vertebrates. Genes and annotation are accessible through the Ensembl browser (http://www.ensembl.org), and through direct queries of its databases using the Perl API (Application Programme Interface), MySQL or BioMart.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号