首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When the human genome project was conceived, its leaders wanted all researchers to have equal access to the data and associated research tools. Their vision of equal access provides an unprecedented teaching opportunity. Teachers and students have free access to the same databases that researchers are using. Furthermore, the recent movement to deliver scientific publications freely has presented a second source of current information for teaching. I have developed a genomics course that incorporates many of the public-domain databases, research tools, and peer-reviewed journals. These online resources provide students with exciting entree into the new fields of genomics, proteomics, and bioinformatics. In this essay, I outline how these fields are especially well suited for inclusion in the undergraduate curriculum. Assessment data indicate that my students were able to utilize online information to achieve the educational goals of the course and that the experience positively influenced their perceptions of how they might contribute to biology.  相似文献   

2.
Enormous amounts of data result from genome sequencing projects and new experimental methods. Within this tremendous amount of genomic data 30-40 per cent of the genes being identified in an organism remain unknown in terms of their biological function. As a consequence of this lack of information the overall schema of all the biological functions occurring in a specific organism cannot be properly represented. To understand the functional properties of the genomic data more experimental data must be collected. A pathway database is an effort to handle the current knowledge of biochemical pathways and in addition can be used for interpretation of sequence data. Some of the existing pathway databases can be interpreted as detailed functional annotations of genomes because they are tightly integrated with genomic information. However, experimental data are often lacking in these databases. This paper summarises a list of pathway databases and some of their corresponding biological databases, and also focuses on information about the content and the structure of these databases, the organisation of the data and the reliability of stored information from a biological point of view. Moreover, information about the representation of the pathway data and tools to work with the data are given. Advantages and disadvantages of the analysed databases are pointed out, and an overview to biological scientists on how to use these pathway databases is given.  相似文献   

3.
Correct spelling of taxon names in vegetation databases is a fundamental prerequisite for many data processing steps. However, manual detection and correction of spelling mistakes is inefficient, prone to errors and non‐reproducible, especially when scanning large databases. Here, I review six software tools that spell‐check taxon names in vegetation databases: (1) the Global Names Resolver, (2) the Interim Register of Marine and Nonmarine Genera, (3) the Taxonomic Name Resolution Service and R packages (4) Plantminer, (5) Taxonstand and (6) tpl. In particular, I test their capacity to spell‐check names across the taxonomic ranks and organism groups frequently encountered in vegetation data and challenge their ability to screen names from different geographic regions. Performance by software tools differed widely in these tests. Backed up by multiple reference lists, the Global Names Resolver emerged as the most versatile software tool. All software solutions currently suffer from some minor limitations, including an inability to spell‐check names of hybrid taxa. Furthermore, some spelling mistakes, by their nature, cannot be resolved unambiguously. Given these limitations, taxon names should be spell‐checked with software tools in a semi‐automatic rather than an automatic way.  相似文献   

4.
Nowadays understanding alternative splicing is one of the greatest challenges in biology, because it is a genetic process much more important than thought at the time of its discovery. In this paper, we explain the approach of using the different available databases and software tools to start a large scale investigation of alternative splice forms. To collect information about alternative splicing we investigated known data in the databases using different computational methods. The investigations proceeded from the genomic sequence data to structural protein data. Then, we interpreted those data to find the relationship between alternative splice forms and protein function and structure. We found some interesting features of alternative splicing which are presented here. We discuss the results of one chosen example. They concern the coverage quality of the protein sequence of a known structure, an EST analysis, the validation of splice variants, the determination of the alternative splice type, and finally the link between alternative splicing and disease.  相似文献   

5.
6.
Plant genome databases play an important role in the archiving and dissemination of data arising from the international genome projects. Recent developments in bioinformatics, such as new software tools, programming languages and standards, have produced better access across the Internet to the data held within them.An increasing emphasis is placed on data analysis and indeed many resources now provide tools allied to the databases, to aid in the analysis and interpretation of the data. However, a considerable wealth of information lies untapped by considering the databases as single entities and will only be exploited by linking them with a wide range of data sources. Data from research programs such as comparative mapping and germplasm studies may be used as tools, to gain additional knowledge but without additional experimentation. To date, the current plant genome databases are not yet linked comprehensively with each other or with these additional resources, although they are clearly moving toward this. Here, the current wealth of public plant genome databases is reviewed, together with an overview of initiatives underway to bind them to form a single plant genome infrastructure.  相似文献   

7.
During recent years, microarrays have been firmly established as valuable tools for the discovery of novel biological phenomena. Especially in combination with whole genome sequences, microarray data can help unravel the dynamics of the expressed genome. For filamentous fungi, microarray studies have already been performed with more than 20 different species; these investigations have explored a variety of different aspects of fungal biology. In this review, I will give an overview of some of the key questions that have been addressed using microarray hybridizations with filamentous fungi, with particular focus on the analysis of co-regulated pathways and physically clustered genes, as well as on the use of microarray data to determine a molecular phenotype. Additionally, a number of useful, freely available software tools for the analysis of fungal microarray data will be discussed.  相似文献   

8.
The development of tools for the analysis of global gene expression is vital for the optimal exploitation of the data on parasite genomes that are now being generated in abundance. Recent advances in two-dimensional electrophoresis (2-DE), mass spectrometry and bioinformatics have greatly enhanced the possibilities for mapping and characterisation of protein populations. We have employed these developments in a proteomics approach for the analysis of proteins expressed in the tachyzoite stage of Toxoplasma gondii. Over 1000 polypeptides were reproducibly separated by high-resolution 2-DE using the pH ranges 4-7 and 6-11. Further separations using narrow range gels suggest that at least 3000-4000 polypeptides should be resolvable by 2-DE using multiple single pH unit gels. Mass spectrometry was used to characterise a variety of protein spots on the 2-DE gels. Peptide mass fingerprints, acquired by matrix-assisted laser desorption/ionisation-(MALDI) mass spectrometry, enabled unambiguous protein identifications to be made where full gene sequence information was available. However, interpretation of peptide mass fingerprint data using the T. gondii expressed sequence tag (EST) database was less reliable. Peptide fragmentation data, acquired by post-source decay mass spectrometry, proved a more successful strategy for the putative identification of proteins using the T. gondii EST database and protein databases from other organisms. In some instances, several protein spots appeared to be encoded by the same gene, indicating that post-translational modification and/or alternative splicing events may be a common feature of functional gene expression in T. gondii. The data demonstrate that proteomic analyses are now viable for T. gondii and other protozoa for which there are good EST databases, even in the absence of complete genome sequence. Moreover, proteomics is of great value in interpreting and annotating EST databases.  相似文献   

9.
欧竑宇 《微生物学通报》2013,40(10):1909-1919
随着DNA测序技术的进步, 迄今为止已有12个链霉菌基因组被测序。面对海量组学的数据, 急需采用生物信息学方法来大规模深度挖掘这些重要微生物资源, 进而实现链霉菌资源挖掘和代谢潜力释放的深度互动。围绕链霉菌基因组比较分析中菌株特有的基因组岛和次生代谢物生物合成基因簇的识别及功能解析等两个常见问题, 本文收集了近期开发的一些常用生物信息学工具和二级数据库。以链霉菌染色体核心区和两臂的划分、天蓝色链霉菌和变铅青链霉菌基因组岛的识别、卡特利链霉菌巨型质粒的鉴别为例, 简介了这些生物信息学资源的使用方法。此外, 还简述了我们课题组进行放线菌型整合性接合元件识别和开发硫肽生物合成基因簇预测新工具的一些尝试。生物信息学工具和二级数据库在链霉菌基因组比较分析中有重要作用, 可将研究重点迅速地聚焦在某株菌的可移动遗传元件和次生代谢物生成基因簇上, 确定其对应的菌株特有表型, 及解析新型化合物生物合成和调控机理。  相似文献   

10.
11.
The requirements for bioinformatics resources to support genome research in farm animals is reviewed.The resources developed to meet these needs are described. Resource databases and associated tools have been developed to handle experimental data. Several of these systems serve the needs of multinational collaborations. Genome databases have been established to provide contemporary summaries of the status of genome maps in a range of farm and domestic animals along with experimental details and citations. New resources and tools will be required to address the informatics needs of emerging technologies such as microarrays. However, continued investment is also required to maintain the currency and utility of the current systems, especially the genome databases.  相似文献   

12.
Nucleic acid sequences from genome sequencing projects are submitted as raw data, from which biologists attempt to elucidate the function of the predicted gene products. The protein sequences are stored in public databases, such as the UniProt Knowledgebase (UniProtKB), where curators try to add predicted and experimental functional information. Protein function prediction can be done using sequence similarity searches, but an alternative approach is to use protein signatures, which classify proteins into families and domains. The major protein signature databases are available through the integrated InterPro database, which provides a classification of UniProtKB sequences. As well as characterization of proteins through protein families, many researchers are interested in analyzing the complete set of proteins from a genome (i.e. the proteome), and there are databases and resources that provide non-redundant proteome sets and analyses of proteins from organisms with completely sequenced genomes. This article reviews the tools and resources available on the web for single and large-scale protein characterization and whole proteome analysis.  相似文献   

13.
Whereas genomics describes the study of genome, mainly represented by its gene expression on the DNA or RNA level, the term proteomics denotes the study of the proteome, which is the protein complement encoded by the genome. In recent years, the number of proteomic experiments increased tremendously. While all fields of proteomics have made major technological advances, the biggest step was seen in bioinformatics. Biological information management relies on sequence and structure databases and powerful software tools to translate experimental results into meaningful biological hypotheses and answers. In this resource article, I provide a collection of databases and software available on the Internet that are useful to interpret genomic and proteomic data. The article is a toolbox for researchers who have genomic or proteomic datasets and need to put their findings into a biological context.  相似文献   

14.
Using genomic databases for sequence-based biological discovery   总被引:1,自引:0,他引:1  
The inherent potential underlying the sequence data produced by the International Human Genome Sequencing Consortium and other systematic sequencing projects is, obviously, tremendous. As such, it becomes increasingly important that all biologists have the ability to navigate through and cull important information from key publicly available databases. The continued rapid rise in available sequence information, particularly as model organism data is generated at breakneck speed, also underscores the necessity for all biologists to learn how to effectively make their way through the expanding "sequence information space." This review discusses some of the more commonly used tools for sequence discovery; tools have been developed for the effective and efficient mining of sequence information. These include LocusLink, which provides a gene-centric view of sequence-based information, as well as the 3 major genome browsers: the National Center for Biotechnology Information Map Viewer, the University of California Santa Cruz Genome Browser, and the European Bioinformatics Institute's Ensembl system. An overview of the types of information available through each of these front-ends is given, as well as information on tutorials and other documentation intended to increase the reader's familiarity with these tools.  相似文献   

15.
16.
Hundreds of bacterial genomes including the genomes of dozens of plant pathogenic bacteria have been sequenced. These genomes represent an invaluable resource for molecular plant pathologists. In this review, we describe different approaches that can be used for mining bacterial genome sequences and examples of how some of these approaches have been used to analyse plant pathogen genomes so far. We review how genomes can be mined one by one and how comparative genomics of closely related genomes releases the true power of genomics. Databases and tools useful for genome mining that are publicly accessible on the Internet are also described. Finally, the need for new databases and tools to efficiently mine today's plant pathogen genomes and hundreds more in the near future is discussed.  相似文献   

17.
BACKGROUND: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. METHODOLOGY: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.  相似文献   

18.
Α web-based Geographic Information Systems (GIS) platform – named Virtual Fire – for forest fire control has been developed to easily, validly and promptly share and utilize information and tools among firefighting forces. This state-of-the-art system enables fire management professionals to take advantage of GIS capabilities without needing to locally install complex software components. Fire management professionals can locate fire service vehicles and other resources online and in real-time. Fire patrol aircrafts and vehicles may use tracking devices to send their coordinates directly to the platform. Cameras can augment these data by transmitting images of high-risk areas into the graphical interface of the system. Furthermore, the system provides the geographical representation of fire ignition probability and identifies high-risk areas at different local regions daily, based on a high performance computing (HPC) pilot application that runs on Windows HPC Server. Real-time data from remote automatic weather stations and weather maps based on a weather forecasting system provide vital weather data needed for fire prevention and early warning. By using these methods and a variety of fire management information and tools, the end-users are given the ability to design an operational plan to encompass the forest fire, choosing the best ways to put the fire out within the proper recourses and time.  相似文献   

19.
MOTIVATION: As more whole genome sequences become available, comparing multiple genomes at the sequence level can provide insight into new biological discovery. However, there are significant challenges for genome comparison. The challenge includes requirement for computational resources owing to the large volume of genome data. More importantly, since the choice of genomes to be compared is entirely subjective, there are too many choices for genome comparison. For these reasons, there is pressing need for bioinformatics systems for comparing multiple genomes where users can choose genomes to be compared freely. RESULTS: PLATCOM (Platform for Computational Comparative Genomics) is an integrated system for the comparative analysis of multiple genomes. The system is built on several public databases and a suite of genome analysis applications are provided as exemplary genome data mining tools over these internal databases. Researchers are able to visually investigate genomic sequence similarities, conserved gene neighborhoods, conserved metabolic pathways and putative gene fusion events among a set of selected multiple genomes. AVAILABILITY: http://platcom.informatics.indiana.edu/platcom  相似文献   

20.
ORFans are open reading frames (ORFs) with no detectable sequence similarity to any other sequence in the databases. Each newly sequenced genome contains a significant number of ORFans. Therefore, ORFans entail interesting evolutionary puzzles. However, little can be learned about them using bioinformatics tools, and their study seems to have been underemphasized. Here we present some of the questions that the existence of so many ORFans have raised and review some of the studies aimed at understanding ORFans, their functions and their origins. These works have demonstrated that ORFans are an untapped source of research, requiring further computational and experimental studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号