首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The Arabidopsis Information Resource (TAIR) is a web-based community database for the model plant Arabidopsis thaliana. It provides an integrated view of genes, sequences, proteins, germplasms, clones, metabolic pathways, gene expression, ecotypes, polymorphisms, publications, maps and community information. TAIR is developed and maintained by collaboration between software developers and biologists. Biologists provide specification and use cases for the system, acquire, analyse and curate data, interact with users and test the software. Software developers design, implement and test the database and software. In this review, we briefly describe how TAIR was built and is being maintained.  相似文献   

3.
AraCyc is a database containing biochemical pathways of Arabidopsis, developed at The Arabidopsis Information Resource (http://www.arabidopsis.org). The aim of AraCyc is to represent Arabidopsis metabolism as completely as possible with a user-friendly Web-based interface. It presently features more than 170 pathways that include information on compounds, intermediates, cofactors, reactions, genes, proteins, and protein subcellular locations. The database uses Pathway Tools software, which allows the users to visualize a bird's eye view of all pathways in the database down to the individual chemical structures of the compounds. The database was built using Pathway Tools' Pathologic module with MetaCyc, a collection of pathways from more than 150 species, as a reference database. This initial build was manually refined and annotated. More than 20 plant-specific pathways, including carotenoid, brassinosteroid, and gibberellin biosyntheses have been added from the literature. A list of more than 40 plant pathways will be added in the coming months. The quality of the initial, automatic build of the database was compared with the manually improved version, and with EcoCyc, an Escherichia coli database using the same software system that has been manually annotated for many years. In addition, a Perl interface, PerlCyc, was developed that allows programmers to access Pathway Tools databases from the popular Perl language. AraCyc is available at the tools section of The Arabidopsis Information Resource Web site (http://www.arabidopsis.org/tools/aracyc).  相似文献   

4.
5.
SUMMARY: The Affymetrix GeneChip Arabidopsis genome array has proved to be a very powerful tool for the analysis of gene expression in Arabidopsis thaliana, the most commonly studied plant model organism. VIZARD is a Java program created at the University of California, Berkeley, to facilitate analysis of Arabidopsis GeneChip data. It includes several integrated tools for filtering, sorting, clustering and visualization of gene expression data as well as tools for the discovery of regulatory motifs in upstream sequences. VIZARD also includes annotation and upstream sequence databases for the majority of genes represented on the Affymetrix Arabidopsis GeneChip array. AVAILABILITY: VIZARD is available free of charge for educational, research, and not-for-profit purposes, and can be downloaded at http://www.anm.f2s.com/research/vizard/ CONTACT: moseyko@uclink4.berkeley.edu  相似文献   

6.
7.
8.
Large-scale single-pass sequencing of cDNAs from different plants has provided an extensive reservoir for the cloning of genes, the evaluation of tissue-specific gene expression, markers for map-based cloning, and the annotation of genomic sequences. Although as of January 2000 GenBank contained over 220,000 entries of expressed sequence tags (ESTs) from plants, most publicly available plant ESTs are derived from vegetative tissues and relatively few ESTs are specifically derived from developing seeds. However, important morphogenetic processes are exclusively associated with seed and embryo development and the metabolism of seeds is tailored toward the accumulation of economically valuable storage compounds such as oil. Here we describe a new set of ESTs from Arabidopsis, which has been derived from 5- to 13-d-old immature seeds. Close to 28,000 cDNAs have been screened by DNA/DNA hybridization and approximately 10,500 new Arabidopsis ESTs have been generated and analyzed using different bioinformatics tools. Approximately 40% of the ESTs currently have no match in dbEST, suggesting many represent mRNAs derived from genes that are specifically expressed in seeds. Although these data can be mined with many different biological questions in mind, this study emphasizes the import of photosynthate into developing embryos, its conversion into seed oil, and the regulation of this pathway.  相似文献   

9.
Sesame (Sesamum indicum) is an important oilseed crop which produces seeds with 50% oil that have a distinct flavor and contains antioxidant lignans. Because sesame lignans are known to have antioxidant and health-protecting properties, metabolic pathways for lignans have been of interest in developing sesame seeds. As an initial approach to identify genes involved in accumulation of storage products and in the biosynthesis of antioxidant lignans, 3328 expressed sequence tags (ESTs) were obtained from a cDNA library of immature seeds 5-25 days old. ESTs were clustered and analyzed by the BLASTX or FASTAX program against the GenBank NR and Arabidopsis proteome databases. To compare gene expression profiles during development of green and non-green seeds, a comparative analysis was carried out between developing sesame and Arabidopsis seed ESTs. Analyses of these two seed EST sets have helped to identify similar and different gene expression profiles during seed development, and to identify a large number of sesame seed-specific genes. In particular, we have identified EST candidates for genes possibly involved in biosynthesis of sesame lignans, sesamin and sesamolin, and also suggest a possible metabolic pathway for the generation of cofactors required for synthesis of storage lipid in non-green oilseeds. Seed-specific expression of several candidate genes has been confirmed by northern blot analysis.  相似文献   

10.
11.
12.
The development of efficient DNA sequencing methods has led to the achievement of the DNA sequence of entire genomes from (to date) 55 prokaryotes, 5 eukaryotic organisms and 10 eukaryotic chromosomes. Thus, an enormous amount of DNA sequence data is available and even more will be forthcoming in the near future. Analysis of this overwhelming amount of data requires bioinformatic tools in order to identify genes that encode functional proteins or RNA. This is an important task, considering that even in the well-studied Escherichia coli more than 30% of the identified open reading frames are hypothetical genes. Future challenges of genome sequence analysis will include the understanding of gene regulation and metabolic pathway reconstruction including DNA chip technology, which holds tremendous potential for biomedicine and the biotechnological production of valuable compounds. The overwhelming volume of information often confuses scientists. This review intends to provide a guide to choosing the most efficient way to analyze a new sequence or to collect information on a gene or protein of interest by applying current publicly available databases and Web services. Recently developed tools that allow functional assignment of genes, mainly based on sequence similarity of the deduced amino acid sequence, using the currently available and increasing biological databases will be discussed.  相似文献   

13.
Proanthocyanidins (PAs) are the main products of the flavonoid biosynthetic pathway in seeds, but their biological function during seed germination is still unclear. We observed that seed germination is delayed with the increase of exogenous PA concentration in Arabidopsis. A similar inhibitory effect occurred in peeled Brassica napus seeds, which was observed by measuring radicle elongation. Using abscisic acid (ABA), a biosynthetic and metabolic inhibitor, and gene expression analysis by real-time polymerase chain reaction, we found that the inhibitory effect of PAs on seed germination is due to their promotion of ABA via de novo biogenesis, rather than by any inhibition of its degradation. Consistent with the relationship between PA content and ABA accumulation in seeds, PA-deficient mutants maintain a lower level of ABA compared with wild-types during germination. Our data suggest that PA distribution in the seed coat can act as a doorkeeper to seed germination. PA regulation of seed germination is mediated by the ABA signaling pathway.  相似文献   

14.
A Geographical Information System (GIS) is used to analyse allelic information of 13 sequenced loci of natural populations of Arabidopsis thaliana and to identify geographical structures. GIS provides tools for visualization and analysis of geographical population structures using molecular data. The geographical distribution of the number of variable positions in the alignments, the distribution of recombinant sequence blocks, and the distribution of a newly defined measure, the differentiation index, are studied. The differentiation index is introduced to measure the sequence divergence among individual plants sampled from various geographical localities. The numbers of variable positions and the differentiation index are also used for a metadata analysis covering about 26 kb of the genome. This analysis reveals, for the first time, differences in DNA sequence structures of geographically different populations of A. thaliana. The broadly defined west Mediterranean region consists of accessions with the highest numbers of polymorphic positions followed by the west European region. The GIS technology Kriging is used to define Arabidopsis specific diversity zones in Europe. The highest genetic variability is observed along the Atlantic coast from the western Iberian Peninsula to southern Great Britain, while lowest variability is found in central Europe.  相似文献   

15.
We present a high‐resolution map of genomic transformation‐competent artificial chromosome (TAC) clones extending over all Arabidopsis thaliana (Arabidopsis) chromosomes. The Arabidopsis genomic TAC clones have been valuable genetic tools. Previously, we constructed an Arabidopsis genomic TAC library consisting of more than 10 000 TAC clones harboring large genomic DNA fragments extending over the whole Arabidopsis genome. Here, we determined 13 577 end sequences from 6987 Arabidopsis TAC clones and mapped 5937 TAC clones to precise locations, covering approximately 90% of the Arabidopsis chromosomes. We present the large‐scale data set of TAC clones with high‐resolution mapping information as a Java application tool, the Arabidopsis TAC Position Viewer, which provides ready‐to‐go transformable genomic DNA clones corresponding to certain loci on Arabidopsis chromosomes. The TAC clone resources will accelerate genomic DNA cloning, positional walking, complementation of mutants and DNA transformation for heterologous gene expression.  相似文献   

16.
MOTIVATION: DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. RESULTS: We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. AVAILABILITY: A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :  相似文献   

17.
We analyzed the complete genome sequence of Arabidopsis thaliana and sequence data from 83 genes in the outcrossing A. lyrata, to better understand the role of gene expression on the strength of natural selection on synonymous and replacement sites in Arabidopsis. From data on tRNA gene abundance, we find a good concordance between codon preferences and the relative abundance of isoaccepting tRNAs in the complete A. thaliana genome, consistent with models of translational selection. Both EST-based and new quantitative measures of gene expression (MPSS) suggest that codon preferences derived from information on tRNA abundance are more strongly associated with gene expression than those obtained from multivariate analysis, which provides further support for the hypothesis that codon bias in Arabidopsis is under selection mediated by tRNA abundance. Consistent with previous results, analysis of protein evolution reveals a significant correlation between gene expression level and amino acid substitution rate. Analysis by MPSS estimates of gene expression suggests that this effect is primarily the result of a correlation between the number of tissues in which a gene is expressed and the rate of amino acid substitution, which indicates that the degree of tissue specialization may be an important determinant of the rate of protein evolution in Arabidopsis.  相似文献   

18.
MetaCyc (http://metacyc.org) contains experimentally determined biochemical pathways to be used as a reference database for metabolism. In conjunction with the Pathway Tools software, MetaCyc can be used to computationally predict the metabolic pathway complement of an annotated genome. To increase the breadth of pathways and enzymes, more than 60 plant-specific pathways have been added or updated in MetaCyc recently. In contrast to MetaCyc, which contains metabolic data for a wide range of organisms, AraCyc is a species-specific database containing only enzymes and pathways found in the model plant Arabidopsis (Arabidopsis thaliana). AraCyc (http://arabidopsis.org/tools/aracyc/) was the first computationally predicted plant metabolism database derived from MetaCyc. Since its initial computational build, AraCyc has been under continued curation to enhance data quality and to increase breadth of pathway coverage. Twenty-eight pathways have been manually curated from the literature recently. Pathway predictions in AraCyc have also been recently updated with the latest functional annotations of Arabidopsis genes that use controlled vocabulary and literature evidence. AraCyc currently features 1,418 unique genes mapped onto 204 pathways with 1,156 literature citations. The Omics Viewer, a user data visualization and analysis tool, allows a list of genes, enzymes, or metabolites with experimental values to be painted on a diagram of the full pathway map of AraCyc. Other recent enhancements to both MetaCyc and AraCyc include implementation of an evidence ontology, which has been used to provide information on data quality, expansion of the secondary metabolism node of the pathway ontology to accommodate curation of secondary metabolic pathways, and enhancement of the cellular component ontology for storing and displaying enzyme and pathway locations within subcellular compartments.  相似文献   

19.
The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee and the North American Arabidopsis Steering Committee. There are extensive tools and resources for information storage, curation, and retrieval of Arabidopsis data that have been developed over recent years primarily through the activities of The Arabidopsis Information Resource, the Nottingham Arabidopsis Stock Centre, and the Arabidopsis Biological Resource Center, among others. However, the rapid expansion in many data types, the international basis of the Arabidopsis community, and changing priorities of the funding agencies all suggest the need for changes in the way informatics infrastructure is developed and maintained. We propose that there is a need for a single core resource that is integrated into a larger international consortium of investigators. We envision this to consist of a distributed system of data, tools, and resources, accessed via a single information portal and funded by a variety of sources, under shared international management of an International Arabidopsis Informatics Consortium (IAIC). This article outlines the proposal for the development, management, operations, and continued funding for the IAIC.The Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering Committee (NAASC) hosted workshops in Nottingham, UK (April 15 to 16, 2010) and Washington DC (May 10 to 11, 2010) to consider the future bioinformatics needs of the Arabidopsis community as well as other science communities that depend vitally on Arabidopsis resources. The outcomes of both workshops were presented and discussed at the International Conference on Arabidopsis Research (ICAR) in Yokohama, Japan. The focus of the workshops was on Arabidopsis because of its unique and essential role as a reference organism for all seed plant species. The development of the highly annotated “gold standard” Arabidopsis genome sequence has been an invaluable resource for plant and crop sciences. This platform provides important information and working practices for other species and for comparative genomic and evolutionary studies. Arabidopsis tools and resources for information storage, curation, and retrieval have been developed over recent years primarily through the activities of The Arabidopsis Information Resource (TAIR), the Nottingham Arabidopsis Stock Centre (NASC), and the Arabidopsis Biological Resource Center, among others. However, the Arabidopsis community and funding agencies recognize the need for a single data management infrastructure. The key challenge is to develop and fund this resource in a sustainable and transparent manner.Global challenges surrounding food and energy security require intelligent plant breeding strategies that will be dependent on a central Arabidopsis information resource to aid our understanding of gene function and associated phenotype in many different environments. The knowledge accrued in Arabidopsis informs our understanding of the genetic basis of plant processes and crop traits. To date, this has accumulated primarily through analysis of single genes. However, gene products do not act alone but rather in complex interacting networks. Thus, the challenge for the Arabidopsis community is to understand this higher level of complexity, to a significant extent through the application of new high volume, quantitative experimental techniques. The goals of these efforts are to develop gene/protein/metabolite networks that will enable systems-level modeling of plant processes and ultimately to translate these findings to crop plants. To achieve these goals, we must develop novel approaches to data management, integration, and access.The UK workshop addressed three principal issues: the types of data generated by the Arabidopsis community, the types of data used by the community, and future needs of the community. The objective was to produce recommendations for the type of infrastructure necessary to address the challenges and opportunities associated with the application of new technologies and recommendations for a sustainable funding model to support this infrastructure. These recommendations were considered and expanded upon at the US workshop with the ultimate goal of generating solutions to the issues discussed in the first meeting. It was recognized that cohesive, cooperative, and long-term international collaboration will be critical to successfully maintain an Arabidopsis database infrastructure that is essential for plant biology research worldwide.The workshop participants concluded that there is a continued need for a central Arabidopsis information resource, based on the productivity of the Arabidopsis community and the critical importance of the findings generated by this community. For example, ∼3000 Arabidopsis publications are currently published in peer-reviewed journals each year, a nearly 10-fold increase since the early 1990s; and in 2009, TAIR was accessed by 335,692 unique visitors and had nearly 20 million page views. Furthermore, the importance of a current, well-organized, and carefully curated Arabidopsis genome to researchers studying other plants, including crops, cannot be overstated. In the future, this resource should be part of a larger infrastructure that would be dynamic and responsive to new directions in plant biology research.  相似文献   

20.
Complete structure of the chloroplast genome of Arabidopsis thaliana.   总被引:7,自引:0,他引:7  
The complete nucleotide sequence of the chloroplast genome of Arabidopsis thaliana has been determined. The genome as a circular DNA composed of 154,478 bp containing a pair of inverted repeats of 26,264 bp, which are separated by small and large single copy regions of 17,780 bp and 84,170 bp, respectively. A total of 87 potential protein-coding genes including 8 genes duplicated in the inverted repeat regions, 4 ribosomal RNA genes and 37 tRNA genes (30 gene species) representing 20 amino acid species were assigned to the genome on the basis of similarity to the chloroplast genes previously reported for other species. The translated amino acid sequences from respective potential protein-coding genes showed 63.9% to 100% sequence similarity to those of the corresponding genes in the chloroplast genome of Nicotiana tabacum, indicating the occurrence of significant diversity in the chloroplast genes between two dicot plants. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号