首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The microarray-based analysis of gene expression has become a workhorse for biomedical research. Managing the amount and diversity of data that such experiments produce is a task that must be supported by appropriate software tools, which led to the creation of literally hundreds of systems. In consequence, choosing the right tool for a given project is difficult even for the expert. We report on the results of a survey encompassing 78 of such tools, of which 22 were inspected in detail and seven were tested hands-on. We report on our experiences with a focus on completeness of functionality, ease-of-use, and necessary effort for installation and maintenance. Thereby, our survey provides a valuable guideline for any project considering the use of a microarray data management system. It reveals which tasks are covered by mature tools and also shows that important requirements, especially in the area of integrated analysis of different experimental data, are not yet met satisfyingly by existing systems.  相似文献   

2.
This paper describes an open-source system for analyzing, storing, and validating proteomics information derived from tandem mass spectrometry. It is based on a combination of data analysis servers, a user interface, and a relational database. The database was designed to store the minimum amount of information necessary to search and retrieve data obtained from the publicly available data analysis servers. Collectively, this system was referred to as the Global Proteome Machine (GPM). The components of the system have been made available as open source development projects. A publicly available system has been established, comprised of a group of data analysis servers and one main database server.  相似文献   

3.
4.
Semple CA  Evans KL  Porteous DJ 《Genome biology》2001,2(3):comment2003.1-comment20035
Once thought to be impossible or a waste of resources, the initial high-volume stages of sequencing the human genome have been completed.  相似文献   

5.
We report the completely annotated genome sequence of Mycobacterium tuberculosis Erdman (TMC 107; ATCC 35801), which is a well-known laboratory strain of M. tuberculosis.  相似文献   

6.
Pigeonpea (Cajanus cajan) is an important grain legume of the Indian subcontinent, South-East Asia and East Africa. More than eighty five percent of the world pigeonpea is produced and consumed in India where it is a key crop for food and nutritional security of the people. Here we present the first draft of the genome sequence of a popular pigeonpea variety ??Asha??. The genome was assembled using long sequence reads of 454 GS-FLX sequencing chemistry with mean read lengths of >550?bp and >10-fold genome coverage, resulting in 510,809,477?bp of high quality sequence. Total 47,004 protein coding genes and 12,511 transposable elements related genes were predicted. We identified 1,213 disease resistance/defense response genes and 152 abiotic stress tolerance genes in the pigeonpea genome that make it a hardy crop. In comparison to soybean, pigeonpea has relatively fewer number of genes for lipid biosynthesis and larger number of genes for cellulose synthesis. The sequence contigs were arranged in to 59,681 scaffolds, which were anchored to eleven chromosomes of pigeonpea with 347 genic-SNP markers of an intra-species reference genetic map. Eleven pigeonpea chromosomes showed low but significant synteny with the twenty chromosomes of soybean. The genome sequence was used to identify large number of hypervariable ??Arhar?? simple sequence repeat (HASSR) markers, 437 of which were experimentally validated for PCR amplification and high rate of polymorphism among pigeonpea varieties. These markers will be useful for fingerprinting and diversity analysis of pigeonpea germplasm and molecular breeding applications. This is the first plant genome sequence completed entirely through a network of Indian institutions led by the Indian Council of Agricultural Research and provides a valuable resource for the pigeonpea variety improvement.  相似文献   

7.
Laboratories working with draft phase genomes have specific software needs, such as the unattended processing of hundreds of single scaffolds and subsequent sequence annotation. In addition, it is critical to follow the "movement" and the manual annotation of single open reading frames (ORFs) within the successive sequence updates. Even with finished genomes, regular database updates can lead to significant changes in the annotation of single ORFs. In functional genomics it is important to mine data and identify new genetic targets rapidly and easily. Often there is no need for sophisticated relational databases (RDB) that greatly reduce the system-independent access of the results. Another aspect is the internet dependency of most software packages. If users are working with confidential data, this dependency poses a security issue. GAMOLA was designed to handle the numerous scaffolds and changing contents of draft phase genomes in an automated process and stores the results for each predicted ORF in flatfile databases. In addition, annotation transfers, ORF designation tracking, Blast comparisons, and primer design for whole genome microarrays have been implemented. The software is available under the license of North Carolina State University. A website and a downloadable example are accessible under (http://fsweb2.schaub. ncsu.edu/TRKwebsite/index.htm).  相似文献   

8.
9.
Debate exists over how to incorporate information from multipartite sequence data in phylogenetic analyses. Strict combined-data approaches argue for concatenation of all partitions and estimation of one evolutionary history, maximizing the explanatory power of the data. Consensus/independence approaches endorse a two-step procedure where partitions are analyzed independently and then a consensus is determined from the multiple results. Mixtures across the model space of a strict combined-data approach and a priori independent parameters are popular methods to integrate these methods. We propose an alternative middle ground by constructing a Bayesian hierarchical phylogenetic model. Our hierarchical framework enables researchers to pool information across data partitions to improve estimate precision in individual partitions while permitting estimation and testing of tendencies in across-partition quantities. Such across-partition quantities include the distribution from which individual topologies relating the sequences within a partition are drawn. We propose standard hierarchical priors on continuous evolutionary parameters across partitions, while the structure on topologies varies depending on the research problem. We illustrate our model with three examples. We first explore the evolutionary history of the guinea pig (Cavia porcellus) using alignments of 13 mitochondrial genes. The hierarchical model returns substantially more precise continuous parameter estimates than an independent parameter approach without losing the salient features of the data. Second, we analyze the frequency of horizontal gene transfer using 50 prokaryotic genes. We assume an unknown species-level topology and allow individual gene topologies to differ from this with a small estimable probability. Simultaneously inferring the species and individual gene topologies returns a transfer frequency of 17%. We also examine HIV sequences longitudinally sampled from HIV+ patients. We ask whether posttreatment development of CCR5 coreceptor virus represents concerted evolution from middisease CXCR4 virus or reemergence of initial infecting CCR5 virus. The hierarchical model pools partitions from multiple unrelated patients by assuming that the topology for each patient is drawn from a multinomial distribution with unknown probabilities. Preliminary results suggest evolution and not reemergence.  相似文献   

10.
The Japanese eel is a much appreciated research object and very important for Asian aquaculture; however, its genomic resources are still limited. We have used a streamlined bioinformatics pipeline for the de novo assembly of the genome sequence of the Japanese eel from raw Illumina sequence reads. The total assembled genome has a size of 1.15 Gbp, which is divided over 323,776 scaffolds with an N50 of 52,849 bp, a minimum scaffold size of 200 bp and a maximum scaffold size of 1.14 Mbp. Direct comparison of a representative set of scaffolds revealed that all the Hox genes and their intergenic distances are almost perfectly conserved between the European and the Japanese eel. The first draft genome sequence of an organism strongly catalyzes research progress in multiple fields. Therefore, the Japanese eel genome sequence will provide a rich resource of data for all scientists working on this important fish species.  相似文献   

11.
Vampirovibrio chlorellavorus is recognized as a pathogen of commercially‐relevant Chlorella species. Algal infection and total loss of productivity (biomass) often occurs when susceptible algal hosts are cultivated in outdoor open pond systems. The pathogenic life cycle of this bacterium has been inferred from laboratory and field observations, and corroborated in part by the genomic analyses for two Arizona isolates recovered from an open algal reactor. V. chlorellavorus predation has been reported to occur in geographically‐ and environmentally‐diverse conditions. Genomic analyses of these and additional field isolates is expected to reveal new information about the extent of ecological diversity and genes involved in host‐pathogen interactions. The draft genome sequences for two isolates of the predatory V. chlorellavorus (Cyanobacteria; Ca. Melainabacteria) from an outdoor cultivation system located in the Arizona Sonoran Desert were assembled and annotated. The genomes were sequenced and analyzed to identify genes (proteins) with predicted involvement in predation, infection, and cell death of Chlorella host species prioritized for biofuel production at sites identified as highly suitable for algal production in the southwestern USA. Genomic analyses identified several predicted genes encoding secreted proteins that are potentially involved in pathogenicity, and at least three apparently complete sets of virulence (Vir) genes, characteristic of the VirB‐VirD type system encoding the canonical VirB1‐11 and VirD4 proteins, respectively. Additional protein functions were predicted suggesting their involvement in quorum sensing and motility. The genomes of two previously uncharacterized V. chlorellavorus isolates reveal nucleotide and protein level divergence between each other, and a previously sequenced V. chlorellavorus genome. This new knowledge will enhance the fundamental understanding of trans‐kingdom interactions between a unique cosmopolitan cyanobacterial pathogen and its green microalgal host, of broad interest as a source of harvestable biomass for biofuels or bioproducts.  相似文献   

12.
Taylor MS  Semple CA 《Genome biology》2002,3(9):reviews1025.1-reviews10256
The publication of the Fugu rubripes draft genome sequence will take this fish from culinary delicacy to potent tool in deciphering the mysteries of human genome function.  相似文献   

13.
14.
SUMMARY: WebBLAST is a suite of programs intended to assist in organizing sequencing data and to provide first-pass sequence analysis in an automated fashion. Data processing is fully automated, with end-users being presented both graphical and tabular summaries of data that can be viewed using any Web browser. AVAILABILITY: The program is free and available at http://genome.nhgri.nih. gov/webblast.  相似文献   

15.
The use of the Monsanto draft rice genome sequence in research   总被引:21,自引:0,他引:21  
Barry GF 《Plant physiology》2001,125(3):1164-1165
  相似文献   

16.
Bacteria of the genus Citricoccus have been isolated from ecological niches characterized by diverse abiotic stress conditions. Here we report the first genome draft of a strain of the genus Citricoccus isolated from the extremely oligotrophic Churince system in the Cuatro Ciénegas Basin (CCB) in Coahuila, Mexico.  相似文献   

17.
Glycine latifolia (Benth.) Newell & Hymowitz (2= 40), one of the 27 wild perennial relatives of soybean, possesses genetic diversity and agronomically favorable traits that are lacking in soybean. Here, we report the 939‐Mb draft genome assembly of G. latifolia (PI 559298) using exclusively linked‐reads sequenced from a single Chromium library. We organized scaffolds into 20 chromosome‐scale pseudomolecules utilizing two genetic maps and the Glycine max (L.) Merr. genome sequence. High copy numbers of putative 91‐bp centromere‐specific tandem repeats were observed in consecutive blocks within predicted pericentromeric regions on several pseudomolecules. No 92‐bp putative centromeric repeats, which are abundant in G. max, were detected in G. latifolia or Glycine tomentella. Annotation of the assembled genome and subsequent filtering yielded a high confidence gene set of 54 475 protein‐coding loci. In comparative analysis with five legume species, genes related to defense responses were significantly overrepresented in Glycine‐specific orthologous gene families. A total of 304 putative nucleotide‐binding site (NBS)‐leucine‐rich‐repeat (LRR) genes were identified in this genome assembly. Different from other legume species, we observed a scarcity of TIR‐NBS‐LRR genes in G. latifolia. The G. latifolia genome was also predicted to contain genes encoding 367 LRR‐receptor‐like kinases, a family of proteins involved in basal defense responses and responses to abiotic stress. The genome sequence and annotation of G. latifolia provides a valuable source of alternative alleles and novel genes to facilitate soybean improvement. This study also highlights the efficacy and cost‐effectiveness of the application of Chromium linked‐reads in diploid plant genome de novo assembly.  相似文献   

18.
O'Brien HE  Gong Y  Fung P  Wang PW  Guttman DS 《PloS one》2011,6(11):e27199
Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.  相似文献   

19.
PlasmoDB (http://PlasmoDB.org) is the official database of the Plasmodium falciparum genome sequencing consortium. This resource incorporates finished and draft genome sequence data and annotation emerging from Plasmodium sequencing projects. PlasmoDB currently houses information from five parasite species and provides tools for cross-species comparisons. Sequence information is also integrated with other genomic-scale data emerging from the Plasmodium research community, including gene expression analysis from EST, SAGE and microarray projects. The relational schemas used to build PlasmoDB [Genomics Unified Schema (GUS) and RNA Abundance Database (RAD)] employ a highly structured format to accommodate the diverse data types generated by sequence and expression projects. A variety of tools allow researchers to formulate complex, biologically based queries of the database. A version of the database is also available on CD-ROM (Plasmodium GenePlot), facilitating access to the data in situations where Internet access is difficult (e.g. by malaria researchers working in the field). The goal of PlasmoDB is to enhance utilization of the vast quantities of data emerging from genome-scale projects by the global malaria research community.  相似文献   

20.
Manduca sexta, known as the tobacco hornworm or Carolina sphinx moth, is a lepidopteran insect that is used extensively as a model system for research in insect biochemistry, physiology, neurobiology, development, and immunity. One important benefit of this species as an experimental model is its extremely large size, reaching more than 10 g in the larval stage. M. sexta larvae feed on solanaceous plants and thus must tolerate a substantial challenge from plant allelochemicals, including nicotine. We report the sequence and annotation of the M. sexta genome, and a survey of gene expression in various tissues and developmental stages. The Msex_1.0 genome assembly resulted in a total genome size of 419.4 Mbp. Repetitive sequences accounted for 25.8% of the assembled genome. The official gene set is comprised of 15,451 protein-coding genes, of which 2498 were manually curated. Extensive RNA-seq data from many tissues and developmental stages were used to improve gene models and for insights into gene expression patterns. Genome wide synteny analysis indicated a high level of macrosynteny in the Lepidoptera. Annotation and analyses were carried out for gene families involved in a wide spectrum of biological processes, including apoptosis, vacuole sorting, growth and development, structures of exoskeleton, egg shells, and muscle, vision, chemosensation, ion channels, signal transduction, neuropeptide signaling, neurotransmitter synthesis and transport, nicotine tolerance, lipid metabolism, and immunity. This genome sequence, annotation, and analysis provide an important new resource from a well-studied model insect species and will facilitate further biochemical and mechanistic experimental studies of many biological systems in insects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号