首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Using the transcriptome to annotate the genome   总被引:35,自引:0,他引:35  
A remaining challenge for the human genome project involves the identification and annotation of expressed genes. The public and private sequencing efforts have identified approximately 15,000 sequences that meet stringent criteria for genes, such as correspondence with known genes from humans or other species, and have made another approximately 10,000-20,000 gene predictions of lower confidence, supported by various types of in silico evidence, including homology studies, domain searches, and ab initio gene predictions. These computational methods have limitations, both because they are unable to identify a significant fraction of genes and exons and because they are unable to provide definitive evidence about whether a hypothetical gene is actually expressed. As the in silico approaches identified a smaller number of genes than anticipated, we wondered whether high-throughput experimental analyses could be used to provide evidence for the expression of hypothetical genes and to reveal previously undiscovered genes. We describe here the development of such a method--called long serial analysis of gene expression (LongSAGE), an adaption of the original SAGE approach--that can be used to rapidly identify novel genes and exons.  相似文献   

3.
4.
5.
The recent completion of a first draft of the human genome has allowed "in silico" genome browsing to become routine. Such computer-based research is now a useful adjunct to experiments based at the bench, and is accelerating gene discovery and the analysis and understanding of genes in their genomic contexts. This review summarises recent findings on genes encoding proteins of the troponin complex. We describe the organization of the three pairs of genes which encode isoforms of troponins I and T, and discuss how this relates to their evolution and regulation. Detailed analysis of the chromosomal context of the cardiac troponin I and slow skeletal troponin T genes reveals a region of densely packed differentially expressed genes, including new genes identified by automatic genome annotation. This information is discussed within the context of detailed analysis of the best-studied gene in this region, cardiac troponin I. In this way, we illustrate the uses to which a combination of conventional bench experiments and "in silico" analyses may be put in understanding the relationship between structure and function within the genome.  相似文献   

6.
In recent years in silico analysis of common laboratory mice has been introduced and subsequently applied, in slightly different ways, as a methodology for gene mapping. Previously we have demonstrated some limitation of the methodology due to sporadic genetic correlations across the genome. Here, we revisit the three main aspects that affect in silico analysis. First, we report on the use of marker maps: we compared our existing 20,000 SNP map to the newly released 140,000 SNP map. Second, we investigated the effect of varying strain numbers on power to map QTL. Third, we introduced a novel statistical approach: a cladistic analysis, which is well suited for mouse genetics and has increased flexibility over existing in silico approaches. We have found that in our examples of complex traits, in silico analysis by itself does fail to uniquely identify quantitative trait gene (QTG)-containing regions. However, when combined with additional information, it may significantly help to prioritize candidate genes. We therefore recommend using an integrated work flow that uses other genomic information such as linkage regions, regions of shared ancestry, and gene expression information to obtain a list of candidate genes from the genome.  相似文献   

7.
Accumulated knowledge of genomic information, systems biology, and disease mechanisms provide an unprecedented opportunity to elucidate the genetic basis of diseases, and to discover new and novel therapeutic targets from the wealth of genomic data. With hundreds to a few thousand potential targets available in the human genome alone, target selection and validation has become a critical component of drug discovery process. The explorations on quantitative characteristics of the currently explored targets (those without any marketed drug) and successful targets (targeted by at least one marketed drug) could help discern simple rules for selecting a putative successful target. Here we use integrative in silico (computational) approaches to quantitatively analyze the characteristics of 133 targets with FDA approved drugs and 3120 human disease genes (therapeutic targets) not targeted by FDA approved drugs. This is the first attempt to comparatively analyze targets with FDA approved drugs and targets with no FDA approved drug or no drugs available for them. Our results show that proteins with 5 or fewer number of homologs outside their own family, proteins with single-exon gene architecture and proteins interacting with more than 3 partners are more likely to be targetable. These quantitative characteristics could serve as criteria to search for promising targetable disease genes.  相似文献   

8.
Microsatellites, or tandem simple sequence repeats (SSRs), have become one of the most popular molecular markers in genome mapping because of their abundance across genomes and because of their high levels of polymorphism. However, information on which genes surround or flank them has remained very limited for most SSRs, especially in livestock species. In this study, an in silico comparative mapping approach was developed to link porcine SSRs to known genome regions by identifying their human orthologs. From a total of 1321 porcine microsatellites used in this study, 228 were found to have blocks in alignment with human genomic sequences. These 228 SSRs span about 1459 cM of the porcine genome, but with uneven distributions, ranging from 2 on SSC12 to 24 on SSC14. Linking these porcine SSRs to the known genome regions in the human genome also revealed 16 new putative synteny groups between these two species. Fifteen SSRs on SSC3 with identified human orthologs were typed on a pig-hamster radiation hybrid (RH) panel and used in a joint analysis with 80 known gene markers previously mapped on SSC3 using the same panel. The analysis revealed that they were all highly linked to either one or both adjacent markers. These results indicated that assigning the porcine SSRs to known genome regions by identifying their human orthologs is a reliable approach. The process will provide a foundation for positional cloning of causative genes for economically important traits.  相似文献   

9.
Liu P  Vikis H  Lu Y  Wang D  You M 《PloS one》2007,2(7):e651
Understanding the genetic basis of common disease and disease-related quantitative traits will aid in the development of diagnostics and therapeutics. The processs of gene discovery can be sped up by rapid and effective integration of well-defined mouse genome and phenome data resources. We describe here an in silico gene-discovery strategy through genome-wide association (GWA) scans in inbred mice with a wide range of genetic variation. We identified 937 quantitative trait loci (QTLs) from a survey of 173 mouse phenotypes, which include models of human disease (atherosclerosis, cardiovascular disease, cancer and obesity) as well as behavioral, hematological, immunological, metabolic, and neurological traits. 67% of QTLs were refined into genomic regions <0.5 Mb with approximately 40-fold increase in mapping precision as compared with classical linkage analysis. This makes for more efficient identification of the genes that underlie disease. We have identified two QTL genes, Adam12 and Cdh2, as causal genetic variants for atherogenic diet-induced obesity. Our findings demonstrate that GWA analysis in mice has the potential to resolve multiple tightly linked QTLs and achieve single-gene resolution. These high-resolution QTL data can serve as a primary resource for positional cloning and gene identification in the research community.  相似文献   

10.
Complete genome sequences of several pathogenic bacteria have been determined, and many more such projects are currently under way. While these data potentially contain all the determinants of host-pathogen interactions and possible drug targets, computational tools for selecting suitable candidates for further experimental analyses are currently limited. Detection of bacterial genes that are non-homologous to human genes, and are essential for the survival of the pathogen represents a promising means of identifying novel drug targets. We have used three-way genome comparisons to identify essential genes from Pseudomonas aeruginosa. Our approach identified 306 essential genes that may be considered as potential drug targets. The resultant analyses are in good agreement with the results of systematic gene deletion experiments. This approach enables rapid potential drug target identification, thereby greatly facilitating the search for new antibiotics. These results underscore the utility of large genomic databases for in silico systematic drug target identification in the post-genomic era.  相似文献   

11.
基于功能基因组信息、网络拓扑结构信息整合分析方法,利用基因表达谱数据和蛋白质互作数据挖掘动脉粥样硬化(AS)风险疾病基因,为从基因组层面研究动脉粥样硬化提供了新的视角.经过差异表达分析,支持向量机(SVM)的机器学习方法双重筛选,可以鉴别出可信度水平较高的风险疾病基因,对于研究动脉粥样硬化疾病基因在网络中的拓扑性质,建立基因与疾病发生发展过程的联系,提供了新的思路.得到了巨噬细胞样本中59个风险疾病基因,泡沫细胞中61个风险疾病基因.这些风险基因与已知疾病基因共享大部分动脉粥样硬化病变相关生物学过程及信号通路.并应用到对其他复杂疾病致病机理的研究中.  相似文献   

12.
13.
Recent advances in genome sequencing techniques have improved our understanding of the genotype-phenotype relationship between genetic variants and human diseases. However, genetic variations uncovered from patient populations do not provide enough information to understand the mechanisms underlying the progression and clinical severity of human diseases. Moreover, building a high-resolution genotype-phenotype map is difficult due to the diverse genetic backgrounds of the human population. We built a cross-species genotype-phenotype map to explain the clinical severity of human genetic diseases. We developed a data-integrative framework to investigate network modules composed of human diseases mapped with gene essentiality measured from a model organism. Essential and nonessential genes connect diseases of different types which form clusters in the human disease network. In a large patient population study, we found that disease classes enriched with essential genes tended to show a higher mortality rate than disease classes enriched with nonessential genes. Moreover, high disease mortality rates are explained by the multiple comorbid relationships and the high pleiotropy of disease genes found in the essential gene-enriched diseases. Our results reveal that the genotype-phenotype map of a model organism can facilitate the identification of human disease-gene associations and predict human disease progression.  相似文献   

14.
We report the localization by linkage analysis in the rat genome of 148 new markers derived from 128 distinct known gene sequences, ESTs, and anonymous sequences selected in GenBank database on the basis of the presence of a repeated element. The composite linkage map of the rat contributed by our group integrates mapping information on a total of 370 different known genes, ESTs, and anonymous mouse or human sequences, and provides a valuable tool for comparative genome analysis. 206 and 254 homologous loci were identified in the mouse and human genomes respectively. Our linkage map, which combines both anonymous markers and gene markers, should facilitate the advancement of genetic studies for a wide variety of rat models characterized for complete phenotypes. The comparative genome mapping should define genetic regions in human likely to be homologous to susceptibility loci identified in rat and provide useful information for the identification of new potential candidates for genetic disorders. Received: 2 January 1999 / Accepted: 7 March 1999  相似文献   

15.
In this report, a genome-scale reconstruction of Bacillus subtilis metabolism and its iterative development based on the combination of genomic, biochemical, and physiological information and high-throughput phenotyping experiments is presented. The initial reconstruction was converted into an in silico model and expanded in a four-step iterative fashion. First, network gap analysis was used to identify 48 missing reactions that are needed for growth but were not found in the genome annotation. Second, the computed growth rates under aerobic conditions were compared with high-throughput phenotypic screen data, and the initial in silico model could predict the outcomes qualitatively in 140 of 271 cases considered. Detailed analysis of the incorrect predictions resulted in the addition of 75 reactions to the initial reconstruction, and 200 of 271 cases were correctly computed. Third, in silico computations of the growth phenotypes of knock-out strains were found to be consistent with experimental observations in 720 of 766 cases evaluated. Fourth, the integrated analysis of the large-scale substrate utilization and gene essentiality data with the genome-scale metabolic model revealed the requirement of 80 specific enzymes (transport, 53; intracellular reactions, 27) that were not in the genome annotation. Subsequent sequence analysis resulted in the identification of genes that could be putatively assigned to 13 intracellular enzymes. The final reconstruction accounted for 844 open reading frames and consisted of 1020 metabolic reactions and 988 metabolites. Hence, the in silico model can be used to obtain experimentally verifiable hypothesis on the metabolic functions of various genes.  相似文献   

16.
17.
S Blackshaw  R E Fraioli  T Furukawa  C L Cepko 《Cell》2001,107(5):579-589
To identify the full set of genes expressed by mammalian rods, we conducted serial analysis of gene expression (SAGE) by using libraries generated from mature and developing mouse retina. We identified 264 uncharacterized genes that were specific to or highly enriched in rods. Nearly half of all cloned human retinal disease genes are selectively expressed in rod photoreceptors. In silico mapping of the human orthologs of genes identified in our screen revealed that 86 map within intervals containing uncloned retinal disease genes, representing 37 different loci. We expect these data will allow identification of many disease genes, and that this approach may be useful for cloning genes involved in classes of disease where cell type-specific expression of disease genes is observed.  相似文献   

18.
Recent advances in DNA sequencing technology have enabled elucidation of whole genome information from a plethora of organisms. In parallel with this technology, various bioinformatics tools have driven the comparative analysis of the genome sequences between species and within isolates. While drawing meaningful conclusions from a large amount of raw material, computer-aided identification of suitable targets for further experimental analysis and characterization, has also led to the prediction of non-human homologous essential genes in bacteria as promising candidates for novel drug discovery. Here, we present a comparative genomic analysis to identify essential genes in Burkholderia pseudomallei. Our in silico prediction has identified 312 essential genes which could also be potential drug candidates. These genes encode essential proteins to support the survival of B. pseudomallei including outer-inner membrane and surface structures, regulators, proteins involved in pathogenenicity, adaptation, chaperones as well as degradation of small and macromolecules, energy metabolism, information transfer, central/intermediate/miscellaneous metabolism pathways and some conserved hypothetical proteins of unknown function. Therefore, our in silico approach has enabled rapid screening and identification of potential drug targets for further characterization in the laboratory.  相似文献   

19.
Nucleotide sequence databases: a gold mine for biologists.   总被引:5,自引:0,他引:5  
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号