首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Quantitative trait locus (QTL) analysis is a powerful method for localizing disease genes, but identifying the causal gene remains difficult. Rodent models of disease facilitate QTL gene identification, and causal genes underlying rodent QTL are often associated with the corresponding human diseases. Recently developed bioinformatics methods, including comparative genomics, combined cross analysis, interval-specific and genome-wide haplotype analysis, followed by sequence and expression analysis, each facilitated by public databases, provide new tools for narrowing rodent QTLs. Here we discuss each tool, illustrate its application and generate a bioinformatics strategy for narrowing QTLs. Combining these bioinformatics tools with classical experimental methods should accelerate QTL gene identification.  相似文献   

3.
基于功能一致性利用蛋白质互作网络挖掘潜在的疾病致病基因,对于了解疾病致病机理和改进临床治疗至关重要.基于基因功能一致性和其在蛋白质互作网络中的拓扑属性将基因与疾病之间建立关联,对疾病风险位点内的基因进行了致病风险预测,并通过GO及KEGG功能富集分析方法进一步筛选,预测出新的致病基因.预测出了51个新的冠心病致病基因,分析发现大部分基因参与了冠心病的致病过程.为疾病基因的挖掘提出一个新的思路,从而有助于复杂疾病致病机理的研究.  相似文献   

4.
The past decade has witnessed a rapid transition from the first positional cloning of an infectious disease susceptibility gene (Slc11a1, also called Nramp1) in the mouse to genome-wide scans in human multicase families and the identification of potential disease-causing genes by simple inspection of the public human genome databases. Pathogen genome projects have facilitated multilocus sequence typing of pathogen isolates and studies of ecological fitness and virulence patterns in disease-causing isolates. Comparative sequence analysis of pathogen strains and functional genomics studies are now underway, hopefully providing new insight into infectious disease susceptibility.  相似文献   

5.
6.
In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5'-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5' ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential.  相似文献   

7.
8.
The application of DNA microarray technology for analysis of gene expression creates enormous opportunities to accelerate the pace in understanding living systems and identification of target genes and pathways for drug development and therapeutic intervention. Parallel monitoring of the expression profiles of thousands of genes seems particularly promising for a deeper understanding of cancer biology and the identification of molecular signatures supporting the histological classification schemes of neoplastic specimens. However, the increasing volume of data generated by microarray experiments poses the challenge of developing equally efficient methods and analysis procedures to extract, interpret, and upgrade the information content of these databases. Herein, a computational procedure for pattern identification, feature extraction, and classification of gene expression data through the analysis of an autoassociative neural network model is described. The identified patterns and features contain critical information about gene-phenotype relationships observed during changes in cell physiology. They represent a rational and dimensionally reduced base for understanding the basic biology of the onset of diseases, defining targets of therapeutic intervention, and developing diagnostic tools for the identification and classification of pathological states. The proposed method has been tested on two different microarray datasets-Golub's analysis of acute human leukemia [Golub et al. (1999) Science 286:531-537], and the human colon adenocarcinoma study presented by Alon et al. [1999; Proc Natl Acad Sci USA 97:10101-10106]. The analysis of the neural network internal structure allows the identification of specific phenotype markers and the extraction of peculiar associations among genes and physiological states. At the same time, the neural network outputs provide assignment to multiple classes, such as different pathological conditions or tissue samples, for previously unseen instances.  相似文献   

9.
The identification and classification of genes and pseudogenes in duplicated regions still constitutes a challenge for standard automated genome annotation procedures. Using an integrated homology and orthology analysis independent of current gene annotation, we have identified 9,484 and 9,017 gene duplicates in human and mouse, respectively. On the basis of the integrity of their coding regions, we have classified them into functional and inactive duplicates, allowing us to define the first consistent and comprehensive collection of 1,811 human and 1,581 mouse unprocessed pseudogenes. Furthermore, of the total of 14,172 human and mouse duplicates predicted to be functional genes, as many as 420 are not included in current reference gene databases and therefore correspond to likely novel mammalian genes. Some of these correspond to partial duplicates with less than half of the length of the original source genes, yet they are conserved and syntenic among different mammalian lineages. The genes and unprocessed pseudogenes obtained here will enable further studies on the mechanisms involved in gene duplication as well as of the fate of duplicated genes.  相似文献   

10.
人类蛋白质组表达谱蛋白质鉴定的分步搜索策略   总被引:3,自引:0,他引:3  
吴松锋  朱云平  贺福初 《遗传》2005,27(5):687-693
大规模蛋白质组表达谱研究的蛋白质鉴定一般采取基于数据库搜索的策略,因此数据库的选择及搜索策略在蛋白质鉴定中非常重要。现有的人类蛋白质数据库远不够完善,而从其他物种的蛋白质数据库中所能得到的补充非常有限,但人类基因组数据库中却可能含有很大的补充空间。在对国际人类蛋白质数据库充分调研、比较的基础上,提出了一种分步搜索的策略。这种策略首先利用一个质量较高、覆盖率相对较大的非冗余数据库进行基本鉴定,随后利用其他蛋白和核酸数据库进行补充鉴定和新蛋白挖掘。该策略能有效地鉴定尽可能多的高可靠蛋白,并能进一步充分利用质谱数据进行补充鉴定和新蛋白挖掘,对大规模蛋白质组表达谱研究具有重要的意义。  相似文献   

11.
Genome-wide screening of sequence databases for human endogenous retroviruses (HERVs) has led to the identification of 18 coding env genes, among which two-the syncytin genes-encode fusogenic ENV proteins possibly involved in placenta physiology. Here we show that a third ENV, originating from the most "recent" HERV-K(HML2) family, is functional. Immunofluorescence analysis of env-transduced cells demonstrates expression of the protein at the cell surface, and we show that the protein confers infectivity to simian immunodeficiency virus pseudotypes. Western blot analysis of the pseudotyped virions further discloses the expected specific cleavage of the ENV precursor protein. This functional ENV could play a role in the amplification--via infection of the germ line--of the HERV-K genomic copies, all the more as coding HERV-K gag and pol genes can similarly be found in the human genome, which could therefore generate infectious virions of a fully endogenous origin.  相似文献   

12.
Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000-100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.  相似文献   

13.
14.
15.
16.
The number of online databases and web-tools for gene expression analysis in Arabidopsis thaliana has increased tremendously during the last years. These resources permit the database-assisted identification of putative cis-regulatory DNA sequences, their binding proteins, and the determination of common cis-regulatory motifs in coregulated genes. DNA binding proteins may be predicted by the type of cis-regulatory motif. Further questions of combinatorial control based on the interaction of DNA binding proteins and the colocalization of cis-regulatory motifs can be addressed. The database-assisted spatial and temporal expression analysis of DNA binding proteins and their target genes may help to further refine experimental approaches. Signal transduction pathways upstream of regulated genes are not yet fully accessible in databases mainly because they need to be manually annotated. This review focuses on the use of the AthaMap and PathoPlant® databases for gene expression regulation analysis and discusses similar and complementary online databases and web-tools. Online databases are helpful for the development of working hypothesis and for designing subsequent experiments.  相似文献   

17.
Availability of genome sequences of pathogens has provided a tremendous amount of information that can be useful in drug target and vaccine target identification. One of the recently adopted strategies is based on a subtractive genomics approach, in which the subtraction dataset between the host and pathogen genome provides information for a set of genes that are likely to be essential to the pathogen but absent in the host. This approach has been used successfully in recent times to identify essential genes in Pseudomonas aeruginosa. We have used the same methodology to analyse the whole genome sequence of the human gastric pathogen Helicobacter pylori. Our analysis revealed that out of the 1590 coding sequences of the pathogen, 40 represent essential genes that have no human homolog. We have further analysed these 40 genes by the protein sequence databases to list some 10 genes whose products are possibly exposed on the pathogen surface. This preliminary work reported here identifies a small subset of the Helicobacter proteome that might be investigated further for identifying potential drug and vaccine targets in this pathogen.  相似文献   

18.
With the rapid increase of DNA databases of human and other eukaryotic model organisms, a large great number of genes need to be distinguished from the DNA databases. Exact recognition of translation initiation sites (TISs) of eukaryotic genes is very important to understand the translation initiation process, predict the detailed structure of eukaryotic genes, and annotate uncharacterized sequences. The problem has not been solved satisfactorily, especially for recognizing TISs of the eukaryotic genes with shorter first exons. It is an important task for extracting new features and finding new powerful algorithms for recognizing TISs of eukaryotic genes. In this paper, the important characteristics of shorter flanking fragments around TISs are extracted and an expectation-maximization (EM) algorithm based on incomplete data is used to recognize TISs of eukaryotic genes. The accuracy is up to 87.8% over a six-fold cross-validation test. The result shows that the identification variables are effectively extracted and the EM algorithm is a powerful tool to predict the TISs of eukaryotic genes. The algorithm also can be applied to other classification or clustering tasks in bioinformatics.  相似文献   

19.
Kutsche R  Brown CJ 《Genomics》2000,65(1):9-15
The large number of redundant sequences available in nucleotide databases provides a resource for the identification of polymorphisms. Expressed polymorphisms in X-linked genes can be used to determine the inactivation status of the genes, and polymorphisms in genes that are subject to inactivation can then be used as tools to examine X-chromosome inactivation status in heterozygous females. In this study, we have identified six new X-linked single-nucleotide polymorphisms and determined the inactivation status of these genes by examination of expression patterns in female cells previously demonstrated to have skewed inactivation, as well as by analysis of somatic cell hybrids retaining the inactive human X chromosome. Expression was seen from both alleles in females heterozygous for the RPS4X gene, confirming the previously reported expression from the inactive X chromosome. Expression of only a single allele was seen in females heterozygous for polymorphisms in the BGN, TM4SF2, ATP6S1, VBP1, and PDHA1 genes, suggesting that these genes are subject to X-chromosome inactivation.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号