首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present a biomedical text-mining system focused on four types of gene-related information: biological functions, associated diseases, related genes and gene-gene relations. The aim of this system is to provide researchers an easy-to-use bio-information service that will rapidly survey the rapidly burgeoning biomedical literature. AVAILABILITY: http://iir.csie.ncku.edu.tw/~yuhc/gis/  相似文献   

2.
Motivation: Understanding the complexity in gene–phenotyperelationship is vital for revealing the genetic basis of commondiseases. Recent studies on the basis of human interactome andphenome not only uncovers prevalent phenotypic overlap and geneticoverlap between diseases, but also reveals a modular organizationof the genetic landscape of human diseases, providing new opportunitiesto reduce the complexity in dissecting the gene–phenotypeassociation. Results: We provide systematic and quantitative evidence thatphenotypic overlap implies genetic overlap. With these results,we perform the first heterogeneous alignment of human interactomeand phenome via a network alignment technique and identify 39disease families with corresponding causative gene networks.Finally, we propose AlignPI, an alignment-based framework topredict disease genes, and identify plausible candidates for70 diseases. Our method scales well to the whole genome, asdemonstrated by prioritizing 6154 genes across 37 chromosomeregions for Crohn's disease (CD). Results are consistent witha recent meta-analysis of genome-wide association studies forCD. Availability: Bi-modules and disease gene predictions are freelyavailable at the URL http://bioinfo.au.tsinghua.edu.cn/alignpi/ Contact: ruijiang{at}tsinghua.edu.cn Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Trey Ideker  相似文献   

3.
Summary: ROBIN is a web server for analyzing genome rearrangementof block-interchanges between two chromosomal genomes. It takestwo or more linear/circular chromosomes as its input, and computesthe number of minimum block-interchange rearrangements betweenany two input chromosomes for transforming one chromosome intoanother and also determines an optimal scenario taking thisnumber of rearrangements. The input can be either bacterial-sizesequence data or landmark-order data. If the input is sequencedata, ROBIN will automatically search for the identical landmarksthat are the homologous/conserved regions shared by all theinput sequences. Availability: ROBIN is freely accessed at http://genome.life.nctu.edu.tw/ROBIN Contact: cllu{at}mail.nctu.edu.tw  相似文献   

4.
5.
Motivation: Differential detection on symptom-related pathogens(SRP) is critical for fast identification and accurate controlagainst epidemic diseases. Conventional polymerase chain reaction(PCR) requires a large number of unique primers to amplify selectedSRP target sequences. With multiple-use primers (mu-primers),multiple targets can be amplified and detected in one PCR experimentunder standard reaction condition and reduced detection complexity.However, the time complexity of designing mu-primers with thebest heuristic method available is too vast. We have formulatedminimum-set mu-primer design problem as a set covering problem(SCP), and used modified compact genetic algorithm (MCGA) tosolve this problem optimally and efficiently. We have also proposednew strategies of primer/probe design algorithm (PDA) on combiningboth minimum-set (MS) mu-primers and unique (UniQ) probes. Designedprimer/probe set by PDA-MS/UniQ can amplify multiple genes simultaneouslyupon physical presence with minimum-set mu-primer amplification(MMA) before intended differential detection with probes-arrayhybridization (PAH) on the selected target set of SRP. Results: The proposed PDA-MS/UniQ method pursues a much smallernumber of primers set compared with conventional PCR. In thesimulation experiment for amplifying 12 669 target sequences,the performance of our method with 68% reduction on requiredmu-primers number seems to be superior to the compared heuristicapproaches in both computation efficiency and reduction percentage.Our integrated PDA-MS/UniQ method is applied to the differentialdetection on 9 plant viruses from 4 genera with MMA and PAHof 11 mu-primers instead of 18 unique ones in conventional PCRwhile amplifying overall 9 target sequences. The results ofwet lab experiments with integrated MMA-PAH system have successfullyvalidated the specificity and sensitivity of the primers/probesdesigned with our integrated PDA-MS/UniQ method. Contact: cykao{at}csie.ntu.edu.tw Supplementary information: http://www.csie.ntu.edu.tw/~cykao/pda/  相似文献   

6.
7.
A multivariate test of association   总被引:1,自引:0,他引:1  
Summary: Although genetic association studies often test multiple,related phenotypes, few formal multivariate tests of associationare available. We describe a test of association that can beefficiently applied to large population-based designs. Availability: A C++ implementation can be obtained from theauthors. Contact: manuel.ferreira{at}qimr.edu.au Supplementary information: Supplementary figures are availableat Bioinformatics online. Associate Editor: Alex Bateman  相似文献   

8.
Motivation: Genomes contain biologically significant informationthat extends beyond that encoded in genes. Some of this informationrelates to various short dispersed repeats distributed throughoutthe genome. The goal of this work was to combine tools for detectionof statistically significant dispersed repeats in DNA sequenceswith tools to aid development of hypotheses regarding theirpossible physiological functions in an easy-to-use web-basedenvironment. Results: Ab Initio Motif Identification Environment (AIMIE)was designed to facilitate investigations of dispersed sequencemotifs in prokaryotic genomes. We used AIMIE to analyze theEscherichia coli and Haemophilus influenzae genomes in orderto demonstrate the utility of the new environment. AIMIE detectedrepeated extragenic palindrome (REP) elements, CRISPR repeats,uptake signal sequences, intergenic dyad sequences and severalother over-represented sequence motifs. Distributional patternsof these motifs were analyzed using the tools included in AIMIE. Availability: AIMIE and the related software can be accessedat our web site http://www.cmbl.uga.edu/software.html. Contact: mrazek{at}uga.edu Associate Editor: Alex Bateman  相似文献   

9.
Summary: EPIMHC is a relational database of MHC-binding peptidesand T cell epitopes that are observed in real proteins. Currently,the database contains 4867 distinct peptide sequences from varioussources, including 84 tumor-associated antigens. The EPIMHCdatabase is accessible through a web server that has been designedto facilitate research in computational vaccinology. Importantly,peptides resulting from a query can be selected to derive specificmotif-matrices. Subsequently, these motif-matrices can be usedin combination with a dynamic algorithm for predicting MHC-bindingpeptides from user-provided protein queries. Availability: The EPIMHC database server is hosted by the Dana-FarberCancer Institute at the site http://immunax.dfci.harvard.edu/bioinformatics/epimhc/ Contact: reche{at}research.dfci.harvard.edu  相似文献   

10.
Summary: We present In silico Biochemical Reaction Network Analysis(IBRENA), a software package which facilitates multiple functionsincluding cellular reaction network simulation and sensitivityanalysis (both forward and adjoint methods), coupled with principalcomponent analysis, singular-value decomposition and model reduction.The software features a graphical user interface that aids simulationand plotting of in silico results. While the primary focus isto aid formulation, testing and reduction of theoretical biochemicalreaction networks, the program can also be used for analysisof high-throughput genomic and proteomic data. Availability: The software package, manual and examples areavailable at http://www.eng.buffalo.edu/~neel/ibrena Contact: neel{at}eng.buffalo.edu Associate Editor: Limsoon Wong  相似文献   

11.
GENOME: a rapid coalescent-based whole genome simulator   总被引:1,自引:0,他引:1  
Summary: GENOME proposes a rapid coalescent-based approach tosimulate whole genome data. In addition to features of standardcoalescent simulators, the program allows for recombinationrates to vary along the genome and for flexible population histories.Within small regions, we have evaluated samples simulated byGENOME to verify that GENOME provides the expected LD patternsand frequency spectra. The program can be used to study thesampling properties of any statistic for a whole genome study. Availability: The program and C++ source code are availableonline at http://www.sph.umich.edu/csg/liang/genome/ Contact: lianglim{at}umich.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

12.
13.
14.
15.
16.
Motivation: After 10-year investigations, the folding mechanismsof β-hairpins are still under debate. Experiments stronglysupport zip-out pathway, while most simulations prefer the hydrophobiccollapse model (including middle-out and zip-in pathways). Inthis article, we show that all pathways can occur during thefolding of β-hairpins but with different probabilities.The zip-out pathway is the most probable one. This is in agreementwith the experimental results. We came to our conclusions by38 100-ns room-temperature all-atom molecular dynamics simulationsof the β-hairpin trpzip2. Our results may help to clarifythe inconsistencies in the current pictures of β-hairpinfolding mechanisms. Contact: yxiao{at}mail.hust.edu.cn Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Anna Tramontano  相似文献   

17.
Motivation: In searching for differentially expressed (DE) genesin microarray data, we often observe a fraction of the genesto have unequal variability between groups. This is not an issuein large samples, where a valid test exists that uses individualvariances separately. The problem arises in the small-samplesetting, where the approximately valid Welch test lacks sensitivity,while the more sensitive moderated t-test assumes equal variance. Methods: We introduce a moderated Welch test (MWT) that allowsunequal variance between groups. It is based on (i) weightingof pooled and unpooled standard errors and (ii) improved estimationof the gene-level variance that exploits the information fromacross the genes. Results: When a non-trivial proportion of genes has unequalvariability, false discovery rate (FDR) estimates based on thestandard t and moderated t-tests are often too optimistic, whilethe standard Welch test has low sensitivity. The MWT is shownto (i) perform better than the standard t, the standard Welchand the moderated t-tests when the variances are unequal betweengroups and (ii) perform similarly to the moderated t, and betterthan the standard t and Welch tests when the group variancesare equal. These results mean that MWT is more reliable thanother existing tests over wider range of data conditions. Availability: R package to perform MWT is available at http://www.meb.ki.se/~yudpaw Contact: yudi.pawitan{at}ki.se Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

18.
Motivation: The success of genome sequencing has resulted inmany protein sequences without functional annotation. We presentConFunc, an automated Gene Ontology (GO)-based protein functionprediction approach, which uses conserved residues to generatesequence profiles to infer function. ConFunc split sets of sequencesidentified by PSI-BLAST into sub-alignments according to theirGO annotations. Conserved residues are identified for each GOterm sub-alignment for which a position specific scoring matrixis generated. This combination of steps produces a set of feature(GO annotation) derived profiles from which protein functionis predicted. Results: We assess the ability of ConFunc, BLAST and PSI-BLASTto predict protein function in the twilight zone of sequencesimilarity. ConFunc significantly outperforms BLAST & PSI-BLASTobtaining levels of recall and precision that are not obtainedby either method and maximum precision 24% greater than BLAST.Further for a large test set of sequences with homologues oflow sequence identity, at high levels of presicision, ConFuncobtains recall six times greater than BLAST. These results demonstratethe potential for ConFunc to form part of an automated genomicsannotation pipeline. Availability: http://www.sbg.bio.ic.ac.uk/confunc Contact: m.sternberg{at}imperial.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Dmitrij Frishman  相似文献   

19.
MOTIVATION: Research on roles of gene products in cells is accumulating and changing rapidly, but most of the results are still reported in text form and are not directly accessible by computers. To expedite the progress of functional bioinformatics, it is, therefore, important to efficiently process large amounts of biomedical literature and transform the knowledge extracted into a structured format usable by biologists and medical researchers. Our aim was to develop an intelligent text-mining system that will extract from biomedical documents knowledge about the functions of gene products and thus facilitate computing with function. RESULTS: We have developed an ontology-based text-mining system to efficiently extract from biomedical literature knowledge about the functions of gene products. We also propose methods of sentence alignment and sentence classification to discover the functions of gene products discussed in digital texts. AVAILABILITY: http://ismp.csie.ncku.edu.tw/~yuhc/meke/  相似文献   

20.
Motivation: Inferring population structures using genetic datasampled from a group of individuals is a challenging task. Manymethods either consider a fixed population number or ignorethe correlation between populations. As a result, they can losesensitivity and specificity in detecting subtle stratifications.In addition, when a large number of genetic markers are used,many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer populationstructures using multiple unlinked single nucleotide polymorphisms(SNPs). Our approach explicitly considers the population correlationthrough a tree hierarchy, and treat the population number asa random variable. Using both simulated and real datasets ofworldwide samples, we demonstrate that an incorporated treecan consistently improve the power in detecting subtle populationstratifications. A tree-based model often involves a large numberof unknown parameters, and the corresponding estimation procedurecan be highly inefficient. We further implement a partitionmethod to analytically integrate out all nuisance parametersin the tree. As a result, our method can analyze large SNP datasetswith significantly improved convergence rate. Availability: http://www.stat.psu.edu/~yuzhang/tips.tar Contact: yuzhang{at}stat.psu.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Keith Crandall  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号