首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
GENOME: a rapid coalescent-based whole genome simulator   总被引:1,自引:0,他引:1  
Summary: GENOME proposes a rapid coalescent-based approach tosimulate whole genome data. In addition to features of standardcoalescent simulators, the program allows for recombinationrates to vary along the genome and for flexible population histories.Within small regions, we have evaluated samples simulated byGENOME to verify that GENOME provides the expected LD patternsand frequency spectra. The program can be used to study thesampling properties of any statistic for a whole genome study. Availability: The program and C++ source code are availableonline at http://www.sph.umich.edu/csg/liang/genome/ Contact: lianglim{at}umich.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

2.
Motivation: Inferring population structures using genetic datasampled from a group of individuals is a challenging task. Manymethods either consider a fixed population number or ignorethe correlation between populations. As a result, they can losesensitivity and specificity in detecting subtle stratifications.In addition, when a large number of genetic markers are used,many existing algorithms perform rather inefficiently. Result: We propose a new Bayesian method to infer populationstructures using multiple unlinked single nucleotide polymorphisms(SNPs). Our approach explicitly considers the population correlationthrough a tree hierarchy, and treat the population number asa random variable. Using both simulated and real datasets ofworldwide samples, we demonstrate that an incorporated treecan consistently improve the power in detecting subtle populationstratifications. A tree-based model often involves a large numberof unknown parameters, and the corresponding estimation procedurecan be highly inefficient. We further implement a partitionmethod to analytically integrate out all nuisance parametersin the tree. As a result, our method can analyze large SNP datasetswith significantly improved convergence rate. Availability: http://www.stat.psu.edu/~yuzhang/tips.tar Contact: yuzhang{at}stat.psu.edu Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Keith Crandall  相似文献   

3.
A multivariate test of association   总被引:1,自引:0,他引:1  
Summary: Although genetic association studies often test multiple,related phenotypes, few formal multivariate tests of associationare available. We describe a test of association that can beefficiently applied to large population-based designs. Availability: A C++ implementation can be obtained from theauthors. Contact: manuel.ferreira{at}qimr.edu.au Supplementary information: Supplementary figures are availableat Bioinformatics online. Associate Editor: Alex Bateman  相似文献   

4.
5.
MMG: a probabilistic tool to identify submodules of metabolic pathways   总被引:1,自引:0,他引:1  
Motivation: A fundamental task in systems biology is the identificationof groups of genes that are involved in the cellular responseto particular signals. At its simplest level, this often reducesto identifying biological quantities (mRNA abundance, enzymeconcentrations, etc.) which are differentially expressed intwo different conditions. Popular approaches involve using t-teststatistics, based on modelling the data as arising from a mixturedistribution. A common assumption of these approaches is thatthe data are independent and identically distributed; however,biological quantities are usually related through a complex(weighted) network of interactions, and often the more pertinentquestion is which subnetworks are differentially expressed,rather than which genes. Furthermore, in many interesting cases(such as high-throughput proteomics and metabolomics), onlyvery partial observations are available, resulting in the needfor efficient imputation techniques. Results: We introduce Mixture Model on Graphs (MMG), a novelprobabilistic model to identify differentially expressed submodulesof biological networks and pathways. The method can easily incorporateinformation about weights in the network, is robust againstmissing data and can be easily generalized to directed networks.We propose an efficient sampling strategy to infer posteriorprobabilities of differential expression, as well as posteriorprobabilities over the model parameters. We assess our methodon artificial data demonstrating significant improvements overstandard mixture model clustering. Analysis of our model resultson quantitative high-throughput proteomic data leads to theidentification of biologically significant subnetworks, as wellas the prediction of the expression level of a number of enzymes,some of which are then verified experimentally. Availability: MATLAB code is available from http://www.dcs.shef.ac.uk/~guido/software.html Contact: guido{at}dcs.shef.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Jonathan Wren  相似文献   

6.
7.
Summary: Traditional two-dimensional (2D) software programsfor drawing pedigrees are limited when dealing with extendedpedigrees. In successive generations, the number of individualsgrows exponentially, leading to an unworkable amount of spacerequired in the horizontal direction for 2D displays. In addition,it is not always possible to place closely related individualsnear each other due to the lack of space in 2Ds. To addressthese issues we have developed three-dimensional (3D) pedigreedrawing techniques to enable clearer visualization of extendedpedigrees. Currently no other methods are available for displayingextended pedigrees in 3Ds. We have made freely available a softwaretool—‘Celestial3D’—that implements thesenovel techniques. Availability: Freely available to non-commercial users Contact: celestial3d{at}genepi.org.au Supplementary information: www.genepi.org.au/celestial3d Associate Editor: Martin Bishop 1A more extensive list of software tools appears in the SupplementaryMaterial.  相似文献   

8.
Summary: BLISS 2.0 is a web-based application for identifyingconserved regulatory modules in distantly related orthologoussequences. Unlike existing approaches, it performs the cross-genomecomparison at the binding site level. Experimental results onsimulated and real world data indicate that BLISS 2.0 can identifyconserved regulatory modules from sequences with little overallsimilarity at the DNA sequence level. Availability: http://www.blisstool.org/ Contact: leizhou{at}ufl.edu Associate Editor: Olga Troyanskaya  相似文献   

9.
Motivation: Understanding the complexity in gene–phenotyperelationship is vital for revealing the genetic basis of commondiseases. Recent studies on the basis of human interactome andphenome not only uncovers prevalent phenotypic overlap and geneticoverlap between diseases, but also reveals a modular organizationof the genetic landscape of human diseases, providing new opportunitiesto reduce the complexity in dissecting the gene–phenotypeassociation. Results: We provide systematic and quantitative evidence thatphenotypic overlap implies genetic overlap. With these results,we perform the first heterogeneous alignment of human interactomeand phenome via a network alignment technique and identify 39disease families with corresponding causative gene networks.Finally, we propose AlignPI, an alignment-based framework topredict disease genes, and identify plausible candidates for70 diseases. Our method scales well to the whole genome, asdemonstrated by prioritizing 6154 genes across 37 chromosomeregions for Crohn's disease (CD). Results are consistent witha recent meta-analysis of genome-wide association studies forCD. Availability: Bi-modules and disease gene predictions are freelyavailable at the URL http://bioinfo.au.tsinghua.edu.cn/alignpi/ Contact: ruijiang{at}tsinghua.edu.cn Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Trey Ideker  相似文献   

10.
11.
Motivation: High-density DNA microarrays provide us with usefultools for analyzing DNA and RNA comprehensively. However, thebackground signal caused by the non-specific binding (NSB) betweenprobe and target makes it difficult to obtain accurate measurements.To remove the background signal, there is a set of backgroundprobes on Affymetrix Exon arrays to represent the amount ofnon-specific signals, and an accurate estimation of non-specificsignals using these background probes is desirable for improvementof microarray analyses. Results: We developed a thermodynamic model of NSB on shortnucleotide microarrays in which the NSBs are modeled by duplexformation of probes and multiple hypothetical targets. We fittedthe observed signal intensities of the background probes withthose expected by the model to obtain the model parameters.As a result, we found that the presented model can improve theaccuracy of prediction of non-specific signals in comparisonwith previously proposed methods. This result will provide auseful method to correct for the background signal in oligonucleotidemicroarray analysis. Availability: The software is implemented in the R languageand can be downloaded from our website (http://www-shimizu.ist.osaka-u.ac.jp/shimizu_lab/MSNS/). Contact: furusawa{at}ist.osaka-u.ac.jp Supplementary information: Supplementary data are availableat Bioinformatics online. The authors wish it to be known that, in their opinion, thefirst two authors should be regarded as joint First Authors. Associate Editor: Trey Ideker  相似文献   

12.
Summary: We developed an interactive gene ontology (GO) browsernamed GOTreePlus that superimposes annotation information overGO structures. It can facilitate the identification of importantGO terms through interactive visualization of them in the GOstructure. The interactive pie chart summarizing an annotationdistribution for a selected GO term provides users with a succinctcontext-sensitive overview of their experimental results. Wetested our GOTreePlus using a proteome profiling dataset obtainedon differentiation of retinal pigment epithelial cells where399 proteins were quantified. Availability: http://bioinformatics.cnmcresearch.org/GOTreePlus/ Contact: jseo{at}cnmcresearch.org Associate Editor: John Quackenbush  相似文献   

13.
Summary: DeconMSn accurately determines the monoisotopic massand charge state of parent ions from high-resolution tandemmass spectrometry data, offering significant improvement forLTQ_FT and LTQ_Orbitrap instruments over the commercially deliveredThermo Fisher Scientific's extract_msn tool. Optimal parention mass tolerance values can be determined using accurate massinformation, thus improving peptide identifications for high-massmeasurement accuracy experiments. For low-resolution data fromLCQ and LTQ instruments, DeconMSn incorporates a support-vector-machine-basedcharge detection algorithm that identifies the most likely chargeof a parent species through peak characteristics of its fragmentationpattern. Availability: http://ncrr.pnl.gov/software/ or http://www.proteomicsresource.org/ Contact: rds{at}pnl.gov Supplementary information: PowerPoint presentation/Poster onhttp://ncrr.pnl.gov/software/. Associate Editor: Alfonso Valencia  相似文献   

14.
15.
Motivation: The genomic methylation analysis is useful to typebacteria that have a high number of expressed type II methyltransferases.Methyltransferases are usually committed to Restriction andModification (R-M) systems, in which the restriction endonucleaseimposes high pressure on the expression of the cognate methyltransferasethat hinder R-M system loss. Conventional cluster methods donot reflect this tendency. An algorithm was developed for dendrogramconstruction reflecting the propensity for conservation of R-MType II systems. Results: The new algorithm was applied to 52 Helicobacter pyloristrains from different geographical regions and compared withconventional clustering methods. The algorithm works by firstgrouping strains that share a common minimum set of R-M systemsand gradually adds strains according to the number of the R-Msystems acquired. Dendrograms revealed a cluster of Africanstrains, which suggest that R-M systems are present in H.pylorigenome since its human host migrates from Africa. Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip Contact: filipavale{at}fe.ucp.pt Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: Martin Bishop  相似文献   

16.
Motivation: As the use of microarrays in human studies continuesto increase, stringent quality assurance is necessary to ensureaccurate experimental interpretation. We present a formal approachfor microarray quality assessment that is based on dimensionreduction of established measures of signal and noise componentsof expression followed by parametric multivariate outlier testing. Results: We applied our approach to several data resources.First, as a negative control, we found that the Affymetrix andIllumina contributions to MAQC data were free from outliersat a nominal outlier flagging rate of =0.01. Second, we createda tunable framework for artificially corrupting intensity datafrom the Affymetrix Latin Square spike-in experiment to allowinvestigation of sensitivity and specificity of quality assurance(QA) criteria. Third, we applied the procedure to 507 Affymetrixmicroarray GeneChips processed with RNA from human peripheralblood samples. We show that exclusion of arrays by this approachsubstantially increases inferential power, or the ability todetect differential expression, in large clinical studies. Availability: http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.htmland http://bioconductor.org/packages/2.3/bioc/html/affyContam.htmlaffyContam (credentials: readonly/readonly) Contact: aasare{at}immunetolerance.org; stvjc{at}channing.harvard.edu The authors wish it to be known that, in their opinion, thefirst two authors should be regarded as joint First Authors. Associate Editor: Trey Ideker  相似文献   

17.
Motivation: Differential detection on symptom-related pathogens(SRP) is critical for fast identification and accurate controlagainst epidemic diseases. Conventional polymerase chain reaction(PCR) requires a large number of unique primers to amplify selectedSRP target sequences. With multiple-use primers (mu-primers),multiple targets can be amplified and detected in one PCR experimentunder standard reaction condition and reduced detection complexity.However, the time complexity of designing mu-primers with thebest heuristic method available is too vast. We have formulatedminimum-set mu-primer design problem as a set covering problem(SCP), and used modified compact genetic algorithm (MCGA) tosolve this problem optimally and efficiently. We have also proposednew strategies of primer/probe design algorithm (PDA) on combiningboth minimum-set (MS) mu-primers and unique (UniQ) probes. Designedprimer/probe set by PDA-MS/UniQ can amplify multiple genes simultaneouslyupon physical presence with minimum-set mu-primer amplification(MMA) before intended differential detection with probes-arrayhybridization (PAH) on the selected target set of SRP. Results: The proposed PDA-MS/UniQ method pursues a much smallernumber of primers set compared with conventional PCR. In thesimulation experiment for amplifying 12 669 target sequences,the performance of our method with 68% reduction on requiredmu-primers number seems to be superior to the compared heuristicapproaches in both computation efficiency and reduction percentage.Our integrated PDA-MS/UniQ method is applied to the differentialdetection on 9 plant viruses from 4 genera with MMA and PAHof 11 mu-primers instead of 18 unique ones in conventional PCRwhile amplifying overall 9 target sequences. The results ofwet lab experiments with integrated MMA-PAH system have successfullyvalidated the specificity and sensitivity of the primers/probesdesigned with our integrated PDA-MS/UniQ method. Contact: cykao{at}csie.ntu.edu.tw Supplementary information: http://www.csie.ntu.edu.tw/~cykao/pda/  相似文献   

18.
19.
Motivation: Genomes contain biologically significant informationthat extends beyond that encoded in genes. Some of this informationrelates to various short dispersed repeats distributed throughoutthe genome. The goal of this work was to combine tools for detectionof statistically significant dispersed repeats in DNA sequenceswith tools to aid development of hypotheses regarding theirpossible physiological functions in an easy-to-use web-basedenvironment. Results: Ab Initio Motif Identification Environment (AIMIE)was designed to facilitate investigations of dispersed sequencemotifs in prokaryotic genomes. We used AIMIE to analyze theEscherichia coli and Haemophilus influenzae genomes in orderto demonstrate the utility of the new environment. AIMIE detectedrepeated extragenic palindrome (REP) elements, CRISPR repeats,uptake signal sequences, intergenic dyad sequences and severalother over-represented sequence motifs. Distributional patternsof these motifs were analyzed using the tools included in AIMIE. Availability: AIMIE and the related software can be accessedat our web site http://www.cmbl.uga.edu/software.html. Contact: mrazek{at}uga.edu Associate Editor: Alex Bateman  相似文献   

20.
The SGN comparative map viewer   总被引:1,自引:0,他引:1  
Motivation: With the rapid accumulation of genetic data fora multitude of different species, the availability of intuitivecomparative genomic tools becomes an important requirement forthe research community. Here we describe a web-based comparativeviewer for mapping data, including genetic, physical and cytologicalmaps, that is part of the SGN website (http://sgn.cornell.edu/)but that can also be installed and adapted for other websites.In addition to viewing and comparing different maps stored inthe SGN database, the viewer allows users to upload their ownmaps and compare them to other maps in the system. The vieweris implemented in object oriented Perl, with a simple extensibleinterface to write data adapters for other relational databaseschemas and flat file formats. Contact: lam87{at}cornell.edu Associate Editor: Alex Bateman  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号