首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: Advances in the field of cheminformatics have been hindered by a lack of freely available tools. We have created Chembench, a publicly available cheminformatics portal for analyzing experimental chemical structure-activity data. Chembench provides a broad range of tools for data visualization and embeds a rigorous workflow for creating and validating predictive Quantitative Structure-Activity Relationship models and using them for virtual screening of chemical libraries to prioritize the compound selection for drug discovery and/or chemical safety assessment. AVAILABILITY: Freely accessible at: http://chembench.mml.unc.edu CONTACT: alex_tropsha@unc.edu  相似文献   

2.
We have created databases and software applications for the analysis of DNA mutations at the humanp53gene, the humanhprtgene and both the rodent transgeniclacIandlacZlocus. The databases themselves are stand-alone dBASE files and the software for analysis of the databases runs on IBM-compatible computers. Each database has a separate software analysis program. The software created for these databases permit the filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web (WWW). Open the following home page with a Web Browser: http://sunsite.unc.edu/dnam/mainpage.ht ml . Alternatively, the databases and programs are available via public FTP from: anonymous@sunsite.unc.edu . There is no password required to enter the system. The databases and software are found beneath the subdirectory: pub/academic/biology/dna-mutations. Two other programs are available at the site-a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

3.
4.
SNPselector: a web tool for selecting SNPs for genetic association studies   总被引:7,自引:0,他引:7  
SUMMARY: Single nucleotide polymorphisms (SNPs) are commonly used for association studies to find genes responsible for complex genetic diseases. With the recent advance of SNP technology, researchers are able to assay thousands of SNPs in a single experiment. But the process of manually choosing thousands of genotyping SNPs for tens or hundreds of genes is time consuming. We have developed a web-based program, SNPselector, to automate the process. SNPselector takes a list of gene names or a list of genomic regions as input and searches the Ensembl genes or genomic regions for available SNPs. It prioritizes these SNPs on their tagging for linkage disequilibrium, SNP allele frequencies and source, function, regulatory potential and repeat status. SNPselector outputs result in compressed Excel spreadsheet files for review by the user. AVAILABILITY: SNPselector is freely available at http://primer.duhs.duke.edu/  相似文献   

5.
We have created databases and software applications for the analysis of DNA mutations in the human p53 gene, the human hprt gene and the rodent transgenic lacZ locus. The databases themselves are stand-alone dBase files and the software for analysis of the databases runs on IBM- compatible computers. The software created for these databases permits filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web (WWW). Open home page http://sunsite.unc.edu/dnam/mainpage.ht ml with a WWW browser. Alternatively, the databases and programs are available via public ftp from anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found in subdirectory pub/academic/biology/dna-mutations. Two other programs are available at the WWW site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

6.
We have created databases and software applications for the analysis of DNA mutations at the human p53 gene, the human hprt gene and both the rodent transgenic lacI and lacZ loci. The databases themselves are stand-alone dBASE files and the software for analysis of the databases runs on IBM-compatible computers with Microsoft Windows. Each database has a separate software analysis program. The software created for these databases permit the filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web. Open the following home page with a Web Browser: http://sunsite.unc.edu/dnam/mainpage. html . Alternatively, the databases and programs are available via public FTP from: anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found beneath the subdirectory: pub/academic/biology/dna-mutations. Two other programs are available at the site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

7.
The search for the association between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes has recently received great attention. For these studies, it is essential to use a small subset of informative SNPs accurately representing the rest of the SNPs. Informative SNP selection can achieve (1) considerable budget savings by genotyping only a limited number of SNPs and computationally inferring all other SNPs or (2) necessary reduction of the huge SNP sets (obtained, e.g. from Affymetrix) for further fine haplotype analysis. A novel informative SNP selection method for unphased genotype data based on multiple linear regression (MLR) is implemented in the software package MLR-tagging. This software can be used for informative SNP (tag) selection and genotype prediction. The stepwise tag selection algorithm (STSA) selects positions of the given number of informative SNPs based on a genotype sample population. The MLR SNP prediction algorithm predicts a complete genotype based on the values of its informative SNPs, their positions among all SNPs, and a sample of complete genotypes. An extensive experimental study on various datasets including 10 regions from HapMap shows that the MLR prediction combined with stepwise tag selection uses fewer tags than the state-of-the-art method of Halperin et al. (2005). AVAILABILITY: MLR-Tagging software package is publicly available at http://alla.cs.gsu.edu/~software/tagging/tagging.html  相似文献   

8.
SUMMARY: The interpretation of genome-wide association results is confounded by linkage disequilibrium between nearby alleles. We have developed a flexible bioinformatics query tool for single-nucleotide polymorphisms (SNPs) to identify and to annotate nearby SNPs in linkage disequilibrium (proxies) based on HapMap. By offering functionality to generate graphical plots for these data, the SNAP server will facilitate interpretation and comparison of genome-wide association study results, and the design of fine-mapping experiments (by delineating genomic regions harboring associated variants and their proxies). AVAILABILITY: SNAP server is available at http://www.broad.mit.edu/mpg/snap/.  相似文献   

9.
10.
In this report, we describe a simple correction for multiple testing of single-nucleotide polymorphisms (SNPs) in linkage disequilibrium (LD) with each other, on the basis of the spectral decomposition (SpD) of matrices of pairwise LD between SNPs. This method provides a useful alternative to more computationally intensive permutation tests. A user-friendly interface (SNPSpD) for performing this correction is available online (http://genepi.qimr.edu.au/general/daleN/SNPSpD/). Additionally, output from SNPSpD includes eigenvalues, principal-component coefficients, and factor "loadings" after varimax rotation, enabling the selection of a subset of SNPs that optimize the information in a genomic region.  相似文献   

11.
Scheet P  Stephens M 《PLoS genetics》2008,4(8):e1000147
Quality control (QC) is a critical step in large-scale studies of genetic variation. While, on average, high-throughput single nucleotide polymorphism (SNP) genotyping assays are now very accurate, the errors that remain tend to cluster into a small percentage of "problem" SNPs, which exhibit unusually high error rates. Because most large-scale studies of genetic variation are searching for phenomena that are rare (e.g., SNPs associated with a phenotype), even this small percentage of problem SNPs can cause important practical problems. Here we describe and illustrate how patterns of linkage disequilibrium (LD) can be used to improve QC in large-scale, population-based studies. This approach has the advantage over existing filters (e.g., HWE or call rate) that it can actually reduce genotyping error rates by automatically correcting some genotyping errors. Applying this LD-based QC procedure to data from The International HapMap Project, we identify over 1,500 SNPs that likely have high error rates in the CHB and JPT samples and estimate corrected genotypes. Our method is implemented in the software package fastPHASE, available from the Stephens Lab website (http://stephenslab.uchicago.edu/software.html).  相似文献   

12.

Genome-wide analysis of single nucleotide polymorphism (SNP) markers is an extremely efficient means for genetic mapping of mutations or traits in mice. However, this approach often defines a relatively large recombinant interval. To facilitate the refinement of this interval, we developed the program SNP2RFLP. This program can be used to identify region-specific SNPs in which the polymorphic nucleotide creates a restriction fragment length polymorphism (RFLP) that can be readily assayed at the benchtop using restriction enzyme digestion of SNP-containing PCR products. The program permits user-defined queries that maximize the informative markers for a particular application. This facilitates fine-mapping in a region containing a mutation of interest, which should prove valuable to the mouse genetics community. SNP2RFLP and further details are publicly available at http://genetics.bwh.harvard.edu/snp2rflp/.

  相似文献   

13.
ABSTRACT: Bisulfite treatment of DNA followed by high-throughput sequencing (Bisulfite-seq) is an important method for studying DNA methylation and epigenetic gene regulation, yet current software tools do not adequately address single nucleotide polymorphisms (SNPs). Identifying SNPs is important for accurate quantification of methylation levels and for identification of allele-specific epigenetic events such as imprinting. We have developed a model-based bisulfite SNP caller, Bis-SNP, that results in substantially better SNP calls than existing methods, thereby improving methylation estimates. At an average 30× genomic coverage, Bis-SNP correctly identified 96% of SNPs using the default high-stringency settings. The open-source package is available at http://epigenome.usc.edu/publicationdata/bissnp2011.  相似文献   

14.
Genome-wide analysis of single nucleotide polymorphism (SNP) markers is an extremely efficient means for genetic mapping of mutations or traits in mice. However, this approach often defines a relatively large recombinant interval. To facilitate the refinement of this interval, we developed the program SNP2RFLP. This program can be used to identify region-specific SNPs in which the polymorphic nucleotide creates a restriction fragment length polymorphism (RFLP) that can be readily assayed at the benchtop using restriction enzyme digestion of SNP-containing PCR products. The program permits user-defined queries that maximize the informative markers for a particular application. This facilitates fine-mapping in a region containing a mutation of interest, which should prove valuable to the mouse genetics community. SNP2RFLP and further details are publicly available at http://genetics.bwh.harvard.edu/snp2rflp/.  相似文献   

15.
MOTIVATION: Using bioinformatic approaches we aimed to characterize poorly understood abnormalities in splicing known as exon scrambling, exon repetition and trans-splicing. RESULTS: We developed a software package that allows large-scale comparison of all human expressed sequence tags (EST) sequences to the entire set of human gene sequences. Among 5,992,495 EST sequences, 401 cases of exon repetition and 416 cases of exon scrambling were found. The vast majority of identified ESTs contain fragments rather than full-length repeated or scrambled exons. Their structures suggest that the scrambled or repeated exon fragments may have arisen in the process of cDNA cloning and not from splicing abnormalities. Nevertheless, we found 11 cases of full-length exon repetition showing that this phenomenon is real yet very rare. In searching for examples of trans-splicing, we looked only at reproducible events where at least two independent ESTs represent the same putative trans-splicing event. We found 15 ESTs representing five types of putative trans-splicing. However, all 15 cases were derived from human malignant tissues and could have resulted from genomic rearrangements. Our results provide support for a very rare but physiological occurrence of exon repetition, but suggest that apparent exon scrambling and trans-splicing result, respectively, from in vitro artifact and gene-level abnormalities. AVAILABILITY: Exon-Intron Database (EID) is available at http://www.meduohio.edu/bioinfo/eid. Programs are available at http://www.meduohio.edu/bioinfo/software.html. The Laboratory website is available at http://www.meduohio.edu/medicine/fedorov Supplementary information: Supplementary file is available at http://www.meduohio.edu/bioinfo/software.html.  相似文献   

16.
17.
A large number of new genomic features are being discovered using high throughput techniques. The next challenge is to automatically map them to the reference genome for further analysis and functional annotation. We have developed a tool that can be used to map important genomic features to the latest version of the human genome and also to annotate new features. These genomic features could be of many different source types, including miRNAs, microarray primers or probes, Chip-on-Chip data, CpG islands and SNPs to name a few. A standalone version and web interface for the tool can be accessed through: http://populationhealth.qimr.edu.au/cgi-bin/webFOG/index.cgi. The project details and source code is also available at http://www.bioinformatics.org/webfog.  相似文献   

18.
Although tomato has been the subject of extensive quantitative trait loci (QTLs) mapping experiments, most of this work has been conducted on transient populations (e.g., F2 or backcross) and few homozygous, permanent mapping populations are available. To help remedy this situation, we have developed a set of inbred backcross lines (IBLs) from the interspecific cross between Lycopersicon esculentum cv. E6203 and L. pimpinellifolium (LA1589). A total of 170 BC2F1 plants were selfed for five generations to create a set of homozygous BC2F6 lines by single-seed descent. These lines were then genotyped for 127 marker loci covering the entire tomato genome. These IBLs were evaluated for 22 quantitative traits. In all, 71 significant QTLs were identified, 15% (11/71) of which mapped to the same chromosomal positions as QTLs identified in earlier studies using the same cross. For 48% (34/71) of the detected QTLs, the wild allele was associated with improved agronomic performance. A number of new QTLs were identified including several of significant agronomic importance for tomato production: fruit shape, firmness, fruit color, scar size, seed and flower number, leaf curliness, plant growth, fertility, and flowering time. To improve the utility of the IBL population, a subset of 100 lines giving the most uniform genome coverage and map resolution was selected using a randomized greedy algorithm as implemented in the software package MapPop (http://www.bio.unc.edu/faculty/vision/lab/ mappop/). The map, phenotypic data, and seeds for the IBL population are publicly available (http://soldb.cit.cornell.edu) and will provide tomato geneticists and breeders with a genetic resource for mapping, gene discovery, and breeding.  相似文献   

19.
The Synergizer is a database and web service that provides translations of biological database identifiers. It is accessible both programmatically and interactively. AVAILABILITY: The Synergizer is freely available to all users inter-actively via a web application (http://llama.med.harvard.edu/synergizer/translate) and programmatically via a web service. Clients implementing the Synergizer application programming interface (API) are also freely available. Please visit http://llama.med.harvard.edu/synergizer/doc for details.  相似文献   

20.
MOTIVATION: Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but this technique is neither practical nor justifiable for large datasets. RESULTS: We describe a data structure that supports efficient KNN queries over arbitrarily sized, sliding haplotype windows, and evaluate its use for genotype imputation. The performance of our method enables exhaustive exploration over all window sizes and known sites in large (150K, 8.3M) SNP panels. We also compare the accuracy and performance of our methods with competing imputation approaches. AVAILABILITY: A free open source software package, NPUTE, is available at http://compgen.unc.edu/software, for non-commercial uses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号