首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sequence annotation is essential for genomics-based research. Investigators of a specific genomic region who have developed abundant local discoveries such as genes and genetic markers, or have collected annotations from multiple resources, can be overwhelmed by the difficulty in creating local annotation and the complexity of integrating all the annotations. Presenting such integrated data in a form suitable for data mining and high-throughput experimental design is even more daunting. DNannotator, a web application, was designed to perform batch annotation on a sizeable genomic region. It takes annotation source data, such as SNPs, genes, primers, and so on, prepared by the end-user and/or a specified target of genomic DNA, and performs de novo annotation. DNannotator can also robustly migrate existing annotations in GenBank format from one sequence to another. Annotation results are provided in GenBank format and in tab-delimited text, which can be imported and managed in a database or spreadsheet and combined with existing annotation as desired. Graphic viewers, such as Genome Browser or Artemis, can display the annotation results. Reference data (reports on the process) facilitating the user's evaluation of annotation quality are optionally provided. DNannotator can be accessed at http://sky.bsd.uchicago.edu/DNannotator.htm.  相似文献   

2.
We describe a rapid and cost-effective technique for the in vitro removal of introns and other unwanted regions from genomic DNA to generate a single sequence of continuous coding capacity, where tissues required for RNA extraction and complementary DNA synthesis are unavailable. Based on an overlapping fusion-PCR strategy, we name this procedure SPLICE (for swift PCR for ligating in vitro constructed exons). As proof-of-principle, we used SPLICE successfully to generate a single piece of DNA containing the coding region of a five-exon gene, the short-wavelength-sensitive 1 (SWS1) opsin gene, from genomic DNA extracted from the brown lemur, Eulemur fulvus, in only two short rounds of PCR. Where the genomic structure and sequence is known, this technique may be universally applied to any gene expressed in any organism to generate a practical unit for investigating the function of a particular gene of interest. In this report, we provide a detailed protocol, experimental considerations, and suggestions for troubleshooting.  相似文献   

3.
The Homeodomain Resource is a comprehensive collection of sequence, structure and genomic information on the homeodomain protein family. Available through the Resource are both full-length and domain-only sequence data, as well as X-ray and NMR structural data for proteins and protein-DNA complexes. Also available is information on human genetic diseases and disorders in which proteins from the homeodomain family play an important role; genomic information includes relevant gene symbols, cytogenetic map locations, and specific mutation data. Search engines are provided to allow users to easily query the component databases and assemble specialized data sets. The Homeodomain Resource is available through the World Wide Web at http://genome.nhgri.nih.gov/homeodomain  相似文献   

4.
5.
QGENE: software for marker-based genomic analysis and breeding   总被引:15,自引:0,他引:15  
Efficient use of DNA markers for genomic research and crop improvement will depend as much on computational tools as on laboratory technology. The large size and multidimensional character of marker datasets invite novel approaches to data visualization. Described here is a software application embodying two design principles: conventional reduction of raw genetic marker data to numerical summary statistics, and fast, interactive graphical display of both data and statistics. The program performs various analyses for mapping quantitative-trait loci in real or simulated datasets and other analyses in aid of phenotypic and marker-assisted breeding. Functionality is described and some output is illustrated.  相似文献   

6.
PepLine is a fully automated software which maps MS/MS fragmentation spectra of trypsic peptides to genomic DNA sequences. The approach is based on Peptide Sequence Tags (PSTs) obtained from partial interpretation of QTOF MS/MS spectra (first module). PSTs are then mapped on the six-frame translations of genomic sequences (second module) giving hits. Hits are then clustered to detect potential coding regions (third module). Our work aimed at optimizing the algorithms of each component to allow the whole pipeline to proceed in a fully automated manner using raw nucleic acid sequences (i.e., genomes that have not been "reduced" to a database of ORFs or putative exons sequences). The whole pipeline was tested on controlled MS/MS spectra sets from standard proteins and from Arabidopsis thaliana envelope chloroplast samples. Our results demonstrate that PepLine competed with protein database searching softwares and was fast enough to potentially tackle large data sets and/or high size genomes. We also illustrate the potential of this approach for the detection of the intron/exon structure of genes.  相似文献   

7.
MOTIVATION: MethylCoder is a software program that generates per-base methylation data given a set of bisulfite-treated reads. It provides the option to use either of two existing short-read aligners, each with different strengths. It accounts for soft-masked alignments and overlapping paired-end reads. MethylCoder outputs data in text and binary formats in addition to the final alignment in SAM format, so that common high-throughput sequencing tools can be used on the resulting output. It is more flexible than existing software and competitive in terms of speed and memory use. AVAILABILITY: MethylCoder requires only a python interpreter and a C compiler to run. Extensive documentation and the full source code are available under the MIT license at: https://github.com/brentp/methylcode. CONTACT: bpederse@gmail.com.  相似文献   

8.

Background  

Repeat-rich regions such as centromeres receive less attention than their gene-rich euchromatic counterparts because the former are difficult to assemble and analyze. Our objectives were to 1) map all ten centromeres onto the maize genetic map and 2) characterize the sequence features of maize centromeres, each of which spans several megabases of highly repetitive DNA. Repetitive sequences can be mapped using special molecular markers that are based on PCR with primers designed from two unique "repeat junctions". Efficient screening of large amounts of maize genome sequence data for repeat junctions, as well as key centromere sequence features required the development of specific annotation software.  相似文献   

9.
MOTIVATION: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes. RESULTS: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three levels for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software. AVAILABILITY: The AraSet sequence set, the Perl programs and complementary results and notes are available at http://sphinx.rug.ac.be:8080/biocomp/napav/. CONTACT: Pierre.Rouze@gengenp.rug.ac.be.  相似文献   

10.
Che D  Hasan MS  Wang H  Fazekas J  Huang J  Liu Q 《Bioinformation》2011,7(6):311-314
Genomic islands (GIs) are genomic regions that are originally transferred from other organisms. The detection of genomic islands in genomes can lead to many applications in industrial, medical and environmental contexts. Existing computational tools for GI detection suffer either low recall or low precision, thus leaving the room for improvement. In this paper, we report the development of our Ensemble algorithm for Genomic Island Detection (EGID). EGID utilizes the prediction results of existing computational tools, filters and generates consensus prediction results. Performance comparisons between our ensemble algorithm and existing programs have shown that our ensemble algorithm is better than any other program. EGID was implemented in Java, and was compiled and executed on Linux operating systems. EGID is freely available at http://www5.esu.edu/cpsc/bioinfo/software/EGID.  相似文献   

11.
12.
The past decade has witnessed the construction of linkage and physical maps defining quantitative trait loci (QTL) in various domesticated species. Targeted chromosomal regions are being further characterized through the construction of bacterial artificial chromosome (BAC) contigs in order to isolate and characterize genes contributing towards phenotypic variation. Whole-genome BAC contigs are also being constructed that will serve as the tiling path for genomic sequencing. Harvesting this genetic information for biological gain requires either genetic selection or the production of genetically modified animals. This later approach when coupled with nuclear transfer technology (NT) provides "clones" of genetically modified animals. However, to date, the production of genetically modified animals has been limited to either microinjection of small gene constructs into embryos with random insertion or complex gene constructs designed to knock-out targeted gene expression. Neither of these approaches provides for introducing directed genetic manipulation allowing for allelic substitution [knock-in], subsequent analyses of gene expression, and cloning. An alternative approach utilizing genomic sequence information and recombineering to direct gene targeting of specific porcine BACs is presented here.  相似文献   

13.
We present a software system BASIO that allows one to segment a sequence into regions with homogeneous nucleotide composition at a desired length scale. The system can work with arbitrary alphabet and therefore can be applied to various (e.g. protein) sequences. Several sequences of complete genomes of eukaryotes are used to demonstrate the efficiency of the software. AVAILABILITY: The BASIO suite is available for non-commercial users free of charge as a set of executables and accompanying segmentation scenarios from http://www.imb.ac.ru/compbio/basio. To obtain the source code, contact the authors.  相似文献   

14.
Hasan MS  Liu Q  Wang H  Fazekas J  Chen B  Che D 《Bioinformation》2012,8(4):203-205
Genomic Islands (GIs) are genomic regions that are originally from other organisms, through a process known as Horizontal Gene Transfer (HGT). Detection of GIs plays a significant role in biomedical research since such align genomic regions usually contain important features, such as pathogenic genes. We have developed a use friendly graphic user interface, Genomic Island Suite of Tools (GIST), which is a platform for scientific users to predict GIs. This software package includes five commonly used tools, AlienHunter, IslandPath, Colombo SIGI-HMM, INDeGenIUS and Pai-Ida. It also includes an optimization program EGID that ensembles the result of existing tools for more accurate prediction. The tools in GIST can be used either separately or sequentially. GIST also includes a downloadable feature that facilitates collecting the input genomes automatically from the FTP server of the National Center for Biotechnology Information (NCBI). GIST was implemented in Java, and was compiled and executed on Linux/Unix operating systems. AVAILABILITY: The database is available for free at http://www5.esu.edu/cpsc/bioinfo/software/GIST.  相似文献   

15.
We have developed a software that applies ascertainment bias on simulated DNA sequences and calculates F(ST) on them, so they can be used to generate neutral distributions that are appropriate to test whether the genetic differentiation of a particular gene between populations is compatible with neutral evolution, or, on the contrary, suggests local adaptation by natural selection. AVAILABILITY: FABSIM is available from http://www.snpator.com/public/downloads/aRamirez/FABSIM/.  相似文献   

16.
Mass spectrometry is used to investigate global changes in protein abundance in cell lysates. Increasingly powerful methods of data collection have emerged over the past decade, but this has left researchers with the task of sifting through mountains of data for biologically significant results. Often, the end result is a list of proteins with no obvious quantitative relationships to define the larger context of changes in cell behavior. Researchers are often forced to perform a manual analysis from this list or to fall back on a range of disparate tools, which can hinder the communication of results and their reproducibility. To address these methodological problems, we developed Annotator, an application that filters validated mass spectrometry data and applies a battery of standardized heuristic and statistical tests to determine significance. To address systems-level interpretations, we incorporated UniProt and Gene Ontology keywords as statistical units of analysis, yielding quantitative information about changes in abundance for an entire functional category. This provides a consistent and quantitative method for formulating conclusions about cellular behavior, independent of network models or standard enrichment analyses. Annotator allows for "bottom-up" annotations that are based on experimental data and not inferred by comparison to external or hypothetical models. Annotator was developed as an independent postprocessing platform that runs on all common operating systems, thereby providing a useful tool for establishing the inherently dynamic nature of functional annotations, which depend on results from ongoing proteomic experiments. Annotator is available for download at http://people.cs.uchicago.edu/~tyler/annotator/annotator_desktop_0.1.tar.gz .  相似文献   

17.

Background  

We present Pegasys – a flexible, modular and customizable software system that facilitates the execution and data integration from heterogeneous biological sequence analysis tools.  相似文献   

18.

Background  

The biological information in genomic expression data can be understood, and computationally extracted, in the context of systems of interacting molecules. The automation of this information extraction requires high throughput management and analysis of genomic expression data, and integration of these data with other data types.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号