首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A C++ class library is available to facilitate the implementation of software for genomics and sequence polymorphism analysis. The library implements methods for data manipulation and the calculation of several statistics commonly used to analyze SNP data. The object-oriented design of the library is intended to be extensible, allowing users to design custom classes for their own needs. In addition, routines are provided to process samples generated by a widely used coalescent simulation. AVAILABILITY: The source code (in C++) is available from http://www.molpopgen.org  相似文献   

2.
The NEXUS Class Library (NCL) is a collection of C++ classes designed to simplify interpreting data files written in the NEXUS format used by many computer programs for phylogenetic analyses. The NEXUS format allows different programs to share the same data files, even though none of the programs can interpret all of the data stored therein. Because users are not required to reformat the data file for each program, use of the NEXUS format prevents cut-and-paste errors as well as the proliferation of copies of the original data file. The purpose of making the NCL available is to encourage the use of the NEXUS format by making it relatively easy for programmers to add the ability to interpret NEXUS files in newly developed software. AVAILABILITY: The NCL is freely available under the GNU General Public License from http://hydrodictyon.eeb.uconn.edu/ncl/ Supplementary information: Documentation for the NCL (general information and source code documentation) is available in HTML format at http://hydrodictyon.eeb.uconn.edu/ncl/  相似文献   

3.
4.
ABSTRACT: BACKGROUND: Contact network models have become increasingly common in epidemiology, but we lack a flexible programming framework for the generation and analysis of epidemiological contact networks and for the simulation of disease transmission through such networks. RESULTS: Here we present EpiFire, an applications programming interface and graphical user interface implemented in C++, which includes a fast and efficient library for generating, analyzing and manipulating networks. Network-based percolation and chain-binomial simulations of susceptible-infected-recovered disease transmission, as well as traditional non-network mass-action simulations, can be performed using EpiFire. CONCLUSIONS: EpiFire provides an open-source programming interface for the rapid development of network models with a focus in contact network epidemiology. EpiFire also provides a point-and-click interface for generating networks, conducting epidemic simulations, and creating figures. This interface is particularly useful as a pedagogical tool.  相似文献   

5.
6.
7.
Transfer RNAs (tRNA) are important molecules that involved in protein translation machinery and acts as a bridge between the ribosome and codon of the mRNA. The study of tRNA is evolving considerably in the fields of bacteria, plants, and animals. However, detailed genomic study of the cyanobacterial tRNA is lacking. Therefore, we conducted a study of cyanobacterial tRNA from 61 species. Analysis revealed that; cyanobacteria contain thirty-six to seventy-eight tRNA gens per genome that encodes for 20 tRNA isotypes. The number of iso-acceptors (anti-codons) ranged from thirty-two to forty-three per genome. tRNAIle with anti-codon AAU, GAU, and UAU was reported to be absent from the genome of Gleocapsa PCC 73,106 and Xenococcus sp. PCC 7305. Instead, they were contained anti-codon CAU that is common to tRNAMet and tRNAIle as well. The iso-acceptors ACA (tRNACys), ACC (tRNAGly), AGA, ACU (tRNASer), AAA (tRNAPhe), AGG (tRNAPro), AAC (tRNAVal), GCG (tRNAArg), AUG (tRNAHis), and AUC (tRNAAsp) were absent from the genome of cyanobacterial lineages studied so far. A few of the cyanobacterial species encode suppressor tRNAs, whereas none of the species were found to encode a selenocysteine iso-acceptor. Cyanobacterial species encode a few putative novel tRNAs whose functions are yet to be elucidated.  相似文献   

8.

Background  

Repeat-rich regions such as centromeres receive less attention than their gene-rich euchromatic counterparts because the former are difficult to assemble and analyze. Our objectives were to 1) map all ten centromeres onto the maize genetic map and 2) characterize the sequence features of maize centromeres, each of which spans several megabases of highly repetitive DNA. Repetitive sequences can be mapped using special molecular markers that are based on PCR with primers designed from two unique "repeat junctions". Efficient screening of large amounts of maize genome sequence data for repeat junctions, as well as key centromere sequence features required the development of specific annotation software.  相似文献   

9.
10.
11.
12.
Non-coding variants have long been recognized as important contributors to common disease risks, but with the expansion of clinical whole genome sequencing, examples of rare, high-impact non-coding variants are also accumulating. Despite recent advances in the study of regulatory elements and the availability of specialized data collections, the systematic annotation of non-coding variants from genome sequencing remains challenging. Here, we propose a new framework for the prioritization of non-coding regulatory variants that integrates information about regulatory regions with prediction scores and HPO-based prioritization. Firstly, we created a comprehensive collection of annotations for regulatory regions including a database of 2.4 million regulatory elements (GREEN-DB) annotated with controlled gene(s), tissue(s) and associated phenotype(s) where available. Secondly, we calculated a variation constraint metric and showed that constrained regulatory regions associate with disease-associated genes and essential genes from mouse knock-outs. Thirdly, we compared 19 non-coding impact prediction scores providing suggestions for variant prioritization. Finally, we developed a VCF annotation tool (GREEN-VARAN) that can integrate all these elements to annotate variants for their potential regulatory impact. In our evaluation, we show that GREEN-DB can capture previously published disease-associated non-coding variants as well as identify additional candidate disease genes in trio analyses.  相似文献   

13.
The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.  相似文献   

14.
SUMMARY: ESS++ is a C++ implementation of a fully Bayesian variable selection approach for single and multiple response linear regression. ESS++ works well both when the number of observations is larger than the number of predictors and in the 'large p, small n' case. In the current version, ESS++ can handle several hundred observations, thousands of predictors and a few responses simultaneously. The core engine of ESS++ for the selection of relevant predictors is based on Evolutionary Monte Carlo. Our implementation is open source, allowing community-based alterations and improvements. AVAILABILITY: C++ source code and documentation including compilation instructions are available under GNU licence at http://bgx.org.uk/software/ESS.html.  相似文献   

15.

Background

Genome-wide association (GWA) study has recently become a powerful approach for detecting genetic variants for common diseases without prior knowledge of the variant's location or function. Generally, in GWA studies, the most significant single-nucleotide polymorphisms (SNPs) associated with top-ranked p values are selected in stage one, with follow-up in stage two. The value of selecting SNPs based on statistically significant p values is obvious. However, when minor allele frequencies (MAFs) are relatively low, less-significant p values can still correspond to higher odds ratios (ORs), which might be more useful for prediction of disease status. Therefore, if SNPs are selected using an approach based only on significant p values, some important genetic variants might be missed. We proposed a hybrid approach for selecting candidate SNPs from the discovery stage of GWA study, based on both p values and ORs, and conducted a simulation study to demonstrate the performance of our approach.

Results

The simulation results showed that our hybrid ranking approach was more powerful than the existing ranked p value approach for identifying relatively less-common SNPs. Meanwhile, the type I error probabilities of the hybrid approach is well-controlled at the end of the second stage of the two-stage GWA study.

Conclusions

In GWA studies, SNPs should be considered for inclusion based not only on ranked p values but also on ranked ORs.  相似文献   

16.

Background  

Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes). Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps.  相似文献   

17.
18.
Springs SL  Bass SE  McLendon GL 《Biochemistry》2000,39(20):6075-6082
A general understanding of how cytochromes evolve within a fixed structure to optimize redox potential for specific bioenergetic processes does not exist. Toward this end, a library approach is used to investigate the range and distribution of redox potential which occurs when all sequence space available through mutation at two positions is examined within a fixed structural motif. Random mutation of Phe61 and Phe65 of cytochrome b562 (E. coli), and subsequent examination of a statistically significant sampling of this library, demonstrates that the redox potential can vary over 100 mV (>25% of the known accessible potential in native proteins with axial His-Met ligation) through mutation at these two positions. The redox potential of the wild-type protein occurs at an extremum of the distribution observed, indicating that Phe61 and Phe65 were most likely naturally selected to differentially stabilize the reduced state of the protein. At the other extremum, a compositionally conservative set of mutations (F61I, F65Y) leads to a 100 mV shift in the redox equilibrium toward the oxidized state. NMR analyses indicate that a charge-dipole interaction which results from mutation of phenylalanine to tyrosine at position 65 may be responsible.  相似文献   

19.
20.

Background  

An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号