首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Peng J  Yang J  Jin Q 《PloS one》2011,6(4):e18509

Background

The completion of numerous genome sequences introduced an era of whole-genome study. However, many genes are missed during genome annotation, including small RNAs (sRNAs) and small open reading frames (sORFs). In order to improve genome annotation, we aimed to identify novel sRNAs and sORFs in Shigella, the principal etiologic agents of bacillary dysentery.

Methodology/Principal Findings

We identified 64 sRNAs in Shigella, which were experimentally validated in other bacteria based on sequence conservation. We employed computer-based and tiling array-based methods to search for sRNAs, followed by RT-PCR and northern blots, to identify nine sRNAs in Shigella flexneri strain 301 (Sf301) and 256 regions containing possible sRNA genes. We found 29 candidate sORFs using bioinformatic prediction, array hybridization and RT-PCR verification. We experimentally validated 557 (57.9%) DOOR operon predictions in the chromosomes of Sf301 and 46 (76.7%) in virulence plasmid.We found 40 additional co-expressed gene pairs that were not predicted by DOOR.

Conclusions/Significance

We provide an updated and comprehensive annotation of the Shigella genome. Our study increased the expected numbers of sORFs and sRNAs, which will impact on future functional genomics and proteomics studies. Our method can be used for large scale reannotation of sRNAs and sORFs in any microbe with a known genome sequence.  相似文献   

2.

Background  

Complexity and noise in expression quantitative trait loci (eQTL) studies make it difficult to distinguish potential regulatory relationships among the many interactions. The predominant method of identifying eQTLs finds associations that are significant at a genome-wide level. The vast number of statistical tests carried out on these data make false negatives very likely. Corrections for multiple testing error render genome-wide eQTL techniques unable to detect modest regulatory effects.  相似文献   

3.

Background

Most cellular signal transduction mechanisms depend on a few molecular partners whose roles depend on their position and movement in relation to the input signal. This movement can follow various rules and take place in different compartments. Additionally, the molecules can form transient complexes. Complexation and signal transduction depend on the specific states partners and complexes adopt. Several spatial simulator have been developed to date, but none are able to model reaction-diffusion of realistic multi-state transient complexes.

Results

Meredys allows for the simulation of multi-component, multi-feature state molecular species in two and three dimensions. Several compartments can be defined with different diffusion and boundary properties. The software employs a Brownian dynamics engine to simulate reaction-diffusion systems at the reactive particle level, based on compartment properties, complex structure, and hydro-dynamic radii. Zeroth-, first-, and second order reactions are supported. The molecular complexes have realistic geometries. Reactive species can contain user-defined feature states which can modify reaction rates and outcome. Models are defined in a versatile NeuroML input file. The simulation volume can be split in subvolumes to speed up run-time.

Conclusions

Meredys provides a powerful and versatile way to run accurate simulations of molecular and sub-cellular systems, that complement existing multi-agent simulation systems. Meredys is a Free Software and the source code is available at http://meredys.sourceforge.net/.  相似文献   

4.
5.
LCA is a system-wide assessment, and the LCIA phase is confronted with the difficulties of local and regional effects in a number of impact categories. We integrate three different environmental techniques to demonstrate how these effects can be addressed in an environmental assessment. The techniques are life cycle inventory, environmental fate models, and an ecological impact assessment using fuzzy expert systems. Results of the LCI are mass and energy flows. In the environmental fate modelling step these mass flows are transformed into concentration and immission values by dispersion-reaction models. A generalised fuzzy expert system for the environmental mechanisms compares calculated exposure with site specific buffering capacities and formulates a generalised dose-response relationship. This generalised fuzzy expert system is used as a template for the assessment of local and regional environmental impacts. An application of this integrated approach is shown for a practical problem: production of magnesium car components. The environmental fate of nitrogen oxides which are released due to the major combustion source within that production system is simulated. Fuzzy expert models for crop damage, soil acidification and eutrophication determine the possible environmental impact of the immited nitrogen oxides. The important methodological extension of this integrated approach is a regionalised impact assessment depending on the spatial distribution of environmental characteristics.  相似文献   

6.
The problem of identifying significantly differentially expressed genes for replicated microarray experiments is accepted as significant and has been tackled by several researchers. Patterns from Gene Expression (PaGE) and q-values are two of the well-known approaches developed to handle this problem. This paper proposes a powerful approach to handle this problem. We first propose a method for estimating the prior probabilities used in the first version of the PaGE algorithm. This way, the problem definition of PaGE stays intact and we just estimate the needed prior probabilities. Our estimation method is similar to Storey's estimator without being its direct extension. Then, we modify the problem formulation to find significantly differentially expressed genes and present an efficient method for finding them. This formulation increases the power by directly incorporating Storey's estimator. We report the preliminary results on the BRCA data set to demonstrate the applicability and effectiveness of our approach.  相似文献   

7.
We have developed an improved method for photofootprinting in vivo which utilizes the thermostable DNA polymerase from T. aquaticus (Taq) in a primer extension assay. UV light is used to introduce photoproducts into the genomic DNA of intact yeast cells. The photoproducts are then detected and mapped at the nucleotide level by multiple rounds of annealing and extension using Taq polymerase, which is blocked by photoproducts in the template DNA. The method is more rapid, sensitive, and reproducible than the previously described chemical photofootprinting procedure developed in this laboratory (Nature 325. 173-177), and detects photoproducts with a specificity which is similar, but not identical to that of the previously described procedure. Binding of GAL4 protein to its binding sites within the GAL1-10 upstream activating sequence is demonstrated using the primer extension photofootprinting method. The primer extension assay can also be used to map DNA strand breakage generated by other footprinting methods, and to determine DNA sequence directly from the yeast genome.  相似文献   

8.
MORGAN is an integrated system for finding genes in vertebrate DNA sequences. MORGAN uses a variety of techniques to accomplish this task, the most distinctive of which is a decision tree classifier. The decision tree system is combined with new methods for identifying start codons, donor sites, and acceptor sites, and these are brought together in a frame-sensitive dynamic programming algorithm that finds the optimal segmentation of a DNA sequence into coding and noncoding regions (exons and introns). The optimal segmentation is dependent on a separate scoring function that takes a subsequence and assigns to it a score reflecting the probability that the sequence is an exon. The scoring functions in MORGAN are sets of decision trees that are combined to give a probability estimate. Experimental results on a database of 570 vertebrate DNA sequences show that MORGAN has excellent performance by many different measures. On a separate test set, it achieves an overall accuracy of 95 %, with a correlation coefficient of 0.78, and a sensitivity and specificity for coding bases of 83 % and 79%. In addition, MORGAN identifies 58% of coding exons exactly; i.e., both the beginning and end of the coding regions are predicted correctly. This paper describes the MORGAN system, including its decision tree routines and the algorithms for site recognition, and its performance on a benchmark database of vertebrate DNA.  相似文献   

9.
Deregulation of cell signaling pathways plays a crucial role in the development of tumors. The identification of such pathways requires effective analysis tools that facilitate the interpretation of expression differences. Here, we present a novel and highly efficient method for identifying deregulated subnetworks in a regulatory network. Given a score for each node that measures the degree of deregulation of the corresponding gene or protein, the algorithm computes the heaviest connected subnetwork of a specified size reachable from a designated root node. This root node can be interpreted as a molecular key player responsible for the observed deregulation. To demonstrate the potential of our approach, we analyzed three gene expression data sets. In one scenario, we compared expression profiles of non-malignant primary mammary epithelial cells derived from BRCA1 mutation carriers and of epithelial cells without BRCA1 mutation. Our results suggest that oxidative stress plays an important role in epithelial cells of BRCA1 mutation carriers and that the activation of stress proteins may result in avoidance of apoptosis leading to an increased overall survival of cells with genetic alterations. In summary, our approach opens new avenues for the elucidation of pathogenic mechanisms and for the detection of molecular key players.  相似文献   

10.
Heuristic approach to deriving models for gene finding.   总被引:21,自引:2,他引:19       下载免费PDF全文
Computer methods of accurate gene finding in DNA sequences require models of protein coding and non-coding regions derived either from experimentally validated training sets or from large amounts of anonymous DNA sequence. Here we propose a new, heuristic method producing fairly accurate inhomogeneous Markov models of protein coding regions. The new method needs such a small amount of DNA sequence data that the model can be built 'on the fly' by a web server for any DNA sequence >400 nt. Tests on 10 complete bacterial genomes performed with the GeneMark.hmm program demonstrated the ability of the new models to detect 93.1% of annotated genes on average, while models built by traditional training predict an average of 93.9% of genes. Models built by the heuristic approach could be used to find genes in small fragments of anonymous prokaryotic genomes and in genomes of organelles, viruses, phages and plasmids, as well as in highly inhomogeneous genomes where adjustment of models to local DNA composition is needed. The heuristic method also gives an insight into the mechanism of codon usage pattern evolution.  相似文献   

11.
12.
Physiological and morphological characters were recorded from 55 strains of 17 Phoma taxa and one Pyrenochaeta. The results were subjected to numerical analysis and UPGMA dendrograms produced. The full results were compared with TLC profiles of secondary metabolites. Seven distinct clusters were recovered from dendrograms based on full and partial character sets and the grouping of strains within each cluster discussed. The new combination Phoma sambuci-nigrae (Sacc.) Monte, Bridge & Sutton is proposed for P. herbarum f. sambuci-nigrae Sacc.  相似文献   

13.
14.

Background  

Clustering techniques are routinely used in gene expression data analysis to organize the massive data. Clustering techniques arrange a large number of genes or assays into a few clusters while maximizing the intra-cluster similarity and inter-cluster separation. While clustering of genes facilitates learning the functions of un-characterized genes using their association with known genes, clustering of assays reveals the disease stages and subtypes. Many clustering algorithms require the user to specify the number of clusters a priori. A wrong specification of number of clusters generally leads to either failure to detect novel clusters (disease subtypes) or unnecessary splitting of natural clusters.  相似文献   

15.
BACKGROUND: Most phenomena in developmental biology involve or depend upon cell migration. This article describes a comprehensive framework for the characterization and analysis of trajectories defined by cell movement. The following two perspectives are considered: (a) the behavior of each individual cell and (b) interactions between neighboring pairs of cells. METHODS: The measurements considered for individual trajectories include the velocity magnitude and orientation, maximum spatial dispersion, displacement effectiveness, and displacement entropies. Interactions between two trajectories are characterized by comparing the respective velocities. RESULTS: The potential of the overall framework is illustrated using data of moving cells in different biological environments. The work shows that it is possible to use the new algorithm presented here to characterize cell motility. CONCLUSIONS: The features of the algorithm were successful in determining the motility changes under different experimental conditions.  相似文献   

16.
Recent genomic projects reveal that about half of the gene repertoire in plant genomes is made up by multigene families. In this paper, a set of structural and phylogenetic analyses have been applied to compare the differently sized nicotianamine synthase (NAS) gene families in barley and rice. Nicotianamine acts as a chelator of iron and other heavy metals and plays a key role in uptake, phloem transport and cytoplasmic distribution of iron, challenging efforts for the breeding of iron-efficient crop plants. Nine barley NAS genes have been mapped, and co-linearity of flanking genes in barley and rice was determined. The combined analyses reveal that the NAS multigene family members in barley originated through at least one duplication event that occurred before the divergence of rice and barley. Additional duplications appear to have occurred within each of the species. Although we detected no evidence for positive selection of recently duplicated genes within species, codon-based tests revealed evidence for positive selection having contributed to the divergence of some amino acids. The integrated comparative and phylogenetic analysis improved our current view of NAS gene family evolution, might facilitate the functional characterization of individual members and is applicable to other multigene families. Electronic supplementary material Supplementary material is available in the online version of this article at and is accessible for authorized users.  相似文献   

17.
We developed a dynamic programming approach of computing common sequence structure patterns among two RNAs given their primary sequences and their secondary structures. Common patterns between two RNAs are defined to share the same local sequential and structural properties. The locality is based on the connections of nucleotides given by their phosphodiester and hydrogen bonds. The idea of interpreting secondary structures as chains of structure elements leads us to develop an efficient dynamic programming approach in time O(nm) and space O(nm), where n and m are the lengths of the RNAs. The biological motivation is given by detecting common, local regions of RNAs, although they do not necessarily share global sequential and structural properties. This might happen if RNAs fold into different structures but share a lot of local, stable regions. Here, we illustrate our algorithm on Hepatitis C virus internal ribosome entry sites. Our method is useful for detecting and describing local motifs as well. An implementation in C++ is available and can be obtained by contacting one of the authors.  相似文献   

18.
H Q Wang  X Y Zhang 《Génome》2006,49(2):181-189
High-molecular-weight glutenin subunits (HMW-GSs) play an important role in the breadmaking quality of wheat flour. In China, cultivars such as Triticum aestivum 'Xiaoyan No. 6' carrying the 1Bx14 and 1By15 glutenin subunits usually have attributes that result in high-quality bread and noodles. HMW-GS 1Bx14 and 1By15 were isolated by preparative sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) and used as an antigen to immunize BALB/c mice. A resulting monoclonal antibody belonging to the IgG1 subclass was shown to bind to all HMW-GSs of Triticum aestivum cultivars, but did not bind to other storage proteins of wheat seeds in a Western blot analysis. After screening a complementary DNA expression library from immature seeds of 'Xiaoyan No. 6' using the monoclonal antibody, the HMW-GS 1By15 gene was isolated and fully sequenced. The deduced amino acid sequence showed an extra stretch of 15 amino acid repeats consisting of a hexapeptide and a nonapeptide in the repetitive domain of this y-type HMW subunit. Bacterial expression of a modified 1By15 gene, in which the coding sequence for the signal peptide was removed and a BamHI site eliminated, gave rise to a protein with mobility identical to that of HMW-GSs extracted from seeds of 'Xiaoyan No. 6' via SDS-PAGE. This approach for isolating genes using specific monoclonal antibody against HMW-GS genes is a good alternative to the extensively used polymerase chain reaction (PCR) technology based on sequence homology of HMW-GSs in wheat and its relatives.  相似文献   

19.
Aspartic proteases play key roles in a variety of pathologies, including acquired immunodeficiency syndrome. Peptidomimetic inhibitors can act as drugs to combat these pathologies. We have developed an integrated methodology for preparing human immunodeficiency virus (HIV)-1 aspartic protease diaminodiol inhibitors, based on a computational method that predicts the potential inhibitory activity of the designed structures in terms of calculated enzyme-inhibitor complexation energies. This is combined with a versatile synthetic strategy that couples a high degree of stereochemical control in the central diaminodiol module with complete flexibility in the choice of side chains in the core and in flanking residues. A series of 23 tetrameric, pentameric and hexameric inhibitors, with a wide range of calculated relative complexation energies (-47.2 to +117 kJ.mol-1) and predicted hydrophobicities (logPo/w = 1.8-8.4) was thus assembled from readily available amino acids and carboxylic acids. The IC50 values for these compounds ranged from 3.2 nM to 90 microM, allowing study of correlations between structure and activity, and individuation of factors other than calculated complexation energies that determine the inhibition potency. Multivariable regression analysis revealed the importance of side-chain bulkiness and rigidity at the P2, P2' positions, suggesting possible improvements for the prediction process used to select candidate structures.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号