首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
MOTIVATION: The recent discovery of the first small modulatory RNA (smRNA) presents the challenge of finding other molecules of similar length and conservation level. Unlike short interfering RNA (siRNA) and micro-RNA (miRNA), effective computational and experimental screening methods are not currently known for this species of RNA molecule, and the discovery of the one known example was partly fortuitous because it happened to be complementary to a well-studied DNA binding motif (the Neuron Restrictive Silencer Element). RESULTS: The existing comparative genomics approaches (e.g., phylogenetic footprinting) rely on alignments of orthologous regions across multiple genomes. This approach, while extremely valuable, is not suitable for finding motifs with highly diverged "non-alignable" flanking regions. Here we show that several unusually long and well conserved motifs can be discovered de novo through a comparative genomics approach that does not require an alignment of orthologous upstream regions. These motifs, including Neuron Restrictive Silencer Element, were missed in recent comparative genomics studies that rely on phylogenetic footprinting. While the functions of these motifs remain unknown, we argue that some may represent biologically important sites. AVAILABILITY: Our comparative genomics software, a web-accessible database of our results and a compilation of experimentally validated binding sites for NRSE can be found at http://www.cse.ucsd.edu/groups/bioinformatics.  相似文献   

3.
SUMMARY: PRIMEX (PRImer Match EXtractor) can detect oligonucleotide sequences in whole genomes, allowing for mismatches. Using a word lookup table and server functionality, PRIMEX accepts queries from client software and returns matches rapidly. We find it faster and more sensitive than currently available tools. AVAILABILITY: Running applications and source code have been made available at http://bioinformatics.cribi.unipd.it/primex  相似文献   

4.

Background  

Complete sequencing of bacterial genomes has become a common technique of present day microbiology. Thereafter, data mining in the complete sequence is an essential step. New in silico methods are needed that rapidly identify the major features of genome organization and facilitate the prediction of the functional class of ORFs. We tested the usefulness of local oligonucleotide usage (OU) patterns to recognize and differentiate types of atypical oligonucleotide composition in DNA sequences of bacterial genomes.  相似文献   

5.
Frequent assimilation of mitochondrial DNA by grasshopper nuclear genomes   总被引:17,自引:0,他引:17  
Multiple copies of mitochondrial-like DNA were found in the brown mountain grasshopper, Podisma pedestris (Orthoptera: Acrididae), paralogous to COI and ND5 regions. The same was discovered using the ND5 regions of nine other grasshopper species from four separate subfamilies (Podisminae, Calliptaminae, Cyrtacanthacridinae, and Gomphocerinae). The extra ND5-like sequences were shown to be nuclear in the desert locust, Schistocerca gregaria (Cyrtacanthacridinae), and probably so in P. pedestris and an Italopodisma sp. (Podisminae). Eighty-seven different ND5-like nuclear mitochondrial pseudogenes (Numts) were sequenced from 12 grasshopper individuals. Different nuclear mitochondrial pseudogenes, if descended from the same mitochondrial immigrant, will have diverged from each other under no selective constraints because of their loss of functionality. Evidence of selective constraints in the differences between any two Numt sequences (e.g., if most differences are at third positions of codons) implies that they have separate mitochondrial origins. Through pairwise comparisons of pseudogene sequences, it was established that there have been at least 12 separate mtDNA integrations into P. pedestris nuclear genomes. This is the highest reported rate of horizontal transfer between organellar and nuclear genomes within a single animal species. The occurrence of numerous mitochondrial pseudogenes in nuclear genomes derived from separate integration events appears to be a common phenomenon among grasshoppers. More than one type of mechanism appears to have been involved in generating the observed grasshopper Numts.  相似文献   

6.
The three genomes of Chlamydomonas   总被引:1,自引:0,他引:1  
During the past 50 years, the green unicellular alga Chlamydomonas reinhardtii has played a key role as model system for the study of photosynthesis and chloroplast biogenesis. This is due to its well-established nuclear and chloroplast genetics, its dispensable photosynthetic function in the presence of acetate, and its highly efficient nuclear and chloroplast transformation systems. Considerable progress has been achieved in our understanding of the structure, function, inheritance, and expression of nuclear, chloroplast, and mitochondrial genes and of the molecular cross-talk between the nuclear, chloroplast, and mitochondrial genetic systems. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

7.
Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size.  相似文献   

8.
A significant problem in biological motif analysis arises when the background symbol distribution is biased (e.g. high/low GC content in the case of DNA sequences). This can lead to overestimation of the amount of information encoded in a motif. A motif can be depicted as a signal using information theory (IT). We apply two concepts from IT, distortion and patterned interference (a type of noise), to model genomic and codon bias respectively. This modeling approach allows us to correct a raw signal to recover signals that are weakened by compositional bias. The corrected signal is more likely to be discriminated from a biased background by a macromolecule. We apply this correction technique to recover ribosome-binding site (RBS) signals from available sequenced and annotated prokaryotic genomes having diverse compositional biases. We observed that linear correction was sufficient for recovering signals even at the extremes of these biases. Further comparative genomics studies were made possible upon correction of these signals. We find that the average Euclidian distance between RBS signal frequency matrices of different genomes can be significantly reduced by using the correction technique. Within this reduced average distance, we can find examples of class-specific RBS signals. Our results have implications for motif-based prediction, particularly with regards to the estimation of reliable inter-genomic model parameters.  相似文献   

9.
The bacterium Deinococcus radiodurans is one of the most resistant organisms to ionizing radiation and other DNAdamaging agents. Although, at present, 30 Deinococcus species have been identified, the whole-genome sequences of most species remain unknown, with the exception of D. radiodurans (DRD), D. geothermalis, and D. deserti. In this study, comparative genomic hybridization (CGH) microarray analysis of three Deinococcus species, D. radiopugnans (DRP), D. proteolyticus (DPL), and D. radiophilus (DRPH), was performed using oligonucleotide arrays based on DRD. Approximately 28%, 14%, and 15% of 3,128 open reading frames (ORFs) of DRD were absent in the genomes of DRP, DPL, and DRPH, respectively. In addition, 162 DRD ORFs were absent in all three species. The absence of 17 randomly selected ORFs was confirmed by a Southern blot. Functional classification showed that the absent genes spanned a variety of functional categories: some genes involved in amino acid biosynthesis, cell envelope, cellular processes, central intermediary metabolism, and DNA metabolism were not present in any of the three deinococcal species tested. Finally, comparative genomic data showed that 120 genes were Deinococcus-specific, not the 230 reported previously. Specifically, ddrD, ddrO, and ddrH genes, previously identified as Deinococcus-specific, were not present in DRP, DPL, or DRPH, suggesting that only a portion of ddr genes are shared by all members of the genus Deinococcus.  相似文献   

10.
We present a new computational method for solving a classical problem, the identification problem of cis-regulatory motifs in a given set of promoter sequences, based on one key new idea. Instead of scoring candidate motifs individually like in all the existing motif-finding programs, our method scores groups of candidate motifs with similar sequences, called motif closures, using a P-value, which has substantially improved the prediction reliability over the existing methods. Our new P-value scoring scheme is sequence length independent, hence allowing direct comparisons among predicted motifs with different lengths on the same footing. We have implemented this method as a Motif Recognition Computer (MREC) program, and have extensively tested MREC on both simulated and biological data from prokaryotic genomes. Our test results indicate that MREC can accurately pick out the actual motif with the correct length as the best scoring candidate for the vast majority of the cases in our test set. We compared our prediction results with two motif-finding programs Cosmo and MEME, and found that MREC outperforms both programs across all the test cases by a large margin. The MREC program is available at http://csbl.bmb.uga.edu/~bingqiang/MREC1/.  相似文献   

11.
Currently 18 hereditary neurological diseases are known to be associated with such mutations as multiple insertions of a single amino acid into the protein sequence. Therefore, investigation of the functional purpose of simple amino acid motifs becomes an important biological task. In this work, we studied the frequencies of motifs consisting of six identical amino acids and of simple six-amino-acid motifs consisting of two randomly located amino acids. The investigation was conducted on three eukaryotic proteomes of the well-studied model organisms, Homo sapiens, Drosophila melanogaster, and Caenorhabditis elegans. We showed that many simple motifs occurred very frequently; the data on the frequency were presented at These results suggest such motifs to be responsible for common functions of non-homologous and unrelated proteins in different organisms.  相似文献   

12.
Wang Y  Leung FC 《FEBS letters》2006,580(5):1277-1284
Inverted repeats are unstable motifs in a genome, having a causal relation to fragment rearrangements and recombination events. We have investigated long inverted repeats (LIR) of > 30 bp in length in eukaryotic genomes to assess their contribution to genome stability. An algorithm was first designed for searching for LIRs with < 2 kb internal spacers and >85% identity (degree of homology between repeat copies of a LIR). There are much fewer LIRs in yeast, fruitfly, pufferfish and chicken than in Caenorhabditis elegans, zebrafish, frog and human. However, the high LIR frequencies do not necessarily imply high genome instability because of variant internal spacers and stem lengths and identities. From the collection of identified LIRs, we selected recombinogenic LIRs that had a short internal spacer and a high copy identity and were prone to induce high instability. We found that a relatively high proportion (5-9.8%) of the LIRs in C. elegans, zebrafish and frog were recombinogenic LIRs. In contrast, the proportions in human and mouse LIRs were quite low (0.4-1.1%) basically accounting for long internal spacers. We suggest that C. elegans, zebrafish and frog genomes are unstable in terms of the LIR frequency and the proportion of recombinogenic LIRs. For the other genomes, LIRs most likely have a minor impact.  相似文献   

13.
14.
Comparative analysis of Acinetobacters: three genomes for three lifestyles   总被引:1,自引:0,他引:1  
Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i) whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss); ii) strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii) several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors) were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS). Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment), louse, soil.  相似文献   

15.
Bacterial biodiversity at the species level, in terms of gene acquisition or loss, is so immense that it raises the question of how essential chromosomal regions are spared from uncontrolled rearrangements. Protection of the genome likely depends on specific DNA motifs that impose limits on the regions that undergo recombination. Although most such motifs remain unidentified, they are theoretically predictable based on their genomic distribution properties. We examined the distribution of the “crossover hotspot instigator,” or Chi, in Escherichia coli, and found that its exceptional distribution is restricted to the core genome common to three strains. We then formulated a set of criteria that were incorporated in a statistical model to search core genomes for motifs potentially involved in genome stability in other species. Our strategy led us to identify and biologically validate two distinct heptamers that possess Chi properties, one in Staphylococcus aureus, and the other in several streptococci. This strategy paves the way for wide-scale discovery of other important functional noncoding motifs that distinguish core genomes from the strain-variable regions.  相似文献   

16.
We developed a highly accurate method to predict polyketide (PK) and nonribosomal peptide (NRP) structures encoded in microbial genomes. PKs/NRPs are polymers of carbonyl/peptidyl chains synthesized by polyketide synthases (PKS) and nonribosomal peptide synthetases (NRPS). We analyzed domain sequences corresponding to specific substrates and physical interactions between PKSs/NRPSs in order to predict which substrates (carbonyl/peptidyl units) are selected and assembled into highly ordered chemical structures. The predicted PKs/NRPs were represented as the sequences of carbonyl/peptidyl units to extract the structural motifs efficiently. We applied our method to 4529 PKSs/NRPSs and found 619 PKs/NRPs. We also collected 1449 PKs/NRPs whose chemical structures have been determined experimentally. The structural sequences were compared using the Smith-Waterman algorithm, and clustered into 271 clusters. From the compound clusters, we extracted 33 structural motifs that are significantly related with their bioactivities. We used the structural motifs to infer functions of 13 novel PKs/NRPs clusters produced by Pseudomonas spp. and Burkholderia spp. and found a putative virulence factor. The integrative analysis of genomic and chemical information given here will provide a strategy to predict the chemical structures, the biosynthetic pathways, and the biological activities of PKs/NRPs, which is useful for the rational design of novel PKs/NRPs.  相似文献   

17.
Liriomyza trifolii (Burgess), Liriomyza huidobrensis (Blanchard), and Liriomyza bryoniae (Kaltenbach), are three closely related and economically important leafminer pests in the world. This study examined the complete mitochondrial genomes of L. trifolii, L. huidobrensis and L. bryoniae, which were 16141 bp, 16236 bp and 16183 bp in length, respectively. All of them displayed 37 typical animal mitochondrial genes and an A + T-rich region. The genomes were highly compact with only 60–68 bp of non-coding intergenic spacer. However, considerable differences in the A + T-rich region were detected among the three species. Results of this study also showed the two ribosomal RNA genes of the three species had very limited variable sites and thus should not provide much information in the study of population genetics of these species. Data generated from three leafminers' complete mitochondrial genomes should provide valuable information in studying phylogeny of Diptera, and developing genetic markers for species identification in leafminers.  相似文献   

18.
Shah K  Krishnamachari A 《Bio Systems》2012,107(3):142-144
Genomes of almost all organisms have been found to exhibit several periodicities, the most prominent one is the three base periodicity. It is more pronounced in the gene coding regions and has been exploited to identify the segments of a genome that code for a protein. The reason for this three base periodicity in the gene-coding region has been attributed to inhomogeneous nucleotide compositions in the three codon positions. However, this reason cannot explain the three base periodicity present at the level of the whole genome where the codon concept is not applicable. Even though the distribution of each nucleotide is uniform at the positions 0(mod 3), 1(mod 3) and 2(mod 3) when the whole genome data is considered, our analysis reveals that the three base periodicity is arising because of higher correlations among the nucleotides separated by three bases.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号