首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.  相似文献   

2.
Considering that recombinations produce quasispecies in lentivirus spreading, we identified and localized highly conserved sequences that may play an important role in viral ontology. Comparison of entire genomes, including 237 human, simian and non-primate mammal lentiviruses and 103 negative control viruses, led to identify 28 Conserved Lentiviral Sequences (CLSs). They were located mainly in the structural genes forming hot spots particularly in the gag and pol genes and to a lesser extent in LTRs and regulatory genes. The CLS pattern was the same throughout the different HIV-1 subtypes, except for some HIV-1-O strains. Only CLS 3 and 4 were detected in both negative control HTLV-1 oncornaviruses and D-particle-forming simian viruses, which are not immunodeficiency inducers and display a genetic stability. CLSs divided the virus genomes into domains allowing us to distinguish sequence families leading to the notion of 'species self' besides that of 'lentiviral self'. Most of acutely localized CLSs in HIV-1s (82%) corresponded to wide recombination segments being currently reported.  相似文献   

3.
4.
5.
6.
Microarray analysis of brassinosteroid-regulated genes in Arabidopsis   总被引:14,自引:0,他引:14  
  相似文献   

7.
Identifying potential tRNA genes in genomic DNA sequences.   总被引:16,自引:0,他引:16  
We have developed an algorithm that automatically and reproducibly identifies potential tRNA genes in genomic DNA sequences, and we present a general strategy for testing the sensitivity of such algorithms. This algorithm is useful for the flagging and characterization of long genomic sequences that have not been experimentally analyzed for identification of functional regions, and for the scanning of nucleotide sequence databases for errors in the sequences and the functional assignments associated with them. In an exhaustive scan of the GenBank database, 97.5% of the 744 known tRNA genes were correctly identified (true-positives), and 42 previously unidentified sequences were predicted to be tRNAs. A detailed analysis of these latter predictions reveals that 16 of the 42 are very similar to known tRNA genes, and we predict that they do, in fact, code for tRNA, yielding a false-positive rate for the algorithm of 0.003%. The new algorithm and testing strategy are a considerable improvement over any previously described strategies for recognizing tRNA genes, and they allow detections of genes (including introns) embedded in long genomic sequences.  相似文献   

8.
9.
The DNA sequences of the Thermomonospora fusca genes encoding cellulases E2 and E5 and the N-terminal end of E4 were determined. Each sequence contains an identical 14-bp inverted repeat upstream of the initiation codon. There were no significant homologies between the coding regions of the three genes. The E2 gene is 73% identical to the celA gene from Microbispora bispora, but this was the only homology found with other cellulase genes. E2 belongs to a family of cellulases that includes celA from M. bispora, cenA from Cellulomonas fimi, casA from an alkalophilic Streptomyces strain, and cellobiohydrolase II from Trichoderma reesei. E4 shows 44% identity to an avocado cellulase, while E5 belongs to the Bacillus cellulase family. There were strong similarities between the amino acid sequences of the E2 and E5 cellulose binding domains, and these regions also showed homology with C. fimi and Pseudomonas fluorescens cellulose binding domains.  相似文献   

10.
Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream, exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.  相似文献   

11.
Aims:  To detect antimicrobial resistance genes in Salmonella isolates from turkey flocks using the microarray technology.
Methods and Results:  A 775 gene probe oligonucleotide microarray was used to detect antimicrobial resistance genes in 34 isolates. All tetracycline-resistant Salmonella harboured tet(A) , tet(C) or tet(R) , with the exception of one Salmonella serotype Heidelberg isolate. The sul1 gene was detected in 11 of 16 sulfisoxazole-resistant isolates. The aadA , aadA1 , aadA2 , strA or strB genes were found in aminoglycoside-resistant isolates of Salm. Heidelberg, Salmonella serotype Senftenberg and untypeable Salmonella . The prevalence of mobile genetic elements, such as class I integron and transposon genes, in drug-resistant Salmonella isolates suggested that these elements may contribute to the dissemination of antimicrobial resistance genes in the preharvest poultry environment. Hierarchical clustering analysis demonstrated a close relationship between drug-resistant phenotypes and the corresponding antimicrobial resistance gene profiles.
Conclusions:  Salmonella serotypes isolated from the poultry environment carry multiple genes that can render them resistant to several antimicrobials used in poultry and humans.
Significance and Impact of the Study:  Multiple antimicrobial resistance genes in environmental Salmonella isolates could be identified efficiently by microarray analysis. Hierarchical clustering analysis of the data was also found to be a useful tool for analysing emerging patterns of drug resistance.  相似文献   

12.
Many peptide antibiotics in prokaryotes and lower eukaryotes are produced non-ribosomally by multi-enzyme complexes. Analysis of gene-derived amino acid sequences of some peptide synthetases of bacterial and fungal origins revealed a high degree of conservation (35-50% identity). The genes encoding those peptide synthetases are clustered into large operons with repetitive domains (about 600 amino acids), in the case of synthetases activating more than one amino acid. We used two 35-mer oligonucleotides derived from two highly conserved regions of known peptide synthetases to identify the surfactin synthetase operon in Bacillus subtilis ATCC 21332, a strain not accessible to genetic manipulation. We show that the derived oligonucleotides can be used not only for the identification of unknown peptide synthetase genes by hybridization experiments but also in sequencing reactions as primers to identify internal domain sequences. Using this method, a 25.8-kb chromosomal DNA fragment bearing a part of the surfactin biosynthesis operon was cloned and partial sequences of two internal domains were obtained.  相似文献   

13.
Plants respond to day/night cycling in a number of physiological ways. At the mRNA level, the expression of some genes changes during the 24-hr period. To identify novel genes regulated in this way, we used microarrays containing 11,521 Arabidopsis expressed sequence tags, representing an estimated 7800 unique genes, to determine gene expression levels at 6-hr intervals throughout the day. Eleven percent of the genes, encompassing genes expressed at both high and low levels, showed a diurnal expression pattern. Approximately 2% cycled with a circadian rhythm. By clustering microarray data from 47 additional nonrelated experiments, we identified groups of genes regulated only by the circadian clock. These groups contained the already characterized clock-associated genes LHY, CCA1, and GI, suggesting that other key circadian clock genes might be found within these clusters.  相似文献   

14.
15.
A group of small peptides with a typical cysteine-rich domain (termed trefoil motif or P-domain) is abundantly expressed at mucosal surfaces of specific normal and neoplastic tissues. Their association with the maintenance of surface integrity was suggested. The first known human trefoil peptide (pS2) was isolated from breast cancer cells (MCF7). Its oestrogen-inducible gene, and the human homologue to the porcine spasmolytic peptide gene (hSP/SML1) appear synchronously expressed in healthy stomach mucosa and several carcinomas of the gastrointestinal tract. Both genes were shown to be localised at 21q22.3. A new trefoil peptide from human intestinal mucosa (hITF/hP1.B) and its gene were described recently. By using suitable oligonucleotide primers and PCR and isolating large (110–250 kb) genomic recombinants cloned in the bacterial artificial chromosome (BAC) system, we present a genomic region from chromosome band 21q22.3 cloned in contiguous sequences and encoding all three members of human P-domain/trefoil peptides proving a physical linkage of all three trefoil peptide genes. Such genomic sequences will provide useful experimental material for analysis of gene regulation, for gene modification experiments and for establishing transgenic cells or animals. Received: 2 January 1996 / Revised: 4 March 1996  相似文献   

16.
Three human small nucleolar RNAs (snoRNAs), E1, E2 and E3, were reported earlier that have unique sequences, interact directly with unique segments of pre-rRNA in vivo and are encoded in introns of protein genes. In the present report, human and frog E1, E2 and E3 RNAs injected into the cytoplasm of frog oocytes migrated to the nucleus and specifically to the nucleolus. This indicates that the nucleolar and nuclear localization signals of these snoRNAs reside within their evolutionarily conserved segments. Homologs of these snoRNAs from several vertebrates were sequenced and this information was used to develop RNA secondary structure models. These snoRNAs have unique phylogenetically conserved sequences.  相似文献   

17.
It has been shown that proteins encoded by linked genes have similar rates of evolution and that clusters of essential genes are found in regions with low recombination rates. We show here that proteins encoded by linked genes in two closely related bacterial species, namely Escherichia coli K12 and Salmonella typhimurium LT2, evolve more slowly when compared with proteins encoded by genes that are not linked as assessed by protein sequence similarity. The proteins encoded by the identified linked genes share an average sequence identity of 82.5% compared with a 46.5% identity of proteins encoded by genes that are not linked.  相似文献   

18.
ID sequences in the genes of three brain-specific proteins   总被引:2,自引:0,他引:2  
We characterized the brain-specific gene coding for rat S-100 protein beta-subunit and found three "brain identifier (ID)" elements, which have been proposed to regulate the gene expression in rat brain. The nucleotide sequences of these elements corresponded well with that of the consensus ID element and were clearly different from those of "ID-like" elements in rat beta B1-crystallin gene, etc. ID elements were also observed in the flanking regions of rat neuron-specific enolase and cholecystokinin genes, which were expressed in the neuronal cells. Direct repeats were observed in the regions flanking ID elements.  相似文献   

19.
Jung CH  Wong CE  Singh MB  Bhalla PL 《PloS one》2012,7(6):e38250
Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja) revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant, Arabidopsis.  相似文献   

20.
In humans, acute myelomonocytic leukemia (AMML) with abnormal bone marrow eosinophilia is diagnosed by the presence of a pericentric inversion in chromosome 16, involving breakpoints p13;q23 [i.e., inv(16)(p13;q23)]. A pericentric inversion involves breaks that have occurred on the p and q arms and the segment in between is rotated 180° and reattaches. The recent development of a “human micro-coatasome” painting probe for 16p contains unique DNA sequences that fluorescently label only the short arm of chromosome 16, which facilitates the identification of such inversions and represents an ideal tool for analyzing the “divergence/convergence” of the equivalent human chromosome 16 (PTR 18, GGO 17 and PPY 19) in the great apes, chimpanzee, gorilla and orangutan. When the probe is used on the type of pericentric inversion characteristic of AMML, signals are observed on the proximal portions (the regions closest to the centromere) of the long and short arms of chromosome 16. The probe hybridized to only the short arm of all three ape chromosomes and signals were not observed on the long arms, suggesting that a pericentric inversion similar to that seen in AMML has not occurred in any of these great apes. Received: 4 July 1996 / Accepted: 18 September 1996  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号