首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nearly 7000 Arabidopsis thaliana -expressed sequence tags (ESTs) from 10 cDNA libraries have been sequenced, of which almost 5000 non-redundant tags have been submitted to the EMBL data bank. The quality of the cDNA libraries used is analysed. Similarity searches in international protein data banks have allowed the detection of significant similarities to a wide range of proteins from many organisms. Alignment with ESTs from the rice systematic sequencing project has allowed the detection of amino acid motifs which are conserved between the two organisms, thus identifying tags to genes encoding highly conserved proteins. These genes are candidates for a common framework in genome mapping projects in different plants.  相似文献   

2.
Exploring the plant transcriptome through phylogenetic profiling   总被引:5,自引:0,他引:5       下载免费PDF全文
Publicly available protein sequences represent only a small fraction of the full catalog of genes encoded by the genomes of different plants, such as green algae, mosses, gymnosperms, and angiosperms. By contrast, an enormous amount of expressed sequence tags (ESTs) exists for a wide variety of plant species, representing a substantial part of all transcribed plant genes. Integrating protein and EST sequences in comparative and evolutionary analyses is not straightforward because of the heterogeneous nature of both types of sequence data. By combining information from publicly available EST and protein sequences for 32 different plant species, we identified more than 250,000 plant proteins organized in more than 12,000 gene families. Approximately 60% of the proteins are absent from current sequence databases but provide important new information about plant gene families. Analysis of the distribution of gene families over different plant species through phylogenetic profiling reveals interesting insights into plant gene evolution, and identifies species- and lineage-specific gene families, orphan genes, and conserved core genes across the green plant lineage. We counted a similar number of approximately 9,500 gene families in monocotyledonous and eudicotyledonous plants and found strong evidence for the existence of at least 33,700 genes in rice (Oryza sativa). Interestingly, the larger number of genes in rice compared to Arabidopsis (Arabidopsis thaliana) can partially be explained by a larger amount of species-specific single-copy genes and species-specific gene families. In addition, a majority of large gene families, typically containing more than 50 genes, are bigger in rice than Arabidopsis, whereas the opposite seems true for small gene families.  相似文献   

3.
Barakat A  Müller KF  Sáenz-de-Miera LE 《Gene》2007,403(1-2):143-150
Cytoplasmic ribosomal protein (r-protein) genes in Arabidopsis thaliana are encoded by 80 multigene families that contain between two and seven members. Gene family members are typically similar at the protein sequence level, with the most divergent members of any gene family retaining 94% identity, on average. However, three Arabidopsis r-protein families - S15a, L7 and P2 - contain highly divergent family members. Here, we investigated the organization, structure, expression and molecular evolution of the L7 r-protein family. Phylogenetic analyses showed that L7 r-protein gene family members constitute two distinct phylogenetic groups. The first group including RPL7B, RPL7C and RPL7D has homologs in plants, animals and fungi. The second group represented by RPL7A is found in plants but has no orthologs from other fully-sequenced eukaryotic genomes. These two groups may have derived from a duplication event prior to the divergence of animals and plants. All four L7 r-protein genes are expressed and all exhibit a differential expression in inflorescence and flowers. RPL7A and RPL7B are less expressed than the other genes in all tissues analyzed. Molecular characterization of nucleic and protein sequences of L7 r-protein genes and analysis of their codon usage did not indicate any functional divergence. The probable evolution of an extra-ribosomal function of group 2 genes is discussed.  相似文献   

4.
The Arabidopsis thaliana genome sequencing project has revealed that multigene families, such as those generated by genome duplications, are more abundant among plant genomes than among animal genomes. To gain insight into the evolutionary implications of the multigene families in higher plants, we examined the XTH gene family, a group of genes encoding xyloglucan endotransglucosylase/hydrolase, which are responsible for cell-wall construction in plants. Expression analysis of all members (33 genes) of this family, using quantitative real-time RT-PCR, revealed that most members exhibit distinct expression profiles in terms of tissue specificity and responses to hormonal signals, with some members exhibiting similar expression patterns. By comparing the flanking sequences of individual genes, we identified four sets of large-segment duplications and two sets of solitary gene duplications. In each set of gene duplicates, long nucleotide sequences, ranging from one to two hundred base pairs, are conserved. Furthermore, gene duplicates exhibit similar organ-specific expression profiles. These facts allowed us to predict putative cis-regulatory regions, particularly those responsible for cell-wall construction, and hence for morphogenesis, that are specific for certain organs or tissues in plants.  相似文献   

5.
6.
7.
8.
The generation of large numbers of partial cDNA sequences, or expressed sequence tags (ESTs), has provided a method with which to sample a large number of genes from an organism. More than 25,000 Arabidopsis thaliana ESTs have been deposited in public databases, producing the largest collection of ESTs for any plant species. We describe here the application of a method of reducing redundancy and increasing information content in this collection by grouping overlapping ESTs representing the same gene into a "contig" or assembly. The increased information content of these assemblies allows more putative identifications to be assigned based on the results of similarity searches with nucleotide and protein databases. The results of this analysis indicate that sequence information is available for approximately 12,600 nonoverlapping ESTs from Arabidopsis. Comparison of the assemblies with 953 Arabidopsis coding sequences indicates that up to 57% of all Arabidopsis genes are represented by an EST. Clustering analysis of these sequences suggests that between 300 and 700 gene families are represented by between 700 and 2000 sequences in the EST database. A database of the assembled sequences, their putative identifications, and cellular roles is available through the World Wide Web.  相似文献   

9.
10.
11.
We have isolated cDNA clones specific for Arabidopsis thaliana cytosolic ribosomal protein S11 and plastid ribosomal protein CS17, both of which are encoded in the nuclear genome, through the use of the corresponding soybean and pea cDNAs as probes, respectively. The nucleotide sequences of all four cDNAs were determined. The amino acid sequences derived from these cDNA sequences show that the soybean and A. thaliana S11 cDNAs encode proteins that are homologous to rat ribosomal protein S11 and that the pea and A. thaliana CS17 cDNAs encode proteins that are homologous to Escherichia coli ribosomal protein S17. The plant S11 cytosolic ribosomal proteins also show significant sequence similarity to both E. coli ribosomal protein S17 and plastid CS17 indicating that these are all related proteins. Comparison of A. thaliana CS17 with A. thaliana S11 and with E. coli S17 suggests that CS17 is more related to S17 than it is to S11. These results support the idea that the gene encoding CS17 was derived from a prokaryotic endosymbiont and not from a duplication of the eukaryotic S11 gene.  相似文献   

12.
Heartwater is an economically important disease of ruminants caused by the tick-transmitted rickettsia Cowdria ruminantium. The disease is present in Africa and the Caribbean and there is a risk of spread to the Americas, particularly because of a clinically asymptomatic carrier state in infected livestock and imported wild animals. The causative agent is closely related taxonomically to the human and animal pathogens Ehrlichia chaffeensis and Ehrlichia canis. A dominant immune response of infected animals or people is directed against variable outer membrane proteins of these agents known, in E. chaffeensis and E. canis, to be encoded by polymorphic multigene families. We demonstrate, by sequence analysis, that map1 encoding the major outer membrane protein of C. ruminantium is also encoded by a polymorphic multigene family. Two members of the gene family are located in tandem in the genome. The upstream member, orf2, is conserved, encoding only 2 amino acid substitutions among six different rickettsial strains from diverse locations in Africa and the Caribbean. In contrast, the downstream member, map1, contains variable and conserved regions between strains. Interestingly, orf2 is more closely related in sequence to omp1b of E. chaffeensis than to map1 of C. ruminantium. The regions that differ among orf2, map1, and omp1b correspond to previously identified variable sequences in outer membrane protein genes of E. chaffeensis and E. canis. These data suggest that diversity in these outer membrane proteins may arise by recombination among gene family members and offer a potential mechanism for persistence of infection in carrier animals.  相似文献   

13.
P Hilson  K L Carroll    P H Masson 《Plant physiology》1993,103(2):525-533
The poly(A) tail of eukaryotic mRNAs associates with poly(A)-binding (PAB) proteins whose role in mRNA translation and stability is being intensively investigated. Very little is known about the structure and function of the PAB genes in plants. We have cloned multiple PAB-related sequences from Arabidopsis thaliana. Results suggest that PAB proteins are encoded by a multigene family. One member of this family (PAB2) is expressed in root and shoot tissues. The complete nucleotide sequence of PAB2 was determined. Study of the predicted PAB2 protein reveals a similarity in structure among vertebrate, insect, yeast, and plant PAB proteins. All contain two highly conserved domains: an amino-terminal sequence formed by four RNA recognition motifs and an uncharacterized carboxyl-terminal region of 69 to 71 amino acids. Possible roles for the carboxyl-terminal conserved domain are discussed in view of recently published data concerning the structure and function of PAB proteins.  相似文献   

14.
15.
The posttranslational modifier ubiquitin is encoded by a multigene family containing three primary members, which yield the precursor protein polyubiquitin and two ubiquitin moieties, Ub(L40) and Ub(S27), that are fused to the ribosomal proteins L40 and S27, respectively. The gene encoding polyubiquitin is highly conserved and, until now, those encoding Ub(L40) and Ub(S27) have been generally considered to be equally invariant. The evolution of the ribosomal ubiquitin moieties is, however, proving to be more dynamic. It seems that the genes encoding Ub(L40) and Ub(S27) are actively maintained by homologous recombination with the invariant polyubiquitin locus. Failure to recombine leads to deterioration of the sequence of the ribosomal ubiquitin moieties in several phyla, although this deterioration is evidently constrained by the structural requirements of the ubiquitin fold. Only a few amino acids in ubiquitin are vital for its function, and we propose that conservation of all three ubiquitin genes is driven not only by functional properties of the ubiquitin protein, but also by the propensity of the polyubiquitin locus to act as a 'selfish gene'.  相似文献   

16.
17.
The protective barrier provided by stratified squamous epithelia relies on the cornified cell envelope (CE), a structure synthesized at late stages of keratinocyte differentiation. It is composed of structural proteins, including involucrin, loricrin, and the small proline-rich (SPRR) proteins, all encoded by genes localized at human chromosome 1q21. The genetic characterization of the SPRR locus reveals that the various members of this multigene family can be classified into two distinct groups with separate evolutionary histories. Whereas group 1 genes have diverged in protein structure and are composed of three different classes (SPRR1 (2x), SPRR3, and SPRR4), an active process of gene conversion has counteracted diversification of the protein sequences of group 2 genes (SPRR2 class, seven genes). Contrasting with this homogenization process, all individual members of the SPRR gene family show specific in vivo and in vitro expression patterns and react selectively to UV irradiation. Apparently, creation of regulatory rather than structural diversity has been the driving force behind the evolution of the SPRR gene family. Differential regulation of highly homologous genes underlines the importance of SPRR protein dosage in providing optimal barrier function to different epithelia, while allowing adaptation to diverse external insults.  相似文献   

18.
19.
ABSTRACT. Parasitic dinoflagellates of the genus Amoebophrya play important roles in the ecology of estuaries and open ocean environments. Little is known of the cell and molecular biology of Amoebophrya , but the genus is intermediate on phylogenetic trees between apicomplexans and typical dinophycean dinoflagellates. Here, we constructed four cDNA libraries, from different stages after infecting the host, Karlodinium veneficum , with Amoebophrya sp. These libraries were used to generate 898 expressed sequence tags (ESTs), with sequences attributed to either the host or parasite, based on AT bias, codon usage, and occurrence during infection. Overall, 209 sequences were attributable to the parasite and 685 to the host. The 50 putative parasite sequences with good protein matches in GenBank were used to find the same protein from host ESTs. For 26 genes, both host and parasite sequences were identified, of which 20 encoded ribosomal proteins. PCR for seven predicted parasite and two host genes were used to confirm attributions. The most common host and parasite ESTs were compared to see if multiple gene copies were present. The host plastocyanin gene had multiple sequence variants, but parasite rps 27 a contained only one polymorphism, likely due to an amplification error. Amplification, cloning, and sequencing of five parasite protein-coding genes suggested that the parasite has a single sequence for each gene, but three host genes were found to have multiple variants. The genome of Amoebophrya sp. infecting K. veneficum appears to have an organization more similar to other eukaryotes than to the tandem gene arrangements found in dinoflagellates.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号