首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Development and annotation of perennial Triticeae ESTs and SSR markers   总被引:2,自引:0,他引:2  
Triticeae contains hundreds of species of both annual and perennial types. Although substantial genomic tools are available for annual Triticeae cereals such as wheat and barley, the perennial Triticeae lack sufficient genomic resources for genetic mapping or diversity research. To increase the amount of sequence information available in the perennial Triticeae, three expressed sequence tag (EST) libraries were developed and annotated for Pseudoroegneria spicata, a mixture of both Elymus wawawaiensis and E. lanceolatus, and a Leymus cinereus x L. triticoides interspecific hybrid. The ESTs were combined into unigene sets of 8 780 unigenes for P. spicata, 11 281 unigenes for Leymus, and 7 212 unigenes for Elymus. Unigenes were annotated based on putative orthology to genes from rice, wheat, barley, other Poaceae, Arabidopsis, and the non-redundant database of the NCBI. Simple sequence repeat (SSR) markers were developed, tested for amplification and polymorphism, and aligned to the rice genome. Leymus EST markers homologous to rice chromosome 2 genes were syntenous on Leymus homeologous groups 6a and 6b (previously 1b), demonstrating promise for in silico comparative mapping. All ESTs and SSR markers are available on an EST information management and annotation database (http://titan.biotec.uiuc.edu/triticeae/).  相似文献   

3.
Triticeae species (including wheat, barley and rye) have huge and complex genomes due to polyploidization and a high content of transposable elements (TEs). TEs are known to play a major role in the structure and evolutionary dynamics of Triticeae genomes. During the last 5 years, substantial stretches of contiguous genomic sequence from various species of Triticeae have been generated, making it necessary to update and standardize TE annotations and nomenclature. In this study we propose standard procedures for these tasks, based on structure, nucleic acid and protein sequence homologies. We report statistical analyses of TE composition and distribution in large blocks of genomic sequences from wheat and barley. Altogether, 3.8 Mb of wheat sequence available in the databases was analyzed or re-analyzed, and compared with 1.3 Mb of re-annotated genomic sequences from barley. The wheat sequences were relatively gene-rich (one gene per 23.9 kb), although wheat gene-derived sequences represented only 7.8% (159 elements) of the total, while the remainder mainly comprised coding sequences found in TEs (54.7%, 751 elements). Class I elements [mainly long terminal repeat (LTR) retrotransposons] accounted for the major proportion of TEs, in terms of sequence length as well as element number (83.6% and 498, respectively). In addition, we show that the gene-rich sequences of wheat genome A seem to have a higher TE content than those of genomes B and D, or of barley gene-rich sequences. Moreover, among the various TE groups, MITEs were most often associated with genes: 43.1% of MITEs fell into this category. Finally, the TRIM and copia elements were shown to be the most active TEs in the wheat genome. The implications of these results for the evolution of diploid and polyploid wheat species are discussed. Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

4.
A large number of wheat (Triticum aestivum) and barley (Hordeum vulgare) varieties have evolved in agricultural ecosystems since domestication. Because of the large, repetitive genomes of these Triticeae crops, sequence information is limited and molecular differences between modern varieties are poorly understood. To study intraspecies genomic diversity, we compared large genomic sequences at the Lr34 locus of the wheat varieties Chinese Spring, Renan, and Glenlea, and diploid wheat Aegilops tauschii. Additionally, we compared the barley loci Vrs1 and Rym4 of the varieties Morex, Cebada Capa, and Haruna Nijo. Molecular dating showed that the wheat D genome haplotypes diverged only a few thousand years ago, while some barley and Ae. tauschii haplotypes diverged more than 500,000 years ago. This suggests gene flow from wild barley relatives after domestication, whereas this was rare or absent in the D genome of hexaploid wheat. In some segments, the compared haplotypes were very similar to each other, but for two varieties each at the Rym4 and Lr34 loci, sequence conservation showed a breakpoint that separates a highly conserved from a less conserved segment. We interpret this as recombination breakpoints of two ancient haplotypes, indicating that the Triticeae genomes are a heterogeneous and variable mosaic of haplotype fragments. Analysis of insertions and deletions showed that large events caused by transposable element insertions, illegitimate recombination, or unequal crossing over were relatively rare. Most insertions and deletions were small and caused by template slippage in short homopolymers of only a few base pairs in size. Such frequent polymorphisms could be exploited for future molecular marker development.  相似文献   

5.
6.
MicroRNAs (miRNAs) and the mRNA targets of miRNAs were identified by sequence complementarity within a DNA sequence database for species of the Triticeae. Data screening identified 28 miRNA precursor sequences from 15 miRNA families that contained conserved mature miRNA sequences within predicted stem-loop structures. In addition, the identification of 337 target sequences among Triticeae genes provided further evidence of the existence of 26 miRNA families in the cereals. MicroRNA targets included genes that are homologous to known targets in diverse model species as well as novel targets. MicroRNA precursors and targets were identified in 10 related species, though the great majority of them were identified in bread wheat, Triticum aestivum, and barley, Hordeum vulgare, the two species with the largest EST data sets among the Triticeae.  相似文献   

7.
The DNA sequence of an extracellular (EXC) domain of an oat (Avena sativa L.) receptor-like kinase (ALrk10) gene was amplified from 23 accessions of 15 Avena species (6 diploid, 6 tetraploid, and 3 hexaploid). Primers were designed from one partial oat ALrk10 clone that had been used to map the gene in hexaploid oat to linkage groups syntenic to Triticeae chromosome 1 and 3. Cluster (phylogenetic) analyses showed that all of the oat DNA sequences amplified with these primers are orthologous to the wheat and barley sequences that are located on chromosome 1 of the Triticeae species. Triticeae chromosome 3 Lrk10 sequences were not amplified using these primers. Cluster analyses provided evidence for multiple copies at a locus. The analysis divided the ALrk EXC sequences into two groups, one of which included AA and AABB genome species and the other CC, AACC, and CCCC genome species. Both groups of sequences were found in hexaploid AACCDD genome species, but not in all accessions. The C genome group was divided into 3 subgroups: (i) the CC diploids and the perennial autotetraploid, Avena macrostachya (this supports other evidence for the presence of the C in this autotetraploid species); (ii) a sequence from Avena maroccana and Avena murphyi and several sequences from different accessions of A. sativa; and (iii) A. murphyi and sequences from A. sativa and Avena sterilis. This suggests a possible polyphyletic origin for A. sativa from the AACC progenitor tetraploids or an origin from a progenitor of the AACC tetraploids. The sequences of the A genome group were not as clearly divided into subgroups. Although a group of sequences from the accession 'SunII' and a sequence from line Pg3, are clearly different from the others, the A genome diploid sequences were interspersed with tetraploid and hexaploid sequences.  相似文献   

8.
Gramene,a tool for grass genomics   总被引:11,自引:0,他引:11  
Gramene (http://www.gramene.org) is a comparative genome mapping database for grasses and a community resource for rice (Oryza sativa). It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, and publications, with a curated database of rice mutants (genes and alleles), molecular markers, and proteins. Gramene curators read and extract detailed information from published sources, summarize that information in a structured format, and establish links to related objects both inside and outside the database, providing seamless connections between independent sources of information. Genetic, physical, and sequence-based maps of rice serve as the fundamental organizing units and provide a common denominator for moving across species and genera within the grass family. Comparative maps of rice, maize (Zea mays), sorghum (Sorghum bicolor), barley (Hordeum vulgare), wheat (Triticum aestivum), and oat (Avena sativa) are anchored by a set of curated correspondences. In addition to sequence-based mappings found in comparative maps and rice genome displays, Gramene makes extensive use of controlled vocabularies to describe specific biological attributes in ways that permit users to query those domains and make comparisons across taxonomic groups. Proteins are annotated for functional significance using gene ontology terms that have been adopted by numerous model species databases. Genetic variants including phenotypes are annotated using plant ontology terms common to all plants and trait ontology terms that are specific to rice. In this paper, we present a brief overview of the search tools available to the plant research community in Gramene.  相似文献   

9.
With the increasing quantities of Brassica genomic data being entered into the public domain and in preparation for the complete Brassica genome sequencing effort, there is a growing requirement for the structuring and detailed bioinformatic analysis of Brassica genomic information within a user-friendly database. At the Plant Biotechnology Centre, Melbourne, Australia, we have developed a series of tools and computational pipelines to assist in the processing and structuring of genomic data, to aid its application to agricultural biotechnology research. These tools include a sequence database, ASTRA, a sequence processing pipeline incorporating annotation against GenBank, SwissProt and Arabidopsis Gene Ontology (GO) data and tools for molecular marker discovery and comparative genome analysis. All sequences are mined for simple sequence repeat (SSR) molecular markers using 'SSR primer' and mapped onto the complete Arabidopsis thaliana genome by sequence comparison. The database may be queried using a text-based search of sequence annotation or GO terms, BLAST comparison against resident sequences, or by the position of candidate orthologues within the Arabidopsis genome. Tools have also been developed and applied to the discovery of single nucleotide polymorphism (SNP) molecular markers and the in silico mapping of Brassica BAC end sequences onto the Arabidopsis genome. Planned extensions to this resource include the integration of gene expression data and the development of an EnsEMBL-based genome viewer.  相似文献   

10.
The recombinant plasmid dpTa1 has an insert of relic wheat DNA that represents a family of tandemly organized DNA sequences with a monomeric length of approximately 340 bp. This insert was used to investigate the structural organization of this element in the genomes of 58 species within the tribe Triticeae and in 7 species representing other tribes of the Poaceae. The main characteristic of the genomic organization of dpTa1 is a classical ladder-type pattern which is typical for tandemly organized sequences. The dpTa1 sequence is present in all of the genomes of the Triticeae species examined and in 1 species from a closely related tribe (Bromus inermis, Bromeae). DNA from Hordelymus europaeus (Triticeae) did not hybridize under the standard conditions used in this study. Prolonged exposure was necessary to obtain a weak signal. Our data suggest that the dpTa1 family is quite old in evolutionary terms, probably more ancient than the tribe Triticeae. The dpTa1 sequence is more abundant in the D-genome of wheat than in other genomes in Triticeae. DNA from several species also have bands in addition to the tandem repeats. The dpTa1 sequence contains short direct and inverted subrepeats and is homologous to a tandemly repeated DNA sequence from Hordeum chilense.  相似文献   

11.
12.
A family of dispersed repetitive sequences (Hch1) which is present in the genome of the wild barley Hordeum chilense was studied in detail. Hch1 sequences are found both as part of short tandem arrays and dispersed throughout the H. chilense chromosomes. Subcloning of sections of the sequence reveals that it is composed of unrelated classes of sequences which can also be found separately in other genomic locations. Analysis of these sequences in the genomes of wheat and two other wild barley species strongly suggests that specific amplifications and arrangements of the repeated sequences have taken place during speciation. Nucleotide sequence analysis fails to detect, in their entirity, the features shown by plant transposons.  相似文献   

13.
Use of wild relatives to improve salt tolerance in wheat   总被引:3,自引:0,他引:3  
There is considerable variability in salt tolerance amongst members of the Triticeae, with the tribe even containing a number of halophytes. This is a review of what is known of the differences in salt tolerance of selected species in this tribe of grasses, and the potential to use wild species to improve salt tolerance in wheat. Most investigators have concentrated on differences in ion accumulation in leaves, describing a desirable phenotype with low leaf Na+ concentration and a high K+/Na+ ratio. Little information is available on other traits (such as "tissue tolerance" of accumulated Na+ and Cl-) that might also contribute to salt tolerance. The sources of Na+ "exclusion" amongst the various genomes that make up tetraploid (AABB) durum wheat (Triticum turgidum L. ssp. durum), hexaploid (AABBDD) bread wheat (Triticum aestivum L. ssp. aestivum), and wild relatives (e.g. Aegilops spp., Thinopyrum spp., Elytrigia elongata syn. Lophopyrum elongatum, Hordeum spp.) are described. The halophytes display a capacity for Na+ "exclusion", and in some cases Cl- "exclusion", even at relatively high salinity. Significantly, it is possible to hybridize several wild species in the Triticeae with durum and bread wheat. Progenitors have been used to make synthetic hexaploids. Halophytic relatives, such as tall wheatgrass spp., have been used to produce amphiploids, disomic chromosome addition and substitution lines, and recombinant lines in wheat. Examples of improved Na+ "exclusion" and enhanced salt tolerance in various derivatives from these various hybridization programmes are given. As several sources of improved Na+ "exclusion" are now known to reside on different chromosomes in various genomes of species in the Triticeae, further work to identify the underlying mechanisms and then to pyramid the controlling genes for the various traits, that could act additively or even synergistically, might enable substantial gains in salt tolerance to be achieved.  相似文献   

14.
During the initial phases of a wheat endosperm Expressed-Sequence-Tag (EST) project, several clones were determined to be related to wheat gliadin sequences, but not similar enough to be classified into any of the traditional gliadin families [α-, γ-, and ω-gliadins, low-molecular-weight (LMW) glutenins]. Complete sequences of these cDNA clones revealed four new classes of gliadin-related endosperm proteins, but lacking a prominent repeat domain which until now has been characteristic of the gliadins. Two of these classes are related to different minimally described groups of Triticeae endosperm proteins. One class of proteins, which has N-terminal amino-acid sequences matching members of a reported 25-kDa globulin family from wheat, is shown by amino-acid sequencing to match to a family of 25-kDa endosperm proteins, is encoded by a multigene family, and is most similar to the LMW-glutenins. A second new class shows N-terminal homologies to LMW secalins from rye, and has an amino-acid composition similar to wheat and barley LMW proteins with extraction properties similar to prolamins. The third class is most similar to α-gliadins, and the fourth class has no close association to previously described wheat endosperm proteins. Received: 20 October 2000 / Accepted: 20 November 2000  相似文献   

15.
Pathways database system: an integrated system for biological pathways   总被引:1,自引:0,他引:1  
MOTIVATION: During the next phase of the Human Genome Project, research will focus on functional studies of attributing functions to genes, their regulatory elements, and other DNA sequences. To facilitate the use of genomic information in such studies, a new modeling perspective is needed to examine and study genome sequences in the context of many kinds of biological information. Pathways are the logical format for modeling and presenting such information in a manner that is familiar to biological researchers. RESULTS: In this paper we present an integrated system, called Pathways Database System, with a set of software tools for modeling, storing, analyzing, visualizing, and querying biological pathways data at different levels of genetic, molecular, biochemical and organismal detail. The novel features of the system include: (a) genomic information integrated with other biological data and presented from a pathway, rather than from the DNA sequence, perspective; (b) design for biologists who are possibly unfamiliar with genomics, but whose research is essential for annotating gene and genome sequences with biological functions; (c) database design, implementation and graphical tools which enable users to visualize pathways data in multiple abstraction levels, and to pose predetermined queries; and (d) an implementation that allows for web(XML)-based dissemination of query outputs (i.e. pathways data) to researchers in the community, giving them control on the use of pathways data. AVAILABILITY: Available on request from the authors.  相似文献   

16.
In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25,320 structural domains and a further 160,000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153-165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389-3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.  相似文献   

17.
Brachypodium is well suited as a model system for temperate grasses because of its compact genome and a range of biological features. In an effort to develop resources for genome research in this emerging model species, we constructed 2 bacterial artificial chromosome (BAC) libraries from an inbred diploid Brachypodium distachyon line, Bd21, using restriction enzymes HindIII and BamHI. A total of 73,728 clones (36,864 per BAC library) were picked and arrayed in 192,384-well plates. The average insert size for the BamHI and HindIII libraries is estimated to be 100 and 105 kb, respectively, and inserts of chloroplast origin account for 4.4% and 2.4%, respectively. The libraries individually represent 9.4- and 9.9-fold haploid genome equivalents with combined 19.3-fold genome coverage, based on a genome size of 355 Mb reported for the diploid Brachypodium, implying a 99.99% probability that any given specific sequence will be present in each library. Hybridization of the libraries with 8 starch biosynthesis genes was used to empirically evaluate this theoretical genome coverage; the frequency at which these genes were present in the library clones gave an estimated coverage of 11.6- and 19.6-fold genome equivalents. To obtain a first view of the sequence composition of the Brachypodium genome, 2185 BAC end sequences (BES) representing 1.3 Mb of random genomic sequence were compared with the NCBI GenBank database and the GIRI repeat database. Using a cutoff expectation value of E<10-10, only 3.3% of the BESs showed similarity to repetitive sequences in the existing database, whereas 40.0% had matches to the sequences in the EST database, suggesting that a considerable portion of the Brachypodium genome is likely transcribed. When the BESs were compared with individual EST databases, more matches hit wheat than maize, although their EST collections are of a similar size, further supporting the close relationship between Brachypodium and the Triticeae. Moreover, 122 BESs have significant matches to wheat ESTs mapped to individual chromosome bin positions. These BACs represent colinear regions containing the mapped wheat ESTs and would be useful in identifying additional markers for specific wheat chromosome regions.  相似文献   

18.
The Mouse Genome Database (MGD) (http://www.informatics.jax.org) one component of a community database resource for the laboratory mouse, a key model organism for interpreting the human genome and for understanding human biology. MGD strives to provide an extensively integrated information resource with experimental details annotated from both literature and on-line genomic data sources. MGD curates and presents the consensus representation of genotype (sequence) to phenotype information including highly detailed information about genes and gene products. Primary foci of integration are through representations of relationships between genes, sequences and phenotypes. MGD collaborates with other bioinformatics groups to curate a definitive set of information about the laboratory mouse. Recent developments include a general implementation of database structures for controlled vocabularies and the integration of a phenotype classification system.  相似文献   

19.
Direct genomic DNA amplification with the primers recognizing the NBS–kinase sequence of the wheat gene Cre3(Genbank accession AF052641) was used to obtain partial homologs of this gene in perennial and annual rye, wheat, and tall wheatgrass. The nucleotide sequences of the cloned fragments and their deduced amino acid sequences were compared to the already-known Cre3homologs in other wheat, aegilops, and barley genotypes. Within the tribe Triticeae, the extent of homology ranged from 86 to 94% for nucleotide sequences and from 74 to 96% for the deduced amino acid sequences, with the most variable region between Kin3 and PR3 conserved motifs.  相似文献   

20.
B R Baum  L G Bailey 《Génome》2000,43(1):79-85
Fifty-three units of 5S rDNA sequences from five accessions of Kengyilia rigidula, a member of the tribe Triticeae that also includes wheat, barley, rye, and their wild relatives, have been amplified by the polymerase chain reaction (PCR), cloned, and sequenced. The genome of K. rigidula consists of three haplomes, St, P, and Y. An evaluation of the aligned sequences of the diverse 53 different 5S DNA units yielded three 5S-unit classes. One unit class, Long S1, was assignable to the St haplome, one unit class, the Long P1, was assignable to the P haplome, and a third unit class, Long H1, was assignable to the H haplome. The last was expected to be assignable to the Y haplome, based on previous knowledge. Evolutionary scenarios are put forward to explain this finding. Among those possibilities is that the number of copies of units assignable to the Y haplome is very small and difficult to detect. Short units, reported earlier in K. alatavica, were not found in K. rigidula.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号