首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/∼23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/∼2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars.  相似文献   

2.
Most of our understanding of plant genome structure and evolution has come from the careful annotation of small (e.g., 100 kb) sequenced genomic regions or from automated annotation of complete genome sequences. Here, we sequenced and carefully annotated a contiguous 22 Mb region of maize chromosome 4 using an improved pseudomolecule for annotation. The sequence segment was comprehensively ordered, oriented, and confirmed using the maize optical map. Nearly 84% of the sequence is composed of transposable elements (TEs) that are mostly nested within each other, of which most families are low-copy. We identified 544 gene models using multiple levels of evidence, as well as five miRNA genes. Gene fragments, many captured by TEs, are prevalent within this region. Elimination of gene redundancy from a tetraploid maize ancestor that originated a few million years ago is responsible in this region for most disruptions of synteny with sorghum and rice. Consistent with other sub-genomic analyses in maize, small RNA mapping showed that many small RNAs match TEs and that most TEs match small RNAs. These results, performed on ∼1% of the maize genome, demonstrate the feasibility of refining the B73 RefGen_v1 genome assembly by incorporating optical map, high-resolution genetic map, and comparative genomic data sets. Such improvements, along with those of gene and repeat annotation, will serve to promote future functional genomic and phylogenomic research in maize and other grasses.  相似文献   

3.
4.
Maize is a major cereal crop and an important model system for basic biological research. Knowledge gained from maize research can also be used to genetically improve its grass relatives such as sorghum, wheat, and rice. The primary objective of the Maize Genome Sequencing Consortium (MGSC) was to generate a reference genome sequence that was integrated with both the physical and genetic maps. Using a previously published integrated genetic and physical map, combined with in-coming maize genomic sequence, new sequence-based genetic markers, and an optical map, we dynamically picked a minimum tiling path (MTP) of 16,910 bacterial artificial chromosome (BAC) and fosmid clones that were used by the MGSC to sequence the maize genome. The final MTP resulted in a significantly improved physical map that reduced the number of contigs from 721 to 435, incorporated a total of 8,315 mapped markers, and ordered and oriented the majority of FPC contigs. The new integrated physical and genetic map covered 2,120 Mb (93%) of the 2,300-Mb genome, of which 405 contigs were anchored to the genetic map, totaling 2,103.4 Mb (99.2% of the 2,120 Mb physical map). More importantly, 336 contigs, comprising 94.0% of the physical map (∼1,993 Mb), were ordered and oriented. Finally we used all available physical, sequence, genetic, and optical data to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1).  相似文献   

5.
The Genomes of Oryza sativa: a history of duplications   总被引:6,自引:0,他引:6       下载免费PDF全文
Yu J  Wang J  Lin W  Li S  Li H  Zhou J  Ni P  Dong W  Hu S  Zeng C  Zhang J  Zhang Y  Li R  Xu Z  Li S  Li X  Zheng H  Cong L  Lin L  Yin J  Geng J  Li G  Shi J  Liu J  Lv H  Li J  Wang J  Deng Y  Ran L  Shi X  Wang X  Wu Q  Li C  Ren X  Wang J  Wang X  Li D  Liu D  Zhang X  Ji Z  Zhao W  Sun Y  Zhang Z  Bao J  Han Y  Dong L  Ji J  Chen P  Wu S  Liu J  Xiao Y  Bu D  Tan J  Yang L  Ye C  Zhang J  Xu J  Zhou Y  Yu Y  Zhang B  Zhuang S  Wei H  Liu B  Lei M  Yu H  Li Y  Xu H  Wei S  He X  Fang L  Zhang Z  Zhang Y  Huang X  Su Z  Tong W  Li J  Tong Z  Li S  Ye J  Wang L 《PLoS biology》2005,3(2):e38
We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000–40,000. Only 2%–3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.  相似文献   

6.
With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb) in length, 75% were flanked on one or both sides by (often unrelated) segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85%) semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13%) regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22), 13 kb (at 7q11), and 1 kb (at 16q24) fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.  相似文献   

7.
The r1 and b1 genes of maize, each derived from the chromosomes of two progenitors that hybridized >4.8 million years ago (MYA), have been a rich source for studying transposition, recombination, genomic imprinting, and paramutation. To provide a phylogenetic context to the genetic studies, we sequenced orthologous regions from maize and sorghum (>600 kb) surrounding these genes and compared them with the rice genome. This comparison showed that the homologous regions underwent complete or partial gene deletions, selective retention of orthologous genes, and insertion of nonorthologous genes. Phylogenetic analyses of the r/b genes revealed that the ancestral gene was amplified independently in different grass lineages, that rice experienced an intragenomic gene movement and parallel duplication, that the maize r1 and b1 genes are descendants of two divergent progenitors, and that the two paralogous r genes of sorghum are almost as old as the sorghum lineage. Such sequence mobility also extends to linked genes. The cisZOG genes are characterized by gene amplification in an ancestral grass, parallel duplications and deletions in different grass lineages, and movement to a nonorthologous position in maize. In addition to gene mobility, both maize and rice regions experienced recent transposition (<3 MYA).  相似文献   

8.
Although copy number variation (CNV) has recently received much attention as a form of structure variation within the human genome, knowledge is still inadequate on fundamental CNV characteristics such as occurrence rate, genomic distribution and ethnic differentiation. In the present study, we used the Affymetrix GeneChip® Mapping 500K Array to discover and characterize CNVs in the human genome and to study ethnic differences of CNVs between Caucasians and Asians. Three thousand and nineteen CNVs, including 2381 CNVs in autosomes and 638 CNVs in X chromosome, from 985 Caucasian and 692 Asian individuals were identified, with a mean length of 296 kb. Among these CNVs, 190 had frequencies greater than 1% in at least one ethnic group, and 109 showed significant ethnic differences in frequencies (p<0.01). After merging overlapping CNVs, 1135 copy number variation regions (CNVRs), covering approximately 439 Mb (14.3%) of the human genome, were obtained. Our findings of ethnic differentiation of CNVs, along with the newly constructed CNV genomic map, extend our knowledge on the structural variation in the human genome and may furnish a basis for understanding the genomic differentiation of complex traits across ethnic groups.  相似文献   

9.
In previous studies we reported the identification of several AFLP, RAPD and RFLP molecular markers linked to apospory in Paspalum notatum. The objective of this work was to sequence these markers, obtain their flanking regions by chromosome walking and perform an in silico mapping analysis in rice and maize. The methylation status of two apospory-related sequences was also assessed using methylation-sensitive RFLP experiments. Fourteen molecular markers were analyzed and several protein-coding sequences were identified. Copy number estimates and RFLP linkage analysis showed that the sequence PnMAI3 displayed 2–4 copies per genome and linkage to apospory. Extension of this marker by chromosome walking revealed an additional protein-coding sequence mapping in silico in the apospory-syntenic regions of rice and maize. Approximately 5 kb corresponding to different markers were characterized through the global sequencing procedure. A more refined analysis based on sequence information indicated synteny with segments of chromosomes 2 and 12 of rice and chromosomes 3 and 5 of maize. Two loci associated with apomixis locus were tested in methylation-sensitive RFLP experiments using genomic DNA extracted from leaves. Although both target sequences were methylated no methylation polymorphisms associated with the mode of reproduction were detected.  相似文献   

10.
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.  相似文献   

11.
Handa H 《Nucleic acids research》2003,31(20):5907-5916
The entire mitochondrial genome of rapeseed (Brassica napus L.) was sequenced and compared with that of Arabidopsis thaliana. The 221 853 bp genome contains 34 protein-coding genes, three rRNA genes and 17 tRNA genes. This gene content is almost identical to that of Arabidopsis. However the rps14 gene, which is a pseudo-gene in Arabidopsis, is intact in rapeseed. On the other hand, five tRNA genes are missing in rapeseed compared to Arabidopsis, although the set of mitochondrially encoded tRNA species is identical in the two Cruciferae. RNA editing events were systematically investigated on the basis of the sequence of the rapeseed mitochondrial genome. A total of 427 C to U conversions were identified in ORFs, which is nearly identical to the number in Arabidopsis (441 sites). The gene sequences and intron structures are mostly conserved (more than 99% similarity for protein-coding regions); however, only 358 editing sites (83% of total editings) are shared by rapeseed and Arabidopsis. Non-coding regions are mostly divergent between the two plants. One-third (about 78.7 kb) and two-thirds (about 223.8 kb) of the rapeseed and Arabidopsis mitochondrial genomes, respectively, cannot be aligned with each other and most of these regions do not show any homology to sequences registered in the DNA databases. The results of the comparative analysis between the rapeseed and Arabidopsis mitochondrial genomes suggest that higher plant mitochondria are extremely conservative with respect to coding sequences and somewhat conservative with respect to RNA editing, but that non-coding parts of plant mitochondrial DNA are extraordinarily dynamic with respect to structural changes, sequence acquisition and/or sequence loss.  相似文献   

12.

Background

The genus Cronobacter (formerly called Enterobacter sakazakii) is composed of five species; C. sakazakii, C. malonaticus, C. turicensis, C. muytjensii, and C. dublinensis. The genus includes opportunistic human pathogens, and the first three species have been associated with neonatal infections. The most severe diseases are caused in neonates and include fatal necrotizing enterocolitis and meningitis. The genetic basis of the diversity within the genus is unknown, and few virulence traits have been identified.

Methodology/Principal Findings

We report here the first sequence of a member of this genus, C. sakazakii strain BAA-894. The genome of Cronobacter sakazakii strain BAA-894 comprises a 4.4 Mb chromosome (57% GC content) and two plasmids; 31 kb (51% GC) and 131 kb (56% GC). The genome was used to construct a 387,000 probe oligonucleotide tiling DNA microarray covering the whole genome. Comparative genomic hybridization (CGH) was undertaken on five other C. sakazakii strains, and representatives of the four other Cronobacter species. Among 4,382 annotated genes inspected in this study, about 55% of genes were common to all C. sakazakii strains and 43% were common to all Cronobacter strains, with 10–17% absence of genes.

Conclusions/Significance

CGH highlighted 15 clusters of genes in C. sakazakii BAA-894 that were divergent or absent in more than half of the tested strains; six of these are of probable prophage origin. Putative virulence factors were identified in these prophage and in other variable regions. A number of genes unique to Cronobacter species associated with neonatal infections (C. sakazakii, C. malonaticus and C. turicensis) were identified. These included a copper and silver resistance system known to be linked to invasion of the blood-brain barrier by neonatal meningitic strains of Escherichia coli. In addition, genes encoding for multidrug efflux pumps and adhesins were identified that were unique to C. sakazakii strains from outbreaks in neonatal intensive care units.  相似文献   

13.
A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celera’s database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity ≥90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000–20 000 NotI sites, of which 6000–9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.  相似文献   

14.
DNA double-strand breaks (DSBs), which are formed by the Spo11 protein, initiate meiotic recombination. Previous DSB-mapping studies have used rad50S or sae2Δ mutants, which are defective in break processing, to accumulate Spo11-linked DSBs, and report large (≥ 50 kb) “DSB-hot” regions that are separated by “DSB-cold” domains of similar size. Substantial recombination occurs in some DSB-cold regions, suggesting that DSB patterns are not normal in rad50S or sae2Δ mutants. We therefore developed a novel method to map genome-wide, single-strand DNA (ssDNA)–associated DSBs that accumulate in processing-capable, repair-defective dmc1Δ and dmc1Δ rad51Δ mutants. DSBs were observed at known hot spots, but also in most previously identified “DSB-cold” regions, including near centromeres and telomeres. Although approximately 40% of the genome is DSB-cold in rad50S mutants, analysis of meiotic ssDNA from dmc1Δ shows that most of these regions have substantial DSB activity. Southern blot assays of DSBs in selected regions in dmc1Δ, rad50S, and wild-type cells confirm these findings. Thus, DSBs are distributed much more uniformly than was previously believed. Comparisons of DSB signals in dmc1, dmc1 rad51, and dmc1 spo11 mutant strains identify Dmc1 as a critical strand-exchange activity genome-wide, and confirm previous conclusions that Spo11-induced lesions initiate all meiotic recombination.  相似文献   

15.
BackgroundSNPs are the most abundant polymorphism type, and have been explored in many crop genomic studies, including rice and maize. SNP discovery in allotetraploid cotton genomes has lagged behind that of other crops due to their complexity and polyploidy. In this study, genome-wide SNPs are detected systematically using next-generation sequencing and efficient SNP genotyping methods, and used to construct a linkage map and characterize the structural variations in polyploid cotton genomes.ResultsWe construct an ultra-dense inter-specific genetic map comprising 4,999,048 SNP loci distributed unevenly in 26 allotetraploid cotton linkage groups and covering 4,042 cM. The map is used to order tetraploid cotton genome scaffolds for accurate assembly of G. hirsutum acc. TM-1. Recombination rates and hotspots are identified across the cotton genome by comparing the assembled draft sequence and the genetic map. Using this map, genome rearrangements and centromeric regions are identified in tetraploid cotton by combining information from the publicly-available G. raimondii genome with fluorescent in situ hybridization analysis.ConclusionsWe report the genotype-by-sequencing method used to identify millions of SNPs between G. hirsutum and G. barbadense. We construct and use an ultra-dense SNP map to correct sequence mis-assemblies, merge scaffolds into pseudomolecules corresponding to chromosomes, detect genome rearrangements, and identify centromeric regions in allotetraploid cottons. We find that the centromeric retro-element sequence of tetraploid cotton derived from the D subgenome progenitor might have invaded the A subgenome centromeres after allotetrapolyploid formation. This study serves as a valuable genomic resource for genetic research and breeding of cotton.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0678-1) contains supplementary material, which is available to authorized users.  相似文献   

16.
A fine physical map of the rice (Oryza sativa spp. Japonica var. Nipponbare) chromosome 5 with bacterial artificial chromosome (BAC) and PI-derived artificial chromosome (PAC) clones was constructed through integration of 280 sequenced BAC/PAC clones and 232 sequence tagged site/expressed sequence tag markers with the use of fingerprinted contig data of the Nipponbare genome. This map consists of five contigs covering 99% of the estimated chromosome size (30.08 Mb). The four physical gaps were estimated at 30 and 20 kb for gaps 1–3 and gap 4, respectively. We have submitted 42.2-Mb sequences with 29.8 Mb of nonoverlapping sequences to public databases. BAC clones corresponding to telomere and centromere regions were confirmed by BAC-fluorescence in situ hybridization (FISH) on a pachytene chromosome. The genetically centromeric region at 54.6 cM was covered by a minimum tiling path spanning 2.1 Mb with no physical gaps. The precise position of the centromere was revealed by using three overlapping BAC/PACs for ~150 kb. In addition, FISH results revealed uneven chromatin condensation around the centromeric region at the pachytene stage. This map is of use for positional cloning and further characterization of the rice functional genomics. Electronic supplementary material Supplementary material is available in the online version of this article at and is accessible for authorized users. Chia-Hsiung Cheng and Mei-Chu Chung have equal contributions.  相似文献   

17.
On the tetraploid origin of the maize genome   总被引:2,自引:0,他引:2  
Data from cytological and genetic mapping studies suggest that maize arose as a tetraploid. Two previous studies investigating the most likely mode of maize origin arrived at different conclusions. Gaut and Doebley [7] proposed a segmental allotetraploid origin of the maize genome and estimated that the two maize progenitors diverged at 20.5 million years ago (mya). In a similar study, using larger data set, Brendel and colleagues (quoted in [8]) suggested a single genome duplication at 16 mya. One of the key components of such analyses is to examine sequence divergence among strictly orthologous genes. In order to identify such genes, Lai and colleagues [10] sequenced five duplicated chromosomal regions from the maize genome and the orthologous counterparts from the sorghum genome. They also identified the orthologous regions in rice. Using positional information of genetic components, they identified 11 orthologous genes across the two duplicated regions of maize, and the sorghum and rice regions. Swigonova et al. [12] analyzed the 11 orthologues, and showed that all five maize chromosomal regions duplicated at the same time, supporting a tetraploid origin of maize, and that the two maize progenitors diverged from each other at about the same time as each of them diverged from sorghum, about 11.9 mya.  相似文献   

18.
Robertson HM 《Genetics》2009,181(1):323-325
Simple telomeres were identified in the genome assembly of the basal placozoan animal Trichoplax adhaerens. They have 1–2 kb of TTAGGG telomeric repeats, which are preceded by a subtelomeric region of 1.5–13 kb. Unlike subtelomeric regions in most animals examined, these subtelomeric regions are unique to each telomere.  相似文献   

19.
There maize nuclear DNA fragments were isolated on the basis of their ability to confer replication on chimeric plasmids in yeast. These Eco RI fragments of 2.5, 2.8 and 5.5 kb are repeated elements within the maize genome. The 2.5 and 2.8 kb fragments represent a family of elements repeated 11 000 times in the maize haploid genome, while the 5.5 kb fragment is part of another family of 28 000 elements. These fragments were subcloned to further define the unique region of ARS activity. The sequence of each 550–650 bp ARS subclone is reported here, and compared to the flanking regions which do not show ARS activity. The ARS elements are 65–70% A+T as compared to 50–55% for the maize genome as a whole. There is approximately 15% sequence divergence, as well as variation of ARS efficiency, among family members. ARS subclones contain the proposed yeast consensus sequence.  相似文献   

20.
Purification and cDNA Cloning of Maize Poly(ADP)-Ribose Polymerase   总被引:1,自引:0,他引:1       下载免费PDF全文
Poly(ADP)-ribose polymerase (PADPRP) has been purified to apparent homogeneity from suspension cultures of the maize (Zea mays) callus line. The purified enzyme is a single polypeptide of approximately 115 kD, which appears to dimerize through an S-S linkage. The catalytic properties of the maize enzyme are very similar to those of its animal counterpart. The amino acid sequences of three tryptic peptides were obtained by microsequencing. Antibodies raised against peptides from maize PADPRP cross-reacted specifically with the maize enzyme but not with the enzyme from human cells, and vice versa. We have also characterized a 3.45-kb expressed-sequence-tag clone that contains a full-length cDNA for maize PADPRP. An open reading frame of 2943 bp within this clone encodes a protein of 980 amino acids. The deduced amino acid sequence of the maize PADPRP shows 40% to 42% identity and about 50% similarity to the known vertebrate PADPRP sequences. All important features of the modular structure of the PADPRP molecule, such as two zinc fingers, a putative nuclear localization signal, the automodification domain, and the NAD+-binding domain, are conserved in the maize enzyme. Northern-blot analysis indicated that the cDNA probe hybridizes to a message of about 4 kb.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号