首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Reference sequences are sequences that are used for public consultation, and therefore must be of high quality. Using the whole‐genome shotgun/next‐generation sequencing approach, many genome sequences of complex higher plants have been generated in recent years, and are generally considered reference sequences. However, none of these sequences has been experimentally evaluated at the whole‐genome sequence assembly level. Rice has a relatively simple plant genome, and the genome sequences for its two sub‐species obtained using different sequencing approaches were published approximately 10 years ago. This provides a unique system for a case study to evaluate the qualities and utilities of published plant genome sequences. We constructed a robust BAC physical map embedding a large number of BAC end sequences forrice variety 93–11. Through BAC end sequence alignments and tri‐assembly comparisons of the 93–11 physical map and the two reference sequences, we found that the Nipponbare reference sequence generated using the clone‐by‐clone approach has a high quality but still contains small artifact inversions and missing sequences. In contrast, the 93–11 reference sequence generated using the whole‐genome shotgun approach contains many large and varied assembly errors, such as inversions, duplications and translocations, as well as missing sequences. The 93–11 physical map provides an invaluable resource for evaluation and improvements toward completion of both Nipponbare and 93–11 reference sequences.  相似文献   

2.
3.
4.
李雄伟贾惠娟  高中山 《遗传》2013,35(10):1167-1178
桃(Prunus persica [L.] Batsch)是蔷薇科重要的核果类果树, 适应性强, 栽培范围广, 果实口感好, 深受消费者喜欢。提高桃果实品质及增加抗病、抗虫性一直是桃遗传育种者关注的焦点。文章对近年来桃遗传分子标记连锁图谱和物理图谱构建、分子标记开发应用、全基因组和转录组测序工作中所取得的最新成果进行综述, 同时阐述了高密度SNP芯片标记技术在桃以及其它作物上所开展的全基因组关联分析应用实例, 为桃进一步开展全基因组关联分析, 挖掘目标性状QTLs以及高效育种选择标记提供理论基础  相似文献   

5.

Background  

A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed.  相似文献   

6.
During the last ten years, Arabidopsis thaliana has become the most favoured plant system for the study of many aspects of development and adaptation to adverse conditions and diseases. The sequencing of the Arabidopsis thaliana genome is nearly completed with more than 90% of the sequence being released in public databases. This is the first plant genome to be analysed and it has revealed a tremendous amount of information about the nature of the genes it contains and its largely duplicated organisation. French groups have been involved in Arabidopsis genomics at several steps: EST (expressed sequence tags) sequencing, construction and ordering (physical mapping of chromosomes) of a YAC (yeast artificial chromosomes) library, genomic sequencing. In parallel an extensive programme of functional genomics is being undertaken through the systematic analysis of insertional mutants. This information provides a support for analysing other more economically important plant genomes such as the rice genome and constitutes the beginning of a systematic investigation on plant gene functions and will promote new strategies for plant improvement.  相似文献   

7.
Phytoplasmas are a large group of plant‐pathogenic wall‐less, non‐helical, bacteria associated with diseases, collectively referred to as yellows diseases, in more than a thousand plant species worldwide. Many of these diseases are of great economic importance. Phytoplasmas are difficult to study, in particular because all attempts at culturing these plant pathogens under axenic conditions have failed. With the introduction of molecular methods into phytoplasmology about two decades ago, the genetic diversity of phytoplasmas could be elucidated and a system for their taxonomic classification based on phylogenetic traits established. In addition, a wealth of information was generated on phytoplasma ecology and genomics, phytoplasma–plant host interactions and phytoplasma–insect vector relationships. Taxonomically, phytoplasmas are placed in the class Mollicutes, closely related to acholeplasmas, and are currently classified within the provisional genus ‘Candidatus Phytoplasma’ based primarily on 16S rDNA sequence analysis. Phytoplasmas are characterised by a small genome. The sizes vary considerably, ranging from 530 to 1350 kilobases (kb), with overlapping values between the various taxonomic groups and subgroups, resembling in this respect the culturable mollicutes. The smallest chromosome, about 530 kb, is known to occur in the Bermuda grass white leaf agent ‘Ca. Phytoplasma cynodontis’. This value represents the smallest mollicute chromosome reported to date. In diseased plants, phytoplasmas reside almost exclusively in the phloem sieve tube elements and are transmitted from plant to plant by phloem‐feeding homopteran insects, mainly leafhoppers and planthoppers, and less frequently psyllids. Most of the phytoplasma host plants are angiosperms in which a wide range of specific and non‐specific symptoms are induced. Phytoplasmas have a unique and complex life cycle that involves colonisation of different environments, the plant phloem and various organs of the insect vectors. Furthermore, many phytoplasmas have an extremely wide plant host range. The dynamic architecture of phytoplasma genomes, due to the occurrence of repetitive elements of various types, may account for variation in their genome size and adaptation of phytoplasmas to the diverse environments of their plant and insect hosts. The availability of five complete phytoplasma genome sequences has made it possible to identify a considerable number of genes that are likely to play major roles in phytoplasma–host interactions. Among these, there are genes encoding surface membrane proteins and effector proteins. Also, it has been shown that phytoplasmas dramatically alter their gene expression upon switching between plant and insect hosts.  相似文献   

8.
9.
The species–area relationship (SAR) constitutes one of the most general ecological patterns globally. A number of different SAR models have been proposed. Recent work has shown that no single model universally provides the best fit to empirical SAR datasets: multiple models may be of practical and theoretical interest. However, there are no software packages available that a) allow users to fit the full range of published SAR models, or b) provide functions to undertake a range of additional SAR‐related analyses. To address these needs, we have developed the R package ‘sars’ that provides a wide variety of SAR‐related functionality. The package provides functions to: a) fit 20 SAR models using non‐linear and linear regression, b) calculate multi‐model averaged curves using various information criteria, and c) generate confidence intervals using bootstrapping. Plotting functions allow users to depict and scrutinize the fits of individual models and multi‐model averaged curves. The package also provides additional SAR functionality, including functions to fit, plot and evaluate the random placement model using a species–sites abundance matrix, and to fit the general dynamic model of oceanic island biogeography. The ‘sars’ R package will aid future SAR research by providing a comprehensive set of simple to use tools that enable in‐depth exploration of SARs and SAR‐related patterns. The package has been designed to allow other researchers to add new functions and models in the future and thus the package represents a resource for future SAR work that can be built on and expanded by workers in the field.  相似文献   

10.
DNA microarrays were originally devised and described as a convenient technology for the global analysis of plant gene expression. Over the past decade, their use has expanded enormously to cover all kingdoms of living organisms. At the same time, the scope of applications of microarrays has increased beyond expression analyses, with plant genomics playing a leadership role in the on-going development of this technology. As the field has matured, the rate-limiting step has moved from that of the technical process of data generation to that of data analysis. We currently face major problems in dealing with the accumulating datasets, not simply with respect to how to archive, access, and process the huge amounts of data that have been and are being produced, but also in determining the relative quality of the different datasets. A major recognized concern is the appropriate use of statistical design in microarray experiments, without which the datasets are rendered useless. A vigorous area of current research involves the development of novel statistical tools specifically for microarray experiments. This article describes, in a necessarily selective manner, the types of platforms currently employed in microarray research and provides an overview of recent activities using these platforms in plant biology.  相似文献   

11.
Mathematical tools for quantifying plant–plant interactions are continuously improving, for example by attaining desirable statistical properties such as symmetry around zero (positive and negative effects have the same distribution). Standardisation is another such important property, making indices comparable between independent experiments, and can be achieved by standardisation for size. Using simulated data, here we show that an approach to standardisation by size that works well for indices of intensity is not appropriate for those of importance (intensity indices measure the absolute size of interaction effect, whilst importance indices quantify this effect as a proportion of the impact of the environment overall); our analyses also show that importance values can be overestimated in unproductive environments. These issues arise because importance indices use a reference value that is the “maximum growth on the gradient”. This causes problems when comparing the results from studies that examine different sections of an environmental gradient: the maximum growth of plants within these sections is different and so the indices are not easy to compare between different sections of a gradient. Although this may sound like an obvious point, such issues can often be overlooked and a general solution adopted. One such solution is to report raw data from separate studies so that values can be recomputed for combined datasets and thus standardised comparisons. Another solution is to use an off‐gradient reference that is the maximum growth measured under optimal conditions for a model target species (phytometer).  相似文献   

12.
Associating phenotypic traits and quantitative trait loci (QTL) to causative regions of the underlying genome is a key goal in agricultural research.InterStoreDB is a suite of integrated databases designed to assist in this process.The individual databases are species independent and generic in design,providing access to curated datasets relating to plant populations,phenotypic traits,genetic maps,marker loci and QTL,with links to functional gene annotation and genomic sequence data.Each component database provides access to associated metadata,including data provenance and parameters used in analyses,thus providing users with information to evaluate the relative worth of any associations identified.The databases include CropStoreDB,for management of population,genetic map,QTL and trait measurement data,SeqStoreDB for sequence-related data and AlignStoreDB,which stores sequence alignment information,and allows navigation between genetic and genomic datasets.Genetic maps are visualized and compared using the CMAP tool,and functional annotation from sequenced genomes is provided via an EnsEMBL-based genome browser.This framework facilitates navigation of the multiple biological domains involved in genetics and genomics research in a transparent manner within a single portal.We demonstrate the value of InterStoreDB as a tool for Brassica research.InterStoreDB is available from:http://www.interstoredb.org  相似文献   

13.
The concepts of coevolution and modularity have been studied separately for decades. Recent advances in genomics have led to the first systematic studies in each of these fields at the molecular level, resulting in several important discoveries. Both coevolution and modularity appear to be pervasive features of genomic data from all species studied to date, and their presence can be detected in many types of datasets, including genome sequences, gene expression data, and protein-protein interaction data. Moreover, the combination of these two ideas might have implications for our understanding of many aspects of biology, ranging from the general architecture of living systems to the causes of various human diseases.  相似文献   

14.
Vitis vinifera has been an emblematic plant for humans since the Neolithic period. Human civilization has been shaped by its domestication as both its medicinal and nutritional values were exploited. It is now cultivated on all habitable continents, and more than 5000 varieties have been developed. A global passion for the art of wine fuels innovation and a profound desire for knowledge on this plant. The genome sequence of a homozygotic cultivar and several RNA‐seq datasets on other varieties have been released paving the way to gaining further insight into its biology and tailoring improvements to varieties. However, its genome annotation remains unpolished. In this issue of Proteomics, Chapman and Bellgard (Proteomics 2017, 17, 1700197) discuss how proteogenomics can help improve genome annotation. By mining shotgun proteomics data, they defined new protein‐coding genes, refined gene structures, and corrected numerous mRNA splicing events. This stimulating study shows how large international consortia could work together to improve plant and animal genome annotation on a large scale. To achieve this aim, time should be invested to generate comprehensive, high‐quality experimental datasets for a wide range of well‐defined lineages and exploit them with pipelines capable of handling giant datasets.  相似文献   

15.
Wang H  Sun D  Sun G 《Génome》2011,54(12):986-992
The phylogeny of diploid Hordeum species has been studied using both chloroplast and nuclear gene sequences. However, the studies of different nuclear datasets of Hordeum species often arrived at similar conclusions, whereas the studies of different chloroplast DNA data generally resulted in inconsistent conclusions. Although the monophyly of the genus is well supported by both morphological and molecular data, the intrageneric phylogeny is still a matter of controversy. To better understand the evolutionary history of Hordeum species, two chloroplast gene loci (trnD-trnT intergenic spacer and rps16 gene) and one nuclear marker (thioreoxin-like gene (HTL)) were used to explore the phylogeny of Hordeum species. Two obviously different types of trnD-trnT sequences were observed, with an approximately 210 base pair difference between these two types: one for American species, another for Eurasian species. The trnD-trnT data generally separated the diploid Hordeum species into Eurasian and American clades, with the exception of Hordeum marinum subsp. gussoneanum. The rps16 data also grouped most American species together and suggested that Hordeum flexuosum has a different plastid type from the remaining American species. The nuclear gene HTL data clearly divided Hordeum species into two clades: the Xu+H genome clade and the Xa+I genome clade. Within clades, H genome species were well separated from the Xu species, and the I genome species were well separated from the Xa genome species. The incongruence between chloroplast and nuclear datasets was found and discussed.  相似文献   

16.
Plants have evolved and diversified to reduce the damages imposed by infectious pathogens and herbivorous insects. Living in a sedentary lifestyle, plants are constantly adapting to their environment. They employ various strategies to increase performance and fitness. Thus, plants developed cost‐effective strategies to defend against specific insects and pathogens. Plant defense, however, imposes selective pressure on insects and pathogens. This selective pressure provides incentives for pathogens and insects to diversify and develop strategies to counter plant defense. This results in an evolutionary arms race among plants, pathogens and insects. The ever‐changing adaptations and physiological alterations among these organisms make studying plant–vector–pathogen interactions a challenging and fascinating field. Studying plant defense and plant protection requires knowledge of the relationship among organisms and the adaptive strategies each organism utilize. Therefore, this review focuses on the integral parts of plant–vector–pathogen interactions in order to understand the factors that affect plant defense and disease development. The review addresses plant–vector–pathogen co‐evolution, plant defense strategies, specificity of plant defenses and plant–vector–pathogen interactions. Improving the comprehension of these factors will provide a multi‐dimensional perspective for the future research in pest and disease management.  相似文献   

17.
The analysis of cytosine methylation provides a new way to assess and describe epigenetic regulation at a whole-genome level in many eukaryotes. DNA methylation has a demonstrated role in the genome stability and protection, regulation of gene expression and many other aspects of genome function and maintenance. BS-seq is a relatively unbiased method for profiling the DNA methylation, with a resolution capable of measuring methylation at individual cytosines. Here we describe, as an example, a workflow to handle DNA methylation analysis, from BS-seq library preparation to the data visualization. We describe some applications for the analysis and interpretation of these data. Our laboratory provides public access to plant DNA methylation data via visualization tools available at our “Next-Gen Sequence” websites (http://mpss.udel.edu), along with small RNA, RNA-seq and other data types.  相似文献   

18.
Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T‐DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene‐rich regions, resulting in direct gene knockout or activation of genes within 20–30 kb up‐ and downstream of the T‐DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T‐DNA‐tagged rice mutant population. We also discuss important features of T‐DNA activation‐ and knockout‐tagging and promoter‐trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high‐throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops.  相似文献   

19.
Transgene expression from the chloroplast (plastid) genome offers several attractions to plant biotechnologists, including high-level accumulation of foreign proteins, transgene stacking in operons and a lack of epigenetic interference with the stability of transgene expression. In addition, the technology provides an environmentally benign method of plant genetic engineering, because plastids and their genetic information are maternally inherited in most crops and thus are largely excluded from pollen transmission. During the past few years, researchers in both the public and private sectors have begun to explore possible areas of application of plastid transformation in plant biotechnology as a viable alternative to conventional nuclear transgenic technologies. Recent proof-of-concept studies highlight the potential of plastid genome engineering for the expression of resistance traits, the production of biopharmaceuticals and metabolic pathway engineering in plants.  相似文献   

20.
This paper studies the problem of building multiclass classifiers for tissue classification based on gene expression. The recent development of microarray technologies has enabled biologists to quantify gene expression of tens of thousands of genes in a single experiment. Biologists have begun collecting gene expression for a large number of samples. One of the urgent issues in the use of microarray data is to develop methods for characterizing samples based on their gene expression. The most basic step in the research direction is binary sample classification, which has been studied extensively over the past few years. This paper investigates the next step-multiclass classification of samples based on gene expression. The characteristics of expression data (e.g. large number of genes with small sample size) makes the classification problem more challenging. The process of building multiclass classifiers is divided into two components: (i) selection of the features (i.e. genes) to be used for training and testing and (ii) selection of the classification method. This paper compares various feature selection methods as well as various state-of-the-art classification methods on various multiclass gene expression datasets. Our study indicates that multiclass classification problem is much more difficult than the binary one for the gene expression datasets. The difficulty lies in the fact that the data are of high dimensionality and that the sample size is small. The classification accuracy appears to degrade very rapidly as the number of classes increases. In particular, the accuracy was very low regardless of the choices of the methods for large-class datasets (e.g. NCI60 and GCM). While increasing the number of samples is a plausible solution to the problem of accuracy degradation, it is important to develop algorithms that are able to analyze effectively multiple-class expression data for these special datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号