首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have carried out a systematic analysis of the contribution of a set of selected features that include three new features to the accuracy of operon prediction. Our analyses have led to a number of new insights about operon prediction, including that (i) different features have different levels of discerning power when used on adjacent gene pairs with different ranges of intergenic distance, (ii) certain features are universally useful for operon prediction while others are more genome-specific and (iii) the prediction reliability of operons is dependent on intergenic distances. Based on these new insights, our newly developed operon-prediction program achieves more accurate operon prediction than the previous ones, and it uses features that are most readily available from genomic sequences. Our prediction results indicate that our (non-linear) decision tree-based classifier can predict operons in a prokaryotic genome very accurately when a substantial number of operons in the genome are already known. For example, the prediction accuracy of our program can reach 90.2 and 93.7% on Bacillus subtilis and Escherichia coli genomes, respectively. When no such information is available, our (linear) logistic function-based classifier can reach the prediction accuracy at 84.6 and 83.3% for E.coli and B.subtilis, respectively.  相似文献   

2.
The resources available from Arabidopsis thaliana for interpreting functional attributes of wheat EST are reviewed. A focus for the review is a comparison between wheat EST sequences, generated from developing endosperm tissue, and the complete genomic sequence from Arabidopsis. The available information indicates that not only can tentative annotations be assigned to many wheat genes but also putative or unknown Arabidopsis gene annotations can be improved by comparative genomics. Electronic Publication  相似文献   

3.
Genomic variants such as Single Nucleotide Polymorphisms and animal pedigree are now used widely in routine genetic evaluations of livestock in many countries. The use of genomic information not only can be used to enhance the accuracy of prediction but also to verify pedigrees for animals that are extensively managed using natural mating and enabling multiple-sire mating groups to be used. By so doing, the rate of genetic gain is enhanced, and any bias associated with incorrect pedigrees is removed. This study used a set of 8 764 sheep genotypes to verify the pedigree based on both the conventional opposing homozygote method as well as a novel method when combined with the inclusion of the genomic relationship matrix (GRM). The genomic relationship coefficients between verified pairs of animals showed on average a relationship of 0.50 with parent, 0.25 with grandparent, 0.13 with great grandparent, 0.50 with full-sibling and 0.27 with half-sibling. Minimum obtained values from these verified pairs were then used as thresholds to determine the pedigree for unverified pairs of animals, to detect potential errors in the pedigree. Using a case study from a population partially genotyped UK sheep, the results from this study illustrate a powerful way to resolve parentage inconsistencies, when combining the conventional ‘opposing homozygote’ method using genomic information together with GRM for pedigree checking. In this way, previously undetected pedigree errors can be resolved.  相似文献   

4.
Genomic imprinting is a conspicuous feature of the endosperm, a triploid tissue nurturing the embryo and synchronizing angiosperm seed development. An unknown subset of imprinted genes (IGs) is critical for successful seed development and should have highly conserved functions. Recent genome‐wide studies have found limited conservation of IGs among distantly related species, but there is a paucity of data from closely related lineages. Moreover, most studies focused on model plants with nuclear endosperm development, and comparisons with properties of IGs in cellular‐type endosperm development are lacking. Using laser‐assisted microdissection, we characterized parent‐specific expression in the cellular endosperm of three wild tomato lineages (Solanum section Lycopersicon). We identified 1025 candidate IGs and 167 with putative homologs previously identified as imprinted in distantly related taxa with nuclear‐type endosperm. Forty‐two maternally expressed genes (MEGs) and 17 paternally expressed genes (PEGs) exhibited conserved imprinting status across all three lineages, but differences in power to assess imprinted expression imply that the actual degree of conservation might be higher than that directly estimated (20.7% for PEGs and 10.4% for MEGs). Regardless, the level of shared imprinting status was higher for PEGs than for MEGs, indicating dissimilar evolutionary trajectories. Expression‐level data suggest distinct epigenetic modulation of MEGs and PEGs, and gene ontology analyses revealed MEGs and PEGs to be enriched for different functions. Importantly, our data provide evidence that MEGs and PEGs interact in modulating both gene expression and the endosperm cell cycle, and uncovered conserved cellular functions of IGs uniting taxa with cellular‐ and nuclear‐type endosperm.  相似文献   

5.
6.
7.
The compact disc (CD) is an ideal toolfor reading, writing, and storing numeric information. It was used in this work as a support for constructing DNA microarrays suited for genomic analysis. The CD was divided into two functional areas: the external ring of the CD was used for multiparametric DNA analysis on arrays, and the inner portion was usedfor storing numeric information. Because polycarbonate and CD resins autofluoresce, a colorimetric method for DNA microarray detection was used that is well adaptedfor the fast detection necessary when using a CD reader. A double-sided CD reader was developed for the simultaneous analysis of both array and numeric data. The numeric data are engraved as pits in the CD tracks and result in the succession of 0/1, which results from the modulation of the laser reflection when one reads the edges of the pits. Another diffraction-based laser was placed above the CD for the detection of the DNA targets on the microarrays. Both readersfit easily in a PC tower. Both numeric and genomic information data were simultaneously acquired, and each array was reconstituted, analyzed, and processed for quantification by the appropriate software.  相似文献   

8.
9.
Cultivated peanut (Arachis hypogaea L.) is an important grain legume providing high‐quality cooking oil, rich proteins and other nutrients. Shelling percentage (SP) is the 2nd most important agronomic trait after pod yield and this trait significantly affects the economic value of peanut in the market. Deployment of diagnostic markers through genomics‐assisted breeding (GAB) can accelerate the process of developing improved varieties with enhanced SP. In this context, we deployed the QTL‐seq approach to identify genomic regions and candidate genes controlling SP in a recombinant inbred line population (Yuanza 9102 × Xuzhou 68‐4). Four libraries (two parents and two extreme bulks) were constructed and sequenced, generating 456.89–790.32 million reads and achieving 91.85%–93.18% genome coverage and 14.04–21.37 mean read depth. Comprehensive analysis of two sets of data (Yuanza 9102/two bulks and Xuzhou 68‐4/two bulks) using the QTL‐seq pipeline resulted in discovery of two overlapped genomic regions (2.75 Mb on A09 and 1.1 Mb on B02). Nine candidate genes affected by 10 SNPs with non‐synonymous effects or in UTRs were identified in these regions for SP. Cost‐effective KASP (Kompetitive Allele‐Specific PCR) markers were developed for one SNP from A09 and three SNPs from B02 chromosome. Genotyping of the mapping population with these newly developed KASP markers confirmed the major control and stable expressions of these genomic regions across five environments. The identified candidate genomic regions and genes for SP further provide opportunity for gene cloning and deployment of diagnostic markers in molecular breeding for achieving high SP in improved varieties.  相似文献   

10.
Spliceosomal intron numbers and boundary sequences vary dramatically in eukaryotes. We found a striking correspondence between low intron number and strong sequence conservation of 5' splice sites (5'ss) across eukaryotic genomes. The phylogenetic pattern suggests that ancestral 5'ss were relatively weakly conserved, but that some lineages independently underwent both major intron loss and 5'ss strengthening. It seems that eukaryotic ancestors had relatively large intron numbers and 'weak' 5'ss, a pattern associated with frequent alternative splicing in modern organisms.  相似文献   

11.
Plants contain large mitochondrial genomes, which are several times as complex as those in animals, fungi or algae. However, genome size is not correlated with information content. The mitochondrial genome (mtDNA) of Arabidopsis specifies only 58 genes in 367 kb, whereas the 184 kb mtDNA in the liverwort Marchantia polymorpha codes for 66 genes, and the 58 kb genome in the green alga Prototheca wickerhamii encodes 63 genes. In Arabidopsis’ mtDNA, genes for subunits of complex II, for several ribosomal proteins and for 16 tRNAs are missing, some of which have been transferred recently to the nuclear genome. Numerous integrated fragments originate from alien genomes, including 16 sequence stretches of plastid origin, 41 fragments of nuclear (retro)transposons and two fragments of fungal viruses. These immigrant sequences suggest that the large size of plant mitochondrial genomes is caused by secondary expansion as a result of integration and propagation, and is thus a derived trait established during the evolution of land plants.  相似文献   

12.
13.
We identified simple-sequence repeat polymorphisms in intron 8 of the RHD and RHCE genes, both of which contained the 5-bp repeat unit (AAAAT)n. We analyzed the polymorphisms of this short tandem repeat (STR) in 104 Japanese RhD-positive and 124 RhD-negative (87 RHD gene negative and 37 nonfunctional RHD gene positive) donors by the polymerase chain reaction (PCR) and subsequent typing by electrophoresis and silver staining. We found five alleles (10, 11, 12, 13, and 14 repeats) in the RHD gene and four (7, 8, 9, and 10 repeats) in the RHCE gene. The Rh phenotypes were closely associated with polymorphisms of the STR. The Ce allele had 12 repeats in the RHD gene and 9 repeats in the RHCE gene at high frequency. The cE allele frequently had 10–12 repeats in the RHD gene and 10 repeats in the RHCE gene. The 10 repeats in the RHCE gene were identified exclusively in the 87 RHD gene-negative donors and 9 repeats were identified only in those with the RhC antigen. These results indicate that both haplotypes of dce and dcE arose from single RHD gene deletion and recombination events, respectively. In the 37 RhD-negative donors with a nonfunctional RHD gene, 12 repeats in the RHD gene and 9 repeats in the RHCE gene were frequently observed. Thus, the RhD-negative with a nonfunctional RHD gene combination might have arisen from the DCe haplotype via a mutation that abolished RHD gene expression. These findings suggest that the STR polymorphisms might shed light upon the molecular evolution of RH haplotypes. Received: 30 November 1998 / Accepted: 8 February 1999  相似文献   

14.
MOTIVATION: Genome sequencing projects and high-through-put technologies like DNA and Protein arrays have resulted in a very large amount of information-rich data. Microarray experimental data are a valuable, but limited source for inferring gene regulation mechanisms on a genomic scale. Additional information such as promoter sequences of genes/DNA binding motifs, gene ontologies, and location data, when combined with gene expression analysis can increase the statistical significance of the finding. This paper introduces a machine learning approach to information fusion for combining heterogeneous genomic data. The algorithm uses an unsupervised joint learning mechanism that identifies clusters of genes using the combined data. RESULTS: The correlation between gene expression time-series patterns obtained from different experimental conditions and the presence of several distinct and repeated motifs in their upstream sequences is examined here using publicly available yeast cell-cycle data. The results show that the combined learning approach taken here identifies correlated genes effectively. The algorithm provides an automated clustering method, but allows the user to specify apriori the influence of each data type on the final clustering using probabilities. AVAILABILITY: Software code is available by request from the first author. CONTACT: jkasturi@cse.psu.edu.  相似文献   

15.
The ancestral form of the cultivated tomato was originally confined to the Peru-Ecuador area. After spreading north possibly as a weed in pre-Columbian times it was not extensively domesticated until it reached Mexico, and from there the cultivated forms were disseminated.  相似文献   

16.
In order to rapidly assign relatively large numbers of tomato genomic clones to specific chromosomes, we have developed the following approach: groups of five to eight clones from a single copy tomato library are pooled, nick-translated, and utilized as probes against Southern blots consisting of a panel of trisomic DNAs. Since the trisomic DNAs are digested with the same enzyme used to produce the genomic library (PstI), each hybridizing band can be related to a specific genomic clone and the relative intensities of the bands can be used to assign each to a specific chromosome. With this technique, we have assigned 52 clones to specific chromosomes and verified the assignment of 21 out of 23 by genetic mapping in a segregating F2 population. In addition to selecting clones according to chromosome, the idea of using multiple clones or "molecular darts" may have broader applications, such as screening for differences between the genomes of nearly isogenic lines.  相似文献   

17.

Background

The incorporation of genomic coefficients into the numerator relationship matrix allows estimation of breeding values using all phenotypic, pedigree and genomic information simultaneously. In such a single-step procedure, genomic and pedigree-based relationships have to be compatible. As there are many options to create genomic relationships, there is a question of which is optimal and what the effects of deviations from optimality are.

Methods

Data of litter size (total number born per litter) for 338,346 sows were analyzed. Illumina PorcineSNP60 BeadChip genotypes were available for 1,989. Analyses were carried out with the complete data set and with a subset of genotyped animals and three generations pedigree (5,090 animals). A single-trait animal model was used to estimate variance components and breeding values. Genomic relationship matrices were constructed using allele frequencies equal to 0.5 (G05), equal to the average minor allele frequency (GMF), or equal to observed frequencies (GOF). A genomic matrix considering random ascertainment of allele frequencies was also used (GOF*). A normalized matrix (GN) was obtained to have average diagonal coefficients equal to 1. The genomic matrices were combined with the numerator relationship matrix creating H matrices.

Results

In G05 and GMF, both diagonal and off-diagonal elements were on average greater than the pedigree-based coefficients. In GOF and GOF*, the average diagonal elements were smaller than pedigree-based coefficients. The mean of off-diagonal coefficients was zero in GOF and GOF*. Choices of G with average diagonal coefficients different from 1 led to greater estimates of additive variance in the smaller data set. The correlation between EBV and genomic EBV (n = 1,989) were: 0.79 using G05, 0.79 using GMF, 0.78 using GOF, 0.79 using GOF*, and 0.78 using GN. Accuracies calculated by inversion increased with all genomic matrices. The accuracies of genomic-assisted EBV were inflated in all cases except when GN was used.

Conclusions

Parameter estimates may be biased if the genomic relationship coefficients are in a different scale than pedigree-based coefficients. A reasonable scaling may be obtained by using observed allele frequencies and re-scaling the genomic relationship matrix to obtain average diagonal elements of 1.  相似文献   

18.
Summary Genetic linkage maps were constructed for both maize and tomato, utilizing restriction fragment length polymorphisms (RFLPs) as the source of genetic markers. In order to detect these RFLPs, unique DNA sequence clones were prepared from either maize or tomato tissue and hybridized to Southern blots containing restriction enzyme-digested genomic DNA from different homozygous lines. A subsequent comparison of the RFLP inheritance patterns in F2 populations from tomato and maize permitted arrangement of the loci detected by these clones into genetic linkage groups for both species.  相似文献   

19.
以7个樱桃蕃茄品种为试材,进行不同栽培方式试验,对每个品种在不同栽培方式下的生长势、单果重、可溶性固形物含量、单株前期产量和总产量、农药残留量、亚硝酸盐含量、灌溉排出液的硝酸盐含量及经济效益进行比较分析。结果表明,基质栽培方式是比较理想的栽培模式。从农业综合性状分析,圣女、3688F1、韩3号为闽南地区适合推广种植的品种。  相似文献   

20.
Genetic control of branching in Arabidopsis and tomato   总被引:1,自引:0,他引:1  
The patterns of axillary bud formation and the growth characteristics of side-shoots determine to a large extent the form of plants. Characterization of mutants in the monopodial plant Arabidopsis thaliana and in the sympodial tomato, as well as cloning of some of the respective genes, contributes to a better understanding of side-shoot development. Genes have been identified that influence the initiation of axillary meristems and the pattern of their subsequent development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号