首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background:  

The presence of introns in protein-coding genes is a universal feature of eukaryotic genome organization, and the genes of multicellular eukaryotes, typically, contain multiple introns, a substantial fraction of which share position in distant taxa, such as plants and animals. Depending on the methods and data sets used, researchers have reached opposite conclusions on the causes of the high fraction of shared introns in orthologous genes from distant eukaryotes. Some studies conclude that shared intron positions reflect, almost entirely, a remarkable evolutionary conservation, whereas others attribute it to parallel gain of introns. To resolve these contradictions, it is crucial to analyze the evolution of introns by using a model that minimally relies on arbitrary assumptions.  相似文献   

2.
By comparing sequences of human, mouse and rat orthologous genes, we show that in 5′-untranslated regions (5′-UTRs) of mammalian cDNAs but not in 3′-UTRs or coding sequences, AUG is conserved to a significantly greater extent than any of the other 63 nt triplets. This effect is likely to reflect, primarily, bona fide evolutionary conservation, rather than cDNA annotation artifacts, because the excess of conserved upstream AUGs (uAUGs) is seen in 5′-UTRs containing stop codons in-frame with the start AUG and many of the conserved AUGs are found in different frames, consistent with the location in authentic non-coding sequences. Altogether, conserved uAUGs are present in at least 20–30% of mammalian genes. Qualitatively similar results were obtained by comparison of orthologous genes from different species of the yeast genus Saccharomyces. Together with the observation that mammalian and yeast 5′-UTRs are significantly depleted in overall AUG content, these findings suggest that AUG triplets in 5′-UTRs are subject to the pressure of purifying selection in two opposite directions: the uAUGs that have no specific function tend to be deleterious and get eliminated during evolution, whereas those uAUGs that do serve a function are conserved. Most probably, the principal role of the conserved uAUGs is attenuation of translation at the initiation stage, which is often additionally regulated by alternative splicing in the mammalian 5′-UTRs. Consistent with this hypothesis, we found that open reading frames starting from conserved uAUGs are significantly shorter than those starting from non-conserved uAUGs, possibly, owing to selection for optimization of the level of attenuation.  相似文献   

3.
A probabilistic measure for alignment-free sequence comparison   总被引:3,自引:0,他引:3  
MOTIVATION: Alignment-free sequence comparison methods are still in the early stages of development compared to those of alignment-based sequence analysis. In this paper, we introduce a probabilistic measure of similarity between two biological sequences without alignment. The method is based on the concept of comparing the similarity/dissimilarity between two constructed Markov models. RESULTS: The method was tested against six DNA sequences, which are the thrA, thrB and thrC genes of the threonine operons from Escherichia coli K-12 and from Shigella flexneri; and one random sequence having the same base composition as thrA from E.coli. These results were compared with those obtained from CLUSTAL W algorithm (alignment-based) and the chaos game representation (alignment-free). The method was further tested against a more complex set of 40 DNA sequences and compared with other existing sequence similarity measures (alignment-free). AVAILABILITY: All datasets and computer codes written in MATLAB are available upon request from the first author.  相似文献   

4.
Y chromosomal fertility genes of Drosophila: a new type of eukaryotic genes   总被引:2,自引:0,他引:2  
The Y chromosomal fertility genes of Drosophila are required for sperm differentiation. They are active only in primary spermatocytes where they form giant lampbrush loops. The molecular structure of these genes was investigated and revealed an unusual composition of DNA. Short, tandemly repeated sequence clusters are interrupted by longer and more heterogeneous sequences, which probably all represent transposable elements. No indication of the presence of protein-coding regions has been found within the fertility genes. However, the lampbrush loops bind site-specific proteins recognized by immunofluorescence techniques. This, together with other experimental data, led to the hypothesis that the Y chromosomal genes have a function in binding chromosomal proteins. The data and arguments in support of this gene model are summarized in this paper.  相似文献   

5.
6.
7.
The extent and nature of DNA polymorphism in the mutS-rpoS region of the Escherichia coli genome were assessed in 21 strains of enteropathogenic E. coli (EPEC) and enterohemorrhagic E. coli (EHEC) and in 6 strains originally isolated from natural populations. The intervening region between mutS and rpoS was amplified by long-range PCR, and the resulting amplicons varied substantially in length (7.8 to 14.2 kb) among pathogenic groups. Restriction maps based on five enzymes and sequence analysis showed that strains of the EPEC 1, EPEC 2, and EHEC 2 groups have a long mutS-rpoS region composed of a approximately 6.0-kb DNA segment found in strain K-12 and a novel DNA segment ( approximately 2.9 kb) located at the 3' end of rpoS. The novel segment contains three genes (yclC, pad1, and slyA) that occur in E. coli O157:H7 and related strains but are not found in K-12 or members of the ECOR group A. Phylogenetic analysis of the common sequences indicates that the long intergenic region is ancestral and at least two separate deletion events gave rise to the shorter regions characteristic of the E. coli O157:H7 and K-12 lineages.  相似文献   

8.
Your Gene structure Annotation Tool for Eukaryotes (yrGATE) provides an Annotation Tool and Community Utilities for worldwide web-based community genome and gene annotation. Annotators can evaluate gene structure evidence derived from multiple sources to create gene structure annotations. Administrators regulate the acceptance of annotations into published gene sets. yrGATE is designed to facilitate rapid and accurate annotation of emerging genomes as well as to confirm, refine, or correct currently published annotations. yrGATE is highly portable and supports different standard input and output formats. The yrGATE software and usage cases are available at .  相似文献   

9.
MOTIVATION: A promising sliding-window method for the detection of interspecific recombination in DNA sequence alignments is based on the monitoring of changes in the posterior distribution of tree topologies with a probabilistic divergence measure. However, as the number of taxa in the alignment increases or the sliding-window size decreases, the posterior distribution becomes increasingly diffuse. This diffusion blurs the probabilistic divergence signal and adversely affects the detection accuracy. The present study investigates how this shortcoming can be redeemed with a pruning method based on post-processing clustering, using the Robinson-Foulds distance as a metric in tree topology space. RESULTS: An application of the proposed scheme to three synthetic and two real-world DNA sequence alignments illustrates the amount of improvement that can be obtained with the pruning method. The study also includes a comparison with two established recombination detection methods: Recpars and the DSS (difference of sum of squares) method. AVAILABILITY: Software, data and further supplementary material are available at the following website: http://www.bioss.sari.ac.uk/~dirk/Supplements/  相似文献   

10.
A large number of genomes have been sequenced, allowing a range of comparative studies. Here, we present the eukaryotic Gene Order Browser with information on the order of protein and non-coding RNA (ncRNA) genes of 74 different eukaryotic species. The browser is able to display a gene of interest together with its genomic context in all species where that gene is present. Thereby, questions related to the evolution of gene organization and non-random gene order may be examined. The browser also provides access to data collected on pairs of adjacent genes that are evolutionarily conserved. AVAILABILITY: eGOB as well as underlying data are freely available at http://egob.biomedicine.gu.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: tore.samuelsson@medkem.gu.se.  相似文献   

11.

The increased availability of genomic resources for many species has expanded perspectives on problems in conservation by helping to design management strategies for threatened species. Tasmanian devils (Sarcophilus harrisii) are an iconic and endangered marsupial with an intensively managed breeding program aimed at preventing extinction in the wild caused by devil facial tumour disease. Between 2015 and 2017, 85 devils from this program were released to three sites in Tasmania to support wild populations. Of these, 26 were known to have been killed by vehicles shortly after release. A previous analysis indicated that increased generations in captivity was a positive predictor of vehicle strike, with possible behavioural change hypothesised. Here we use 39 resequenced devil genomes to characterise diversity at 35 behaviour-associated genes, which contained 826 single nucleotide polymorphisms (24 were non-synonymous). We tested for a predictor of survival by examining three genes (AVPR1B, OXT and SLC6A4) in 62 released devils with known fates (survived, N?=?39; died, N?=?23), and genome-wide associations via reduced-representation sequencing (1727 single nucleotide polymorphisms [SNPs]), in 55 devils with known fates (survived, N?=?38; died, N?=?17). Overall, there was little evidence of an association between genetic profile and probability of being struck by a vehicle. Despite previous evidence of low genetic diversity in devils, the 35 behaviour-associated genes contained variation that may influence their functions. Our dataset can be used for future research into devil behavioural ecology, and adds to the increasing body of research applying genomics to conservation problems.

  相似文献   

12.
Current efforts to build Sustainable Development Measurements have stumbled with problems of arbitrary structure, valuation,artificial ignorance suppression, and democratic illegitimacy. This paper proposes a new method to track and compare the Sustainable Development (SD) of countries, building an Interval of Sustainable Development (ISD). The ISD is capable of overcoming these problems by reporting all possible structures instead of only one, by relying on a variety of existing economic, social, and environmental variables, by embodying confidence levels in the measurement itself, and by facilitating democratic deliberation. By doing this, the ISD is capable of showing, subject to a confidence level, how a country is performing with respect to SD. This paper also applies this method specifying parameters and using available data for 180 countries during 1990–2011. During this 22-year period, results for a selection of countries are presented to illustrate the advantages and limitations of this proposal.  相似文献   

13.
Using oligonucleotide probes with defined sequences, we have selected clones from a human lymphocyte cDNA library which represent human leukocyte (HuIFN-α) and fibroblast (HuIFN-β) interferon gene sequences. Double-stranded f1 phage DNA was used as the vector for initial cloning of cDNA. Clones carrying interferon gene sequences were identified by hybridization with the oligonucleotide probes. The same oligonucleotide probes were used as primers for dideoxy chain termination sequencing of the clones. One HuIFN-α clone, 201, has a nucleotide sequence different from published HuIFN-α sequences. Under control of the lacUV5 promoter, the 201 gene has been used to express biologically active HuIFN-α in Escherichia coli.  相似文献   

14.
W M Hern 《Social biology》1990,37(1-2):102-109
Fertility measurement in small preindustrial societies is hampered by small numbers and the lack of some essential data. Most measures of fertility are collective and require large enough populations to permit grouped data analysis. Existing individual measures of fertility are often unsatisfactory. This paper presents a new measure of individual fertility, the Individual Fertility Rate (IFR), which is constructed by dividing parity by reproductive span in years and multiplying the product by 100. The result is a number which may be used as a dependent individual or cumulative variable to study the effects of health and socioeconomic factors on fertility.  相似文献   

15.
Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists and constructed parsimonious scenarios of evolution of the exon-intron structure for the respective genes. Approximately one-third of the introns in the malaria parasite Plasmodium falciparum are shared with at least one crown group eukaryote; this number indicates that these introns have been conserved through >1.5 billion years of evolution that separate Plasmodium from the crown group. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with the fly or nematode. The inferred evolutionary scenario holds that the common ancestor of Plasmodium and the crown group and, especially, the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes, arthropods, and probably Plasmodium. In addition, numerous introns have been inserted into vertebrate and plant genes, whereas, in other lineages, intron gain was much less prominent.  相似文献   

16.
MOTIVATION: Despite increased availability of genome annotation data, a comprehensive resource for in-depth analysis of splice signal distributions and alternative splicing (AS) patterns in eukaryote genomes is still lacking. To meet this need, we have developed EuSplice--a unique splice-centric database which provides reliable splice signal and AS information for 23 eukaryotes. RESULTS: The EuSplice database contains 95,822 AS events and 2.1 million splice signals associated with over 270,000 protein-coding genes. The intuitive, user-friendly EuSplice web interface has powerful data mining and graphics capabilities for inter-genomic comparative analysis of splice signals, putative cryptic splice sites and AS events. Moreover, the seamless integration of splicing data to extensive gene-specific annotations, such as homolog annotations, functional information, mutations and sequence details makes EuSplice a powerful one-stop information resource for investigating the molecular mechanisms of complex splicing events, disease associations and the evolution of splicing in eukaryotes. AVAILABILITY: http://66.170.16.154/EuSplice. SUPPLEMENTARY INFORMATION: Supplementary tables and figures at Bioinfo online.  相似文献   

17.
We analyse optimal and heuristic place prioritization algorithms for biodiversity conservation area network design which can use probabilistic data on the distribution of surrogates for biodiversity. We show how an Expected Surrogate Set Covering Problem (ESSCP) and a Maximal Expected Surrogate Covering Problem (MESCP) can be linearized for computationally efficient solution. For the ESSCP, we study the performance of two optimization software packages (XPRESS and CPLEX) and five heuristic algorithms based on traditional measures of complementarity and rarity as well as the Shannon and Simpson indices of α‐diversity which are being used in this context for the first time. On small artificial data sets the optimal place prioritization algorithms often produced more economical solutions than the heuristic algorithms, though not always ones guaranteed to be optimal. However, with large data sets, the optimal algorithms often required long computation times and produced no better results than heuristic ones. Thus there is generally little reason to prefer optimal to heuristic algorithms with probabilistic data sets.  相似文献   

18.
19.
20.

Background  

A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号