首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
MOTIVATION: Many computational methods for identifying regulatory elements use a likelihood ratio between motif and background models. Often, the methods use a background model of independent bases. At least two different Markov background models have been proposed with the aim of increasing the accuracy of predicting regulatory elements. Both Markov background models suffer theoretical drawbacks, so this article develops a third, context-dependent Markov background model from fundamental statistical principles. RESULTS: Datasets containing known regulatory elements in eukaryotes provided a basis for comparing the predictive accuracies of the different background models. Non-parametric statistical tests indicated that Markov models of order 3 constituted a statistically significant improvement over the background model of independent bases. Our model performed slightly better than the previous Markov background models. We also found that for discriminating between the predictive accuracies of competing background models, the correlation coefficient is a more sensitive measure than the performance coefficient. AVAILABILITY: Our C++ program is available at ftp://ftp.ncbi.nih.gov/pub/spouge/papers/archive/AGLAM/2006-07-19  相似文献   

2.
Identifying the geographic distribution of populations is a basic, yet crucial step in many fundamental and applied ecological projects, as it provides key information on which many subsequent analyses depend. However, this task is often costly and time consuming, especially where rare species are concerned and where most sampling designs generally prove inefficient. At the same time, rare species are those for which distribution data are most needed for their conservation to be effective. To enhance fieldwork sampling, model‐based sampling (MBS) uses predictions from species distribution models: when looking for the species in areas of high habitat suitability, chances should be higher to find them. We thoroughly tested the efficiency of MBS by conducting an important survey in the Swiss Alps, assessing the detection rate of three rare and five common plant species. For each species, habitat suitability maps were produced following an ensemble modeling framework combining two spatial resolutions and two modeling techniques. We tested the efficiency of MBS and the accuracy of our models by sampling 240 sites in the field (30 sites×8 species). Across all species, the MBS approach proved to be effective. In particular, the MBS design strictly led to the discovery of six sites of presence of one rare plant, increasing chances to find this species from 0 to 50%. For common species, MBS doubled the new population discovery rates as compared to random sampling. Habitat suitability maps coming from the combination of four individual modeling methods predicted well the species' distribution and more accurately than the individual models. As a conclusion, using MBS for fieldwork could efficiently help in increasing our knowledge of rare species distribution. More generally, we recommend using habitat suitability models to support conservation plans.  相似文献   

3.
Building a metal-responsive promoter with synthetic regulatory elements.   总被引:37,自引:13,他引:24       下载免费PDF全文
A fusion gene consisting of the promoter region from the mouse metallothionein-I gene joined to the coding region of the herpes simplex virus thymidine kinase gene is efficiently regulated by zinc in a transient assay when transfected into baby hamster kidney cells. Analysis of similar plasmids in which the metallothionein-I promoter region was mutated indicated the presence of multiple metal regulatory elements (MREs) between -176 and -44 base pairs from the cap site. To further investigate the function of MREs, we inserted a synthetic DNA fragment containing the sequence of MRE-a (the element between -55 and -44 base pairs) into the nonresponsive promoter of the thymidine kinase gene in various positions and configurations. Little or no induction by zinc was observed with single insertions of the regulatory sequence, whereas many different constructions having two copies of MRE-a were inducible. The precise position of the two MREs relative to each other or to the thymidine kinase promoter elements had a relatively small effect on the efficiency of induction, but the inducibility could be further increased by the introduction of more MRE-a sequences. MRE-a can function synergistically with the thymidine kinase distal promoter elements, but in the presence of the TATA box alone it functions as a positive, zinc-dependent promoter element.  相似文献   

4.
5.
A Gibbs sampling approach to linkage analysis.   总被引:9,自引:0,他引:9  
We present a Monte Carlo approach to estimation of the recombination fraction theta and the profile likelihood for a dichotomous trait and a single marker gene with 2 alleles. The method is an application of a technique known as 'Gibbs sampling', in which random samples of each of the unknowns (here genotypes, theta and nuisance parameters, including the allele frequencies and the penetrances) are drawn from their posterior distributions, given the data and the current values of all the other unknowns. Upon convergence, the resulting samples derive from the marginal distribution of all the unknowns, given only the data, so that the uncertainty in the specification of the nuisance parameters is reflected in the variance of the posterior distribution of theta. Prior knowledge about the distribution of theta and the nuisance parameters can be incorporated using a Bayesian approach, but adoption of a flat prior for theta and point priors for the nuisance parameters would correspond to the standard likelihood approach. The method is easy to program, runs quickly on a microcomputer, and could be generalized to multiple alleles, multipoint linkage, continuous phenotypes and more complex models of disease etiology. The basic approach is illustrated by application to data on cholesterol levels and an a low-density lipoprotein receptor gene in a single large pedigree.  相似文献   

6.
The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles.  相似文献   

7.
G Colwell  B Li  D Forrest  R Brackenbury 《Genomics》1992,14(4):875-882
Genomic clones containing 5'-flanking sequences, the first exon, and the entire first intron from the chicken N-CAM gene were characterized by restriction mapping and DNA sequencing. A > 600-bp segment that includes the first exon is very G + C-rich and contains a large proportion of CpG dinucleotides, suggesting that it represents a CpG island. SP-1 and AP-1 consensus elements are present, but no TATA- or CCAAT-like elements were found within 300 bp upstream of the first exon. Comparison of the chicken promoter region sequence with similar regions of the human, rat, and mouse N-CAM genes revealed that some potential regulatory elements including a "purine box" seen in mouse and rat N-CAM genes, one of two homeodomain binding regions seen in mammalian N-CAM genes, and several potential SP-1 sites are not conserved within this region. In contrast, high CpG content, a homeodomain binding sequence, an SP-1 element, an octomer element, and an AP-1 element are conserved in all four genes. The first intron of the chicken gene is 38 kb, substantially smaller than the corresponding intron from mammalian N-CAM genes. Together with previous studies, this work completes the cloning of the chicken N-CAM gene, which contains at least 26 exons distributed over 85 kb.  相似文献   

8.
9.
10.
The Gibbs sampling method has been widely used for sequence analysis after it was successfully applied to the problem of identifying regulatory motif sequences upstream of genes. Since then, numerous variants of the original idea have emerged: however, in all cases the application has been to finding short motifs in collections of short sequences (typically less than 100 nucleotides long). In this paper, we introduce a Gibbs sampling approach for identifying genes in multiple large genomic sequences up to hundreds of kilobases long. This approach leverages the evolutionary relationships between the sequences to improve the gene predictions, without explicitly aligning the sequences. We have applied our method to the analysis of genomic sequence from 14 genomic regions, totaling roughly 1.8 Mb of sequence in each organism. We show that our approach compares favorably with existing ab initio approaches to gene finding, including pairwise comparison based gene prediction methods which make explicit use of alignments. Furthermore, excellent performance can be obtained with as little as four organisms, and the method overcomes a number of difficulties of previous comparison based gene finding approaches: it is robust with respect to genomic rearrangements, can work with draft sequence, and is fast (linear in the number and length of the sequences). It can also be seamlessly integrated with Gibbs sampling motif detection methods.  相似文献   

11.

Background  

Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set.  相似文献   

12.
13.
14.
15.
E May  F Omilli  J Borde    P Scieller 《Journal of virology》1992,66(6):3347-3354
Late promoter activity measured before viral DNA replication results from a complex involvement of negative and positive cis-acting elements located both in the enhancer and in the 21-bp repeats. GC motifs located within the 21-bp repeats act in cooperation with sequences overlapping the early TATA box to down-regulate the late promoter activity. Analysis of insertion mutants indicates that the late promoter might be negatively regulated at least partially by the early promoter machinery. The GTI motif located within the enhancer as well as the GC motifs lose the ability to down-regulate the late promoter in the presence of T antigen. Results obtained with tsA58 protein indicate that two different domains of T antigen are involved in the negative autoregulation of the early promoter activity and in the release of the down-regulation of the late promoter by the GC motifs.  相似文献   

16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号