首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It has been proposed that sequence homology should exist between the short arms of the human sex chromosomes, in the regions pairing at meiosis. Out of 40 clones picked at random from a collection of non-repetitive DNA sequences derived from the human Y chromosome, we have found nine sequences which show very high homology with sequences located on the X chromosome. All nine probes originate from the euchromatic part of the Y chromosome. All the homologous sequences are located within the Xq12-Xq22-24 region. None of them map to the short arm of the X chromosome. We conclude that an important part of the euchromatic region of the Y chromosome is homologous to the middle of the X chromosome long arm, possibly as a result of recent translation event(s).  相似文献   

2.
We explore model-based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log–Det distance measure. We take as our primary tool group representation theory, and show that it provides a general framework for analyzing Markov processes on trees. From this algebraic perspective, the inherent symmetries of these processes become apparent, and focusing on plethysms, we are able to define Markov invariants and give existence proofs. We give an explicit technique for constructing the invariants, valid for any number of character states and taxa. For phylogenetic trees with three and four leaves, we demonstrate that the corresponding Markov invariants can be fruitfully exploited in applied phylogenetic studies.  相似文献   

3.
In this study, we wanted to inspect whether the evolutionary driven differences in primary sequences could correlate, and thus predict the genetic diversity of related marker loci, which is an important criterion to assess the quality of any DNA marker. We adopted new approach of quantitative symbolic DNA sequence analysis called DNA random walk representation to study multiallelic marker loci from Begonia × tuberhybrida Voss. We described significant correlation of random walk-derived digital invariants to genetic diversity of the marker loci. Specifically, on the 3D-contour plot of multivariate principal component analysis (PCA), we revealed statistical correlation between the first two PCA factors and the number of alleles per marker locus. Based on that correlation, we suggest that DNA walk representation may predict allele-rich loci solely from their primary sequences, which improves current design of new DNA germplasm identificators.  相似文献   

4.
Cytochrome b561 family was characterized by the presence of "b561 core domain" that forms a transmembrane four helix bundle containing four totally conserved His residues, which might coordinate two heme b groups. We conducted BLAST and PSI-BLAST searches to obtain insights on structure and functions of this protein family. Analyses with CLUSTAL W on b561 sequences from various organisms showed that the members could be classified into 7 subfamilies based on characteristic motifs; groups A (animals/neuroendocrine), B (plants), C (insects), D (fungi), E (animals/TSF), F (plants+DoH), and G (SDR2). In group A, both motif 1, {FN(X)HP(X)2M(X)2G(X)5G(X)ALLVYR}, and motif 2, {YSLHSW(X)G}, were identified. These two motifs were also conserved in group B. There was no significant features characteristic to groups C and D. A modified version of motif 1, {LFSWHP(X)2M(X)3F(X)3M(X)EAIL(X)SP(X)2SS}, was found in group E with a high degree of conservation. Both motif 3, {DP(X)WFY(L)H(X)3Q}, and motif 4, {K(X)R(X)YWN(X)YHH(X)2G(R/Y)} ,were found in group F at different regions from those of motifs 1 and 2. The "DoH" domain common to the NH2-terminal region of dopamine beta-hydroxylase was found to form fusion proteins with the b561 core domains in groups F and G. Based on these results, we proposed a hypothesis regarding structures and functions of the 7 subfamilies of cytochrome b561.  相似文献   

5.
An exact expression for the variance of random frequency thata given word has in text generated by a Markov chain is presented.The result is applied to periodic Markov chains, which describethe protein-coding DNA sequences better than simple Markov chains.A new solution to the problem of word overlap is proposed. Itwas found that the expected frequency and overlapping propertiesdetermine most of the variance. The expectation and varianceof counts for triplets are compared with experimental countsin Escherichia coli coding sequences.  相似文献   

6.
Sixty-four eucaryotic nuclear DNA sequences, half of them coding and half noncoding, have been examined as expressions of first-, second-, or third-order Markov chains. Standard statistical tests found that most of the sequences required at least second-order Markov chains for their representation, and some required chains of third order. For all 64 sequences the observed one-step second-order transition count matrices were effective in predicting the two-step transition count matrices, and 56 of 64 were effective in predicting the three-step transition count matrices. The departure from random expectation of the observed first- and second-order transition count matrices meant that a considerable sample of eucaryotic nuclear DNA sequences, both protein coding and noncoding, have significant local structure over subsequences of three to five contiguous bases, and that this structure occurs throughout the total length of the sequence. These results suggested that present DNA sequences may have arisen from the duplication, concatenation, and gradual modification of very early short sequences.  相似文献   

7.
Summary Sixty-four eucaryotic nuclear DNA sequences, half of them coding and half noncoding, have been examined as expressions of first-, second-, or third-order Markov chains. Standard statistical tests found that most of the sequences required at least second-order Markov chains for their representation, and some required chains of third order. For all 64 sequences the observed one-step second-order transition count matrices were effective in predicting the two-step transition count matrices, and 56 of 64 were effective in predicting the three-step transition count matrices. The departure from random expectation of the observed first- and second-order transition count matrices meant that a considerable sample of eucaryotic nuclear DNA sequences, both protein coding and noncoding, have significant local structure over subsequences of three to five contiguous bases, and that this structure occurs throughout the total length of the sequence. These results suggested that present DNA sequences may have arisen from the duplication, concatenation, and gradual modification of very early short sequences.  相似文献   

8.
Let X(1)...X(n) be a sequence of i.i.d. positive or negative integer-valued random variables and H(n) = max(0 < or = i < or = j < or = n)(X(i) +...+ X(j)) be the local score of the sequence. The exact distribution of H(n) is obtained using a simple Markov chain. This result is applied to the scoring of DNA and protein sequences in molecular biology.  相似文献   

9.
Summary By in situ hybridization, Y-specific DNA sequences were localized on Xp22.3-Xpter of one of the two X chromosomes in all of eleven XX males studied. In nine of the cases the presence of the Y-specific DNA did not affect random X inactivation in fibroblasts. Fibroblasts of the other two cases showed a preferential inactivation of the Y DNA-carrying X chromosome. In only one of these two exceptions blood lymphocytes could also be studied, and here, random inactivation of the Y DNA-carrying X chromosome occurred. Furthermore, the gene dosage of steroid sulfatase (STS) was examined by Southern blot analysis. In ten of the cases including the one showing random X-inactivation in lymphocytes but not in fibroblasts, a double dosage of the STS gene is present. The remaining case with non-random inactivation shows a single STS gene dosage. This case was reported previously to have STS enzyme activity in the male range. It is assumed that, as a consequence of an unequal X-Y interchange, a deletion of X-specific DNA sequences may result in the preferential inactivation of the Y DNA-carrying X chromosome.  相似文献   

10.
Four cloned unique sequences from the human Y chromosome, two of which are found only on the Y chromosome and two of which are on both the X and Y chromosomes, were hybridized to restriction enzyme-treated DNA samples of a male and a female chimpanzee (Pan troglodytes), gorilla (Gorilla gorilla), and pig-tailed macaque (Macaca nemestrina); and a male orangutan (Pongo pygmaeus) and gibbon (Hylobates lar). One of the human Y-specific probes hybridized only to male DNA among the humans and great apes, and thus its Y linkage and sequence similarities are conserved. The other human Y-specific clone hybridized to male and female DNA from the humans, great apes, and gibbon, indicating its presence on the X chromosome or autosomes. Two human sequences present on both the X and Y chromosomes also demonstrated conservation as indicated by hybridization to genomic DNAs of distantly related species and by partial conservation of restriction enzyme sites. Although conservation of Y linkage can only be demonstrated for one of these four sequences, these results suggest that Y-chromosomal unique sequence genes do not diverge markedly more rapidly than unique sequences located on other chromosomes. However, this sequence conservation may in part be due to evolution while part of other chromosomes.  相似文献   

11.
In this paper, we study the problem of computing the similarity of two protein structures by measuring their contact-map overlap. Contact-map overlap abstracts the problem of computing the similarity of two polygonal chains as a graph-theoretic problem. In R3, we present the first polynomial time algorithm with any guarantee on the approximation ratio for the 3-dimensional problem. More precisely, we give an algorithm for the contact-map overlap problem with an approximation ratio of sigma where sigma = min{sigma(P1), sigma(P2)} 0, is hard.  相似文献   

12.
It is well known that the periodic cycle {x(n)} of a periodically forced nonlinear difference equation is attenuant (resonant) if av(x(n)) < av(K(n))(av(x(n)) > av(K(n))),where {K ( n )} is the carrying capacity of the environment and av(t(n)) = (1/p)∑(p?1) (i=0) ti (arithmetic mean of the p-periodic cycle {t ( n )}). In this article, we extend the concept of attenuance and resonance of periodic cycles using the geometric mean for the average of a periodic cycle. We study the properties of the periodically forced nonautonomous delay Beverton-Holt model x(n+1) = r(n)x(n)/1 + (r(n?l) ? 1)x(n?k)/K(n?k), n= 0, 1, . . . , where {K ( n )} and {r ( n )} are positive p-periodic sequences; (K ( n )>0, r ( n )>1) as well as k and l are nonnegative integers. We will show that for all positive solutions {x ( n )} of the previous equation lim sup (n→∞) (∏(n?1)(i=0)xi)(1/n) ≤ ((∏(p?1)(i=0)ri)(1/p) ? 1)(∏(p?1)(i=0)(ri ? 1))(?1/p)(∏(p?1)(i=0)Ki)(1/p). In particular, in the case where {x(n)} is a p-periodic solution of the above equation (assuming that such solution exists) and r ( n )=r>1, the periodic cycle is g-attenuant, that is (∏(p?1)(i=0)x(i))(1/p)<(∏(p?1)(i=0)K(i))(p?1) Surprisingly, the obtained results show that the delays k and l do not play any role.  相似文献   

13.
Relationship between curved DNA conformations and slow gel migration   总被引:2,自引:0,他引:2  
We propose some specific DNA conformations that explain, in terms of molecular conformations, the anomalous gel electrophoretic behavior of the sequences (VA4T4X), and (V2A3T3X2)i where V and X are either G or C. Previously (J. Biomole. Struct. Dyn. 4, 41, 1986) we considered hydrophobic interactions among aliphatic hydrocarbon groups in A/T sequences. In the sequences (T)n.(A)n, the T's are slightly bent to yield structures with tightly stacked methyl groups along one side of the major groove. By folding together the two pairs of stacked methyls on the opposite sides of the major groove. TTAA might yield a relatively sharp bend. On this basis, we show below that the sequences (VT4A4X)i might form a very tightly coiled super-helix whereas the sequences (VA4T4X)i form a broad super-helix of radius approximately 120 A for i = 25. The sequence (V2A3T3X2)i forms a slightly smaller radius super-helix. The time of passage through the gel has been taken to be inversely proportional to the smallest dimension of the molecule. Specifically we are taking the ratio of the apparent molecular weight to the actual molecular weight to be related to the moment of inertia I1 about the smallest principal axis of the molecular conformation. We find a good fit to the experimental gel mobility data of Hagerman (2) if we assume this ratio to be proportional to (I1)1/5.  相似文献   

14.
The K group of human endogenous retroviruses (HERV-K) has been suggested to have a role in disease and has recently been shown to include long terminal repeat (LTR) elements that are human specific. Here we investigated the presence of HERV-K LTRs on the human X and Y chromosomes with the use of PCR on a monochromosomal somatic cell hybrid DNA panel. We report twelve such sequences on the X chromosome and ten sequences on the Y chromosome. Phylogenetic analysis reveals that clones X2, 4, 5, 6, 7, 11, 15 from the X chromosome and clones Y4, 5, 7, 10 from the Y chromosome are closely related to the human-specific members of Medstrand and Mager's cluster 9. The sequence of clone Y7 from the Y chromosome is identical with human-specific HERV-K LTR element (AC002350) from chromosome 12q24. The findings suggest recent proliferation and transposition of HERV-K LTR elements on these chromosomes. Such events may have contributed to structural change and genetic variation in the human genome. We draw attention to evolutionarily recent changes in homologies between X and Y chromosomes as a method of further investigating such transpositions.  相似文献   

15.
The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences.  相似文献   

16.
New 3D graphical representation of DNA sequence based on dual nucleotides   总被引:2,自引:2,他引:0  
We introduce a 3D graphical representation of DNA sequences based on the pairs of dual nucleotides (DNs). Based on this representation, we consider some mathematical invariants and construct two 16-component vectors associated with these invariants. The vectors are used to characterize and compare the complete coding sequence part of beta globin gene of nine different species. The examination of similarities/dissimilarities illustrates the utility of the approach.  相似文献   

17.
The karyotype of the spiny eel (Mastacembelus aculeatus) has highly evolved heteromorphic sex chromosomes. X and Y chromosomes differ from each other in the distribution of heterochromatin blocks. To characterize the repetitive sequences in these heterochromatic regions, we microdissected the X chromosome, constructed an X chromosome library, amplified the genomic DNA using PCR and isolated a repetitive sequence DNA family by screening the library. All family members were clusters of two simple repetitive monomers, MaSRS1 and MaSRS2. We detected a conserved 5S rDNA gene sequence within monomer MaSRS2; thus, tandem-arranged MaSRS1s and MaSRS2s may co-compose 5S rDNA multigenes and NTSs in M. aculeatus. FISH analysis revealed that MaSRS1 and MaSRS2were the main components of the heterochromatic regions of the X and Y chromosomes. This finding contributes additional data about differentiation of heteromorphic sex chromosomes in lower vertebrates.  相似文献   

18.
Prevalence of quadruplexes in the human genome   总被引:28,自引:17,他引:11  
Guanine-rich DNA sequences of a particular form have the ability to fold into four-stranded structures called G-quadruplexes. In this paper, we present a working rule to predict which primary sequences can form this structure, and describe a search algorithm to identify such sequences in genomic DNA. We count the number of quadruplexes found in the human genome and compare that with the figure predicted by modelling DNA as a Bernoulli stream or as a Markov chain, using windows of various sizes. We demonstrate that the distribution of loop lengths is significantly different from what would be expected in a random case, providing an indication of the number of potentially relevant quadruplex-forming sequences. In particular, we show that there is a significant repression of quadruplexes in the coding strand of exonic regions, which suggests that quadruplex-forming patterns are disfavoured in sequences that will form RNA.  相似文献   

19.
In this article, we introduce the drifting Markov models (DMMs) which are inhomogeneous Markov models designed for modeling the heterogeneities of sequences (in our case DNA or protein sequences) in a more flexible way than homogeneous Markov chains or even hidden Markov models (HMMs). We focus here on the polynomial drift: the transition matrix varies in a polynomial way. To show the reliability of our models on DNA, we exhibit high similarities between the probability distributions of nucleotides obtained by our models and the frequencies of these nucleotides computed by using a sliding window. In a further step, these DMMs can be used as the states of an HMM: on each of its segments, the observed process can be modeled by a drifting Markov model. Search of rare words in DNA sequences remains possible with DMMs and according to the fits provided, DMMs turn out to be a powerful tool for this purpose. The software is available on request from the author. It will soon be integrated on seq++ library (http://stat.genopole.cnrs.fr/seqpp/).  相似文献   

20.
Organization of DNA sequences and replication origins at yeast telomeres   总被引:50,自引:0,他引:50  
C S Chan  B K Tye 《Cell》1983,33(2):563-573
We have shown that the DNA sequences adjacent to the telomeres of Saccharomyces cerevisiae chromosomes are highly conserved and contain a high density of replication origins. The salient features of these telomeres can be summarized as follows. There are three moderately repetitive elements present at the telomeres: the 131 sequence (1 to 1.5 kb), the highly conserved Y sequence (5.2 kb), and the less conserved X sequence (0.3 to 3.75 kb). There is a high density of replication origins spaced about 6.7 kb apart at the telomeres. These replication origins are part of the X or the Y sequences. Some of the 131-Y repetitive units are tandemly arranged. The terminal sequence T (about 0.33 to 0.6 kb) is different from the 131, X, or Y sequences and is heterogeneous in length. The order of these sequences from the telomeric end towards the centromere is T-(Y-131)n-X-, where n ranges from 1 to no more than 4. Although these telomeric sequences are conserved among S. cerevisiae strains, they show striking divergence in certain closely related yeast species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号