首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Contaminated observations (e.g. outliers) and heavy tails in the underlying distribution influence the standard deviation as a measure of dispersion even more than, e.g., the mean. Other measures of dispersion, namely absolute deviation, (α, β)-trimmed standard deviation, interquartile range and median absolute deviation (MAD) are defined for population, their properties — especially robustness — are explained and estimators are given, discussed and computed for a medical example. It is investigated how these measures of dispersion can be used to estimate a scale parameter of the underlying distribution more robustly. In numerical comparisons and simulations the robustness of these measures is demonstrated for heavy tailed distributions and contaminated distributions. Among other proposals it is recommended to use the (α, β)-trimmed standard deviation and transform it to the ordinary standard deviation for easier interpretation, if possible.  相似文献   

2.
The kappa index is usually used for measuring the agreement between two observers when the scale is nominal. A modification of Cohen's kappa index was given by Krauth. The new estimator was biased and its large sample variance was obtained. An alternative estimator is developed here It is a ratio estimator and its mean square error is derived. A comparison with Cohen's estimator and Krauth's one is given by the examples used in the paper of Krauth.  相似文献   

3.
Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.  相似文献   

4.
5.
6.
Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the “information profile”, which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance.  相似文献   

7.
A model experimental system based on SV40-transformed Chinese hamster embryo cells and a highly sensitive in situ hybridization procedure was designed. Exposure of the cells to different categories of chemical and physical carcinogens resulted in the induction of SV40 DNA synthesis in the treated cells. Although the carcinogen-mediated amplification of SV40 DNA sequences is regulated by the viral “A” gene, neither infectious virus nor complete viral DNA molecules were rescued from the treated cells. A heterogenous collection of DNA molecules containing SV40 sequences was generated following treatment with DMBA. Restriction enzyme analysis of the amplified DNA molecules in the Hirt supernatant revealed that not all sequences in the integrated SV40 inserts are present. The possibility that the amplification of SV40 sequences is a reflection of a general gene amplification phenomenon mediated by carcinogens is discussed.  相似文献   

8.
Denatured DNA from leukemic myeloblasts or uninfected chicken embryos, immobilized on nitrocellulose filters, was hybridized to a vast excess of [(3)H]70S RNA from purified avian myeloblastosis virus. The viral RNA was eluted from the RNA-DNA hybrids, purified, and then rehybridized in solution to an excess of either leukemic or normal chicken embryonic DNA. This study revealed that all the slow and the fast hybridizing viral RNA sequences detectable by liquid hybridization in DNA excess had hybridized to the filter bound DNA. Both techniques also gave similar values for the number of 28S ribosomal RNA genes contained in a chicken cell genome: 210 by the liquid hybridization procedure and 218 by the filter hybridization technique. Therefore, filter hybridization can accurately detect DNA sequences present in relatively few numbers in the genome of higher organisms.  相似文献   

9.
10.
11.
12.
Epigenetic marks are fundamental to normal development, but little is known about signals that dictate their placement. Insights have been provided by studies of imprinted loci in mammals, where monoallelic expression is epigenetically controlled. Imprinted expression is regulated by DNA methylation programmed during gametogenesis in a sex-specific manner and maintained after fertilization. At Rasgrf1 in mouse, paternal-specific DNA methylation on a differential methylation domain (DMD) requires downstream tandem repeats. The DMD and repeats constitute a binary switch regulating paternal-specific expression. Here, we define sequences sufficient for imprinted methylation using two transgenic mouse lines: One carries the entire Rasgrf1 cluster (RC); the second carries only the DMD and repeats (DR) from Rasgrf1. The RC transgene recapitulated all aspects of imprinting seen at the endogenous locus. DR underwent proper DNA methylation establishment in sperm and erasure in oocytes, indicating the DMD and repeats are sufficient to program imprinted DNA methylation in germlines. Both transgenes produce a DMD-spanning pit-RNA, previously shown to be necessary for imprinted DNA methylation at the endogenous locus. We show that when pit-RNA expression is controlled by the repeats, it regulates DNA methylation in cis only and not in trans. Interestingly, pedigree history dictated whether established DR methylation patterns were maintained after fertilization. When DR was paternally transmitted followed by maternal transmission, the unmethylated state that was properly established in the female germlines could not be maintained. This provides a model for transgenerational epigenetic inheritance in mice.  相似文献   

13.

Background

Comparative DNA sequence analysis provides insight into evolution and helps construct a natural classification reflecting the Tree of Life. The growing numbers of organisms represented in DNA databases challenge tree-building techniques and the vertical hierarchical classification may obscure relationships among some groups. Approaches that can incorporate sequence data from large numbers of taxa and enable visualization of affinities across groups are desirable.

Methodology/Principal Findings

Toward this end, we developed a procedure for extracting diagnostic patterns in the form of indicator vectors from DNA sequences of taxonomic groups. In the present instance the indicator vectors were derived from mitochondrial cytochrome c oxidase I (COI) sequences of those groups and further analyzed on this basis. In the first example, indicator vectors for birds, fish, and butterflies were constructed from a training set of COI sequences, then correlations with test sequences not used to construct the indicator vector were determined. In all cases, correlation with the indicator vector correctly assigned test sequences to their proper group. In the second example, this approach was explored at the species level within the bird grouping; this also gave correct assignment, suggesting the possibility of automated procedures for classification at various taxonomic levels. A false-color matrix of vector correlations displayed affinities among species consistent with higher-order taxonomy.

Conclusions/Significance

The indicator vectors preserved DNA character information and provided quantitative measures of correlations among taxonomic groups. This method is scalable to the largest datasets envisioned in this field, provides a visually-intuitive display that captures relational affinities derived from sequence data across a diversity of life forms, and is potentially a useful complement to current tree-building techniques for studying evolutionary processes based on DNA sequence data.  相似文献   

14.
We report here a novel method for predicting melting temperatures of DNA sequences based on a molecular-level hypothesis on the phenomena underlying the thermal denaturation of DNA. The model presented here attempts to quantify the energetic components stabilizing the structure of DNA such as base pairing, stacking, and ionic environment which are partially disrupted during the process of thermal denaturation. The model gives a Pearson product-moment correlation coefficient (r) of ∼0.98 between experimental and predicted melting temperatures for over 300 sequences of varying lengths ranging from 15-mers to genomic level and at different salt concentrations. The approach is implemented as a web tool (www.scfbio-iitd.res.in/chemgenome/Tm_predictor.jsp) for the prediction of melting temperatures of DNA sequences.  相似文献   

15.
The n-rule of Schrödinger in his discussion of DNA is based onnormal statistics and equilibrium physics. Herein the kurtosis is used tomeasure the deviation from normality of the stistics of non-equilibrium DNAsequences. A pattern for this deviation from normality is identified andthis signature is found in prokaryotes. The signature is explained by atheory of DNA sequences that involves finite length DNA walks withdynamically generated long-range correlations.  相似文献   

16.
Noncanonical parallel-stranded DNA double helices (ps-DNA) of natural nucleotide sequences are usually less stable than the canonical antiparallel-stranded DNA structures, which ensures reliable cell functioning. However, recent data indicate a possible role of ps-DNA in DNA loops or in regions of trinucleotide repeats connected with neurodegenerative diseases. The review surveys recent studies on the effect of nucleotide sequence on preference of one or other type of DNA duplex. (1) Ps-DNA of mixed AT/GC composition was found to have conformational and thermodynamic properties drastically different from those of a Watson–Crick double helix. Its stability depends strongly on the specific sequence in a manner peculiar to the ps double helix, because of the energy disadvantage of the AT/GC contacts. The AT/GC boundary facilitated flipping of A and T out of the ps double helix. Proton acceptor groups of bases are exposed into both grooves of the ps-DNA and are accessible to solvent and ligands, including proteins. (2) DNA regions containing natural minor bases isoguanine and isomethylisocytosine were shown to form ps-DNA with transAT-, trans isoGC, and transiso5meCG pairs exceeding in stability a related canonical duplex. (3) Nucleotide sequence dG(GT)4G from yeast telomeres and microsatellites was demonstrated to form novel ps-DNA with GG and TT base pairing. Unlike d(GT) n - and d(G n T m ) sequences able to form quadruplexes, the dG(GT)4G sequence formed no alternative double- or multistranded structures in a wide range of experimental conditions, thus suggesting that the nucleotide context governs the observed structural polymorphism of the d(GT) n sequence. The possible biological role of ps-DNA and the prospects of its study are discussed.  相似文献   

17.
综述了基因组中常见的重复DNA序列,介绍了其可能的产生机理、分布情况和生物学功能。  相似文献   

18.
Abstract

A 2-sperminoguanosine nucleotide has been synthesized and incorporated into oligonucleotides which showed increased duplex melting termperature.  相似文献   

19.
着丝粒是染色体的重要结构,在真核生物的细胞分裂中负责染色体的分裂分离。近年来对着丝粒的研究已经成为遗传学的一个热点。本文对着丝粒DNA的重复序列、着丝粒区域的基因及着丝粒的形成机制等作了简要的介绍。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号