首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A sticker-based model for DNA computation.   总被引:19,自引:0,他引:19  
We introduce a new model of molecular computation that we call the sticker model. Like many previous proposals it makes use of DNA strands as the physical substrate in which information is represented and of separation by hybridization as a central mechanism. However, unlike previous models, the stickers model has a random access memory that requires no strand extension and uses no enzymes; also (at least in theory), its materials are reusable. The paper describes computation under the stickers model and discusses possible means for physically implementing each operation. Finally, we go on to propose a specific machine architecture for implementing the stickers model as a microprocessor-controlled parallel robotic workstation. In the course of this development a number of previous general concerns about molecular computation (Smith, 1996; Hartmanis, 1995; Linial et al., 1995) are addressed. First, it is clear that general-purpose algorithms can be implemented by DNA-based computers, potentially solving a wide class of search problems. Second, we find that there are challenging problems, for which only modest volumes of DNA should suffice. Third, we demonstrate that the formation and breaking of covalent bonds is not intrinsic to DNA-based computation. Fourth, we show that a single essential biotechnology, sequence-specific separation, suffices for constructing a general-purpose molecular computer. Concerns about errors in this separation operation and means to reduce them are addressed elsewhere (Karp et al., 1995; Roweis and Winfree, 1999). Despite these encouraging theoretical advances, we emphasize that substantial engineering challenges remain at almost all stages and that the ultimate success or failure of DNA computing will certainly depend on whether these challenges can be met in laboratory investigations.  相似文献   

2.
Detection and Integration of Genotyping Errors in Statistical Genetics   总被引:15,自引:0,他引:15       下载免费PDF全文
Detection of genotyping errors and integration of such errors in statistical analysis are relatively neglected topics, given their importance in gene mapping. A few inopportunely placed errors, if ignored, can tremendously affect evidence for linkage. The present study takes a fresh look at the calculation of pedigree likelihoods in the presence of genotyping error. To accommodate genotyping error, we present extensions to the Lander-Green-Kruglyak deterministic algorithm for small pedigrees and to the Markov-chain Monte Carlo stochastic algorithm for large pedigrees. These extensions can accommodate a variety of error models and refrain from simplifying assumptions, such as allowing, at most, one error per pedigree. In principle, almost any statistical genetic analysis can be performed taking errors into account, without actually correcting or deleting suspect genotypes. Three examples illustrate the possibilities. These examples make use of the full pedigree data, multiple linked markers, and a prior error model. The first example is the estimation of genotyping error rates from pedigree data. The second-and currently most useful-example is the computation of posterior mistyping probabilities. These probabilities cover both Mendelian-consistent and Mendelian-inconsistent errors. The third example is the selection of the true pedigree structure connecting a group of people from among several competing pedigree structures. Paternity testing and twin zygosity testing are typical applications.  相似文献   

3.
Microsatellite genotyping errors will be present in all but the smallest data sets and have the potential to undermine the conclusions of most downstream analyses. Despite this, little rigorous effort has been made to quantify the size of the problem and to identify the commonest sources of error. Here, we use a large data set comprising almost 2000 Antarctic fur seals Arctocephalus gazella genotyped at nine hypervariable microsatellite loci to explore error detection methods, common sources of error and the consequences of errors on paternal exclusion. We found good concordance among a range of contrasting approaches to error-rate estimation, our range being 0.0013 to 0.0074 per single locus PCR (polymerase chain reaction). The best approach probably involves blind repeat-genotyping, but this is also the most labour-intensive. We show that several other approaches are also effective at detecting errors, although the most convenient alternative, namely mother-offspring comparisons, yielded the lowest estimate of the error rate. In total, we found 75 errors, emphasizing their ubiquitous presence. The most common errors involved the misinterpretation of allele banding patterns (n = 60, 80%) and of these, over a third (n = 22, 36.7%) were due to confusion between homozygote and adjacent allele heterozygote genotypes. A specific test for whether a data set contains the expected number of adjacent allele heterozygotes could provide a useful tool with which workers can assess the likely size of the problem. Error rates are also positively correlated with both locus polymorphism and product size, again indicating aspects where extra effort at error reduction should be directed. Finally, we conducted simulations to explore the potential impact of genotyping errors on paternity exclusion. Error rates as low as 0.01 per allele resulted in a rate of false paternity exclusion exceeding 20%. Errors also led to reduced estimates of male reproductive skew and increases in the numbers of pups that matched more than one candidate male. Because even modest error rates can be strongly influential, we recommend that error rates should be routinely published and that researchers make an attempt to calculate how robust their analyses are to errors.  相似文献   

4.
Zou G  Pan D  Zhao H 《Genetics》2003,164(3):1161-1173
The identification of genotyping errors is an important issue in mapping complex disease genes. Although it is common practice to genotype multiple markers in a candidate region in genetic studies, the potential benefit of jointly analyzing multiple markers to detect genotyping errors has not been investigated. In this article, we discuss genotyping error detections for a set of tightly linked markers in nuclear families, and the objective is to identify families likely to have genotyping errors at one or more markers. We make use of the fact that recombination is a very unlikely event among these markers. We first show that, with family trios, no extra information can be gained by jointly analyzing markers if no phase information is available, and error detection rates are usually low if Mendelian consistency is used as the only standard for checking errors. However, for nuclear families with more than one child, error detection rates can be greatly increased with the consideration of more markers. Error detection rates also increase with the number of children in each family. Because families displaying Mendelian consistency may still have genotyping errors, we calculate the probability that a family displaying Mendelian consistency has correct genotypes. These probabilities can help identify families that, although showing Mendelian consistency, may have genotyping errors. In addition, we examine the benefit of available haplotype frequencies in the general population on genotyping error detections. We show that both error detection rates and the probability that an observed family displaying Mendelian consistency has correct genotypes can be greatly increased when such additional information is available.  相似文献   

5.
We estimate DNA sequence error rates in Genbank records containing protein-coding and non-coding DNA sequences by comparing sequences of the inbred mouse strain C57BL/6J, sequenced as part of the mouse genome project and independently by other laboratories. C57BL/6J was produced by more than 100 generations of brother-sister mating, and can be assumed to be virtually free of residual polymorphism and mutational variation, so differences between independent sequences can be attributed to error. The estimated single nucleotide error rate for coding DNA is 0.10% (SE 0.012%), which is substantially lower than previous estimates for error rates in Genbank accessions. The estimated single nucleotide error rate for intronic DNA sequences (0.22%; SE 0.051%) is significantly higher than the rate for coding DNA. Since error rates for the mouse genome sequence are very low, the vast majority of the errors we detected are likely to be in individual Genbank accessions. The frequency of insertion-deletion (indel) errors in non-coding DNA approaches that of single nucleotide errors in non-coding DNA, whereas indel errors are uncommon in coding sequences.  相似文献   

6.
Errors in genotype calling can have perverse effects on genetic analyses, confounding association studies, and obscuring rare variants. Analyses now routinely incorporate error rates to control for spurious findings. However, reliable estimates of the error rate can be difficult to obtain because of their variance between studies. Most studies also report only a single estimate of the error rate even though genotypes can be miscalled in more than one way. Here, we report a method for estimating the rates at which different types of genotyping errors occur at biallelic loci using pedigree information. Our method identifies potential genotyping errors by exploiting instances where the haplotypic phase has not been faithfully transmitted. The expected frequency of inconsistent phase depends on the combination of genotypes in a pedigree and the probability of miscalling each genotype. We develop a model that uses the differences in these frequencies to estimate rates for different types of genotype error. Simulations show that our method accurately estimates these error rates in a variety of scenarios. We apply this method to a dataset from the whole-genome sequencing of owl monkeys (Aotus nancymaae) in three-generation pedigrees. We find significant differences between estimates for different types of genotyping error, with the most common being homozygous reference sites miscalled as heterozygous and vice versa. The approach we describe is applicable to any set of genotypes where haplotypic phase can reliably be called and should prove useful in helping to control for false discoveries.  相似文献   

7.
Genotypic errors, whether due to mutation or laboratory error, can cause the genotypes of parents and their offspring to appear inconsistent with Mendelian inheritance. As a result, molecular parentage analyses are expected to benefit when allowances are made for the presence of genotypic errors. However, a cost of allowing for genotypic errors might also be expected under some analytical conditions, primarily because parentage analyses that assume nonzero genotypic error rates can neither assign nor exclude parentage with certainty. The goal of this work was therefore to determine whether or not such costs might be important under conditions relevant to parentage analyses, particularly in natural populations. Simulation results indicate that the costs may often outweigh the benefits of accounting for nonzero error rates, except in situations where data are available for many marker loci. Consequently, the most powerful approach to handling genotypic errors in parentage analyses might be to apply likelihood equations with error rates set to values substantially lower than the rates at which genotypic errors occur. When applying molecular parentage analyses to natural populations, we advocate an increased consideration of optimal strategies for handling genotypic errors. Currently available software packages contain procedures that can be used for this purpose.  相似文献   

8.
The concept of the reward prediction error—the difference between reward obtained and reward predicted—continues to be a focal point for much theoretical and experimental work in psychology, cognitive science, and neuroscience. Models that rely on reward prediction errors typically assume a single learning rate for positive and negative prediction errors. However, behavioral data indicate that better-than-expected and worse-than-expected outcomes often do not have symmetric impacts on learning and decision-making. Furthermore, distinct circuits within cortico-striatal loops appear to support learning from positive and negative prediction errors, respectively. Such differential learning rates would be expected to lead to biased reward predictions and therefore suboptimal choice performance. Contrary to this intuition, we show that on static “bandit” choice tasks, differential learning rates can be adaptive. This occurs because asymmetric learning enables a better separation of learned reward probabilities. We show analytically how the optimal learning rate asymmetry depends on the reward distribution and implement a biologically plausible algorithm that adapts the balance of positive and negative learning rates from experience. These results suggest specific adaptive advantages for separate, differential learning rates in simple reinforcement learning settings and provide a novel, normative perspective on the interpretation of associated neural data.  相似文献   

9.
Estimates of missense error rates (misreading) during protein synthesis vary from 10(-3) to 10(-4) per codon. The experiments reporting these rates have measured several distinct errors using several methods and reporter systems. Variation in reported rates may reflect real differences in rates among the errors tested or in sensitivity of the reporter systems. To develop a more accurate understanding of the range of error rates, we developed a system to quantify the frequency of every possible misreading error at a defined codon in Escherichia coli. This system uses an essential lysine in the active site of firefly luciferase. Mutations in Lys529 result in up to a 1600-fold reduction in activity, but the phenotype varies with amino acid. We hypothesized that residual activity of some of the mutant genes might result from misreading of the mutant codons by tRNA(Lys) (UUUU), the cognate tRNA for the lysine codons, AAA and AAG. Our data validate this hypothesis and reveal details about relative missense error rates of near-cognate codons. The error rates in E. coli do, in fact, vary widely. One source of variation is the effect of competition by cognate tRNAs for the mutant codons; higher error frequencies result from lower competition from low-abundance tRNAs. We also used the system to study the effect of ribosomal protein mutations known to affect error rates and the effect of error-inducing antibiotics, finding that they affect misreading on only a subset of near-cognate codons and that their effect may be less general than previously thought.  相似文献   

10.
Lisewski AM 《PloS one》2008,3(9):e3110
The transmission of genomic information from coding sequence to protein structure during protein synthesis is subject to stochastic errors. To analyze transmission limits in the presence of spurious errors, Shannon's noisy channel theorem is applied to a communication channel between amino acid sequences and their structures established from a large-scale statistical analysis of protein atomic coordinates. While Shannon's theorem confirms that in close to native conformations information is transmitted with limited error probability, additional random errors in sequence (amino acid substitutions) and in structure (structural defects) trigger a decrease in communication capacity toward a Shannon limit at 0.010 bits per amino acid symbol at which communication breaks down. In several controls, simulated error rates above a critical threshold and models of unfolded structures always produce capacities below this limiting value. Thus an essential biological system can be realistically modeled as a digital communication channel that is (a) sensitive to random errors and (b) restricted by a Shannon error limit. This forms a novel basis for predictions consistent with observed rates of defective ribosomal products during protein synthesis, and with the estimated excess of mutual information in protein contact potentials.  相似文献   

11.
In noninvasive genetic sampling, when genotyping error rates are high and recapture rates are low, misidentification of individuals can lead to overestimation of population size. Thus, estimating genotyping errors is imperative. Nonetheless, conducting multiple polymerase chain reactions (PCRs) at multiple loci is time-consuming and costly. To address the controversy regarding the minimum number of PCRs required for obtaining a consensus genotype, we compared consumer-style the performance of two genotyping protocols (multiple-tubes and 'comparative method') in respect to genotyping success and error rates. Our results from 48 faecal samples of river otters (Lontra canadensis) collected in Wyoming in 2003, and from blood samples of five captive river otters amplified with four different primers, suggest that use of the comparative genotyping protocol can minimize the number of PCRs per locus. For all but five samples at one locus, the same consensus genotypes were reached with fewer PCRs and with reduced error rates with this protocol compared to the multiple-tubes method. This finding is reassuring because genotyping errors can occur at relatively high rates even in tissues such as blood and hair. In addition, we found that loci that amplify readily and yield consensus genotypes, may still exhibit high error rates (7-32%) and that amplification with different primers resulted in different types and rates of error. Thus, assigning a genotype based on a single PCR for several loci could result in misidentification of individuals. We recommend that programs designed to statistically assign consensus genotypes should be modified to allow the different treatment of heterozygotes and homozygotes intrinsic to the comparative method.  相似文献   

12.
Geller F  Ziegler A 《Human heredity》2002,54(3):111-117
One well-known approach for the analysis of transmission-disequilibrium is the investigation of single nucleotide polymorphisms (SNPs) in trios consisting of an affected child and its parents. Results may be biased by erroneously given genotypes. Various reasons, among them sample swap or wrong pedigree structure, represent a possible source for biased results. As these can be partly ruled out by good study conditions together with checks for correct pedigree structure by a series of independent markers, the remaining main cause for errors is genotyping errors. Some of the errors can be detected by Mendelian checks whilst others are compatible with the pedigree structure. The extent of genotyping errors can be estimated by investigating the rate of detected genotyping errors by Mendelian checks. In many studies only one SNP of a specific genomic region is investigated by TDT which leaves Mendelian checks as the only tool to control genotyping errors. From the rate of detected errors the true error rate can be estimated. Gordon et al. [Hum Hered 1999;49:65-70] considered the case of genotyping errors that occur randomly and independently with some fixed probability for the wrong ascertainment of an allele. In practice, instead of single alleles, SNP genotypes are determined. Therefore, we study the proportion of detected errors (detection rate) based on genotypes. In contrast to Gordon et al., who reported detection rates between 25 and 30%, we obtain higher detection rates ranging from 39 up to 61% considering likely error structures in the data. We conclude that detection rates are probably substantially higher than those reported by Gordon et al.  相似文献   

13.
There are several sources from which human embryonic stem cell (hESC) lines can be generated: surplus embryos after in vitro fertilization procedures, one- and three-pronuclear zygotes, early arrested or highly fragmented embryos that have reached the blastocyst stage, or otherwise chromosomally or genetically abnormal embryos after preimplantation genetic diagnosis (PGD). We report on the efficiency of establishing hESC lines from blastocysts with proven meiotic or mitotic errors after sequential testing of both polar bodies and blastomere analysis on day 3. The success rate of establishing hESC lines originating from blastocysts carrying a meiotic error was as low as 2.4% and differed significantly from the success rate of establishing hESC lines originating from blastocysts with balanced meiotic errors (21.6%) or mitotic errors (after sequential testing (9.1%) and after blastomere testing alone (12.2%)). This suggests that it may be reasonable to apply sequential PGD prior to the initiation of hESC culture. Information about the karyotype may in the future help refine the methods and possibly improve the efficiency by which hESC lines are derived from embryos with prezygotic abnormalities. Additionally, it may in general prove very difficult to obtain abnormal hESC lines for scientific study from aneuploid PGD embryos, which will limit our ability to study the biological consequences of chromosomal abnormalities. Furthermore, the success rates for generating aneuploid cell lines originating from fertilized oocytes carrying a prezygotic nondisjunction error seem to mirror the miscarriage rates during pregnancy of embryos carrying such errors.  相似文献   

14.
Determining population sizes can be difficult, but is essential for conservation. By counting distinct microsatellite genotypes, DNA from noninvasive samples (hair, faeces) allows estimation of population size. Problems arise because genotypes from noninvasive samples are error-prone, but genotyping errors can be reduced by multiple polymerase chain reaction (PCR). For faecal genotypes from wolves in Yellowstone National Park, error rates varied substantially among samples, often above the 'worst-case threshold' suggested by simulation. Consequently, a substantial proportion of multilocus genotypes held one or more errors, despite multiple PCR. These genotyping errors created several genotypes per individual and caused overestimation (up to 5.5-fold) of population size. We propose a 'matching approach' to eliminate this overestimation bias.  相似文献   

15.
In this paper we describe a method for the statistical reconstruction of a large DNA sequence from a set of sequenced fragments. We assume that the fragments have been assembled and address the problem of determining the degree to which the reconstructed sequence is free from errors, i.e., its accuracy. A consensus distribution is derived from the assembled fragment configuration based upon the rates of sequencing errors in the individual fragments. The consensus distribution can be used to find a minimally redundant consensus sequence that meets a prespecified confidence level, either base by base or across any region of the sequence. A likelihood-based procedure for the estimation of the sequencing error rates, which utilizes an iterative EM algorithm, is described. Prior knowledge of the error rates is easily incorporated into the estimation procedure. The methods are applied to a set of assembled sequence fragments from the human G6PD locus. We close the paper with a brief discussion of the relevance and practical implications of this work.  相似文献   

16.
According to many published literature, parallel computing is regarded as an efficient solution in digital terrain analysis (DTA) of geographic information system. The stable and credible services play an irreplaceable role in the high performance computing, especially when an error occurs in large-scale science computing. In this paper, a new approach for the parallel DTA considering the performance of fault-tolerance was proposed: fast parallel re-computation (FPR). FPR owns a fast self-recovery ability based on redundancy mechanisms compared to other fault-tolerant methods. Once some errors in application layers are detected, the data block having computation errors is further partitioned into several sub-blocks, which are re-computed by the surviving processes concurrently to improve the efficiency of failure recovery. The overlapping strategy of error detection and re-computation is presented through decomposing the data block into several logic sub-blocks. As a result, when an error of a logical sub-block of the data block is detected by a comparing thread the re-computing process immediately starts to correct the error. This strategy reduces the time of re-computation and error detection by overlapping them comparing the traditional re-computation method. The experiments show that the proposed FPR method can achieve better performance efficiency with fewer overhead.  相似文献   

17.
18.
19.
Johnson PC  Haydon DT 《Genetics》2007,175(2):827-842
The importance of quantifying and accounting for stochastic genotyping errors when analyzing microsatellite data is increasingly being recognized. This awareness is motivating the development of data analysis methods that not only take errors into consideration but also recognize the difference between two distinct classes of error, allelic dropout and false alleles. Currently methods to estimate rates of allelic dropout and false alleles depend upon the availability of error-free reference genotypes or reliable pedigree data, which are often not available. We have developed a maximum-likelihood-based method for estimating these error rates from a single replication of a sample of genotypes. Simulations show it to be both accurate and robust to modest violations of its underlying assumptions. We have applied the method to estimating error rates in two microsatellite data sets. It is implemented in a computer program, Pedant, which estimates allelic dropout and false allele error rates with 95% confidence regions from microsatellite genotype data and performs power analysis. Pedant is freely available at http://www.stats.gla.ac.uk/ approximately paulj/pedant.html.  相似文献   

20.
OBJECTIVE: In affected sib pair studies without genotyped parents the effect of genotyping error is generally to reduce the type I error rate and power of tests for linkage. The effect of genotyping error when parents have been genotyped is unknown. We investigated the type I error rate of the single-point Mean test for studies in which genotypes of both parents are available. METHODS: Datasets were simulated assuming no linkage and one of five models for genotyping error. In each dataset, Mendelian-inconsistent families were either excluded or regenotyped, and then the Mean test applied. RESULTS: We found that genotyping errors lead to an inflated type I error rate when inconsistent families are excluded. Depending on the genotyping-error model assumed, regenotyping inconsistent families has one of several effects. It may produce the same type I error rate as if inconsistent families are excluded; it may reduce the type I error, but still leave an anti-conservative test; or it may give a conservative test. Departures of the type I error rate from its nominal level increase with both the genotyping error rate and sample size. CONCLUSION: We recommend that markers with high error rates either be excluded from the analysis or be regenotyped in all families.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号