期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mathematical Models for Analysis of Bacterial Endocarditis Data

H. B. Eisenberg R. R. M. Geoghagen J. E. Walsh 《Biometrical journal. Biometrische Zeitschrift》1968,10(4):248-256

相似文献

2.

一个新的养分效应数学模型 总被引：1，自引：0，他引：1

郑学高张泽《生物数学学报》1997,(Z1)

在养分效应研究中,多采用二次抛物线模型,从容（1991）在米氏方程的基础上提出一个改进抛物线模型,这两种模型都是对称性模型,也即是把养分正负效应的速率视为相同,而大多数实际情形并非如此．本文建立了一个反映养分效应普遍现象的非对称性模型,两个应用实例表明新模型较之原来的模型无论在拟合度（残差平方和）还是在生物学意义方面都具有明显的优越性．相似文献

3.

Mathematical Analysis of a Chlamydia Epidemic Model with Pulse Vaccination Strategy

G. P. Samanta 《Acta biotheoretica》2015,63(1):1-21

相似文献

4.

Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data

Gao?T. Wang Bo Peng Suzanne?M. Leal 《American journal of human genetics》2014,94(5):770-783

Currently there is great interest in detecting associations between complex traits and rare variants. In this report, we describe Variant Association Tools (VAT) and the VAT pipeline, which implements best practices for rare-variant association studies. Highlights of VAT include variant-site and call-level quality control (QC), summary statistics, phenotype- and genotype-based sample selection, variant annotation, selection of variants for association analysis, and a collection of rare-variant association methods for analyzing qualitative and quantitative traits. The association testing framework for VAT is regression based, which readily allows for flexible construction of association models with multiple covariates and weighting themes based on allele frequencies or predicted functionality. Additionally, pathway analyses, conditional analyses, and analyses of gene-gene and gene-environment interactions can be performed. VAT is capable of rapidly scanning through data by using multi-process computation, adaptive permutation, and simultaneously conducting association analysis via multiple methods. Results are available in text or graphic file formats and additionally can be output to relational databases for further annotation and filtering. An interface to R language also facilitates user implementation of novel association methods. The VAT''s data QC and association-analysis pipeline can be applied to sequence, imputed, and genotyping array, e.g., “exome chip,” data, providing a reliable and reproducible computational environment in which to analyze small- to large-scale studies with data from the latest genotyping and sequencing technologies. Application of the VAT pipeline is demonstrated through analysis of data from the 1000 Genomes project. 相似文献

5.

Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data

Gao T. Wang Bo Peng Suzanne M. Leal 《American journal of human genetics》2014

相似文献

6.

A New Variant of the Mouse Akp1 Locus

Lee CH Kim EH Won YS Choi YK Nam KH Kim HC Hyun BH Suh JG Oh YS 《Biochemical genetics》2005,43(11-12):597-602

A new electrophoretic migration type of alkaline phosphatase 1 (Akp1) was found on the cellulose acetate electrophoresis for kidney and liver homogenates of KWHM mouse, a newly established inbred strain derived from the Korean wild mouse (Mus musculus molossinus). This new type of alkaline phosphatase 1 was distinguished from previously reported AKP1A and AKP1B types in the mouse, and tentatively named AKP1C. In genetic analysis by mating experiments between KWHM and C57BL/6J (AKP1A) or BALB/cA (AKP1B), the phenotypic segregation ratios of AKP1A : AKP1AC : AKP1C or AKP1B : AKP1BC : AKP1C were 1 : 2 : 1 in both groups of F2 generations. It was therefore concluded that AKP1C type is controlled by Akp1c allele which is codominant with Akp1a and Akp1b alleles. 相似文献

7.

FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units

Gang Peng Yu Fan Wenyi Wang 《PLoS computational biology》2014,10(10)

Various algorithms have been developed for variant calling using next-generation sequencing data, and various methods have been applied to reduce the associated false positive and false negative rates. Few variant calling programs, however, utilize the pedigree information when the family-based sequencing data are available. Here, we present a program, FamSeq, which reduces both false positive and false negative rates by incorporating the pedigree information from the Mendelian genetic model into variant calling. To accommodate variations in data complexity, FamSeq consists of four distinct implementations of the Mendelian genetic model: the Bayesian network algorithm, a graphics processing unit version of the Bayesian network algorithm, the Elston-Stewart algorithm and the Markov chain Monte Carlo algorithm. To make the software efficient and applicable to large families, we parallelized the Bayesian network algorithm that copes with pedigrees with inbreeding loops without losing calculation precision on an NVIDIA graphics processing unit. In order to compare the difference in the four methods, we applied FamSeq to pedigree sequencing data with family sizes that varied from 7 to 12. When there is no inbreeding loop in the pedigree, the Elston-Stewart algorithm gives analytical results in a short time. If there are inbreeding loops in the pedigree, we recommend the Bayesian network method, which provides exact answers. To improve the computing speed of the Bayesian network method, we parallelized the computation on a graphics processing unit. This allowed the Bayesian network method to process the whole genome sequencing data of a family of 12 individuals within two days, which was a 10-fold time reduction compared to the time required for this computation on a central processing unit.

This is a PLOS Computational Biology Software Article

相似文献

8.

A Model-Based Clustering Method for Genomic Structural Variant Prediction and Genotyping Using Paired-End Sequencing Data

Matthew Hayes Yoon Soo Pyon Jing Li 《PloS one》2012,7(12)

Structural variation (SV) has been reported to be associated with numerous diseases such as cancer. With the advent of next generation sequencing (NGS) technologies, various types of SV can be potentially identified. We propose a model based clustering approach utilizing a set of features defined for each type of SV events. Our method, termed SVMiner, not only provides a probability score for each candidate, but also predicts the heterozygosity of genomic deletions. Extensive experiments on genome-wide deep sequencing data have demonstrated that SVMiner is robust against the variability of a single cluster feature, and it significantly outperforms several commonly used SV detection programs. SVMiner can be downloaded from http://cbc.case.edu/svminer/. 相似文献

9.

A Mathematical Analysis of Multiple-Target Selex

Yeon-Jung Seo Shiliang Chen Marit Nilsen-Hamilton Howard A. Levine 《Bulletin of mathematical biology》2010,72(7):1623-1665

SELEX (Systematic Evolution of Ligands by Exponential Enrichment) is a procedure by which a mixture of nucleic acids can be fractionated with the goal of identifying those with specific biochemical activities. 相似文献

10.

Evaluation of Multitype Mathematical Models for CFSE-Labeling Experiment Data

Miao H Jin X Perelson AS Wu H 《Bulletin of mathematical biology》2012,74(2):300-326

Carboxy-fluorescein diacetate succinimidyl ester (CFSE) labeling is an important experimental tool for measuring cell responses to extracellular signals in biomedical research. However, changes of the cell cycle (e.g., time to division) corresponding to different stimulations cannot be directly characterized from data collected in CFSE-labeling experiments. A number of independent studies have developed mathematical models as well as parameter estimation methods to better understand cell cycle kinetics based on CFSE data. However, when applying different models to the same data set, notable discrepancies in parameter estimates based on different models has become an issue of great concern. It is therefore important to compare existing models and make recommendations for practical use. For this purpose, we derived the analytic form of an age-dependent multitype branching process model. We then compared the performance of different models, namely branching process, cyton, Smith–Martin, and a linear birth–death ordinary differential equation (ODE) model via simulation studies. For fairness of model comparison, simulated data sets were generated using an agent-based simulation tool which is independent of the four models that are compared. The simulation study results suggest that the branching process model significantly outperforms the other three models over a wide range of parameter values. This model was then employed to understand the proliferation pattern of CD4+ and CD8+ T cells under polyclonal stimulation. 相似文献

11.

A Mathematical Model of Interference for Use in Constructing Linkage Maps from Tetrad Data

下载免费PDF全文

J. S. King R. K. Mortimer 《Genetics》1991,129(2):597-602

In determining genetic map distances it is necessary to infer crossover frequencies from the ratios of recombinant and parental progeny. To do this accurately, in intervals where multiple crossovers may occur, a mathematical model of chiasma interference must be assumed when mapping in organisms displaying such interference. In Saccharomyces cerevisiae the model most frequently used is that of R.W. Barratt. An alternative to this model is presented. This new model is implemented using a microcomputer and standard numerical methods. It is demonstrated to fit ranked tetrad data from Saccharomyces more closely than the Barratt model and thus generates more accurate estimates of map distances when used with two-point data. A computer program implementing the model has been developed for use in calculating map distances from tetrad data in Saccharomyces. 相似文献

12.

A New Approach for the Discrimination of Mixed Data

K.-D. Wernecke 《Biometrical journal. Biometrische Zeitschrift》1991,33(7):893-896

A new procedure for the discrimination of mixed data is presented. The method allows the application of well-known discriminators after defining new (continuous) variables from the a-posteriori-probabilities. Some outlooks and an example from medical diagnostics are given. 相似文献

13.

A New Method for the Analysis of Soft Tissues with Data Acquired under Field Conditions

Ruth S. Sonnweber Nina Stobbe Olmo Zavala Romero Dennis E. Slice Martin Fieder Bernard Wallner 《PloS one》2013,8(6)

Analyzing soft-tissue structures is particularly challenging due to the lack of homologous landmarks that can be reliably identified across time and specimens. This is particularly true when data are to be collected under field conditions. Here, we present a method that combines photogrammetric techniques and geometric morphometrics methods (GMM) to quantify soft tissues for their subsequent volumetric analysis. We combine previously developed methods for landmark data acquisition and processing with a custom program for volumetric computations. Photogrammetric methods are a particularly powerful tool for field studies as they allow for image acquisition with minimal equipment requirements and for the acquisition of the spatial coordinates of points (anatomical landmarks or others) from these images. For our method, a limited number of homologous landmarks, i.e., points that can be found on any specimen independent of space and time, and further distinctive points, which may vary over time, space and subject, are identified on two-dimensional photographs and their three-dimensional coordinates estimated using photogrammetric methods. The three-dimensional configurations are oriented by the spatial principal components (PCs) of the homologous points. Crucially, this last step orients the configuration such that x and y-information (PC1 and PC2 coordinates) constitute an anatomically-defined plane with the z-values (PC3 coordinate) in the direction of interest for volume computation. The z-coordinates are then used to estimate the volume of the tissue. We validate our method using a physical, geometric model of known dimensions and physical (wax) models designed to approximate perineal swellings in female macaques. To demonstrate the usefulness and potential of our method, we use it to estimate the volumes of Barbary macaque sexual swellings recorded in the field with video images. By analyzing both the artificial data and real monkey swellings, we validate our method''s accuracy and illustrate its potential for application in important areas of biological research. 相似文献

14.

An Integrated Approach for Analyzing Clinical Genomic Variant Data from Next-Generation Sequencing

Erin L. Crowgey Deborah L. Stabley Chuming Chen Hongzhan Huang Katherine M. Robbins Shawn W. Polson Katia Sol-Church Cathy H. Wu 《Journal of biomolecular techniques》2015,26(1):19-28

Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource’s iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease. 相似文献

15.

A Photostable Green Fluorescent Protein Variant for Analysis of Protein Localization in Candida albicans

Chengda Zhang James B. Konopka 《Eukaryotic cell》2010,9(1):224-226

Fusions to the green fluorescent protein (GFP) are an effective way to monitor protein localization. However, altered codon usage in Candida species has delayed implementation of new variants. Examination of three new GFP variants in Candida albicans showed that one has higher signal intensity and increased resistance to photobleaching.The human fungal pathogen Candida albicans can cause severe infections, particularly in immunocompromised patients. Important insights into its pathogenesis have been obtained by analyzing fusions to green fluorescent protein (GFP) (). Although GFP tagging has been very successful, many fusion proteins are not easily detected. New GFP variants with improved fluorescence and protein folding properties have been identified by genetic approaches in other organisms (, , ). However, these GFP variants have not been assessed in C. albicans and related species, presumably because of the added difficulties of attempting heterologous expression in C. albicans.To adapt GFP for effective use in C. albicans, Cormack et al. introduced three types of codon changes: the S65G S72A mutations to enhance fluorescence; the CTG codon 201 change to TTG, since CUG is translated as Ser instead of Leu in C. albicans; and the optimization of the other codons for translation in C. albicans (). This variant, known as YeGFP3, was introduced into convenient vectors for creating gene fusions in C. albicans (). Another version of eGFP known as mut2 (S65A V68L S72A Q80R) was adapted for C. albicans by changing the CTG codon but without further codon optimization (). These obstacles to heterologous expression in C. albicans have presumably delayed implementation of newer versions of GFP. Therefore, in this study three different GFP variants were introduced into YeGFP3 and examined for function in C. albicans.The GFP variants were constructed using standard methods to introduce changes in the coding sequence of YeGFP3. In brief, mutagenic oligonucleotides were used to prime PCR synthesis of a plasmid carrying YeGFP3, the template DNA was then destroyed by digestion with DpnI, and then the resulting DNA was transformed into Escherichia coli. DNA sequencing (carried out by the Stony Brook University DNA Sequencing Facility) confirmed that the correct substitutions were present. The mutant GFP genes were then released as PstI-AscI fragments and then were subcloned to replace the corresponding GFP fragment of plasmid pFa-GFP-URA3 (), which carries a PCR cassette module for creating GFP fusions in C. albicans. Because of the large number of changes, the mutants were given the more convenient names of CaGFPα (F64L S65T F99S M153T V163A), CaGFPβ (F64L S65T N149K M153T I167T; also known as emerald), and CaGFPγ (F64L S65C V163A I167T). The CaGFPγ was also introduced into vectors that contain selectable markers HIS1 and ARG4 (). DNA sequences used to design primers for creating GFP fusions in C. albicans were as follows: forward primer, 5′ (region of homology)-GGTGCTGGCGCAGGTGCTTC-3′, and reverse primer, 5′ (region of homology)-TCTGATATCATCGATGAATTCGAG-3′.CDC11-GFP fusion genes were created in C. albicans by homologous recombination, as described previously (, ). In brief, long oligonucleotide primers with homology to the 3′ end of the CDC11 open reading frame were used to prime PCR synthesis of each of the corresponding GFP variant genes plus an adjacent selectable marker gene (URA3). These DNA elements were then introduced into C. albicans cells and allowed to recombine with the homologous region of the CDC11 gene in C. albicans to create the CDC11-GFP fusion genes. Sequences used for the design of PCR primers to amplify the pFa-GFP plasmids are shown above. Cells carrying the indicated CDC11-GFP fusion gene were grown overnight in log phase in synthetic medium (yeast nitrogen base plus amino acids and dextrose). Cdc11-GFP fluorescence intensity was analyzed with an Olympus BH2 microscope equipped with a Zeiss AxioCam camera run by Openlab software. The relative GFP signal was determined by measuring the intensity of GFP fluorescence of the septin ring and then subtracting the fluorescence of an area immediately adjacent to each ring. All samples were visualized under the same conditions.Samples were prepared for Western blot analysis by resuspending cells in TNE lysis buffer (10 mM Tris base, 1 mM EDTA, 100 mM NaCl) with 100× protease mix (40 mg/ml pepstatin A, 40 mg/ml aprotinin, 20 mg/ml leupeptin) and then agitating in the presence of glass beads. The supernatant was collected after low-speed centrifugation at 3,000 rpm for 1 min, protein concentrations were determined by the bicinchoninic acid (BCA) protein assay (Pierce), and then equal amounts of protein extract were separated by gel electrophoresis and transferred to a Protran nitrocellulose membrane (Whatman GmbH). The blots were incubated with mouse anti-GFP (Millipore), rabbit anti-glucose-6-phosphate dehydrogenase (anti-G6PD; Sigma), or rabbit anti-Cdc11 (Santa Cruz Biotechnology) primary antibodies; washed; and then incubated with either goat anti-mouse IRDye 800cw or goat anti-rabbit IRDye 680 (Li-Cor Biosciences, Lincoln, NE). The immunoreactive proteins were visualized with a Li-Cor fluorescence scanner run by Odyssey software.Three new GFP variants based on YeGFP3 were constructed by introducing mutations predicted to improve either the fluorescence properties or protein folding (, , ). Because multiple changes were introduced into each variant, they were given the more convenient names of CaGFPα, CaGFPβ, and CaGFPγ (see above). The key mutations in CaGFPα and CaGFPβ have been described previously (, , ), but CaGFPγ represents a novel combination of mutations. The 3 new GFP variants plus the YeGFP3 and mut2 versions were compared by fusing them to the C terminus of the Cdc11 septin protein (). The Cdc11 protein was selected because its restricted localization to the bud neck facilitated microscopic analysis and comparison of fluorescence properties. CDC11-GFP fusion genes were constructed in strain BWP17 () using PCR-generated modules with a URA3 selectable marker, as described previously (, ).Cells were grown in synthetic medium overnight to log phase at both 30°C and 37°C, temperatures that are commonly used to propagate C. albicans and that may affect the folding properties of GFP. GFP fluorescence was then analyzed by quantifying the intensity of the septin rings in digital images (Fig. (Fig.1A).1A). Septin rings were analyzed only if they were obviously in focus and at the same stage of the cell cycle (large budded). CaGFPγ gave a slightly stronger signal than the other variants, which was most obvious at 30°C (Fig. (Fig.1A).1A). At least two independent clones were analyzed for each CDC11-GFP variant, and the two gave similar results (data not shown).Open in a separate window FIG. 1.Properties of Cdc11-GFP fusion proteins. Cells were grown to log phase overnight at the indicated temperature, and then Cdc11-GFP fluorescence was analyzed. (A) Signal intensity for the different versions of Cdc11-GFP was compared in three independent assays in which 50 septin rings per assay were quantified for each different Cdc11-GFP. The average fluorescence intensity was normalized to 100 for Cdc11-YeGPF3. The Cdc11-CaGFPγ variant gave a significantly stronger signal than the other variants (P < 0.001). (B) Western blot analysis comparing the levels of Cdc11-GFP produced in the indicated strains. The lane labeled “neg” refers to the negative-control strain (BWP17) that lacks GFP. Blots were probed with anti-GFP to detect Cdc11-GFP, anti-glucose-6-phosphate dehydrogenase (αG6PD) as a control, and anti-Cdc11 to detect the untagged version of Cdc11.The levels of the Cdc11-GFP proteins at both 30°C and 37°C were compared on two independent Western blots using anti-GFP antibody (Fig. (Fig.1B).1B). The relative levels of Cdc11-mut2GFP and Cdc11-CaGFPα were the lowest, consistent with their lower fluorescence intensity. The lower levels of Cdc11-mut2GFP are consistent with the fact that the codons in the mut2 version of GFP were not optimized for expression in C. albicans (). The Cdc11-YeGFP3 and Cdc11-CaGFPγ were present at higher levels, and the Cdc11-CaGFPβ was produced at even slightly higher levels, consistent with reports that this latter version of GFP (also known as emerald) has improved folding properties (). The Cdc11-GFP variants did not affect the production of the untagged Cdc11 protein (Fig. (Fig.1B1B).Photobleaching is also an important factor for GFP (), especially in time-lapse studies or Z-stack analysis of different optical sections of cells. Photostability of the GFP variants was examined by taking pictures at 4-s intervals during 1 min of continuous exposure to the fluorescence excitation lamp (Fig. 2A and B). The fluorescence of YeGFP3, mut2GFP, and CaGFPα fused to Cdc11 decayed to 50% of original intensity within 15 to 30 s, and the rate of photobleaching was even higher for CaGFPβ. In contrast, Cdc11-CaGFPγ showed extended photostability at both 30°C and 37°C (half-life [t_1/2] of ∼2 min). Similar results were also obtained for CaGFPγ fused to the Golgi protein Vrg4 (data not shown), although the standard deviations were larger because the mobile Golgi compartments frequently moved out of the focal plane during the time course (data not shown). On a practical level, the Cdc11-GFPγ fluorescence was readily detectable after several minutes of continuous exposure (Fig. (Fig.2C),2C), demonstrating its clear advantage for allowing more time to observe protein localization before photobleaching becomes significant.Open in a separate window FIG. 2.Photostability of GFP variants. (A and B) Relative fluorescence intensity of the GFP variants at 4-s intervals over a time course of 1 min of continuous exposure to the fluorescence excitation lamp after growth at 30°C (A) and at 37°C (B). CaGFPγ showed the best photostability (t_1/2 of ∼2 min). The relative fluorescence was normalized to 100 for each Cdc11-GFP variant at the start of the time course. The results represent the average of three independent assays in which three septin rings were analyzed for each mutant. Error bars indicate standard deviations. (C) Cells carrying Cdc11 fused to YeGFP3 or CaGFPγ were continuously exposed to the fluorescence excitation lamp, and then images of septin rings were captured at the indicated times.Altogether, Cdc11-CaGFPγ had the best overall properties based on protein levels, signal intensity, and photostability in C. albicans. The higher level of Cdc11-CaGFPβ production was apparently offset by increased photobleaching, resulting in no overall advantage for this variant. The Cdc11-CaGFPα was produced at relatively low levels, and it was less photostable compared to the other versions. Thus, CaGFPγ is a novel GFP variant that offers improved features for the study of protein localization in C. albicans and will likely also be useful for expression in other species. 相似文献

16.

A New Variant of Human Intersex with Discussion on the Developmental Aspects

Prabhaker N. Shah S. N. Naik D. K. Mahajan M. J. Dave J. C. Paymaster 《BMJ (Clinical research ed.)》1961,2(5250):474-477

相似文献

17.

A Technique for the Analysis of Unbalanced Data

Kuan P. Singe Umed Singh 《Biometrical journal. Biometrische Zeitschrift》1989,31(1):1-17

The regression methods with dummy variables have been shown to be effective in preventing confusion in the analysis of linear models. In particular, this model simplifies interpretation of parameters and clarifies hypothesis statements. All existing methods have been shown as special cases of the general linear hypothesis in regression setting. Three regression on dummy variables methods are examined critically to bring out the salient features of each method. The choice of a method should be based on the way definitions of the parameters are desired. The linear models are considered in a regression model setting. This has been done by defining appropriate dummy variables in a regression model which often is desirable, if not mandatory, when dealing with unbalanced data involving two or more factors. 相似文献

18.

VertNet: A New Model for Biodiversity Data Sharing

Heather Constable Robert Guralnick John Wieczorek Carol Spencer A. Townsend Peterson The VertNet Steering Committee 《PLoS biology》2010,8(2)

相似文献

19.

A Genetic Analysis of Sectoring in Ultraviolet-Induced Variant Colonies of Yeast

下载免费PDF全文

James AP 《Genetics》1955,40(2):204-213

相似文献

20.

Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

Shunichi Kosugi Satoshi Natsume Kentaro Yoshida Daniel MacLean Liliana Cano Sophien Kamoun Ryohei Terauchi 《PloS one》2013,8(10)

Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. 相似文献