期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Influence functions and outlier detection under the common principal components model: A robust approach 总被引：4，自引：0，他引：4

Boente Graciela; Pires Ana M.; Rodrigues Isabel M. 《Biometrika》2002,89(4):861-875

相似文献

2.

PEDSTATS: descriptive statistics, graphics and quality assessment for gene mapping data 总被引：16，自引：0，他引：16

Wigginton JE Abecasis GR 《Bioinformatics (Oxford, England)》2005,21(16):3445-3447

We describe a tool that produces summary statistics and basic quality assessments for gene-mapping data, accommodating either pedigree or case-control datasets. Our tool can also produce graphic output in the PDF format. 相似文献

3.

Alternative approaches to the analysis of comparative data: compare and contrast

Susan J. Mazer 《American journal of botany》1998,85(8):1194-1199

相似文献

4.

RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data

Washietl S Findeiss S Müller SA Kalkhof S von Bergen M Hofacker IL Stadler PF Goldman N 《RNA (New York, N.Y.)》2011,17(4):578-594

相似文献

5.

A general, robust method for the quality control of intact proteins using LC-ESI-MS

Sundqvist G Stenvall M Berglund H Ottosson J Brumer H 《Journal of chromatography. B, Analytical technologies in the biomedical and life sciences》2007,852(1-2):188-194

A simple and robust method for the routine quality control of intact proteins based on liquid chromatography coupled to electrospray ionization mass spectrometry (LC-ESI-MS) is presented. A wide range of prokaryotic and eukaryotic proteins expressed recombinantly in Escherichia coli or Pichia pastoris has been analyzed with medium- to high-throughput with on-line desalting from multi-well sample plates. Particular advantages of the method include fast chromatography and short cycle times, the use of inexpensive trapping/desalting columns, low sample carryover, and the ability to analyze proteins with masses ranging from 5 to 100 kDa with greater than 50 ppm accuracy. Moreover, the method can be readily coupled with optimized chemical reduction and alkylation steps to facilitate the analysis of denatured or incorrectly folded proteins (e.g., recombinant proteins sequestered in E. coli inclusion bodies) bearing cysteine residues, which otherwise form intractable multimers and non-specific adducts by disulfide bond formation. 相似文献

6.

Quick and simple: quality control of microarray data

Sauer U Preininger C Hany-Schmatzberger R 《Bioinformatics (Oxford, England)》2005,21(8):1572-1578

MOTIVATION: Microarrays are high-throughput tools for parallel miniaturized detection of biomolecules. In contrast to experiments using ratios of signals in two channels, experiments with only one fluorescent dye cause special problems for data analysis. The present work compares algorithms for quality filtering on spot level as well as array/slide level. RESULTS: Methods for quantitative spot filtering are discussed and new sets of quality scores for data preprocessing are designed. As measures of spot quality also reflect the quality of protocols, they were employed to find the optimal print buffer in an optimization experiment. In order to determine problematic arrays within a set of replicates we tested methods of outlier detection which can suitably replace the visual inspection of slides. CONTACT: Ursula.Sauer@arcs.ac.at. 相似文献

7.

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation 总被引：29，自引：1，他引：29

下载免费PDF全文

Yang YH Dudoit S Luu P Lin DM Peng V Ngai J Speed TP 《Nucleic acids research》2002,30(4):e15

There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensity-dependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides. 相似文献

8.

An automatic MRI quality control procedure: Multisite reports for slice thickness and geometric accuracy

A. Sewonu G. Hossu J. Felblinger R. Anxionnat C. Pasquier 《IRBM》2013,34(4-5):300-305

In this work, we report multi-scanner quality control monitoring using a standard and automated protocol. This magnetic resonance imaging quality control protocol, based on the American College of Radiology procedures, includes weekly scans of a dedicated phantom followed by specific measurements. The processing step commonly involves manually-performed operations which can be tedious and time-consuming hence motivating their automation. QC data were collected in four sites; data from one of them served for the validation of the automatic analysis tool. Designed as a package of MATLAB^® functions, this tool was successfully validated using Student's t-test and the correlation between automatic measurements and the manual ones. Besides, the multisite QC study enabled to compare the performances of these four MR facilities. In order to avoid misinterpretation or errors in multicenter clinical studies, such approach can be recommended as a preliminary step for including a site in the studies. 相似文献

9.

A systematic comparison of three structure determination methods from NMR data: Dependence upon quality and quantity of data

Yajun Liu Daqing Zhao Russ Altman Oleg Jardetzky 《Journal of biomolecular NMR》1992,2(4):373-388

Summary We have systematically examined how the quality of NMR protein structures depends on (1) the number of NOE distance constraints. (2) their assumed precision, (3) the method of structure calculation and (4) the size of the protein. The test sets of distance constraints have been derived from the crystal structures of crambin (5 kDa) and staphylococcal nuclease (17 kDa). Three methods of structure calculation have been compared: Distance Geometry (DGEOM), Restrained Molecular Dynamics (XPLOR) and the Double Iterated Kalman Filter (DIKF). All three methods can reproduce the general features of the starting structure under all conditions tested. In many instances the apparent precision of the calculated structure (as measured by the RMS dispersion from the average) is greater than its accuracy (as measured by the RMS deviation of the average structure from the starting crystal structure). The global RMS deviations from the reference structures decrease exponentially as the number of constraints is increased, and after using about 30% of all potential constraints, the crrors asymptotically approach a limiting value. Increasing the assumed precision of the constraints has the same qualitative effect as increasing the number of constraints. For comparable numbers of constraints/residue, the precision of the calculated structure is less for the larger than for the smaller protein, regardless of the method of calculation. The accuracy of the average structure calculated by Restrained Molecular Dynamics is greater than that of structures obtained by purely geometric methods (DGEOM and DIKF). 相似文献

10.

A comparison of three estimators of the population-scaled recombination rate: accuracy and robustness 总被引：5，自引：0，他引：5

下载免费PDF全文

Smith NG Fearnhead P 《Genetics》2005,171(4):2051-2062

We have performed simulations to assess the performance of three population genetics approximate-likelihood methods in estimating the population-scaled recombination rate from sequence data. We measured performance in two ways: accuracy when the sequence data were simulated according to the (simplistic) standard model underlying the methods and robustness to violations of many different aspects of the standard model. Although we found some differences between the methods, performance tended to be similar for all three methods. Despite the fact that the methods are not robust to violations of the underlying model, our simulations indicate that patterns of relative recombination rates should be inferred reasonably well even if the standard model does not hold. In addition, we assess various techniques for improving the performance of approximate-likelihood methods. In particular we find that the composite-likelihood method of Hudson (2001) can be improved by including log-likelihood contributions only for pairs of sites that are separated by some prespecified distance. 相似文献

11.

A comparison of hazard rate estimators for left truncated and right censored data 总被引：1，自引：0，他引：1

UZUNOG{macron}ULLARI ULKU; WANG JANE-LING 《Biometrika》1992,79(2):297-310

相似文献

12.

QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments

Stephen W. Hartley James C. Mullikin 《BMC bioinformatics》2015,16(1)

相似文献

13.

Protocols for the assurance of microarray data quality and process control 总被引：3，自引：0，他引：3

Burgoon LD Eckel-Passow JE Gennings C Boverhof DR Burt JW Fong CJ Zacharewski TR 《Nucleic acids research》2005,33(19):e172

Microarrays represent a powerful technology that provides the ability to simultaneously measure the expression of thousands of genes. However, it is a multi-step process with numerous potential sources of variation that can compromise data analysis and interpretation if left uncontrolled, necessitating the development of quality control protocols to ensure assay consistency and high-quality data. In response to emerging standards, such as the minimum information about a microarray experiment standard, tools are required to ascertain the quality and reproducibility of results within and across studies. To this end, an intralaboratory quality control protocol for two color, spotted microarrays was developed using cDNA microarrays from in vivo and in vitro dose-response and time-course studies. The protocol combines: (i) diagnostic plots monitoring the degree of feature saturation, global feature and background intensities, and feature misalignments with (ii) plots monitoring the intensity distributions within arrays with (iii) a support vector machine (SVM) model. The protocol is applicable to any laboratory with sufficient datasets to establish historical high- and low-quality data. 相似文献

14.

Multiple observers, humidity, and choice of precision statistics: factors influencing craniometric data quality

C J Utermohle S L Zegura G M Heathcote 《American journal of physical anthropology》1983,61(1):85-95

This study investigates three topics: (1) interobserver measurement error in craniometry, (2) the effects of humidity on craniometric measurements, and (3) the current status of estimators of measurement precision in craniometry and anthropometry. The results of the three-observer error analysis based on 24 linear measurements taken on 47 crania indicate that minor idiosyncratic variations in measurement technique can lead to high levels of statistical discrimination among the data produced by the different observers. The results of the humidity experiment substantiate the contention that increasing levels of relative humidity are associated with cranial expansion. The results of the comparison of 11 univariate precision estimators suggest that the combination of percentage agreement, the mean absolute difference, and Fisher's nonparametric sign test can give an instructive picture of the frequency, magnitude, and directionality of measurement imprecision. Information on the comparability of technique and measurement precision can then be used in the variable selection process prior to the application of multivariate statistical procedures to strengthen the substantive interpretation of craniometric data. 相似文献

15.

Comparison of 'model-free' and 'model-based' linkage statistics in the presence of locus heterogeneity: single data set and multiple data set applications

Huang J Vieland VJ 《Human heredity》2001,51(4):217-225

Earlier work [Knapp et al.: Hum Hered 1994;44:44-51] focusing on affected sib pair (ASP) data established the equivalence between the mean test and a test based on a simple recessive lod score, as well as equivalences between certain forms of the maximum likelihood score (MLS) statistic [Risch: Am J Hum Genet 1990;46:242-253] and particular forms of the lod score. Here we extend the results of Knapp et al. [1994] by reconsidering these equivalences for ASP data, but in the presence of locus heterogeneity. We show that Risch's MLS statistic under the possible triangle constraints [Holmans: Am J Hum Genet 1993;52:362-374] is locally equivalent to the ordinary heterogeneity lod score assuming a simple recessive model (HLOD/R); while the one-parameter MLS assuming no dominance variance is locally equivalent to the (homogeneity) recessive lod. The companion paper (this issue, pp 199-208) showed that when considering multiple data sets in the presence of locus heterogeneity, the HLOD can suffer appreciable losses in power. We show here that in ASP data, these equivalences ensure that this same loss in power is incurred by both forms of the MLS statistic as well. The companion paper also introduced an adaptation of the lod, the compound lod score (HLOD/C). We confirm that the HLOD/C maintains higher power than these 'model-free' methods when applied to multiple heterogeneous data sets, even when it is calculated assuming the wrong genetic model. 相似文献

16.

BiQ Analyzer: visualization and quality control for DNA methylation data from bisulfite sequencing 总被引：11，自引：0，他引：11

Bock C Reither S Mikeska T Paulsen M Walter J Lengauer T 《Bioinformatics (Oxford, England)》2005,21(21):4067-4068

SUMMARY: Manual processing of DNA methylation data from bisulfite sequencing is a tedious and error-prone task. Here we present an interactive software tool that provides start-to-end support for this process. In an easy-to-use manner, the tool helps the user to import the sequence files from the sequencer, to align them, to exclude or correct critical sequences, to document the experiment, to perform basic statistics and to produce publication-quality diagrams.Emphasis is put on quality control: The program automatically assesses data quality and provides warnings and suggestions for dealing with critical sequences. The BiQ Analyzer program is implemented in the Java programming language and runs on any platform for which a recent Java virtual machine is available. AVAILABILITY: The program is available without charge for non-commercial users and can be downloaded from http://biq-analyzer.bioinf.mpi-inf.mpg.de/ 相似文献

17.

Statistical and graphical methods for quality control determination of high-throughput screening data

Gunter B Brideau C Pikounis B Liaw A 《Journal of biomolecular screening》2003,8(6):624-633

High-throughput screening (HTS) is used in modern drug discovery to screen hundreds of thousands to millions of compounds on selected protein targets. It is an industrial-scale process relying on sophisticated automation and state-of-the-art detection technologies. Quality control (QC) is an integral part of the process and is used to ensure good quality data and mini mize assay variability while maintaining assay sensitivity. The authors describe new QC methods and show numerous real examples from their biologist-friendly Stat Server HTS application, a custom-developed software tool built from the commercially available S-PLUS and Stat Server statistical analysis and server software. This system remotely processes HTS data using powerful and sophisticated statistical methodology but insulates users from the technical details by outputting results in a variety of readily interpretable graphs and tables. It allows users to visualize HTS data and examine assay performance during the HTS campaign to quickly react to or avoid quality problems. 相似文献

18.

NGS QC Toolkit: a toolkit for quality control of next generation sequencing data 总被引：4，自引：0，他引：4

Patel RK Jain M 《PloS one》2012,7(2):e30619

Next generation sequencing (NGS) technologies provide a high-throughput means to generate large amount of sequence data. However, quality control (QC) of sequence data generated from these technologies is extremely important for meaningful downstream analysis. Further, highly efficient and fast processing tools are required to handle the large volume of datasets. Here, we have developed an application, NGS QC Toolkit, for quality check and filtering of high-quality data. This toolkit is a standalone and open source application freely available at http://www.nipgr.res.in/ngsqctoolkit.html. All the tools in the application have been implemented in Perl programming language. The toolkit is comprised of user-friendly tools for QC of sequencing data generated using Roche 454 and Illumina platforms, and additional tools to aid QC (sequence format converter and trimming tools) and analysis (statistics tools). A variety of options have been provided to facilitate the QC at user-defined parameters. The toolkit is expected to be very useful for the QC of NGS data to facilitate better downstream analysis. 相似文献

19.

AfterQC: automatic filtering,trimming, error removing and quality control for fastq data

Shifu Chen Tanxiao Huang Yanqing Zhou Yue Han Mingyan Xu Jia Gu 《BMC bioinformatics》2017,18(3):80

Background

Some applications, especially those clinical applications requiring high accuracy of sequencing data, usually have to face the troubles caused by unavoidable sequencing errors. Several tools have been proposed to profile the sequencing quality, but few of them can quantify or correct the sequencing errors. This unmet requirement motivated us to develop AfterQC, a tool with functions to profile sequencing errors and correct most of them, plus highly automated quality control and data filtering features. Different from most tools, AfterQC analyses the overlapping of paired sequences for pair-end sequencing data. Based on overlapping analysis, AfterQC can detect and cut adapters, and furthermore it gives a novel function to correct wrong bases in the overlapping regions. Another new feature is to detect and visualise sequencing bubbles, which can be commonly found on the flowcell lanes and may raise sequencing errors. Besides normal per cycle quality and base content plotting, AfterQC also provides features like polyX (a long sub-sequence of a same base X) filtering, automatic trimming and K-MER based strand bias profiling.

Results

For each single or pair of FastQ files, AfterQC filters out bad reads, detects and eliminates sequencer’s bubble effects, trims reads at front and tail, detects the sequencing errors and corrects part of them, and finally outputs clean data and generates HTML reports with interactive figures. AfterQC can run in batch mode with multiprocess support, it can run with a single FastQ file, a single pair of FastQ files (for pair-end sequencing), or a folder for all included FastQ files to be processed automatically. Based on overlapping analysis, AfterQC can estimate the sequencing error rate and profile the error transform distribution. The results of our error profiling tests show that the error distribution is highly platform dependent.

Conclusion

Much more than just another new quality control (QC) tool, AfterQC is able to perform quality control, data filtering, error profiling and base correction automatically. Experimental results show that AfterQC can help to eliminate the sequencing errors for pair-end sequencing data to provide much cleaner outputs, and consequently help to reduce the false-positive variants, especially for the low-frequency somatic mutations. While providing rich configurable options, AfterQC can detect and set all the options automatically and require no argument in most cases.

相似文献

20.

Swine gene banking: a quality control perspective on collection, and analysis of samples for a national repository

Purdy PH 《Theriogenology》2008,70(8):1304-1309

The National Animal Germplasm Program (NAGP) is developing a national repository for germplasm (semen, oocytes, embryos, blood, DNA, tissue) for all agricultural species in the US. Currently, the swine collection consists of 127,479 samples from 886 boars representing 20 major, minor and composite populations. Cryopreservation per se is not an impediment to program success. Rather, the greatest difficulties encountered are in determining the quality of the samples pre- and post-thaw. Robust, broadly applicable, and cost effective quality control methodologies need to be developed and implemented. This overview of the NAGP will discuss the approaches used for cryopreserving boar semen samples, overcoming the challenges of assessing sample quality, and moving toward a quality control strategy. 相似文献