期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Laboratory Information Management System (LIMS) for a high throughput genetic platform aimed at candidate gene mutation screening

Voegele C Tavtigian SV de Silva D Cuber S Thomas A Le Calvez-Kelm F 《Bioinformatics (Oxford, England)》2007,23(18):2504-2506

High throughput mutation screening in an automated environment generates large data sets that have to be organized and stored reliably. Complex multistep workflows require strict process management and careful data tracking. We have developed a Laboratory Information Management Systems (LIMS) tailored to high throughput candidate gene mutation scanning and resequencing that respects these requirements. Designed with a client/server architecture, our system is platform independent and based on open-source tools from the database to the web application development strategy. Flexible, expandable and secure, the LIMS, by communicating with most of the laboratory instruments and robots, tracks samples and laboratory information, capturing data at every step of our automated mutation screening workflow. An important feature of our LIMS is that it enables tracking of information through a laboratory workflow where the process at one step is contingent on results from a previous step. AVAILABILITY: Script for MySQL database table creation and source code of the whole JSP application are freely available on our website: http://www-gcs.iarc.fr/lims/. SUPPLEMENTARY INFORMATION: System server configuration, database structure and additional details on the LIMS and the mutation screening workflow are available on our website: http://www-gcs.iarc.fr/lims/ 相似文献

2.

TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

Robert Schmieder Yan Wei Lim Forest Rohwer Robert Edwards 《BMC bioinformatics》2010,11(1):1-14

Background

Shared-usage high throughput screening (HTS) facilities are becoming more common in academe as large-scale small molecule and genome-scale RNAi screening strategies are adopted for basic research purposes. These shared facilities require a unique informatics infrastructure that must not only provide access to and analysis of screening data, but must also manage the administrative and technical challenges associated with conducting numerous, interleaved screening efforts run by multiple independent research groups.

Results

We have developed Screensaver, a free, open source, web-based lab information management system (LIMS), to address the informatics needs of our small molecule and RNAi screening facility. Screensaver supports the storage and comparison of screening data sets, as well as the management of information about screens, screeners, libraries, and laboratory work requests. To our knowledge, Screensaver is one of the first applications to support the storage and analysis of data from both genome-scale RNAi screening projects and small molecule screening projects.

Conclusions

The informatics and administrative needs of an HTS facility may be best managed by a single, integrated, web-accessible application such as Screensaver. Screensaver has proven useful in meeting the requirements of the ICCB-Longwood/NSRB Screening Facility at Harvard Medical School, and has provided similar benefits to other HTS facilities. 相似文献

3.

MASTR-MS: a web-based collaborative laboratory information management system (LIMS) for metabolomics

Adam Hunter Saravanan Dayalan David De Souza Brad Power Rodney Lorrimar Tamas Szabo Thu Nguyen Sean O’Callaghan Jeremy Hack James Pyke Amsha Nahid Roberto Barrero Ute Roessner Vladimir Likic Dedreia Tull Antony Bacic Malcolm McConville Matthew Bellgard 《Metabolomics : Official journal of the Metabolomic Society》2017,13(2):14

Background

An increasing number of research laboratories and core analytical facilities around the world are developing high throughput metabolomic analytical and data processing pipelines that are capable of handling hundreds to thousands of individual samples per year, often over multiple projects, collaborations and sample types. At present, there are no Laboratory Information Management Systems (LIMS) that are specifically tailored for metabolomics laboratories that are capable of tracking samples and associated metadata from the beginning to the end of an experiment, including data processing and archiving, and which are also suitable for use in large institutional core facilities or multi-laboratory consortia as well as single laboratory environments.

Results

Here we present MASTR-MS, a downloadable and installable LIMS solution that can be deployed either within a single laboratory or used to link workflows across a multisite network. It comprises a Node Management System that can be used to link and manage projects across one or multiple collaborating laboratories; a User Management System which defines different user groups and privileges of users; a Quote Management System where client quotes are managed; a Project Management System in which metadata is stored and all aspects of project management, including experimental setup, sample tracking and instrument analysis, are defined, and a Data Management System that allows the automatic capture and storage of raw and processed data from the analytical instruments to the LIMS.

Conclusion

MASTR-MS is a comprehensive LIMS solution specifically designed for metabolomics. It captures the entire lifecycle of a sample starting from project and experiment design to sample analysis, data capture and storage. It acts as an electronic notebook, facilitating project management within a single laboratory or a multi-node collaborative environment. This software is being developed in close consultation with members of the metabolomics research community. It is freely available under the GNU GPL v3 licence and can be accessed from, https://muccg.github.io/mastr-ms/.

相似文献

4.

Predictive modeling of plant messenger RNA polyadenylation sites

Guoli Ji Jianti Zheng Yingjia Shen Xiaohui Wu Ronghan Jiang Yun Lin Johnny C Loke Kimberly M Davis Greg J Reese Qingshun Quinn Li 《BMC bioinformatics》2007,8(1):1-15

Background

Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable.

Results

We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in ad hoc ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions.

Conclusion

The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments. 相似文献

5.

ProteinTracker: an application for managing protein production and purification

Stefan C Ponko David Bienvenue 《BMC research notes》2012,5(1):224

相似文献

6.

Auto-validation of fluorescent primer extension genotyping assay using signal clustering and neural networks

Ching Yu Austin Huang Joel Studebaker Anton Yuryev Jianping Huang Kathryn E Scott Jennifer Kuebler Shobha Varde Steven Alfisi Craig A Gelfand Mark Pohl Michael T Boyce-Jacino 《BMC bioinformatics》2004,5(1):1-10

Background

SNP genotyping typically incorporates a review step to ensure that the genotype calls for a particular SNP are correct. For high-throughput genotyping, such as that provided by the GenomeLab SNPstream^® instrument from Beckman Coulter, Inc., the manual review used for low-volume genotyping becomes a major bottleneck. The work reported here describes the application of a neural network to automate the review of results.

Results

We describe an approach to reviewing the quality of primer extension 2-color fluorescent reactions by clustering optical signals obtained from multiple samples and a single reaction set-up. The method evaluates the quality of the signal clusters from the genotyping results. We developed 64 scores to measure the geometry and position of the signal clusters. The expected signal distribution was represented by a distribution of a 64-component parametric vector obtained by training the two-layer neural network onto a set of 10,968 manually reviewed 2D plots containing the signal clusters.

Conclusion

The neural network approach described in this paper may be used with results from the GenomeLab SNPstream instrument for high-throughput SNP genotyping. The overall correlation with manual revision was 0.844. The approach can be applied to a quality review of results from other high-throughput fluorescent-based biochemical assays in a high-throughput mode. 相似文献

7.

Genetic dissection of drought tolerance in chickpea (Cicer arietinum L.) 总被引：1，自引：0，他引：1

Rajeev K. Varshney Mahendar Thudi Spurthi N. Nayak Pooran M. Gaur Junichi Kashiwagi Lakshmanan Krishnamurthy Deepa Jaganathan Jahnavi Koppolu Abhishek Bohra Shailesh Tripathi Abhishek Rathore Aravind K. Jukanti Veera Jayalakshmi Anilkumar Vemula S. J. Singh Mohammad Yasin M. S. Sheshshayee K. P. Viswanatha 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2014,127(2):445-462

Key message

Analysis of phenotypic data for 20 drought tolerance traits in 1–7 seasons at 1–5 locations together with genetic mapping data for two mapping populations provided 9 QTL clusters of which one present on CaLG04 has a high potential to enhance drought tolerance in chickpea improvement.

Abstract

Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in the arid and semi-arid regions of the world. Drought is one of the major constraints leading up to 50 % production losses in chickpea. In order to dissect the complex nature of drought tolerance and to use genomics tools for enhancing yield of chickpea under drought conditions, two mapping populations—ICCRIL03 (ICC 4958 × ICC 1882) and ICCRIL04 (ICC 283 × ICC 8261) segregating for drought tolerance-related root traits were phenotyped for a total of 20 drought component traits in 1–7 seasons at 1–5 locations in India. Individual genetic maps comprising 241 loci and 168 loci for ICCRIL03 and ICCRIL04, respectively, and a consensus genetic map comprising 352 loci were constructed (http://cmap.icrisat.ac.in/cmap/sm/cp/varshney/). Analysis of extensive genotypic and precise phenotypic data revealed 45 robust main-effect QTLs (M-QTLs) explaining up to 58.20 % phenotypic variation and 973 epistatic QTLs (E-QTLs) explaining up to 92.19 % phenotypic variation for several target traits. Nine QTL clusters containing QTLs for several drought tolerance traits have been identified that can be targeted for molecular breeding. Among these clusters, one cluster harboring 48 % robust M-QTLs for 12 traits and explaining about 58.20 % phenotypic variation present on CaLG04 has been referred as “QTL-hotspot”. This genomic region contains seven SSR markers (ICCM0249, NCPGR127, TAA170, NCPGR21, TR11, GA24 and STMS11). Introgression of this region into elite cultivars is expected to enhance drought tolerance in chickpea. 相似文献

8.

High accuracy genotyping directly from genomic DNA using a rolling circle amplification based assay

Alsmadi OA Bornarth CJ Song W Wisniewski M Du J Brockman JP Faruqi AF Hosono S Sun Z Du Y Wu X Egholm M Abarzúa P Lasken RS Driscoll MD 《BMC genomics》2003,4(1):21-18

Background

Rolling circle amplification of ligated probes is a simple and sensitive means for genotyping directly from genomic DNA. SNPs and mutations are interrogated with open circle probes (OCP) that can be circularized by DNA ligase when the probe matches the genotype. An amplified detection signal is generated by exponential rolling circle amplification (ERCA) of the circularized probe. The low cost and scalability of ligation/ERCA genotyping makes it ideally suited for automated, high throughput methods.

Results

A retrospective study using human genomic DNA samples of known genotype was performed for four different clinically relevant mutations: Factor V Leiden, Factor II prothrombin, and two hemochromatosis mutations, C282Y and H63D. Greater than 99% accuracy was obtained genotyping genomic DNA samples from hundreds of different individuals. The combined process of ligation/ERCA was performed in a single tube and produced fluorescent signal directly from genomic DNA in less than an hour. In each assay, the probes for both normal and mutant alleles were combined in a single reaction. Multiple ERCA primers combined with a quenched-peptide nucleic acid (Q-PNA) fluorescent detection system greatly accellerated the appearance of signal. Probes designed with hairpin structures reduced misamplification. Genotyping accuracy was identical from either purified genomic DNA or genomic DNA generated using whole genome amplification (WGA). Fluorescent signal output was measured in real time and as an end point.

Conclusions

Combining the optimal elements for ligation/ERCA genotyping has resulted in a highly accurate single tube assay for genotyping directly from genomic DNA samples. Accuracy exceeded 99 % for four probe sets targeting clinically relevant mutations. No genotypes were called incorrectly using either genomic DNA or whole genome amplified sample. 相似文献

9.

KniMet: a pipeline for the processing of chromatography–mass spectrometry metabolomics data

Sonia Liggi Christine Hinz Zoe Hall Maria Laura Santoru Simone Poddighe John Fjeldsted Luigi Atzori Julian L. Griffin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):52

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.

相似文献

10.

SNP high-throughput screening in grapevine using the SNPlex™ genotyping system

Massimo Pindo Silvia Vezzulli Giuseppina Coppola Dustin A Cartwright Andrey Zharkikh Riccardo Velasco Michela Troggio 《BMC plant biology》2008,8(1):1-6

Background

Until recently, only a small number of low- and mid-throughput methods have been used for single nucleotide polymorphism (SNP) discovery and genotyping in grapevine (Vitis vinifera L.). However, following completion of the sequence of the highly heterozygous genome of Pinot Noir, it has been possible to identify millions of electronic SNPs (eSNPs) thus providing a valuable source for high-throughput genotyping methods.

Results

Herein we report the first application of the SNPlex? genotyping system in grapevine aiming at the anchoring of an eukaryotic genome. This approach combines robust SNP detection with automated assay readout and data analysis. 813 candidate eSNPs were developed from non-repetitive contigs of the assembled genome of Pinot Noir and tested in 90 progeny of Syrah × Pinot Noir cross. 563 new SNP-based markers were obtained and mapped. The efficiency rate of 69% was enhanced to 80% when multiple displacement amplification (MDA) methods were used for preparation of genomic DNA for the SNPlex assay.

Conclusion

Unlike other SNP genotyping methods used to investigate thousands of SNPs in a few genotypes, or a few SNPs in around a thousand genotypes, the SNPlex genotyping system represents a good compromise to investigate several hundred SNPs in a hundred or more samples simultaneously. Therefore, the use of the SNPlex assay, coupled with whole genome amplification (WGA), is a good solution for future applications in well-equipped laboratories. 相似文献

11.

'PACLIMS': A component LIM system for high-throughput functional genomic analysis

Nicole Donofrio Ravi Rajagopalon Douglas Brown Stephen Diener Donald Windham Shelly Nolin Anna Floyd Thomas Mitchell Natalia Galadima Sara Tucker Marc J Orbach Gayatri Patel Mark Farman Vishal Pampanwar Cari Soderlund Yong-Hwan Lee Ralph A Dean 《BMC bioinformatics》2005,6(1):1-7

Background

Recent advances in sequencing techniques leading to cost reduction have resulted in the generation of a growing number of sequenced eukaryotic genomes. Computational tools greatly assist in defining open reading frames and assigning tentative annotations. However, gene functions cannot be asserted without biological support through, among other things, mutational analysis. In taking a genome-wide approach to functionally annotate an entire organism, in this application the ~11,000 predicted genes in the rice blast fungus (Magnaporthe grisea), an effective platform for tracking and storing both the biological materials created and the data produced across several participating institutions was required.

Results

The platform designed, named PACLIMS, was built to support our high throughput pipeline for generating 50,000 random insertion mutants of Magnaporthe grisea. To be a useful tool for materials and data tracking and storage, PACLIMS was designed to be simple to use, modifiable to accommodate refinement of research protocols, and cost-efficient. Data entry into PACLIMS was simplified through the use of barcodes and scanners, thus reducing the potential human error, time constraints, and labor. This platform was designed in concert with our experimental protocol so that it leads the researchers through each step of the process from mutant generation through phenotypic assays, thus ensuring that every mutant produced is handled in an identical manner and all necessary data is captured.

Conclusion

Many sequenced eukaryotes have reached the point where computational analyses are no longer sufficient and require biological support for their predicted genes. Consequently, there is an increasing need for platforms that support high throughput genome-wide mutational analyses. While PACLIMS was designed specifically for this project, the source and ideas present in its implementation can be used as a model for other high throughput mutational endeavors. 相似文献

12.

Fourmidable: a database for ant genomics 总被引：1，自引：0，他引：1

Yannick Wurm Paolo Uva Frédéric Ricci John Wang Stephanie Jemielity Christian Iseli Laurent Falquet Laurent Keller 《BMC genomics》2009,10(1):1-5

相似文献

13.

Genotyping panel for assessing response to cancer chemotherapy

Zunyan Dai Audrey C Papp Danxin Wang Heather Hampel Wolfgang Sadee 《BMC medical genomics》2008,1(1):1-18

Background

Variants in numerous genes are thought to affect the success or failure of cancer chemotherapy. Interindividual variability can result from genes involved in drug metabolism and transport, drug targets (receptors, enzymes, etc), and proteins relevant to cell survival (e.g., cell cycle, DNA repair, and apoptosis). The purpose of the current study is to establish a flexible, cost-effective, high-throughput genotyping platform for candidate genes involved in chemoresistance and -sensitivity, and treatment outcomes.

Methods

We have adopted SNPlex for genotyping 432 single nucleotide polymorphisms (SNPs) in 160 candidate genes implicated in response to anticancer chemotherapy.

Results

The genotyping panels were applied to 39 patients with chronic lymphocytic leukemia undergoing flavopiridol chemotherapy, and 90 patients with colorectal cancer. 408 SNPs (94%) produced successful genotyping results. Additional genotyping methods were established for polymorphisms undetectable by SNPlex, including multiplexed SNaPshot for CYP2D6 SNPs, and PCR amplification with fluorescently labeled primers for the UGT1A1 promoter (TA)nTAA repeat polymorphism.

Conclusion

This genotyping panel is useful for supporting clinical anticancer drug trials to identify polymorphisms that contribute to interindividual variability in drug response. Availability of population genetic data across multiple studies has the potential to yield genetic biomarkers for optimizing anticancer therapy. 相似文献

14.

Transforming Microbial Genotyping: A Robotic Pipeline for Genotyping Bacterial Strains

Brian O’Farrell Jana K. Haase Vimalkumar Velayudhan Ronan A. Murphy Mark Achtman 《PloS one》2012,7(10)

相似文献

15.

Declarative platform for high-performance network traffic analytics

Harjot Gill Dong Lin Cam Nguyen Tanveer Gill Boon Thau Loo 《Cluster computing》2014,17(4):1121-1137

This paper presents Scalanytics, a declarative platform that supports high-performance application layer analysis of network traffic. Scalanytics uses (1) stateful network packet processing techniques for extracting application layer data from network packets, (2) a declarative rule-based language called Analog for compactly specifying analysis pipelines from reusable modules, and (3) a task-stealing architecture for processing network packets at high throughput within these pipelines, by leveraging multi-core processing capabilities in a load-balanced manner without the need for explicit performance profiling. In a cluster of machines, Scalanytics further improves throughput through the use of a consistent-hashing based load partitioning strategy. Our evaluation on a 16-core machine demonstrate that Scalanytics achieves up to 11.4 \(\times \) improvement in throughput compared with the best uniprocessor implementation. Moreover, Scalanytics outperforms the Bro intrusion detection system by an order of magnitude when used for analyzing SMTP traffic. We further observed increased throughput when running Scalanytics pipelines across multiple machines. 相似文献

16.

Quantitative trait loci of stripe rust resistance in wheat 总被引：1，自引：0，他引：1

G. M. Rosewarne S. A. Herrera-Foessel R. P. Singh J. Huerta-Espino C. X. Lan Z. H. He 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2013,126(10):2427-2449

Key message

Over 140 QTLs for resistance to stripe rust in wheat have been published and through mapping flanking markers on consensus maps, 49 chromosomal regions are identified.

Abstract

Over thirty publications during the last 10 years have identified more than 140 QTLs for stripe rust resistance in wheat. It is likely that many of these QTLs are identical genes that have been spread through plant breeding into diverse backgrounds through phenotypic selection under stripe rust epidemics. Allelism testing can be used to differentiate genes in similar locations but in different genetic backgrounds; however, this is problematic for QTL studies where multiple loci segregate from any one parent. This review utilizes consensus maps to illustrate important genomic regions that have had effects against stripe rust in wheat, and although this methodology cannot distinguish alleles from closely linked genes, it does highlight the extent of genetic diversity for this trait and identifies the most valuable loci and the parents possessing them for utilization in breeding programs. With the advent of cheaper, high throughput genotyping technologies, it is envisioned that there will be many more publications in the near future describing ever more QTLs. This review sets the scene for the coming influx of data and will quickly enable researchers to identify new loci in their given populations. 相似文献

17.

Investigating the utility of combining Φ29 whole genome amplification and highly multiplexed single nucleotide polymorphism BeadArray™ genotyping

Rebecca Pask Helen E Rance Bryan J Barratt Sarah Nutland Deborah J Smyth Meera Sebastian Rebecca CJ Twells Anne Smith Alex C Lam Luc J Smink Neil M Walker John A Todd 《BMC biotechnology》2004,4(1):1-8

Background

Sustainable DNA resources and reliable high-throughput genotyping methods are required for large-scale, long-term genetic association studies. In the genetic dissection of common disease it is now recognised that thousands of samples and hundreds of thousands of markers, mostly single nucleotide polymorphisms (SNPs), will have to be analysed. In order to achieve these aims, both an ability to boost quantities of archived DNA and to genotype at low costs are highly desirable. We have investigated Φ29 polymerase Multiple Displacement Amplification (MDA)-generated DNA product (MDA product), in combination with highly multiplexed BeadArray? genotyping technology. As part of a large-scale BeadArray genotyping experiment we made a direct comparison of genotyping data generated from MDA product with that from genomic DNA (gDNA) templates.

Results

Eighty-six MDA product and the corresponding 86 gDNA samples were genotyped at 345 SNPs and a concordance rate of 98.8% was achieved. The BeadArray sample exclusion rate, blind to sample type, was 10.5% for MDA product compared to 5.8% for gDNA.

Conclusions

We conclude that the BeadArray technology successfully produces high quality genotyping data from MDA product. The combination of these technologies improves the feasibility and efficiency of mapping common disease susceptibility genes despite limited stocks of gDNA samples. 相似文献

18.

A Perl toolkit for LIMS development

James A Morris Simon A Gayther Ian J Jacobs Christopher Jones 《Source code for biology and medicine》2008,3(1):4

Background

High throughput laboratory techniques generate huge quantities of scientific data. Laboratory Information Management Systems (LIMS) are a necessary requirement, dealing with sample tracking, data storage and data reporting. Commercial LIMS solutions are available, but these can be both costly and overly complex for the task. The development of bespoke LIMS solutions offers a number of advantages, including the flexibility to fulfil all a laboratory's requirements at a fraction of the price of a commercial system. The programming language Perl is a perfect development solution for LIMS applications because of Perl's powerful but simple to use database and web interaction, it is also well known for enabling rapid application development and deployment, and boasts a very active and helpful developer community. The development of an in house LIMS from scratch however can take considerable time and resources, so programming tools that enable the rapid development of LIMS applications are essential but there are currently no LIMS development tools for Perl. 相似文献

19.

Clinical utility of the low-density Infinium QC genotyping Array in a genomics-based diagnostics laboratory

Petr Ponomarenko Alex Ryutov Dennis T. Maglinte Ancha Baranova Tatiana V. Tatarinova Xiaowu Gai 《BMC medical genomics》2017,10(1):57

Background

With 15,949 markers, the low-density Infinium QC Array-24 BeadChip enables linkage analysis, HLA haplotyping, fingerprinting, ethnicity determination, mitochondrial genome variations, blood groups and pharmacogenomics. It represents an attractive independent QC option for NGS-based diagnostic laboratories, and provides cost-efficient means for determining gender, ethnic ancestry, and sample kinships, that are important for data interpretation of NGS-based genetic tests.

Methods

We evaluated accuracy and reproducibility of Infinium QC genotyping calls by comparing them with genotyping data of the same samples from other genotyping platforms, whole genome/exome sequencing. Accuracy and robustness of determining gender, provenance, and kinships were assessed.

Results

Concordance of genotype calls between Infinium QC and other platforms was above 99%. Here we show that the chip’s ancestry informative markers are sufficient for ethnicity determination at continental and sometimes subcontinental levels, with assignment accuracy varying with the coverage for a particular region and ethnic groups. Mean accuracies of provenance prediction at a regional level were varied from 81% for Asia, to 89% for Americas, 86% for Africa, 97% for Oceania, 98% for Europe, and 100% for India. Mean accuracy of ethnicity assignment predictions was 63%. Pairwise concordances of AFR samples with the samples from any other super populations were the lowest (0.39–0.43), while the concordances within the same population were relatively high (0.55–0.61). For all populations except African, cross-population comparisons were similar in their concordance ranges to the range of within-population concordances (0.54–0.57). Gender determination was correct in all tested cases.

Conclusions

Our results indicate that the Infinium QC Array-24 chip is suitable for cost-efficient, independent QC assaying in the settings of an NGS-based molecular diagnostic laboratory; hence, we recommend its integration into the standard laboratory workflow. Low-density chips can provide sample-specific measures for variant call accuracy, prevent sample mix-ups, validate self-reported ethnicities, and detect consanguineous cases. Integration of low-density chips into QC procedures aids proper interpretation of candidate sequence variants. To enhance utility of this low-density chip, we recommend expansion of ADME and mitochondrial markers. Inexpensive Infinium-like low-density human chips have a potential to become a “Swiss army knife” among genotyping assays suitable for many applications requiring high-throughput assays.

相似文献

20.

Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data

Ágnes Baross Allen D Delaney H Irene Li Tarun Nayar Stephane Flibotte Hong Qian Susanna Y Chan Jennifer Asano Adrian Ally Manqiu Cao Patricia Birch Mabel Brown-John Nicole Fernandes Anne Go Giulia Kennedy Sylvie Langlois Patrice Eydoux JM Friedman Marco A Marra 《BMC bioinformatics》2007,8(1):1-18

Background

Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays.

Results

We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection.

Conclusion

We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity. 相似文献