首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.

Background

The probe percent bound value, calculated using multi-state equilibrium models of solution hybridization, is shown to be useful in understanding the hybridization behavior of microarray probes having 50 nucleotides, with and without mismatches. These longer oligonucleotides are in widespread use on microarrays, but there are few controlled studies of their interactions with mismatched targets compared to 25-mer based platforms.

Principal Findings

50-mer oligonucleotides with centrally placed single, double and triple mismatches were spotted on an array. Over a range of target concentrations it was possible to discriminate binding to perfect matches and mismatches, and the type of mismatch could be predicted accurately in the concentration midrange (100 pM to 200 pM) using solution hybridization modeling methods. These results have implications for microarray design, optimization and analysis methods.

Conclusions

Our results highlight the importance of incorporating biophysical factors in both the design and the analysis of microarrays. Use of the probe “percent bound” value predicted by equilibrium models of hybridization is confirmed to be important for predicting and interpreting the behavior of long oligonucleotide arrays, as has been shown for short oligonucleotide arrays.  相似文献   

2.

Background

Last generations of Single Nucleotide Polymorphism (SNP) arrays allow to study copy-number variations in addition to genotyping measures.

Results

MPAgenomics, standing for multi-patient analysis (MPA) of genomic markers, is an R-package devoted to: (i) efficient segmentation and (ii) selection of genomic markers from multi-patient copy number and SNP data profiles. It provides wrappers from commonly used packages to streamline their repeated (sometimes difficult) manipulation, offering an easy-to-use pipeline for beginners in R.The segmentation of successive multiple profiles (finding losses and gains) is performed with an automatic choice of parameters involved in the wrapped packages. Considering multiple profiles in the same time, MPAgenomics wraps efficient penalized regression methods to select relevant markers associated with a given outcome.

Conclusions

MPAgenomics provides an easy tool to analyze data from SNP arrays in R. The R-package MPAgenomics is available on CRAN.  相似文献   

3.

Background

The ability to accurately detect DNA copy number variation in both a sensitive and quantitative manner is important in many research areas. However, genome-wide DNA copy number analyses are complicated by variations in detection signal.

Results

While GC content has been used to correct for this, here we show that coverage biases are tissue-specific and independent of the detection method as demonstrated by next-generation sequencing and array CGH. Moreover, we show that DNA isolation stringency affects the degree of equimolar coverage and that the observed biases coincide with chromatin characteristics like gene expression, genomic isochores, and replication timing.

Conclusion

These results indicate that chromatin organization is a main determinant for differential DNA retrieval. These findings are highly relevant for germline and somatic DNA copy number variation analyses.  相似文献   

4.

Background

Single nucleotide polymorphisms (SNPs) have been used extensively in genetics and epidemiology studies. Traditionally, SNPs that did not pass the Hardy-Weinberg equilibrium (HWE) test were excluded from these analyses. Many investigators have addressed possible causes for departure from HWE, including genotyping errors, population admixture and segmental duplication. Recent large-scale surveys have revealed abundant structural variations in the human genome, including copy number variations (CNVs). This suggests that a significant number of SNPs must be within these regions, which may cause deviation from HWE.

Results

We performed a Bayesian analysis on the potential effect of copy number variation, segmental duplication and genotyping errors on the behavior of SNPs. Our results suggest that copy number variation is a major factor of HWE violation for SNPs with a small minor allele frequency, when the sample size is large and the genotyping error rate is 0∼1%.

Conclusions

Our study provides the posterior probability that a SNP falls in a CNV or a segmental duplication, given the observed allele frequency of the SNP, sample size and the significance level of HWE testing.  相似文献   

5.

Background

Next-generation sequencing techniques, such as genotyping-by-sequencing (GBS), provide alternatives to single nucleotide polymorphism (SNP) arrays. The aim of this work was to evaluate the potential of GBS compared to SNP array genotyping for genomic selection in livestock populations.

Methods

The value of GBS was quantified by simulation analyses in which three parameters were varied: (i) genome-wide sequence read depth (x) per individual from 0.01x to 20x or using SNP array genotyping; (ii) number of genotyped markers from 3000 to 300 000; and (iii) size of training and prediction sets from 500 to 50 000 individuals. The latter was achieved by distributing the total available x of 1000x, 5000x, or 10 000x per genotyped locus among the varying number of individuals. With SNP arrays, genotypes were called from sequence data directly. With GBS, genotypes were called from sequence reads that varied between loci and individuals according to a Poisson distribution with mean equal to x. Simulated data were analyzed with ridge regression and the accuracy and bias of genomic predictions and response to selection were quantified under the different scenarios.

Results

Accuracies of genomic predictions using GBS data or SNP array data were comparable when large numbers of markers were used and x per individual was ~1x or higher. The bias of genomic predictions was very high at a very low x. When the total available x was distributed among the training individuals, the accuracy of prediction was maximized when a large number of individuals was used that had GBS data with low x for a large number of markers. Similarly, response to selection was maximized under the same conditions due to increasing both accuracy and selection intensity.

Conclusions

GBS offers great potential for developing genomic selection in livestock populations because it makes it possible to cover large fractions of the genome and to vary the sequence read depth per individual. Thus, the accuracy of predictions is improved by increasing the size of training populations and the intensity of selection is increased by genotyping a larger number of selection candidates.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0102-z) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

DNA sequence diversity within the human genome may be more greatly affected by copy number variations (CNVs) than single nucleotide polymorphisms (SNPs). Although the importance of CNVs in genome wide association studies (GWAS) is becoming widely accepted, the optimal methods for identifying these variants are still under evaluation. We have previously reported a comprehensive view of CNVs in the HapMap DNA collection using high density 500 K EA (Early Access) SNP genotyping arrays which revealed greater than 1,000 CNVs ranging in size from 1 kb to over 3 Mb. Although the arrays used most commonly for GWAS predominantly interrogate SNPs, CNV identification and detection does not necessarily require the use of DNA probes centered on polymorphic nucleotides and may even be hindered by the dependence on a successful SNP genotyping assay.

Results

In this study, we have designed and evaluated a high density array predicated on the use of non-polymorphic oligonucleotide probes for CNV detection. This approach effectively uncouples copy number detection from SNP genotyping and thus has the potential to significantly improve probe coverage for genome-wide CNV identification. This array, in conjunction with PCR-based, complexity-reduced DNA target, queries over 1.3 M independent NspI restriction enzyme fragments in the 200 bp to 1100 bp size range, which is a several fold increase in marker density as compared to the 500 K EA array. In addition, a novel algorithm was developed and validated to extract CNV regions and boundaries.

Conclusion

Using a well-characterized pair of DNA samples, close to 200 CNVs were identified, of which nearly 50% appear novel yet were independently validated using quantitative PCR. The results indicate that non-polymorphic probes provide a robust approach for CNV identification, and the increasing precision of CNV boundary delineation should allow a more complete analysis of their genomic organization.  相似文献   

7.
8.

Background

Array comparative genomic hybridization (aCGH) to detect copy number variants (CNVs) in mammalian genomes has led to a growing awareness of the potential importance of this category of sequence variation as a cause of phenotypic variation. Yet there are large discrepancies between studies, so that the extent of the genome affected by CNVs is unknown. We combined molecular and aCGH analyses of CNVs in inbred mouse strains to investigate this question.

Principal Findings

Using a 2.1 million probe array we identified 1,477 deletions and 499 gains in 7 inbred mouse strains. Molecular characterization indicated that approximately one third of the CNVs detected by the array were false positives and we estimate the false negative rate to be more than 50%. We show that low concordance between studies is largely due to the molecular nature of CNVs, many of which consist of a series of smaller deletions and gains interspersed by regions where the DNA copy number is normal.

Conclusions

Our results indicate that CNVs detected by arrays may be the coincidental co-localization of smaller CNVs, whose presence is more likely to perturb an aCGH hybridization profile than the effect of an isolated, small, copy number alteration. Our findings help explain the hitherto unexplored discrepancies between array-based studies of copy number variation in the mouse genome.  相似文献   

9.

Background

Numerous efforts have been made to elucidate the etiology and improve the treatment of lung cancer, but the overall five-year survival rate is still only 15%. Although cigarette smoking is the primary risk factor for lung cancer, only 7% of female lung cancer patients in Taiwan have a history of smoking. Since cancer results from progressive accumulation of genetic aberrations, genomic rearrangements may be early events in carcinogenesis.

Results

In order to identify biomarkers of early-stage adenocarcinoma, the genome-wide DNA aberrations of 60 pairs of lung adenocarcinoma and adjacent normal lung tissue in non-smoking women were examined using Affymetrix Genome-Wide Human SNP 6.0 arrays. Common copy number variation (CNV) regions were identified by ≥30% of patients with copy number beyond 2 ± 0.5 of copy numbers for each single nucleotide polymorphism (SNP) and at least 100 continuous SNP variant loci. SNPs associated with lung adenocarcinoma were identified by McNemar’s test. Loss of heterozygosity (LOH) SNPs were identified in ≥18% of patients with LOH in the locus. Aberration of SNP rs10248565 at HDAC9 in chromosome 7p21.1 was identified from concurrent analyses of CNVs, SNPs, and LOH.

Conclusion

The results elucidate the genetic etiology of lung adenocarcinoma by demonstrating that SNP rs10248565 may be a potential biomarker of cancer susceptibility.  相似文献   

10.

Background  

DNA copy number aberration (CNA) is one of the key characteristics of cancer cells. Recent studies demonstrated the feasibility of utilizing high density single nucleotide polymorphism (SNP) genotyping arrays to detect CNA. Compared with the two-color array-based comparative genomic hybridization (array-CGH), the SNP arrays offer much higher probe density and lower signal-to-noise ratio at the single SNP level. To accurately identify small segments of CNA from SNP array data, segmentation methods that are sensitive to CNA while resistant to noise are required.  相似文献   

11.

Introduction

In breast cancer, the basal-like subtype has high levels of genomic instability relative to other breast cancer subtypes with many basal-like-specific regions of aberration. There is evidence that this genomic instability extends to smaller scale genomic aberrations, as shown by a previously described micro-deletion event in the PTEN gene in the Basal-like SUM149 breast cancer cell line.

Methods

We sought to identify if small regions of genomic DNA copy number changes exist by using a high density, gene-centric Comparative Genomic Hybridizations (CGH) array on cell lines and primary tumors. A custom tiling array for CGH (244,000 probes, 200 bp tiling resolution) was created to identify small regions of genomic change, which was focused on previously identified basal-like-specific, and general cancer genes. Tumor genomic DNA from 94 patients and 2 breast cancer cell lines was labeled and hybridized to these arrays. Aberrations were called using SWITCHdna and the smallest 25% of SWITCHdna-defined genomic segments were called micro-aberrations (<64 contiguous probes, ∼ 15 kb).

Results

Our data showed that primary tumor breast cancer genomes frequently contained many small-scale copy number gains and losses, termed micro-aberrations, most of which are undetectable using typical-density genome-wide aCGH arrays. The basal-like subtype exhibited the highest incidence of these events. These micro-aberrations sometimes altered expression of the involved gene. We confirmed the presence of the PTEN micro-amplification in SUM149 and by mRNA-seq showed that this resulted in loss of expression of all exons downstream of this event. Micro-aberrations disproportionately affected the 5′ regions of the affected genes, including the promoter region, and high frequency of micro-aberrations was associated with poor survival.

Conclusion

Using a high-probe-density, gene-centric aCGH microarray, we present evidence of small-scale genomic aberrations that can contribute to gene inactivation. These events may contribute to tumor formation through mechanisms not detected using conventional DNA copy number analyses.  相似文献   

12.

Background

Gastric cancer is common cancer. Discovering novel genetic biomarkers might help to identify high-risk individuals. Copy number variation (CNV) has recently been shown to influence risk for several cancers. The aim of the present study was sought to test the association between copy number at a variant region and GC.

Methods

A total of 110 gastric cancer patients and 325 healthy volunteers were enrolled in this study. We searched for a CNV and found a CNV (Variation 7468) containing part of the APC gene, the SRP19 gene and the REEP5 gene. We chose four probes targeting at APC-intron8, APC-exon9, SRP19 and REEP5 to interrogate this CNV. Specific Taqman probes labeled by different reporter fluorophores were used in a real-time PCR platform to obtain copy number. Both the original non-integer data and transformed integer data on copy number were used for analyses.

Results

Gastric caner patients had a lower non-integer copy number than controls for the APC-exon9 probe (Adjusted p = 0.026) and SRP19 probe (Adjusted p = 0.002). The analysis of integer copy number yielded a similar pattern although less significant (Adjusted p = 0.07 for APC-exon9 probe and Adjusted p = 0.02 for SRP19 probe).

Conclusions

Losses of a CNV at 5q22, especially in the DNA region surrounding APC-exon 9, may be associated with a higher risk of gastric cancer.  相似文献   

13.

Purpose

To determine how a single nucleotide polymorphism (SNP)- and informatics-based non-invasive prenatal aneuploidy test performs in detecting trisomy 13.

Methods

Seventeen trisomy 13 and 51 age-matched euploid samples, randomly selected from a larger cohort, were analyzed. Cell-free DNA was isolated from maternal plasma, amplified in a single multiplex polymerase chain reaction assay that interrogated 19,488 SNPs covering chromosomes 13, 18, 21, X, and Y, and sequenced. Analysis and copy number identification involved a Bayesian-based maximum likelihood statistical method that generated chromosome- and sample-specific calculated accuracies.

Results

Of the samples that passed a stringent DNA quality threshold (94.1%), the algorithm correctly identified 15/15 trisomy 13 and 49/49 euploid samples, for 320/320 correct copy number calls.

Conclusions

This informatics- and SNP-based method accurately detects trisomy 13-affected fetuses non-invasively and with high calculated accuracy.  相似文献   

14.

Background

There have been conflicting reports in the literature on association of gene copy number with disease, including CCL3L1 and HIV susceptibility, and β-defensins and Crohn''s disease. Quantification of precise gene copy numbers is important in order to define any association of gene copy number with disease. At present, real-time quantitative PCR (QPCR) is the most commonly used method to determine gene copy number, however the Paralogue Ratio Test (PRT) is being used in more and more laboratories.

Findings

In this study we compare a Pyrosequencing-based Paralogue Ratio Test (PPRT) for determining beta-defensin gene copy number with two currently used methods for gene copy number determination, QPCR and triplex PRT by typing five different cohorts (UK, Danish, Portuguese, Ghanaian and Czech) of DNA from a total of 576 healthy individuals. We found a systematic measurement bias between DNA cohorts revealed by QPCR, but not by the PRT-based methods. Using PRT, copy number ranged from 2 to 9 copies, with a modal copy number of 4 in all populations.

Conclusions

QPCR is very sensitive to quality of the template DNA, generating systematic biases that could produce false-positive or negative disease associations. Both triplex PRT and PPRT do not show this systematic bias, and type copy number within the correct range, although triplex PRT appears to be a more precise and accurate method to type beta-defensin copy number.  相似文献   

15.

Background

Prognostic biomarkers are needed for superficial gastroesophageal adenocarcinoma (EAC) to predict clinical outcomes and select therapy. Although recurrent mutations have been characterized in EAC, little is known about their clinical and prognostic significance. Aneuploidy is predictive of clinical outcome in many malignancies but has not been evaluated in superficial EAC.

Methods

We quantified copy number changes in 41 superficial EAC using Affymetrix SNP 6.0 arrays. We identified recurrent chromosomal gains and losses and calculated the total copy number abnormality (CNA) count for each tumor as a measure of aneuploidy. We correlated CNA count with overall survival and time to first recurrence in univariate and multivariate analyses.

Results

Recurrent segmental gains and losses involved multiple genes, including: HER2, EGFR, MET, CDK6, KRAS (recurrent gains); and FHIT, WWOX, CDKN2A/B, SMAD4, RUNX1 (recurrent losses). There was a 40-fold variation in CNA count across all cases. Tumors with the lowest and highest quartile CNA count had significantly better overall survival (p = 0.032) and time to first recurrence (p = 0.010) compared to those with intermediate CNA counts. These associations persisted when controlling for other prognostic variables.

Significance

SNP arrays facilitate the assessment of recurrent chromosomal gain and loss and allow high resolution, quantitative assessment of segmental aneuploidy (total CNA count). The non-monotonic association of segmental aneuploidy with survival has been described in other tumors. The degree of aneuploidy is a promising prognostic biomarker in a potentially curable form of EAC.  相似文献   

16.

Background

The development of microarray-based genetic tests for diseases that are caused by known mutations is becoming increasingly important. The key obstacle to developing functional genotyping assays is that such mutations need to be genotyped regardless of their location in genomic regions. These regions include large variations in G+C content, and structural features like hairpins.

Methods/Findings

We describe a rational, stable method for screening and combining assay conditions for the genetic analysis of 42 Phenylketonuria-associated mutations in the phenylalanine hydroxylase gene. The mutations are located in regions with large variations in G+C content (20–75%). Custom-made microarrays with different lengths of complementary probe sequences and spacers were hybridized with pooled PCR products of 12 exons from each of 38 individual patient DNA samples. The arrays were washed with eight buffers with different stringencies in a custom-made microfluidic system. The data were used to assess which parameters play significant roles in assay development.

Conclusions

Several assay development methods found suitable probes and assay conditions for a functional test for all investigated mutation sites. Probe length, probe spacer length, and assay stringency sufficed as variable parameters in the search for a functional multiplex assay. We discuss the optimal assay development methods for several different scenarios.  相似文献   

17.

Background

Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays.

Results

We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection.

Conclusion

We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity.  相似文献   

18.

Background

In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information.

Results

Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion.

Conclusions

This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.  相似文献   

19.
Psifidi A  Dovas C  Banos G 《PloS one》2011,6(1):e14560

Background

Single nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and, in general, currently applied methods give inconsistent results in selected cohorts. In the present study we sought to develop a novel method for accurate detection and quantification of SNP in DNA pooled samples.

Methods

The development and evaluation of a novel Ligase Chain Reaction (LCR) protocol that uses a DNA-specific fluorescent dye to allow quantitative real-time analysis is described. Different reaction components and thermocycling parameters affecting the efficiency and specificity of LCR were examined. Several protocols, including gap-LCR modifications, were evaluated using plasmid standard and genomic DNA pools. A protocol of choice was identified and applied for the quantification of a polymorphism at codon 136 of the ovine PRNP gene that is associated with susceptibility to a transmissible spongiform encephalopathy in sheep.

Conclusions

The real-time LCR protocol developed in the present study showed high sensitivity, accuracy, reproducibility and a wide dynamic range of SNP quantification in different DNA pools. The limits of detection and quantification of SNP frequencies were 0.085% and 0.35%, respectively.

Significance

The proposed real-time LCR protocol is applicable when sensitive detection and accurate quantification of low copy number mutations in DNA pools is needed. Examples include oncogenes and tumour suppressor genes, infectious diseases, pathogenic bacteria, fungal species, viral mutants, drug resistance resulting from point mutations, and genetically modified organisms in food.  相似文献   

20.

Background

Genome-wide association studies of pooled DNA samples were shown to be a valuable tool to identify candidate SNPs associated to a phenotype. No such study was up to now applied to childhood allergic asthma, even if the very high complexity of asthma genetics is an appropriate field to explore the potential of pooled GWAS approach.

Methodology/Principal Findings

We performed a pooled GWAS and individual genotyping in 269 children with allergic respiratory diseases comparing allergic children with and without asthma. We used a modular approach to identify the most significant loci associated with asthma by combining silhouette statistics and physical distance method with cluster-adapted thresholding. We found 97% concordance between pooled GWAS and individual genotyping, with 36 out of 37 top-scoring SNPs significant at individual genotyping level. The most significant SNP is located inside the coding sequence of C5, an already identified asthma susceptibility gene, while the other loci regulate functions that are relevant to bronchial physiopathology, as immune- or inflammation-mediated mechanisms and airway smooth muscle contraction. Integration with gene expression data showed that almost half of the putative susceptibility genes are differentially expressed in experimental asthma mouse models.

Conclusion/Significance

Combined silhouette statistics and cluster-adapted physical distance threshold analysis of pooled GWAS data is an efficient method to identify candidate SNP associated to asthma development in an allergic pediatric population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号