首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Error-tolerant pooling designs with inhibitors.   总被引:2,自引:0,他引:2  
Pooling designs are used in clone library screening to efficiently distinguish positive clones from negative clones. Mathematically, a pooling design is just a nonadaptive group testing scheme which has been extensively studied in the literature. In some applications, there is a third category of clones called "inhibitors" whose effect is to neutralize positives. Specifically, the presence of an inhibitor in a pool dictates a negative outcome even though positives are present. Sequential group testing schemes, which can be modified to three-stage schemes, have been proposed for the inhibitor model, but it is unknown whether a pooling design (a one-stage scheme) exists. Another open question raised in the literature is whether the inhibitor model can treat unreliable pool outcomes. In this paper, we answer both open problems by giving a pooling design, as well as a two-stage scheme, for the inhibitor model with unreliable outcomes. The number of pools required by our schemes are quite comparable to the three-stage scheme.  相似文献   

2.
Pooled Genomic Indexing (PGI) is a novel method for physical mapping of clones onto known sequences. PGI is carried out by pooling arrayed clones and generating shotgun sequence reads from the pools. The shotgun sequences are compared to a reference sequence. In the simplest case, clones are placed on an array and are pooled by rows and columns. If a shotgun sequence from a row pool and another shotgun sequence from a column pool match the reference sequence at a close distance, they are both assigned to the clone at the intersection of the two pools. Accordingly, the clone is mapped onto the region of the reference sequence between the two matches. A probabilistic model for PGI is developed, and several pooling designs are described and analyzed, including transversal designs and designs from linear codes. The probabilistic model and the pooling schemes are validated in simulated experiments where 625 rat bacterial artificial chromosome (BAC) clones and 207 mouse BAC clones are mapped onto homologous human sequence.  相似文献   

3.
We consider nonadaptive pooling designs for unique-sequence screening of a 1530-clone map ofAspergillus nidulans.The map has the properties that the clones are, with possibly a few exceptions, ordered and no more than 2 of them cover any point on the genome. We propose two subdesigns of the Steiner systemS(3, 5, 65), one with 65 pools and approximately 118 clones per pool, the other with 54 pools and about 142 clones per pool. Each design allows 1 or 2 positive clones to be detected, even in the presence of substantial experimental error rates. More efficient designs are possible if the overlap information in the map is exploited, if there is no constraint on the number of clones in a pool, and if no error tolerance is required. An information theory lower bound requires at least 12 pools to satisfy these minimal criteria, and an “interleaved binary” design can be constructed on 20 pools, with about 380 clones per pool. However, the designs with more pools have important properties of robustness to various possible errors and general applicability to a wider class of pooling experiments.  相似文献   

4.
Deconvolution of relationships between bacterial artificial chromosome (BAC) clones and genes is a crucial step in the selective sequencing of regions of interest in a genome. It often includes combinatorial pooling of unique probes obtained from the genes (unigenes), and screening of the BAC library using the pools in a hybridization experiment. Since several probes can hybridize to the same BAC, in order for the deconvolution to be achievable the pooling design has to be able to handle a large number of positives. As a consequence, smaller pools need to be designed, which in turn increases the number of hybridization experiments, possibly making the entire protocol unfeasible. We propose a new algorithm that is capable of producing high-accuracy deconvolution even in the presence of a weak pooling design, i.e. when pools are rather large. The algorithm compensates for the decrease of information in the hybridization data by taking advantage of a physical map of the BAC clones. We show that the right combination of combinatorial pooling and our algorithm not only dramatically reduces the number of pools required, but also successfully deconvolutes the BAC-gene relationships with almost perfect accuracy. Software is available on request from the first author.  相似文献   

5.
A construction of pooling designs with some happy surprises.   总被引:9,自引:0,他引:9  
The screening of data sets for "positive data objects" is essential to modern technology. A (group) test that indicates whether a positive data object is in a specific subset or pool of the dataset can greatly facilitate the identification of all the positive data objects. A collection of tested pools is called a pooling design. Pooling designs are standard experimental tools in many biotechnical applications. In this paper, we use the (linear) subspace relation coupled with the general concept of a "containment matrix" to construct pooling designs with surprisingly high degrees of error correction (detection.) Error-correcting pooling designs are important to biotechnical applications where error rates often are as high as 15%. What is also surprising is that the rank of the pooling design containment matrix is independent of the number of positive data objects in the dataset.  相似文献   

6.
Oligonucleotide fingerprinting is a powerful DNA array-based method to characterize cDNA and ribosomal RNA gene (rDNA) libraries and has many applications including gene expression profiling and DNA clone classification. We are especially interested in the latter application. A key step in the method is the cluster analysis of fingerprint data obtained from DNA array hybridization experiments. Most of the existing approaches to clustering use (normalized) real intensity values and thus do not treat positive and negative hybridization signals equally (positive signals are much more emphasized). In this paper, we consider a discrete approach. Fingerprint data are first normalized and binarized using control DNA clones. Because there may exist unresolved (or missing) values in this binarization process, we formulate the clustering of (binary) oligonucleotide fingerprints as a combinatorial optimization problem that attempts to identify clusters and resolve the missing values in the fingerprints simultaneously. We study the computational complexity of this clustering problem and a natural parameterized version and present an efficient greedy algorithm based on MINIMUM CLIQUE PARTITION on graphs. The algorithm takes advantage of some unique properties of the graphs considered here, which allow us to efficiently find the maximum cliques as well as some special maximal cliques. Our preliminary experimental results on simulated and real data demonstrate that the algorithm runs faster and performs better than some popular hierarchical and graph-based clustering methods. The results on real data from DNA clone classification also suggest that this discrete approach is more accurate than clustering methods based on real intensity values in terms of separating clones that have different characteristics with respect to the given oligonucleotide probes.  相似文献   

7.
Functional screens, where a large numbers of cDNA clones are assayed for certain biological activity, are a useful tool in elucidating gene function. In Xenopus, gain of function screens are performed by pool screening, whereby RNA transcribed in vitro from groups of cDNA clones, ranging from thousands to a hundred, are injected into early embryos. Once an activity is detected in a pool, the active clone is identified by sib-selection. Such screens are intrinsically biased towards potent genes, whose RNA is active at low quantities. To improve the sensitivity and efficiency of a gain of function screen we have bioinformatically processed an arrayed and EST sequenced set of 100,000 gastrula and neurula cDNA clones, to create a unique and full-length set of approximately 2500 clones. Reducing the redundancy and excluding truncated clones from the starting clone set reduced the total number of clones to be screened, in turn allowing us to reduce the pool size to just eight clones per pool. We report that the efficiency of screening this clone set is five-fold higher compared to a redundant set derived from the same libraries. We have screened 960 cDNA clones from this set, for genes that are involved in neurogenesis. We describe the overexpression phenotypes of 18 single clones, the majority of which show a previously uncharacterised phenotype and some of which are completely novel. In situ hybridisation analysis shows that a large number of these genes are specifically expressed in neural tissue. These results demonstrate the effectiveness of a unique full-length set of cDNA clones for uncovering players in a developmental pathway.  相似文献   

8.
We study the problem of selecting control clones in DNA array hybridization experiments. The problem arises in the OFRG method for analyzing microbial communities. The OFRG method performs classification of rRNA gene clones using binary fingerprints created from a series of hybridization experiments, where each experiment consists of hybridizing a collection of arrayed clones with a single oligonucleotide probe. This experiment produces analog signals, one for each clone, which then need to be classified, that is, converted into binary values 1 and 0 that represent hybridization and non-hybridization events. In addition to the sample rRNA gene clones, the array contains a number of control clones needed to calibrate the classification procedure of the hybridization signals. These control clones must be selected with care to optimize the classification process. We formulate this as a combinatorial optimization problem called Balanced Covering. We prove that the problem is NP-hard, and we show some results on hardness of approximation. We propose approximation algorithms based on randomized rounding, and we show that, with high probability, our algorithms approximate well the optimum solution. The experimental results confirm that the algorithms find high quality control clones. The algorithms have been implemented and are publicly available as part of the software package called CloneTools.  相似文献   

9.
Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be “domesticated” for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7–15 kb) cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs) with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.  相似文献   

10.
MOTIVATION: The construction of physical maps based on bacterial clones [e.g. bacterial artificial chromosomes (BACs)] is valuable for a number of molecular genetics applications, including the high-resolution mapping of genomic regions of interest and the identification of clones suitable for systematic sequencing. A common approach for large-scale screening of bacterial clone libraries involves the hybridization of high-density arrays of immobilized, lysed colonies with collections of DNA probes. The use of a multiplex hybridization screening strategy, whereby pooled probes are analysed en masse, simplifies the effort by reducing the total number of parallel experiments required. However, this approach generates large amounts of hybridization-based data that must be carefully analysed, assimilated, and disambiguated in a careful but efficient manner. RESULTS: To facilitate the screening of high-density clone arrays by a multiplex hybridization approach, we have written a program called ComboScreen. This program provides an organizational framework and analytical tools required for the high-throughput hybridization screening of clone arrays with pools of probes. We have used this program extensively for constructing mouse sequence-ready BAC contig maps.  相似文献   

11.
Complementary BAC and BIBAC libraries were constructed from nuclear DNA of sunflower cultivar HA 89. The BAC library, constructed with BamHI in the pECBAC1 vector, contains 107,136 clones and has an average insert size of 140 kb. The BIBAC library was constructed with HindIII in the plant-transformation-competent binary vector pCLD04541 and contains 84,864 clones, with an average insert size of 137 kb. The two libraries combined contain 192,000 clones and are equivalent to approximately 8.9 haploid genomes of sunflower (3,000 Mb/1C), and provide a greater than 99% probability of obtaining a clone of interest. The frequencies of BAC and BIBAC clones carrying chloroplast or mitochondrial DNA sequences were estimated to be 2.35 and 0.04%, respectively, and insert-empty clones were less than 0.5%. To facilitate chromosome engineering and anchor the sunflower genetic map to its chromosomes, one to three single- or low-copy RFLP markers from each linkage group of sunflower were used to design pairs of overlapping oligonucleotides (overgos). Thirty-six overgos were designed and pooled as probes to screen a subset (5.1×) of the BAC and BIBAC libraries. Of the 36 overgos, 33 (92%) gave at least one positive clone and 3 (8%) failed to hit any clone. As a result, 195 BAC and BIBAC clones representing 19 linkage groups were identified, including 76 BAC clones and 119 BIBAC clones, further verifying the genome coverage and utility of the libraries. These BAC and BIBAC libraries and linkage group-specific clones provide resources essential for comprehensive research of the sunflower genome.  相似文献   

12.
For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.  相似文献   

13.
'In vitro evolution' of ligands for HCV-specific serum antibodies   总被引:2,自引:0,他引:2  
We developed a strategy to improve the properties of ligands selected from phage-displayed random peptide libraries. A site-directed mutagenesis protocol that introduces mutations and extends the size of a target sequence has been set up to generate diversity in a single or in a population of clones. The pool of mutants thus created is screened to identify variants with the desired properties. We refer to this strategy as in vitro evolution' of ligands. Here we report the application of this in vitro evolution protocol to the identification of improved ligands for HCV-specific serum antibodies. A single clone or population of clones were processed to generate a secondary library. Screening of these libraries with sera from HCV-infected patients identified peptides with an enhanced and broadened ability to detect HCV-specific serum antibodies.  相似文献   

14.
We tested the accuracy of molecular analyses for recovering the species richness and structure of pooled fungal communities of known composition. We constructed replicate pools of 2-20 species and analysed these pools by two separate pooling-DNA extraction procedures and three different molecular analyses (Automated Ribosomal Intergenic Spacer Analysis (ARISA), terminal restriction fragment length polymorphism (T-RFLP) and clone library-sequencing). None of the methods correctly described the known communities. Only clone library-sequencing with high sequencing per pool (~100 clones) recovered reasonable estimates of richness. Frequency data were skewed with all procedures and analyses. These results indicate that the error introduced by pooling samples is significant and problematic for ecological studies of fungal communities.  相似文献   

15.
Peptide aptamers are combinatorial proteins that specifically bind intracellular proteins and modulate their function. They are powerful tools to study protein function within complex regulatory networks and to guide small-molecule drug discovery. Here we describe methodological improvements that enhance the yeast two-hybrid selection and characterization of large collections of peptide aptamers. We provide a detailed protocol to perform high-efficiency transformation of peptide aptamer libraries, in-depth validation experiments of the bait proteins, high-efficiency mating to screen large numbers of peptide aptamers and streamlined confirmation of the positive clones. We also describe yeast two-hybrid mating assays, which can be used to determine the specificity of the selected aptamers, map their binding sites on target proteins and provide structural insights on their target-binding surface. Overall, 12 weeks are required to perform the protocols. The improvements on the yeast two-hybrid method can be also usefully applied to the screening of cDNA libraries to identify protein interactions.  相似文献   

16.
A semiliquid medium was employed in an efficient method to screen highly complex DNA libraries for clones with known sequences. Unbiased clone pools are generated in 2 ml vials and screened by whole-cell PCR, and individual clones are obtained by few additional rounds of dilution and PCR screening. To demonstrate the utility of this approach, the single positive clone present in a 400,000 member metagenomic fosmid library was isolated.  相似文献   

17.
MOTIVATION: Because of the high cost of sequencing, the bulk of gene discovery is performed using anonymous cDNA microarrays. Though the clones on such arrays are easier and cheaper to construct and utilize than unigene and oligonucleotide arrays, they are there in proportion to their corresponding gene expression activity in the tissue being examined. The associated redundancy will be there in any pool of possibly interesting differentially expressed clones identified in a microarray experiment for subsequent sequencing and investigation. An a posteriori sampling strategy is proposed to enhance gene discovery by reducing the impact of the redundancy in the identified pool. RESULTS: The proposed strategy exploits the fact that individual genes that are highly expressed in a tissue are more likely to be present as a number of spots in an anonymous library and, as a direct consequence, are also likely to give higher fluorescence intensity responses when present in a probe in a cDNA microarray experiment. Consequently, spots that respond with low intensities will have a lower redundancy and so should be sequenced in preference to those with the highest intensities. The proposed method, which formalizes how the fluorescence intensity of a spot should be assessed, is validated using actual microarray data, where the sequences of all the clones in the identified pool had been previously determined. For such validations, the concept of a repeat plot is introduced. It is also utilized to visualize and examine different measures for the characterization of fluorescence intensity. In addition, as confirmatory evidence, sequencing from the lowest to the highest intensities in a pool, with all the sequences known, is compared graphically with their random sequencing. The results establish that, in general, the opportunity for gene discovery is enhanced by avoiding the pooling of different biological libraries (because their construction will have involved different hybridization episodes) and concentrating on the clones with lower fluorescence intensities.  相似文献   

18.
中国美利奴细毛羊BAC文库的三维PCR筛选   总被引:1,自引:0,他引:1  
本研究利用中国美利奴细毛羊全基因组BAC文库,构建了可供快速筛选的两级水平的混合池,一级混合池和二级混合池(Primary pools and secondary pools).一级混合池基于每一384-well盘而构建,由盘、行,列三维混合池组成,二级混合池基于整个BAC文库而构建.设计了一种基于PCR技术的快速筛选方法,先筛选二级混合池m再根据结果筛选相应的一级混合池.利用此方法只需一步共66个PCR反应即可从BAC丈库中7.4万个克隆中筛选出1个阳性克隆,或三步100个以内的PCR反应筛选出多个阳性克隆.以绵羊基因组多态性分子标记BF94-1为引物,用一步共66个PCR反应成功筛选到1个阳性克隆373D13.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号