首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In physical mapping, one orders a set of genetic landmarks or a library of cloned fragments of DNA according to their position in the genome. Our approach to physical mapping divides the problem into smaller and easier subproblems by partitioning the probe set into independent parts (probe contigs). For this purpose we introduce a new distance function between probes, the averaged rank distance (ARD) derived from bootstrap resampling of the raw data. The ARD measures the pairwise distances of probes within a contig and smoothes the distances of probes across different contigs. It shows distinct jumps at contig borders. This makes it appropriate for contig selection by clustering. We have designed a physical mapping algorithm that makes use of these observations and seems to be particularly well suited to the delineation of reliable contigs. We evaluated our method on data sets from two physical mapping projects. On data from the recently sequenced bacterium Xylella fastidiosa, the probe contig set produced by the new method was evaluated using the probe order derived from the sequence information. Our approach yielded a basically correct contig set. On this data we also compared our method to an approach which uses the number of supporting clones to determine contigs. Our map is much more accurate. In comparison to a physical map of Pasteurella haemolytica that was computed using simulated annealing, the newly computed map is considerably cleaner. The results of our method have already proven helpful for the design of experiments aimed at further improving the quality of a map.  相似文献   

2.
The Self-organizing map (SOM) is an unsupervised learning method based on the neural computation, which has found wide applications. However, the learning process sometime takes multi-stable states, within which the map is trapped to an undesirable disordered state including topological defects on the map. These topological defects critically aggravate the performance of the SOM. In order to overcome this problem, we propose to introduce an asymmetric neighborhood function for the SOM algorithm. Compared with the conventional symmetric one, the asymmetric neighborhood function accelerates the ordering process even in the presence of the defect. However, this asymmetry tends to generate a distorted map. This can be suppressed by an improved method of the asymmetric neighborhood function. In the case of one-dimensional SOM, it is found that the required steps for perfect ordering is numerically shown to be reduced from O(N 3) to O(N 2). We also discuss the ordering process of a twisted state in two-dimensional SOM, which can not be rectified by the ordinary symmetric neighborhood function.  相似文献   

3.
An algorithm for searching restriction maps   总被引:1,自引:0,他引:1  
This paper presents an algorithm thai searches a DNA restrictionenzyme map for regions that approximately match a shorter 'probe'map. Both the map and the probe consist of a sequence of address-enzymepairs denoting restriction sites, and the algorithm penalizesa potential match for undetected or missing sites and for discrepanciesin the distance between adjacent sites. The algorithm was designedspecifically for comparing relatively short DNA sequences witha long restriction map, a problem that will become increasingcommon as large physical maps are generated. The algorithm hasbeen used to extract information from a restriction map of theentire Escherichia coli genome. Received on October 28, 1989; accepted on February 2, 1990  相似文献   

4.
This is the first of four papers that begin to explore the possibility of automated site-directed drug design. A general outline is given of the logical steps involved in approaching the problem. The starting point is the process of knowledge acquisition about the site. An algorithm is described here for the construction of a map of hydrogen-bonding regions at protein surfaces directly from the Brookhaven Protein Data Bank coordinates. Hydrogen-bonding atoms are located, intramolecular bonds are searched for, hydrogen-bonding atoms at the surface are found and hydrogen-bonding regions are computed at the accessible surface. A grid is placed within each region discovered and the probability of hydrogen bonding at each grid point is computed. The output of the program is a map of hydrogen-bonding regions displayed within a user-defined window. This information can be used as part of a knowledge base for the automatic construction of novel ligands to fit specified binding sites.  相似文献   

5.
We consider the problem of finding a subnetwork in a given biological network (i.e. target network) that is most similar to a given small query network. We aim to find the optimal solution (i.e. the subnetwork with the largest alignment score) with a provable confidence bound. There is no known polynomial time solution to this problem in the literature. Alon et al. has developed a state-of-the-art coloring method that reduces the cost of this problem. This method randomly colors the target network prior to alignment for many iterations until a user-supplied confidence is reached. Here we develop a novel coloring method, named k-hop coloring (k is a positive integer), that achieves a provable confidence value in a small number of iterations without sacrificing the optimality. Our method considers the color assignments already made in the neighborhood of each target network node while assigning a color to a node. This way, it preemptively avoids many color assignments that are guaranteed to fail to produce the optimal alignment. We also develop a filtering method that eliminates the nodes that cannot be aligned without reducing the alignment score after each coloring instance. We demonstrate both theoretically and experimentally that our coloring method outperforms that of Alon et al., which is also used by a number network alignment methods, including QPath and QNet, by a factor of three without reducing the confidence in the optimality of the result. Our experiments also suggest that the resulting alignment method is capable of identifying functionally enriched regions in the target network successfully.  相似文献   

6.
Information processing in the human brain arises from both interactions between adjacent areas and from distant projections that form distributed brain systems. Here we map interactions across different spatial scales by estimating the degree of intrinsic functional connectivity for the local (≤14 mm) neighborhood directly surrounding brain regions as contrasted with distant (>14 mm) interactions. The balance between local and distant functional interactions measured at rest forms a map that separates sensorimotor cortices from heteromodal association areas and further identifies regions that possess both high local and distant cortical-cortical interactions. Map estimates of network measures demonstrate that high local connectivity is most often associated with a high clustering coefficient, long path length, and low physical cost. Task performance changed the balance between local and distant functional coupling in a subset of regions, particularly, increasing local functional coupling in regions engaged by the task. The observed properties suggest that the brain has evolved a balance that optimizes information-processing efficiency across different classes of specialized areas as well as mechanisms to modulate coupling in support of dynamically changing processing demands. We discuss the implications of these observations and applications of the present method for exploring normal and atypical brain function.  相似文献   

7.
In this paper, heuristic solution techniques for the multi-objective orienteering problem are developed. The motivation stems from the problem of planning individual tourist routes in a city. Each point of interest in a city provides different benefits for different categories (e.g., culture, shopping). Each tourist has different preferences for the different categories when selecting and visiting the points of interests (e.g., museums, churches). Hence, a multi-objective decision situation arises. To determine all the Pareto optimal solutions, two metaheuristic search techniques are developed and applied. We use the Pareto ant colony optimization algorithm and extend the design of the variable neighborhood search method to the multi-objective case. Both methods are hybridized with path relinking procedures. The performances of the two algorithms are tested on several benchmark instances as well as on real world instances from different Austrian regions and the cities of Vienna and Padua. The computational results show that both implemented methods are well performing algorithms to solve the multi-objective orienteering problem.  相似文献   

8.
Self-organizing maps: stationary states,metastability and convergence rate   总被引:1,自引:0,他引:1  
We investigate the effect of various types of neighborhood function on the convergence rates and the presence or absence of metastable stationary states of Kohonen's self-organizing feature map algorithm in one dimension. We demonstrate that the time necessary to form a topographic representation of the unit interval [0, 1] may vary over several orders of magnitude depending on the range and also the shape of the neighborhood function, by which the weight changes of the neurons in the neighborhood of the winning neuron are scaled. We will prove that for neighborhood functions which are convex on an interval given by the length of the Kohonen chain there exist no metastable states. For all other neighborhood functions, metastable states are present and may trap the algorithm during the learning process. For the widely-used Gaussian function there exists a threshold for the width above which metastable states cannot exist. Due to the presence or absence of metastable states, convergence time is very sensitive to slight changes in the shape of the neighborhood function. Fastest convergence is achieved using neighborhood functions which are "convex" over a large range around the winner neuron and yet have large differences in value at neighboring neurons.  相似文献   

9.
Hotspots of meiotic recombination can change rapidly over time. This instability and the reported high level of inter-individual variation in meiotic recombination puts in question the accuracy of the calculated hotspot map, which is based on the summation of past genetic crossovers. To estimate the accuracy of the computed recombination rate map, we have mapped genetic crossovers to a median resolution of 70 Kb in 10 CEPH pedigrees. We then compared the positions of crossovers with the hotspots computed from HapMap data and performed extensive computer simulations to compare the observed distributions of crossovers with the distributions expected from the calculated recombination rate maps. Here we show that a population-averaged hotspot map computed from linkage disequilibrium data predicts well present-day genetic crossovers. We find that computed hotspot maps accurately estimate both the strength and the position of meiotic hotspots. An in-depth examination of not-predicted crossovers shows that they are preferentially located in regions where hotspots are found in other populations. In summary, we find that by combining several computed population-specific maps we can capture the variation in individual hotspots to generate a hotspot map that can predict almost all present-day genetic crossovers.  相似文献   

10.
Cryo-elecron microscopy (cryo-EM) can provide important structural information of large macromolecular assemblies in different conformational states. Recent years have seen an increase in structures deposited in the Protein Data Bank (PDB) by fitting a high-resolution structure into its low-resolution cryo-EM map. A commonly used protocol for accommodating the conformational changes between the X-ray structure and the cryo-EM map is rigid body fitting of individual domains. With the emergence of different flexible fitting approaches, there is a need to compare and revise these different protocols for the fitting. We have applied three diverse automated flexible fitting approaches on a protein dataset for which rigid domain fitting (RDF) models have been deposited in the PDB. In general, a consensus is observed in the conformations, which indicates a convergence from these theoretically different approaches to the most probable solution corresponding to the cryo-EM map. However, the result shows that the convergence might not be observed for proteins with complex conformational changes or with missing densities in cryo-EM map. In contrast, RDF structures deposited in the PDB can represent conformations that not only differ from the consensus obtained by flexible fitting but also from X-ray crystallography. Thus, this study emphasizes that a "consensus" achieved by the use of several automated flexible fitting approaches can provide a higher level of confidence in the modeled configurations. Following this protocol not only increases the confidence level of fitting, but also highlights protein regions with uncertain fitting. Hence, this protocol can lead to better interpretation of cryo-EM data.  相似文献   

11.
12.
An algorithm for the solution of the Maximum Entropy problem is presented, for use when the data are considerably oversampled, so that the amount of independent information they contain is very much less than the actual number of data points. The application of general purpose entropy maximisation methods is then comparatively inefficient. In this algorithm the independent variables are in the singular space of the transform between map (or image or spectrum) and data. These variables are much fewer in number than either the data or the reconstructed map, resulting in a fast and accurate algorithm. The speed of this algorithm makes feasible the incorporation of recent ideas in maximum entropy theory (Skilling 1989 a; Gull 1989). This algorithm is particularly appropriate for the exponential decay problem, solution scattering, fibre diffraction, and similar applications.  相似文献   

13.
Bhandarkar SM  Machaka SA  Shete SS  Kota RN 《Genetics》2001,157(3):1021-1043
Reconstructing a physical map of a chromosome from a genomic library presents a central computational problem in genetics. Physical map reconstruction in the presence of errors is a problem of high computational complexity that provides the motivation for parallel computing. Parallelization strategies for a maximum-likelihood estimation-based approach to physical map reconstruction are presented. The estimation procedure entails a gradient descent search for determining the optimal spacings between probes for a given probe ordering. The optimal probe ordering is determined using a stochastic optimization algorithm such as simulated annealing or microcanonical annealing. A two-level parallelization strategy is proposed wherein the gradient descent search is parallelized at the lower level and the stochastic optimization algorithm is simultaneously parallelized at the higher level. Implementation and experimental results on a distributed-memory multiprocessor cluster running the parallel virtual machine (PVM) environment are presented using simulated and real hybridization data.  相似文献   

14.
We introduce an unsupervised competitive learning rule, called the extended Maximum Entropy learning Rule (eMER), for topographic map formation. Unlike Kohonen's Self-Organizing Map (SOM) algorithm, the presence of a neighborhood function is not a prerequisite for achieving topology-preserving mappings, but instead it is intended: (1) to speed up the learning process and (2) to perform nonparametric regression. We show that, when the neighborhood function vanishes, the neural weigh t density at convergence approaches a linear function of the input density so that the map can be regarded as a nonparametric model of the input density. We apply eMER to density estimation and compare its performance with that of the SOM algorithm and the variable kernel method. Finally, we apply the ‘batch’ version of eMER to nonparametric projection pursuit regression and compare its performance with that of back-propagation learning, projection pursuit learning, constrained topolog ical mapping, and the Heskes and Kappen approach. Received: 12 August 1996 / Accepted in revised form: 9 April 1997  相似文献   

15.
Bennett GG  McNeill LH  Wolin KY  Duncan DT  Puleo E  Emmons KM 《PLoS medicine》2007,4(10):1599-606; discussion 1607

Background

Despite its health benefits, physical inactivity is pervasive, particularly among those living in lower-income urban communities. In such settings, neighborhood safety may impact willingness to be regularly physically active. We examined the association of perceived neighborhood safety with pedometer-determined physical activity and physical activity self-efficacy.

Methods and Findings

Participants were 1,180 predominantly racial/ethnic minority adults recruited from 12 urban low-income housing complexes in metropolitan Boston. Participants completed a 5-d pedometer data-collection protocol and self-reported their perceptions of neighborhood safety and self-efficacy (i.e., confidence in the ability to be physically active). Gender-stratified bivariate and multivariable random effects models were estimated to account for within-site clustering. Most participants reported feeling safe during the day, while just over one-third (36%) felt safe at night. We found no association between daytime safety reports and physical activity among both men and women. There was also no association between night-time safety reports and physical activity among men (p = 0.23) but women who reported feeling unsafe (versus safe) at night showed significantly fewer steps per day (4,302 versus 5,178, p = 0.01). Perceiving one''s neighborhood as unsafe during the day was associated with significantly lower odds of having high physical activity self-efficacy among both men (OR 0.40, p = 0.01) and women (OR 0.68, p = 0.02).

Conclusions

Residing in a neighborhood that is perceived to be unsafe at night is a barrier to regular physical activity among individuals, especially women, living in urban low-income housing. Feeling unsafe may also diminish confidence in the ability to be more physically active. Both of these factors may limit the effectiveness of physical activity promotion strategies delivered in similar settings.  相似文献   

16.
Post-processing long pairwise alignments   总被引:2,自引:0,他引:2  
MOTIVATION: The local alignment problem for two sequences requires determining similar regions, one from each sequence, and aligning those regions. For alignments computed by dynamic programming, current approaches for selecting similar regions may have potential flaws. For instance, the criterion of Smith and Waterman can lead to inclusion of an arbitrarily poor internal segment. Other approaches can generate an alignment scoring less than some of its internal segments. RESULTS: We develop an algorithm that decomposes a long alignment into sub-alignments that avoid these potential imperfections. Our algorithm runs in time proportional to the original alignment's length. Practical applications to alignments of genomic DNA sequences are described.  相似文献   

17.
《Genomics》1999,55(1):78-87
We have developed an integrated physical mapping computer software package (IMP), originally designed to support the physical mapping of human chromosome 13 and expanded to support several gene-identification projects based on the positional candidate approach. IMP displays map data in a form that provides useful guidelines to the end users. An integrated map with high resolution and confidence is constructed from different types of mapping data, including hybridization experiments, STS-based PCR assays, genetic linkage mapping, cDNA localization, and FISH data. The map is also designed to provide suggestions for specific experiments that are required to obtain maps with even higher resolution and confidence. To this end, the optimization employs multiple constraints that take into account already established STS “scaffold” maps. This software thus serves as an important general tool kit for physical mapping, sequencing, and gene-hunting projects.  相似文献   

18.
Prediction of topological representations of proteins that are geometrically invariants can contribute towards the solution of fundamental open problems in structural genomics like folding. In this paper we focus on coarse grained protein contact maps, a representation that describes the spatial neighborhood relation between secondary structure elements such as helices, beta sheets, and random coils. Our methodology is based on searching the graph space. The search algorithm is guided by an adaptive evaluation function computed by a specialized noncausal recursive connectionist architecture. The neural network is trained using candidate graphs generated during examples of successful searches. Our results demonstrate the viability of the approach for predicting coarse contact maps.  相似文献   

19.
MOTIVATION: STS-content data for genomic mapping contain numerous errors and anomalies resulting in cross-links among distant regions of the genome. Identification of contigs within the data is an important and difficult problem. RESULTS: This paper introduces a graph algorithm which creates a simplified view of STS-content data. The shape of the resulting structure graph provides a quality check - coherent data produce a straight line, while anomalous data produce branches and loops. In the latter case, it is sometimes possible to disentangle the various paths into subsets of the data covering contiguous regions of the genome, i.e. contigs. These straight subgraphs can then be analyzed in standard ways to construct a physical map. A theoretical basis for the method is presented along with examples of its application to current STS data from human genome centers. AVAILABILITY: Freely available on request.  相似文献   

20.
A large class of neural network models have their units organized in a lattice with fixed topology or generate their topology during the learning process. These network models can be used as neighborhood preserving map of the input manifold, but such a structure is difficult to manage since these maps are graphs with a number of nodes that is just one or two orders of magnitude less than the number of input points (i.e., the complexity of the map is comparable with the complexity of the manifold) and some hierarchical algorithms were proposed in order to obtain a high-level abstraction of these structures. In this paper a general structure capable to extract high order information from the graph generated by a large class of self-organizing networks is presented. This algorithm will allow to build a two layers hierarchical structure starting from the results obtained by using the suitable neural network for the distribution of the input data. Moreover the proposed algorithm is also capable to build a topology preserving map if it is trained using a graph that is also a topology preserving map.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号