首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The problem of multiple surface clustering is a challenging task, particularly when the surfaces intersect. Available methods such as Isomap fail to capture the true shape of the surface near by the intersection and result in incorrect clustering. The Isomap algorithm uses shortest path between points. The main draw back of the shortest path algorithm is due to the lack of curvature constrained where causes to have a path between points on different surfaces. In this paper we tackle this problem by imposing a curvature constraint to the shortest path algorithm used in Isomap. The algorithm chooses several landmark nodes at random and then checks whether there is a curvature constrained path between each landmark node and every other node in the neighborhood graph. We build a binary feature vector for each point where each entry represents the connectivity of that point to a particular landmark. Then the binary feature vectors could be used as a input of conventional clustering algorithm such as hierarchical clustering. We apply our method to simulated and some real datasets and show, it performs comparably to the best methods such as K-manifold and spectral multi-manifold clustering.  相似文献   

2.
Multiple alignment is an important problem in computational biology. It is well known that it can be solved exactly by a dynamic programming algorithm which in turn can be interpreted as a shortest path computation in a directed acyclic graph. The A* algorithm (or goal-directed unidirectional search) is a technique that speeds up the computation of a shortest path by transforming the edge lengths without losing the optimality of the shortest path. We implemented the A* algorithm in a computer program similar to MSA (Gupta et al., 1995) and FMA (Shibuya and Imai, 1997). We incorporated in this program new bounding strategies for both lower and upper bounds and show that the A* algorithm, together with our improvements, can speed up computations considerably. Additionally, we show that the A* algorithm together with a standard bounding technique is superior to the well-known Carrillo-Lipman bounding since it excludes more nodes from consideration.  相似文献   

3.
Buzas JS  Wager CG  Lansky DM 《Biometrics》2011,67(4):1189-1196
This article explores effective implementation of split-plot designs in serial dilution bioassay using robots. We show that the shortest path for a robot to fill plate wells for a split-plot design is equivalent to the shortest common supersequence problem in combinatorics. We develop an algorithm for finding the shortest common supersequence, provide an R implementation, and explore the distribution of the number of steps required to implement split-plot designs for bioassay through simulation. We also show how to construct collections of split plots that can be filled in a minimal number of steps, thereby demonstrating that split-plot designs can be implemented with nearly the same effort as strip-plot designs. Finally, we provide guidelines for modeling data that result from these designs.  相似文献   

4.
Yang J  Chen Y 《PloS one》2011,6(7):e22557
Betweenness centrality is an essential index for analysis of complex networks. However, the calculation of betweenness centrality is quite time-consuming and the fastest known algorithm uses O(N(M + N log N)) time and O(N + M) space for weighted networks, where N and M are the number of nodes and edges in the network, respectively. By inserting virtual nodes into the weighted edges and transforming the shortest path problem into a breadth-first search (BFS) problem, we propose an algorithm that can compute the betweenness centrality in O(wDN2) time for integer-weighted networks, where w is the average weight of edges and D is the average degree in the network. Considerable time can be saved with the proposed algorithm when w < log N/D + 1, indicating that it is suitable for lightly weighted large sparse networks. A similar concept of virtual node transformation can be used to calculate other shortest path based indices such as closeness centrality, graph centrality, stress centrality, and so on. Numerical simulations on various randomly generated networks reveal that it is feasible to use the proposed algorithm in large network analysis.  相似文献   

5.
Analysis of the pattern of the chromosomal localization of quantitative trait loci (QTLs) is necessary for comprehensively understanding their functions. The chromosomal localization of QTLs controlling milk production traits has been studied in cattle chromosomes. The distribution of QTLs between chromosomes has proved to be binomial. Their distribution along each chromosome was, in general, uniform, except for the QTLs controlling the somatic cell score (SCS), which tended towards telomeric location. However, there are chromosomes either enriched with or particularly poor in QTLs. The QTL distribution patters are the most similar for the milk yield (M) and milk protein yield (P) and for milk fat yield (F) and milk fat content (%F). The pattern of the SCS QTLs stands out among those of other QTLs. The distance between the QTLs of contrasting traits is the shortest for M and P QTLs, longer for M and milk protein content (%P) QTLs, and still longer for M and %F QTLs, which may be explained by QTL pleiotropy, a common phenomenon in cattle.  相似文献   

6.
We report that for population data, where sequences are very similar to one another, it is often possible to use a two-pronged (MinMax Squeeze) approach to prove that a tree is the shortest possible under the parsimony criterion. Such population data can be in a range where parsimony is a maximum likelihood estimator. This is in sharp contrast to the case with species data, where sequences are much further apart and the problem of guaranteeing an optimal phylogenetic tree is known to be computationally prohibitive for realistic numbers of species, irrespective of whether likelihood or parsimony is the optimality criterion. The Squeeze uses both an upper bound (the length of the shortest tree known) and a lower bound derived from partitions of the columns (the length of the shortest tree possible). If the two bounds meet, the shortest known tree is thus proven to be a shortest possible tree. The implementation is first tested on simulated data sets and then applied to 53 complete human mitochondrial genomes. The shortest possible trees for those data have several significant improvements from the published tree. Namely, a pair of Australian lineages comes deeper in the tree (in agreement with archaeological data), and the non-African part of the tree shows greater agreement with the geographical distribution of lineages.  相似文献   

7.

Background

Somatic cell score (SCS) has been promoted as a selection criterion to improve mastitis resistance. However, SCS from healthy and infected animals may be considered as separate traits. Moreover, imperfect sensitivity and specificity could influence animals'' classification and impact on estimated variance components. This study was aimed at: (1) estimating the heritability of bacteria negative SCS, bacteria positive SCS, and infection status, (2) estimating phenotypic and genetic correlations between bacteria negative and bacteria positive SCS, and the genetic correlation between bacteria negative SCS and infection status, and (3) evaluating the impact of imperfect diagnosis of infection on variance component estimates.

Methods

Data on SCS and udder infection status for 1,120 ewes were collected from four Valle del Belice flocks. The pedigree file included 1,603 animals. The SCS dataset was split according to whether animals were infected or not at the time of sampling. A repeatability test-day animal model was used to estimate genetic parameters for SCS traits and the heritability of infection status. The genetic correlation between bacteria negative SCS and infection status was estimated using an MCMC threshold model, implemented by Gibbs Sampling.

Results

The heritability was 0.10 for bacteria negative SCS, 0.03 for bacteria positive SCS, and 0.09 for infection status, on the liability scale. The genetic correlation between bacteria negative and bacteria positive SCS was 0.62, suggesting that they may be genetically different traits. The genetic correlation between bacteria negative SCS and infection status was 0.51. We demonstrate that imperfect diagnosis of infection leads to underestimation of differences between bacteria negative and bacteria positive SCS, and we derive formulae to predict impacts on estimated genetic parameters.

Conclusions

The results suggest that bacteria negative and bacteria positive SCS are genetically different traits. A positive genetic correlation between bacteria negative SCS and liability to infection was found, suggesting that the approach of selecting animals for decreased SCS should help to reduce mastitis prevalence. However, the results show that imperfect diagnosis of infection has an impact on estimated genetic parameters, which may reduce the efficiency of selection strategies aiming at distinguishing between bacteria negative and bacteria positive SCS.  相似文献   

8.
The shortest common supersequence problem is a classical problem with many applications in different fields such as planning, Artificial Intelligence and especially in Bioinformatics. Due to its NP-hardness, we can not expect to efficiently solve this problem using conventional exact techniques. This paper presents a heuristic to tackle this problem based on the use at different levels of a probabilistic variant of a classical heuristic known as Beam Search. The proposed algorithm is empirically analysed and compared to current approaches in the literature. Experiments show that it provides better quality solutions in a reasonable time for medium and large instances of the problem. For very large instances, our heuristic also provides better solutions, but required execution times may increase considerably.  相似文献   

9.

Background  

Interaction graphs (signed directed graphs) provide an important qualitative modeling approach for Systems Biology. They enable the analysis of causal relationships in cellular networks and can even be useful for predicting qualitative aspects of systems dynamics. Fundamental issues in the analysis of interaction graphs are the enumeration of paths and cycles (feedback loops) and the calculation of shortest positive/negative paths. These computational problems have been discussed only to a minor extent in the context of Systems Biology and in particular the shortest signed paths problem requires algorithmic developments.  相似文献   

10.
Sorting by reciprocal translocations via reversals theory.   总被引:1,自引:0,他引:1  
The understanding of genome rearrangements is an important endeavor in comparative genomics. A major computational problem in this field is finding a shortest sequence of genome rearrangements that transforms, or sorts, one genome into another. In this paper we focus on sorting a multi-chromosomal genome by translocations. We reveal new relationships between this problem and the well studied problem of sorting by reversals. Based on these relationships, we develop two new algorithms for sorting by reciprocal translocations, which mimic known algorithms for sorting by reversals: a score-based method building on Bergeron's algorithm, and a recursive procedure similar to the Berman-Hannenhalli method. Though their proofs are more involved, our procedures for reciprocal translocations match the complexities of the original ones for reversals.  相似文献   

11.
In the increasingly competitive global markets, enterprises face challenges in responding to customer orders quickly, as well as producing customized products cost-effectively. This paper proposes a dynamic heuristic-based algorithm for the part input sequencing problem of flexible manufacturing systems (FMSs) in a mass customization (MC) environment. The FMS manufactures a variety of parts, and customer orders arrive dynamically with order size as small as one. Segmental set functions are established in the proposed algorithm to apply the strategy of dynamic workload balancing, and the shortest processing time (SPT) scheduling rule. Theoretical analysis is performed and the effectiveness of the algorithm in dynamic workload balancing under the complex and dynamic environment is proven. The application of the algorithm is illustrated by an example. The potential of its practical applications to the FMSs in make-to-order (MTO) supply chains is also discussed. Further research is provided.  相似文献   

12.
A novel ACO algorithm for optimization via reinforcement and initial bias   总被引:1,自引:0,他引:1  
In this paper, we introduce the MAF-ACO algorithm, which emulates the foraging behavior of ants found in nature. In addition to the usual pheromone model present in ACO algorithms, we introduce an incremental learning component. We view the components of the MAF-ACO algorithm as stochastic approximation algorithms and use the ordinary differential equation (o.d.e.) method to analyze their convergence. We examine how the local stigmergic interaction of the individual ants results in an emergent dynamic programming framework. The MAF-ACO algorithm is also applied to the multi-stage shortest path problem and the traveling salesman problem. Research of Prof. V.S. Borkar was supported in part by grant no. III.5(157)/99-ET and a J.C. Bose Fellowship from the Department of Science and Technology, Government of India.  相似文献   

13.
Comparing and computing distances between phylogenetic trees are important biological problems, especially for models where edge lengths play an important role. The geodesic distance measure between two phylogenetic trees with edge lengths is the length of the shortest path between them in the continuous tree space introduced by Billera, Holmes, and Vogtmann. This tree space provides a powerful tool for studying and comparing phylogenetic trees, both in exhibiting a natural distance measure and in providing a euclidean-like structure for solving optimization problems on trees. An important open problem is to find a polynomial time algorithm for finding geodesics in tree space. This paper gives such an algorithm, which starts with a simple initial path and moves through a series of successively shorter paths until the geodesic is attained.  相似文献   

14.
Minimization of the makespan of a printed circuit board assembly process is a complex problem. Decisions involved in this problem concern the specification of the order in which components are to be placed on the board and the assignment of component types to the feeder slots of the placement machine. If some component types are assigned to multiple feeder slots, an additional problem emerges: for each placement on the board, one must select the feeder slot from which the required component is to be retrieved. In this paper, we consider this component retrieval problem for placement machines of the Fuji CP type. We explain why simple forward dynamic programming schemes cannot provide a solution to this problem, invalidating the correctness of an algorithm proposed by Bard, Clayton, and Feo (1994). We then present a polynomial algorithm that solves the problem to optimality. The analysis of the component retrieval problem is facilitated by its reformulation as a PERT/CPM problem with design aspects: finding the minimal makespan of the assembly process amounts to identifying a design for which the longest path in the induced PERT/CPM network is shortest. The complexity of this network problem is analyzed, and we prove that the polynomial solvability of the component retrieval problem is caused by the specific structure it inflicts on the arc lengths of the network: in the absence of this structure, the network problem is shown to be NP-hard.  相似文献   

15.
An artificial neural network with a two-layer feedback topology and generalized recurrent neurons, for solving nonlinear discrete dynamic optimization problems, is developed. A direct method to assign the weights of neural networks is presented. The method is based on Bellmann's Optimality Principle and on the interchange of information which occurs during the synaptic chemical processing among neurons. The neural network based algorithm is an advantageous approach for dynamic programming due to the inherent parallelism of the neural networks; further it reduces the severity of computational problems that can occur in methods like conventional methods. Some illustrative application examples are presented to show how this approach works out including the shortest path and fuzzy decision making problems.  相似文献   

16.
A graphics processing unit (GPU) has been widely used to accelerate discrete optimization problems. In this paper, we introduce a novel hybrid parallel algorithm to generate a shortest addition chain for a positive integer e. The main idea of the proposed algorithm is to divide the search tree into a sequence of three subtrees: top, middle, and bottom. The top subtree works using a branch and bound depth first strategy. The middle subtree works using a branch and bound breadth first strategy, while the bottom subtree works using a parallel (GPU) branch and bound depth first strategy. Our experimental results show that, compared to the fastest sequential algorithm for generating a shortest addition chain, we improve the generation by about 70% using one GPU (NVIDIA GeForce GTX 770). For generating all shortest addition chains, the percentage of the improvement is about 50%.  相似文献   

17.
We propose new algorithms for computing pairwise rearrangement scenarios that conserve the combinatorial structure of genomes. More precisely, we investigate the problem of sorting signed permutations by reversals without breaking common intervals. We describe a combinatorial framework for this problem that allows us to characterize classes of signed permutations for which one can compute, in polynomial time, a shortest reversal scenario that conserves all common intervals. In particular, we define a class of permutations for which this computation can be done in linear time with a very simple algorithm that does not rely on the classical Hannenhalli-Pevzner theory for sorting by reversals. We apply these methods to the computation of rearrangement scenarios between permutations obtained from 16 synteny blocks of the X chromosomes of the human, mouse, and rat  相似文献   

18.
An analogy between the evolution of organisms and some complex computational problems (cryptosystem cracking, determination of the shortest path in a graph) is considered. It is shown that in the absence of a priori information about possible species of organisms such a problem is complex (is rated in the class NP) and cannot be solved in a polynomial number of steps. This conclusion suggests the need for re-examination of evolution mechanisms. Ideas of a deterministic approach to the evolution are discussed.  相似文献   

19.
Biomedical spectroscopic experiments generate large volumes of data. For accurate, robust diagnostic tools the data must be analyzed for only a few characteristic observations per subject, and a large number of subjects must be studied. We describe here two of the current data analytic approaches applied to this problem: SIMCA (principal component analysis, partial least squares), and the statistical classification strategy (SCS). We demonstrate the application of the SCS by three examples of its use in analyzing 1H NMR spectra: screening for colon cancer, characterization of thyroid cancer, and distinguishing cancer from cholangitis in the biliary tract.  相似文献   

20.
One ultimate goal of metabolic network modeling is the rational redesign of biochemical networks to optimize the production of certain compounds by cellular systems. Although several constraint-based optimization techniques have been developed for this purpose, methods for systematic enumeration of intervention strategies in genome-scale metabolic networks are still lacking. In principle, Minimal Cut Sets (MCSs; inclusion-minimal combinations of reaction or gene deletions that lead to the fulfilment of a given intervention goal) provide an exhaustive enumeration approach. However, their disadvantage is the combinatorial explosion in larger networks and the requirement to compute first the elementary modes (EMs) which itself is impractical in genome-scale networks.We present MCSEnumerator, a new method for effective enumeration of the smallest MCSs (with fewest interventions) in genome-scale metabolic network models. For this we combine two approaches, namely (i) the mapping of MCSs to EMs in a dual network, and (ii) a modified algorithm by which shortest EMs can be effectively determined in large networks. In this way, we can identify the smallest MCSs by calculating the shortest EMs in the dual network. Realistic application examples demonstrate that our algorithm is able to list thousands of the most efficient intervention strategies in genome-scale networks for various intervention problems. For instance, for the first time we could enumerate all synthetic lethals in E.coli with combinations of up to 5 reactions. We also applied the new algorithm exemplarily to compute strain designs for growth-coupled synthesis of different products (ethanol, fumarate, serine) by E.coli. We found numerous new engineering strategies partially requiring less knockouts and guaranteeing higher product yields (even without the assumption of optimal growth) than reported previously. The strength of the presented approach is that smallest intervention strategies can be quickly calculated and screened with neither network size nor the number of required interventions posing major challenges.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号