期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sampling Solution Traces for the Problem of Sorting Permutations by Signed Reversals

Baudet C Dias Z Sagot MF 《Algorithms for molecular biology : AMB》2012,7(1):18

ABSTRACT: BACKGROUND: Traditional algorithms to solve the problem of sorting by signed reversals output just one optimal solution while the space of all optimal solutions can be huge. A so-called trace represents a group of solutions which share the same set of reversals that must be applied to sort the original permutation following a partial ordering. By using traces, we therefore can represent the set of optimal solutions in a more compact way. Algorithms for enumerating the complete set of traces of solutions were developed. However, due to their exponential complexity, their practical use is limited to small permutations. A partial enumeration of traces is a sampling of the complete set of traces and can be an alternative for the study of distinct evolutionary scenarios of big permutations. Ideally, the sampling should be done uniformly from the space of all optimal solutions. This is however conjectured to be #P-complete. RESULTS: We propose and evaluate three algorithms for producing a sampling of the complete set of traces that instead can be shown in practice to preserve some of the characteristics of the space of all solutions. The first algorithm (RA) performs the construction of traces through a random selection of reversals on the list of optimal 1-sequences. The second algorithm (DFALT) consists in a slight modification of an algorithm that performs the complete enumeration of traces. Finally, the third algorithm (SWA) is based on a sliding window strategy to improve the enumeration of traces. All proposed algorithms were able to enumerate traces for permutations with up to 200 elements. CONCLUSIONS: We analysed the distribution of the enumerated traces with respect to their height and average reversal length. Various works indicate that the reversal length can be an important aspect in genome rearrangements. The algorithms RA and SWA show a tendency to lose traces with high average reversal length. Such traces are however rare, and qualitatively our results show that, for testable-sized permutations, the algorithms DFALT and SWA produce distributions which approximate the reversal length distributions observed with a complete enumeration of the set of traces. 相似文献

2.

Sorting by reciprocal translocations via reversals theory. 总被引：1，自引：0，他引：1

Michal Ozery-Flato Ron Shamir 《Journal of computational biology》2007,14(4):408-422

The understanding of genome rearrangements is an important endeavor in comparative genomics. A major computational problem in this field is finding a shortest sequence of genome rearrangements that transforms, or sorts, one genome into another. In this paper we focus on sorting a multi-chromosomal genome by translocations. We reveal new relationships between this problem and the well studied problem of sorting by reversals. Based on these relationships, we develop two new algorithms for sorting by reciprocal translocations, which mimic known algorithms for sorting by reversals: a score-based method building on Bergeron's algorithm, and a recursive procedure similar to the Berman-Hannenhalli method. Though their proofs are more involved, our procedures for reciprocal translocations match the complexities of the original ones for reversals. 相似文献

3.

Two notes on genome rearrangement

Ozery-Flato M Shamir R 《Journal of bioinformatics and computational biology》2003,1(1):71-94

A central problem in genome rearrangement is finding a most parsimonious rearrangement scenario using certain rearrangement operations. An important problem of this type is sorting a signed genome by reversals and translocations (SBRT). Hannenhalli and Pevzner presented a duality theorem for SBRT which leads to a polynomial time algorithm for sorting a multi-chromosomal genome using a minimum number of reversals and translocations. However, there is one case for which their theorem and algorithm fail. We describe that case and suggest a correction to the theorem and the polynomial algorithm. The solution of SBRT uses a reduction to the problem of sorting a signed permutation by reversals (SBR). The best extant algorithms for SBR require quadratic time. The common approach to solve SBR is by finding a safe reversal using the overlap graph or the interleaving graph of a permutation. We describe a family of signed permutations which proves a quadratic lower bound on the number of affected vertices in the overlap/interleaving graph during any optimal sorting scenario. This implies, in particular, an Omega(n3) lower bound for Bergeron's algorithm. 相似文献

4.

Perfect sorting by reversals is not always difficult

Bérard S Bergeron A Chauve C Paul C 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2007,4(1):4-16

We propose new algorithms for computing pairwise rearrangement scenarios that conserve the combinatorial structure of genomes. More precisely, we investigate the problem of sorting signed permutations by reversals without breaking common intervals. We describe a combinatorial framework for this problem that allows us to characterize classes of signed permutations for which one can compute, in polynomial time, a shortest reversal scenario that conserves all common intervals. In particular, we define a class of permutations for which this computation can be done in linear time with a very simple algorithm that does not rely on the classical Hannenhalli-Pevzner theory for sorting by reversals. We apply these methods to the computation of rearrangement scenarios between permutations obtained from 16 synteny blocks of the X chromosomes of the human, mouse, and rat 相似文献

5.

Common intervals and sorting by reversals: a marriage of necessity

Bergeron A Heber S Stoye J 《Bioinformatics (Oxford, England)》2002,18(Z2):S54-S63

This paper revisits the problem of sorting by reversals with tools developed in the context of detecting common intervals. Mixing the two approaches yields new definitions and algorithms for the reversal distance computations, that apply directly on the original permutation. Traditional constructions such as recasting the signed permutation as a positive permutation, or traversing the overlap graph to analyze its connected components, are replaced by elementary definitions in terms of intervals of the permutation. This yields simple linear time algorithms that identify the essential features in a single pass over the permutation and use only simple data structures like arrays and stacks. 相似文献

6.

An algorithm to enumerate sorting reversals for signed permutations. 总被引：1，自引：0，他引：1

Adam C Siepel 《Journal of computational biology》2003,10(3-4):575-597

The rearrangement distance between single-chromosome genomes can be estimated as the minimum number of inversions required to transform the gene ordering observed in one into that observed in the other. This measure, known as "inversion distance," can be computed as the reversal distance between signed permutations. During the past decade, much progress has been made both on the problem of computing reversal distance and on the related problem of finding a minimum-length sequence of reversals, which is known as "sorting by reversals." For most problem instances, however, many minimum-length sequences of reversals exist, and in the absence of auxiliary information, no one is of greater value than the others. The problem of finding all minimum-length sequences of reversals is thus a natural generalization of sorting by reversals, yet it has received little attention. This problem reduces easily to the problem of finding all "sorting reversals" of one permutation with respect to another - that is, all reversals rho such that, if rho is applied to one permutation, then the reversal distance of that permutation from the other is decreased. In this paper, an efficient algorithm is derived to solve the problem of finding all sorting reversals, and experimental results are presented indicating that, while the new algorithm does not represent a significant improvement in asymptotic terms (it takes O(n(3)) time, for permutations of size n; the problem can now be solved by brute force in Theta(n(3)) time), it performs dramatically better in practice than the best known alternative. An implementation of the algorithm is available at www.cse.ucsc.edu/~acs. 相似文献

7.

Reversal of tissue position after cell sorting 总被引：1，自引：0，他引：1

P B Armstrong R Niederman 《Developmental biology》1972,28(3):518-527

In most tissue combinations which have been studied, cell sorting results in highly reproducible tissue patterns. In a given combination of two cell types, one cell type always occupies the perimeter of the aggregate and the other always occupies the interior. In some tissue combinations, however, position reversals have been observed to occur. The present study deals with position reversals in chick embryo heart-pigmented retinal epithelium and heart-limb bud mesenchyme tissue combinations. Evidence is presented which implies that this reversal of tissue position results from changes in cellular adhesiveness during cell culture. 相似文献

8.

A model for the minimum cost configuration problem in flexible manufacturing systems

Ulrich A. W. Tetzlaff 《Flexible Services and Manufacturing Journal》1995,7(2):127-146

This paper presents a mathematical programming model to help select equipment for a flexible manufacturing system, i.e., the selection of the types and numbers of CNC machines, washing stations, load/unload stations, transportation vehicles, and pallets. The objective is to minimize equipment costs and work-in-process inventory cost, while fulfilling production requirements for an average period. Queueing aspects and part flow interactions are considered with the help of a Jacksonian-type closed queueing network model in order to evaluate the system's performance. Since the related decision problem of our model can be shown to be NP-complete, the proposed solution procedure is based on implicit enumeration. Four bounds are provided, two lower and two upper bounds. A tight lower bound is obtained by linearizing the model through the application of asymptotic bound analysis. Furthermore, asymptotic bound analysis allows the calculation of a lower bound for the number of pallets in the system. The first upper bound is given by the best feasible solution and the second is based on the anti-starshaped form of the throughput function. 相似文献

9.

The evolution of A-, F-, and V-type ATP synthases and ATPases: reversals in function and changes in the H+/ATP coupling ratio

Cross RL Müller V 《FEBS letters》2004,576(1-2):1-4

Members of the FoF1, AoA1 and VoV1 family of ATP synthases and ATPases have undergone at least two reversals in primary function. The first was from a progenitor proton-pumping ATPase to a proton-driven ATP synthase. The second involved transforming the synthase back into a proton-pumping ATPase. As proposed earlier [FEBS Lett. 259 (1990) 227], these reversals required changes in the H+/ATP coupling ratio from an optimal value of about 2 for an ATPase function to about 4 for an ATP synthase function. The doubling of the ratio that occurred at the ATPase-to-Synthase transition was accomplished by duplicating the gene that encodes the nucleotide-binding catalytic subunits followed by loss of function in one of the genes. The halving of the ratio that occurred at the Synthase-to-ATPase transition was achieved by a duplication/fusion of the gene that encodes the proton-binding transporter subunits, followed by a loss of function in one half of the double-sized protein. These events allowed conservation of quaternary structure, while maintaining a sufficient driving force to sustain an adequate phosphorylation potential or electrochemical gradient. Here, we describe intermediate evolutionary steps and a fine-tuning of the H+/ATP coupling ratio to optimize synthase function in response to different environments. In addition, we propose a third reversal of function, from an ATPase back to an ATP synthase. In contrast to the first two reversals which required a partial loss in function, the change in coupling ratio required for the third reversal is explained by a gain in function. 相似文献

10.

Genome rearrangement based on reversals that preserve conserved intervals

Bernt M Merkle D Middendorf M 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2006,3(3):275-288

The order of genes in the genomes of species can change during evolution and can provide information about their phylogenetic relationship. An interesting method to infer the phylogenetic relationship from the gene orders is to use different types of rearrangement operations and to find possible rearrangement scenarios using these operations. One of the most common rearrangement operations is reversals, which reverse the order of a subset of neighbored genes. In this paper, we study the problem to find the ancestral gene order for three species represented by their gene orders. The rearrangement scenario should use a minimal number of reversals and no other rearrangement operations. This problem is called the Median problem and is known to be NP--complete. In this paper, we describe a heuristic algorithm for finding solutions to the Median problem that searches for rearrangement scenarios with the additional property that gene groups should not be destroyed by reversal operations. The concept of conserved intervals for signed permutations is used to describe such gene groups. We show experimentally, for different types of test problems, that the proposed algorithm produces very good results compared to other algorithms for the Median problem. We also integrate our reversal selection procedure into the well-known MGR and GRAPPA algorithms and show that they achieve a significant speedup while obtaining solutions of the same quality as the original algorithms on the test problems. 相似文献

11.

Efficient Record Linkage Algorithms Using Complete Linkage Clustering

Abdullah-Al Mamun Robert Aseltine Sanguthevar Rajasekaran 《PloS one》2016,11(4)

Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times. 相似文献

12.

Exploring the solution space of sorting by reversals, with experiments and an application to evolution

Braga MD Sagot MF Scornavacca C Tannier E 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2008,5(3):348-356

In comparative genomics, algorithms that sort permutations by reversals are often used to propose evolutionary scenarios of rearrangements between species. One of the main problems of such methods is that they give one solution while the number of optimal solutions is huge, with no criteria to discriminate among them. Bergeron et al. started to give some structure to the set of optimal solutions, in order to be able to deliver more presentable results than only one solution or a complete list of all solutions. However, no algorithm exists so far to compute this structure except through the enumeration of all solutions, which takes too much time even for small permutations. Bergeron et al. state as an open problem the design of such an algorithm. We propose in this paper an answer to this problem, that is, an algorithm which gives all the classes of solutions and counts the number of solutions in each class, with a better theoretical and practical complexity than the complete enumeration method. We give an example of how to reduce the number of classes obtained, using further constraints. Finally, we apply our algorithm to analyse the possible scenarios of rearrangement between mammalian sex chromosomes. 相似文献

13.

On the repeat-annotated phylogenetic tree reconstruction problem.

Firas Swidan Michal Ziv-Ukelson Ron Y Pinter 《Journal of computational biology》2006,13(8):1397-1418

A new problem in phylogenetic inference is presented, based on recent biological findings indicating a strong association between reversals (i.e., inversions) and repeats. These biological findings are formalized here in a new mathematical model, called repeat-annotated phylogenetic trees (RAPT). We show that, under RAPT, the evolutionary process--including both the tree-topology as well as internal node genome orders--is uniquely determined, a property that is of major significance both in theory and in practice. Furthermore, the repeats are employed to provide linear-time algorithms for reconstructing both the genomic orders and the phylogeny, which are NP-hard problems under the classical model of sorting by reversals (SBR). 相似文献

14.

Reversal distance for partially ordered genomes

Zheng C Lenert A Sankoff D 《Bioinformatics (Oxford, England)》2005,21(Z1):i502-i508

MOTIVATION: The total order of the genes or markers on a chromosome inherent in its representation as a signed per-mutation must often be weakened to a partial order in the case of real data. This is due to lack of resolution (where several genes are mapped to the same chromosomal position) to missing data from some of the datasets used to compile a gene order, and to conflicts between these datasets. The available genome rearrangement algorithms, however, require total orders as input. A more general approach is needed to handle rearrangements of gene partial orders. RESULTS: We formalize the uncertainty in gene order data by representing a chromosome from each genome as a partial order, summarized by a directed acyclic graph (DAG). The rearrangement problem is then to infer a minimal sequence of reversals for transforming any topological sort of one DAG to any one of the other DAG. Each topological sort represents a possible linearization compatible with all the datasets on the chromosome. The set of all possible topological sorts is embedded in each DAG by appropriately augmenting the edge set, so that it becomes a general directed graph (DG). The DGs representing chromosomes of two genomes are combined to produce a bicoloured graph from which we extract a maximal decomposition into alternating coloured cycles, and from which, in turn, an optimal sequence of reversals can usually be identified. We test this approach on simulated incomplete comparative maps and on cereal chromosomal maps drawn from the Gramene browser. 相似文献

15.

Double cut and join with insertions and deletions

Braga MD Willing E Stoye J 《Journal of computational biology》2011,18(9):1167-1184

Many approaches to compute the genomic distance are still limited to genomes with the same content, without duplicated markers. However, differences in the gene content are frequently observed and can reflect important evolutionary aspects. While duplicated markers can hardly be handled by exact models, when duplicated markers are not allowed, a few polynomial time algorithms that include genome rearrangements, insertions and deletions were already proposed. In an attempt to improve these results, in the present work we give the first linear time algorithm to compute the distance between two multichromosomal genomes with unequal content, but without duplicated markers, considering insertions, deletions and double cut and join (DCJ) operations. We derive from this approach algorithms to sort one genome into another one also using DCJ operations, insertions and deletions. The optimal sorting scenarios can have different compositions and we compare two types of sorting scenarios: one that maximizes and one that minimizes the number of DCJ operations with respect to the number of insertions and deletions. We also show that, although the triangle inequality can be disrupted in the proposed genomic distance, it is possible to correct this problem adopting a surcharge on the number of non-common markers. We use our method to analyze six species of Rickettsia, a group of obligate intracellular parasites, and identify preliminary evidence of clusters of deletions. 相似文献

16.

Computing the reversal distance between genomes in the presence of multi-gene families via binary integer programming

Suksawatchon J Lursinsap C Bodén M 《Journal of bioinformatics and computational biology》2007,5(1):117-133

Hannenhalli and Pevzner developed the first polynomial-time algorithm for the combinatorial problem of sorting signed genomic data. Their algorithm determines the minimum number of reversals required for rearranging a genome to another -but only in the absence of gene duplicates. However, duplicates often account for 40% of a genome. In this paper, we show how to extend Hannenhalli and Pevzner's approach to deal with genomes with multi-gene families. We propose a new heuristic algorithm to compute the nearest reversal distance between two genomes with multi-gene families via binary integer programming. The experimental results on both synthetic and real biological data demonstrate that the proposed algorithm is able to find the reversal distance with high accuracy. 相似文献

17.

A single ubiquitin is sufficient for cargo protein entry into MVBs in the absence of ESCRT ubiquitination

Stringer DK Piper RC 《The Journal of cell biology》2011,192(2):229-242

ESCRTs (endosomal sorting complexes required for transport) bind and sequester ubiquitinated membrane proteins and usher them into multivesicular bodies (MVBs). As Ubiquitin (Ub)-binding proteins, ESCRTs themselves become ubiquitinated. However, it is unclear whether this regulates a critical aspect of their function or is a nonspecific consequence of their association with the Ub system. We investigated whether ubiquitination of the ESCRTs was required for their ability to sort cargo into the MVB lumen. Although we found that Rsp5 was the main Ub ligase responsible for ubiquitination of ESCRT-0, elimination of Rsp5 or elimination of the ubiquitinatable lysines within ESCRT-0 did not affect MVB sorting. Moreover, by fusing the catalytic domain of deubiquitinating peptidases onto ESCRTs, we could block ESCRT ubiquitination and the sorting of proteins that undergo Rsp5-dependent ubiquitination. Yet, proteins fused to a single Ub moiety were efficiently delivered to the MVB lumen, which strongly indicates that a single Ub is sufficient in sorting MVBs in the absence of ESCRT ubiquitination. 相似文献

18.

Lower bounds on multiple sequence alignment using exact 3-way alignment

Charles J Colbourn Sudhir Kumar 《BMC bioinformatics》2007,8(1):140

Background

Multiple sequence alignment is fundamental. Exponential growth in computation time appears to be inevitable when an optimal alignment is required for many sequences. Exact costs of optimum alignments are therefore rarely computed. Consequently much effort has been invested in algorithms for alignment that are heuristic, or explore a restricted class of solutions. These give an upper bound on the alignment cost, but it is equally important to determine the quality of the solution obtained. In the absence of an optimal alignment with which to compare, lower bounds may be calculated to assess the quality of the alignment. As more effort is invested in improving upper bounds (alignment algorithms), it is therefore important to improve lower bounds as well. Although numerous cost metrics can be used to determine the quality of an alignment, many are based on sum-of-pairs (SP) measures and their generalizations. 相似文献

19.

HAPLOFREQ--estimating haplotype frequencies efficiently.

Eran Halperin Elad Hazan 《Journal of computational biology》2006,13(2):481-500

A commonly used tool in disease association studies is the search for discrepancies between the haplotype distribution in the case and control populations. In order to find this discrepancy, the haplotypes frequency in each of the populations is estimated from the genotypes. We present a new method HAPLOFREQ to estimate haplotype frequencies over a short genomic region given the genotypes or haplotypes with missing data or sequencing errors. Our approach incorporates a maximum likelihood model based on a simple random generative model which assumes that the genotypes are independently sampled from the population. We first show that if the phased haplotypes are given, possibly with missing data, we can estimate the frequency of the haplotypes in the population by finding the global optimum of the likelihood function in polynomial time. If the haplotypes are not phased, finding the maximum value of the likelihood function is NP-hard. In this case, we define an alternative likelihood function which can be thought of as a relaxed likelihood function. We show that the maximum relaxed likelihood can be found in polynomial time and that the optimal solution of the relaxed likelihood approaches asymptotically to the haplotype frequencies in the population. In contrast to previous approaches, our algorithms are guaranteed to converge in polynomial time to a global maximum of the different likelihood functions. We compared the performance of our algorithm to the widely used program PHASE, and we found that our estimates are at least 10% more accurate than PHASE and about ten times faster than PHASE. Our techniques involve new algorithms in convex optimization. These algorithms may be of independent interest. Particularly, they may be helpful in other maximum likelihood problems arising from survey sampling. 相似文献

20.

Intracranial and extracranial recordings of the auditory middle latency response

《Electronic Notes in Theoretical Computer Science》1988,71(4):296-303

Simultaneous epidural and cortical depth recordings of the auditory middle latency reponse (MLR) were obtained from 18 anesthetized guinea pigs. Microelectrodes were advanced at a right angle to the cortical surface at sites shown to be optimal for recording surface MLRs.Transcortical polarity reversals of waves A (14 msec) and B (24 msec) of the MLR were recorded in depth penetrations initiated at sites on the temporal lobe with large amplitude surface potentials. In 6 of 18 penetrations yielding phase inversions, wave polarities changed abruptly as microelectrodes were advanced into the cortex. In the remaining penetrations, the reversals were preceded by gradual decreases in wave latencies at progressively deep sites. As electrodes were advanced beyond the depth at which polarity reversals were encountered, decreases in amplitude and only minor changes in latency were observed.Surface and depth MLR activity were temporarily eliminated immediately after electrolytic lesions were made at polarity reversal sites. Recovery of responses occurred within 30–60 min. Lesions produced in penetrations initiated at sites with no surface MLR activity had no effect. Histologic examination confirmed the location of the phase reversal sites as being within grey matter of the temporal lobe.These results are consistent with previous investigations in experimental animals which demonstrated transcortical polarity reversals, and provide evidence for dipolar generating systems of the early components of the MLR at the cortical level. 相似文献