首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Genotype-phenotype mapping: genes as computer programs   总被引:10,自引:0,他引:10  
The effects of genes on phenotype are mediated by processes that are typically unknown but whose determination is desirable. The conversion from gene to phenotype is not a simple function of individual genes, but involves the complex interactions of many genes; it is what is known as a nonlinear mapping problem. A computational method called genetic programming allows the representation of candidate nonlinear mappings in several possible trees. To find the best model, the trees are 'evolved' by processes akin to mutation and recombination, and the trees that more closely represent the actual data are preferentially selected. The result is an improved tree of rules that represent the nonlinear mapping directly. In this way, the encoding of cellular and higher-order activities by genes is seen as directly analogous to computer programs. This analogy is of utility in biological genetics and in problems of genotype-phenotype mapping.  相似文献   

2.
This paper describes two types of problems related to tree shapes, as well as algorithms that can be used to solve these problems. The first problem is that of comparing the similarity of the unlabelled shapes instead of merely their degree of balance, in a manner analogous to that routinely used to compare topologies for labelled trees. There are possible practical applications for this comparison, such as determining, based on tree shape similarity alone, whether the taxa in two phylogenies are likely to have a correspondence (e.g. hosts and parasites with high specificity). It is shown that tree balance is insufficient for this task and that standard measures of topological difference (Robinson–Foulds distances, SPR distances or retention indices of the matrices representing the trees, MRPs) can be easily adapted to the problem. The second type of problem is to determine whether taxa of uncertain matching unique to two different phylogenies could correspond to each other (e.g. the same species in larvae and adults of metamorphic animals, fossils known from different body parts). This second problem can be solved by either relabelling taxa in such a way that the number of consensus nodes is maximized, or relabelling taxa in such a way that the sum of the number of steps in the MRP of each tree mapped onto the other is minimum.  相似文献   

3.
In phylogenetics, a central problem is to infer the evolutionary relationships between a set of species X; these relationships are often depicted via a phylogenetic tree—a tree having its leaves labeled bijectively by elements of X and without degree-2 nodes—called the “species tree.” One common approach for reconstructing a species tree consists in first constructing several phylogenetic trees from primary data (e.g., DNA sequences originating from some species in X), and then constructing a single phylogenetic tree maximizing the “concordance” with the input trees. The obtained tree is our estimation of the species tree and, when the input trees are defined on overlapping—but not identical—sets of labels, is called “supertree.” In this paper, we focus on two problems that are central when combining phylogenetic trees into a supertree: the compatibility and the strict compatibility problems for unrooted phylogenetic trees. These problems are strongly related, respectively, to the notions of “containing as a minor” and “containing as a topological minor” in the graph community. Both problems are known to be fixed parameter tractable in the number of input trees k, by using their expressibility in monadic second-order logic and a reduction to graphs of bounded treewidth. Motivated by the fact that the dependency on k of these algorithms is prohibitively large, we give the first explicit dynamic programming algorithms for solving these problems, both running in time \(2^{O(k^2)} \cdot n\), where n is the total size of the input.  相似文献   

4.
Joost P  Methner A 《Genome biology》2002,3(11):research0063.1-research006316

Background

G-protein-coupled receptors (GPCRs) are the largest and most diverse family of transmembrane receptors. They respond to a wide range of stimuli, including small peptides, lipid analogs, amino-acid derivatives, and sensory stimuli such as light, taste and odor, and transmit signals to the interior of the cell through interaction with heterotrimeric G proteins. A large number of putative GPCRs have no identified natural ligand. We hypothesized that a more complete knowledge of the phylogenetic relationship of these orphan receptors to receptors with known ligands could facilitate ligand identification, as related receptors often have ligands with similar structural features.

Results

A database search excluding olfactory and gustatory receptors was used to compile a list of accession numbers and synonyms of 81 orphan and 196 human GPCRs with known ligands. Of these, 241 sequences belonging to the rhodopsin receptor-like family A were aligned and a tentative phylogenetic tree constructed by neighbor joining. This tree and local alignment tools were used to define 19 subgroups of family A small enough for more accurate maximum-likelihood analyses. The secretin receptor-like family B and metabotropic glutamate receptor-like family C were directly subjected to these methods.

Conclusions

Our trees show the overall relationship of 277 GPCRs with emphasis on orphan receptors. Support values are given for each branch. This approach may prove valuable for identification of the natural ligands of orphan receptors as their relation to receptors with known ligands becomes more evident.  相似文献   

5.
The problem of determining an optimal phylogenetic tree from a set of data is an example of the Steiner problem in graphs. There is no efficient algorithm for solving this problem with reasonably large data sets. In the present paper an approach is described that proves in some cases that a given tree is optimal without testing all possible trees. The method first uses a previously described heuristic algorithm to find a tree of relatively small total length. The second part of the method independently analyses subsets of sites to determine a lower bound on the length of any tree. We simultaneously attempt to reduce the total length of the tree and increase the lower bound. When these are equal it is not possible to make a shorter tree with a given data set and given criterion. An example is given where the only two possible minimal trees are found for twelve different mammalian cytochrome c sequences. The criterion of finding the smallest number of minimum base changes was used. However, there is no general method of guaranteeing that a solution will be found in all cases and in particular better methods of improving the estimate of the lower bound need to be developed.  相似文献   

6.
Comparing and computing distances between phylogenetic trees are important biological problems, especially for models where edge lengths play an important role. The geodesic distance measure between two phylogenetic trees with edge lengths is the length of the shortest path between them in the continuous tree space introduced by Billera, Holmes, and Vogtmann. This tree space provides a powerful tool for studying and comparing phylogenetic trees, both in exhibiting a natural distance measure and in providing a euclidean-like structure for solving optimization problems on trees. An important open problem is to find a polynomial time algorithm for finding geodesics in tree space. This paper gives such an algorithm, which starts with a simple initial path and moves through a series of successively shorter paths until the geodesic is attained.  相似文献   

7.
The amino acid sequences of 369 human nonolfactory G-protein-coupled receptors (GPCRs) have been aligned at the seven transmembrane domain (TM) and used to extract the nature of 30 critical residues supposed--from the X-ray structure of bovine rhodopsin bound to retinal--to line the TM binding cavity of ground-state receptors. Interestingly, the clustering of human GPCRs from these 30 residues mirrors the recently described phylogenetic tree of full-sequence human GPCRs (Fredriksson et al., Mol Pharmacol 2003;63:1256-1272) with few exceptions. A TM cavity could be found for all investigated GPCRs with physicochemical properties matching that of their cognate ligands. The current approach allows a very fast comparison of most human GPCRs from the focused perspective of the predicted TM cavity and permits to easily detect key residues that drive ligand selectivity or promiscuity.  相似文献   

8.
Tree structures are useful for describing and analyzing biological objects and processes. Consequently, there is a need to design metrics and algorithms to compare trees. A natural comparison metric is the "Tree Edit Distance," the number of simple edit (insert/delete) operations needed to transform one tree into the other. Rooted-ordered trees, where the order between the siblings is significant, can be compared in polynomial time. Rooted-unordered trees are used to describe processes or objects where the topology, rather than the order or the identity of each node, is important. For example, in immunology, rooted-unordered trees describe the process of immunoglobulin (antibody) gene diversification in the germinal center over time. Comparing such trees has been proven to be a difficult computational problem that belongs to the set of NP-Complete problems. Comparing two trees can be viewed as a search problem in graphs. A* is a search algorithm that explores the search space in an efficient order. Using a good lower bound estimation of the degree of difference between the two trees, A* can reduce search time dramatically. We have designed and implemented a variant of the A* search algorithm suitable for calculating tree edit distance. We show here that A* is able to perform an edit distance measurement in reasonable time for trees with dozens of nodes.  相似文献   

9.
Summary The problem of determining the minimal phylogenetic tree is discussed in relation to graph theory. It is shown that this problem is an example of the Steiner problem in graphs which is to connect a set of points by a minimal length network where new points can be added. There is no reported method of solving realistically-sized Steiner problems in reasonable computing time. A heuristic method of approaching the phylogenetic problem is presented, together with a worked example with 7 mammalian cytochrome c sequences. It is shown in this case that the method develops a phylogenetic tree that has the smallest possible number of amino acid replacements. The potential and limitations of the method are discussed. It is stressed that objective methods must be used for comparing different trees. In particular it should be determined how close a given tree is to a mathematically determined lower bound. A theorem is proved which is used to establish a lower bound on the length of any tree and if a tree is found with a length equal to the lower bound, then no shorter tree can exist.  相似文献   

10.
Böcker and Dress (Adv Math 138:105–125, 1998) presented a 1-to-1 correspondence between symbolically dated rooted trees and symbolic ultrametrics. We consider the corresponding problem for unrooted trees. More precisely, given a tree T with leaf set X and a proper vertex coloring of its interior vertices, we can map every triple of three different leaves to the color of its median vertex. We characterize all ternary maps that can be obtained in this way in terms of 4- and 5-point conditions, and we show that the corresponding tree and its coloring can be reconstructed from a ternary map that satisfies those conditions. Further, we give an additional condition that characterizes whether the tree is binary, and we describe an algorithm that reconstructs general trees in a bottom-up fashion.  相似文献   

11.
Deterministic and stochastic class structured population models were used to simulate the life cycle of Avicennia bicolor of the Pacific coast of Costa Rica. The models were based on an extensive data set collected during a 6 year period in a 0.52 ha plot of monospecific A. bicolor. This data set included density, growth, mortality and transition rates of seedlings, saplings and trees of eight different diameter classes, as well as propagule production for the reproductive tree classes. Model simulations carried out over a 100 year period indicated a stable size class structure of the forest. Sensitivity analysis showed a significantly greater sensitivity of the model population to simulated changes in the mortality of seedlings, in comparison with the mortality of saplings and trees. An increase of 1% in the mortality of seedlings, for example, was sufficient to cause significant changes in the density of individual size classes. In contrast, neither a 10% increase in the mortality of saplings and trees nor a 20% decrease in the propagule production of fecund trees significantly affected the overall forest structure.  相似文献   

12.
 以海南岛霸王岭自然保护区1 hm2老龄原始林样地的调查材料为基础,分析了热带山地雨林群落的组成、高度结构、径级结构及有关的树种多样性特征。结果表明:霸王岭热带山地雨林树种较丰富,物种多样性指数较高。树种数和树木的密度都随高度级、径级的增加而呈负指数或负幂函数递减;热带山地雨林不同高度级、不同径级和不同小样方斑块内的树种数都与树木密度呈显著的正相关关系。热带山地雨林经过自然的演替达到老龄顶极群落后,最后进入主林层的只是少部分树种的少数个体。  相似文献   

13.
Evolution of the nuclear receptor gene superfamily.   总被引:54,自引:6,他引:48       下载免费PDF全文
V Laudet  C Hnni  J Coll  F Catzeflis    D Sthelin 《The EMBO journal》1992,11(3):1003-1013
  相似文献   

14.
The field of plant molecular systematics is expanding rapidly, and with it new and refined methods are coming into use. This paper reviews recent advances in experimental methods and data analysis, as applied to the chloroplast genome. Restriction site mapping of the chloroplast genome has been used widely, but is limited in the range of taxonomic levels to which it can be applied. The upper limits (i.e., greatest divergence) of its application are being explored by mapping of the chloroplast inverted repeat region, where rates of nucleotide substitution are low. The lower limits of divergence amenable to restriction site study are being examined using restriction enzymes with 4-base recognition sites to analyze polymerase chain reaction (PCR)-amplified portions of the chloroplast genome that evolve rapidly. The comparison of DNA sequences is the area of molecular systematics in which the greatest advances are being made. PCR and methods for direct sequencing of PCR products have resulted in a mushrooming of sequence data. In theory, any degree of divergence is amenable to comparative sequencing studies. In practice, plant systematists have focused on two slowly evolving sequences (rbcL and rRNA genes). More rapidly evolving DNA sequences, including rapidly changing chloroplast genes, chloroplast introns, and intergenic spacers, and the noncoding portions of the nuclear ribosomal RNA repeat, also are being investigated for comparative purposes. The relative advantages and disadvantages of comparative restriction site mapping and DNA sequencing are reviewed. For both methods, the analysis of resulting data requires sufficient taxon and character sampling to achieve the best possible estimate of phylogenetic relationships. Parsimony analysis is particularly sensitive to the issue of taxon sampling due to the problem of long branches attracting on a tree. However, data sets with many taxa present serious computational difficulties that may result in the inability to achieve maximum parsimony or to find all shortest trees.  相似文献   

15.

Background

In the absence of horizontal gene transfer it is possible to reconstruct the history of gene families from empirically determined orthology relations, which are equivalent to event-labeled gene trees. Knowledge of the event labels considerably simplifies the problem of reconciling a gene tree T with a species trees S, relative to the reconciliation problem without prior knowledge of the event types. It is well-known that optimal reconciliations in the unlabeled case may violate time-consistency and thus are not biologically feasible. Here we investigate the mathematical structure of the event labeled reconciliation problem with horizontal transfer.

Results

We investigate the issue of time-consistency for the event-labeled version of the reconciliation problem, provide a convenient axiomatic framework, and derive a complete characterization of time-consistent reconciliations. This characterization depends on certain weak conditions on the event-labeled gene trees that reflect conditions under which evolutionary events are observable at least in principle. We give an \(\mathcal {O}(|V(T)|\log (|V(S)|))\)-time algorithm to decide whether a time-consistent reconciliation map exists. It does not require the construction of explicit timing maps, but relies entirely on the comparably easy task of checking whether a small auxiliary graph is acyclic. The algorithms are implemented in C++ using the boost graph library and are freely available at https://github.com/Nojgaard/tc-recon.

Significance

The combinatorial characterization of time consistency and thus biologically feasible reconciliation is an important step towards the inference of gene family histories with horizontal transfer from orthology data, i.e., without presupposed gene and species trees. The fast algorithm to decide time consistency is useful in a broader context because it constitutes an attractive component for all tools that address tree reconciliation problems.
  相似文献   

16.
Screening phage-displayed combinatorial peptide libraries   总被引:3,自引:0,他引:3  
Among the many techniques available to investigators interested in mapping protein-protein interactions is phage display. With a modest amount of effort, time, and cost, one can select peptide ligands to a wide array of targets from phage-display combinatorial peptide libraries. In this article, protocols and examples are provided to guide scientists who wish to identify peptide ligands to their favorite proteins.  相似文献   

17.

Background

Horizontal gene transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values.

Results

We investigate the locus tree inference problem as a possible alternative that combines the advantages of both approaches. We present several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification.

Conclusions

Our approach allows for faster comparisons of gene and species trees and improves known algorithms for duplication inference in the presence of polytomies in the species trees. We have implemented our algorithms in a software tool available at https://github.com/mciach/LocusTreeInference.
  相似文献   

18.
There are two main classes of natural killer (NK) cell receptors in mammals, the killer cell immunoglobulin-like receptors (KIR) and the structurally unrelated killer cell lectin-like receptors (KLR). While KIR represent the most diverse group of NK receptors in all primates studied to date, including humans, apes, and Old and New World monkeys, KLR represent the functional equivalent in rodents. Here, we report a first digression from this rule in lemurs, where the KLR (CD94/NKG2) rather than KIR constitute the most diverse group of NK cell receptors. We demonstrate that natural selection contributed to such diversification in lemurs and particularly targeted KLR residues interacting with the peptide presented by MHC class I ligands. We further show that lemurs lack a strict ortholog or functional equivalent of MHC-E, the ligands of non-polymorphic KLR in “higher” primates. Our data support the existence of a hitherto unknown system of polymorphic and diverse NK cell receptors in primates and of combinatorial diversity as a novel mechanism to increase NK cell receptor repertoire.  相似文献   

19.
The complex three-dimensional shapes of tree-like structures in biology are constrained by optimization principles, but the actual costs being minimized can be difficult to discern. We show that despite quite variable morphologies and functions, bifurcations in the scleractinian coral Madracis and in many different mammalian neuron types tend to be planar. We prove that in fact bifurcations embedded in a spatial tree that minimizes wiring cost should lie on planes. This biologically motivated generalization of the classical mathematical theory of Euclidean Steiner trees is compatible with many different assumptions about the type of cost function. Since the geometric proof does not require any correlation between consecutive planes, we predict that, in an environment without directional biases, consecutive planes would be oriented independently of each other. We confirm this is true for many branching corals and neuron types. We conclude that planar bifurcations are characteristic of wiring cost optimization in any type of biological spatial tree structure.  相似文献   

20.

Background

For a combination of reasons (including data generation protocols, approaches to taxon and gene sampling, and gene birth and loss), estimated gene trees are often incomplete, meaning that they do not contain all of the species of interest. As incomplete gene trees can impact downstream analyses, accurate completion of gene trees is desirable.

Results

We introduce the Optimal Tree Completion problem, a general optimization problem that involves completing an unrooted binary tree (i.e., adding missing leaves) so as to minimize its distance from a reference tree on a superset of the leaves. We present OCTAL, an algorithm that finds an optimal solution to this problem when the distance between trees is defined using the Robinson–Foulds (RF) distance, and we prove that OCTAL runs in \(O(n^2)\) time, where n is the total number of species. We report on a simulation study in which gene trees can differ from the species tree due to incomplete lineage sorting, and estimated gene trees are completed using OCTAL with a reference tree based on a species tree estimated from the multi-locus dataset. OCTAL produces completed gene trees that are closer to the true gene trees than an existing heuristic approach in ASTRAL-II, but the accuracy of a completed gene tree computed by OCTAL depends on how topologically similar the reference tree (typically an estimated species tree) is to the true gene tree.

Conclusions

OCTAL is a useful technique for adding missing taxa to incomplete gene trees and provides good accuracy under a wide range of model conditions. However, results show that OCTAL’s accuracy can be reduced when incomplete lineage sorting is high, as the reference tree can be far from the true gene tree. Hence, this study suggests that OCTAL would benefit from using other types of reference trees instead of species trees when there are large topological distances between true gene trees and species trees.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号