首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Anonymous probes from the genome of Halobacterium salinarium GRB and 12 gene probes were hybridized to the cosmid clones representing the chromosome and plasmids of Halobacterium salinarium GRB and Haloferax volcanii DS2. The order of and pairwise distances between 35 loci uniquely cross-hybridizing to both chromosomes were analyzed in a search for conservation. No conservation between the genomes could be detected at the 15-kbp resolution used in this study. We found distinct sets of low-copy-number repeated sequences in the chromosome and plasmids of Halobacterium salinarium GRB, indicating some degree of partitioning between these replicons. We propose alternative courses for the evolution of the haloarchaeal genome: (i) that the majority of genomic differences that exist between genera came about at the inception of this group or (ii) that the differences have accumulated over the lifetime of the lineage. The strengths and limitations of investigating these models through comparative genomic studies are discussed.  相似文献   

2.
Various international efforts are underway to catalog the genomic similarities and variations in the human population. Some key discoveries such as inversions and transpositions within the members of the species have also been made over the years. The task of constructing a phylogeny tree of the members of the same species, given this knowledge and data, is an important problem. In this context, a key observation is that the "distance" between two members, or member and ancestor, within the species is small. In this paper, we pose the tree reconstruction problem exploiting some of these peculiarities. The central idea of the paper is based on the notion of minimal consensus PQ tree T of sequences. We use a modified PQ structure (termed oPQ) and show that both the number and size of each T is O(1). We further show that the tree reconstruction problem is statistically well-defined (Theorem 7) and give a simple scheme to construct the phylogeny tree and the common ancestors. Our preliminary experiments with simulated data look very promising.  相似文献   

3.
Computing genomic distances between whole genomes is a fundamental problem in comparative genomics. Recent researches have resulted in different genomic distance definitions, for example, number of breakpoints, number of common intervals, number of conserved intervals, and Maximum Adjacency Disruption number. Unfortunately, it turns out that, in presence of duplications, most problems are NP-hard, and hence several heuristics have been recently proposed. However, while it is relatively easy to compare heuristics between them, until now very little is known about the absolute accuracy of these heuristics. Therefore, there is a great need for algorithmic approaches that compute exact solutions for these genomic distances. In this paper, we present a novel generic pseudo-boolean approach for computing the exact genomic distance between two whole genomes in presence of duplications, and put strong emphasis on common intervals under the maximum matching model. Of particular importance, we show three heuristics which provide very good results on a well-known public dataset of gamma-Proteobacteria.  相似文献   

4.
5.
Genome rearrangements have been studied in 30 gamma-proteobacterial complete genomes by comparing the order of a reduced set of genes on the chromosome. This set included those genes fulfilling several characteristics, the main ones being that an ortholog was present in every genome and that none of them had been acquired by horizontal gene transfer. Genome rearrangement distances were estimated based on either the number of breakpoints or the minimal number of inversions separating two genomes. Breakpoint and inversion distances were highly correlated, indicating that inversions were the main type of rearrangement event in gamma-Proteobacteria. In general, the progressive increase in sequence-based distances between genome pairs was associated with the increase in their rearrangement-based distances but with several groups of distances not following this pattern. Compared with free-living enteric bacteria, the lineages of Pasteurellaceae were evolving, on average, to relatively higher rates of between 2.02 and 1.64, while the endosymbiotic bacterial lineages of Buchnera aphidicola and Wigglesworthia glossinidia were evolving at moderately higher rates of 1.38 and 1.35, respectively. Because we know that the rearrangement rate in the Bu. aphidicola lineage was close to zero during the last 100-150 Myr of evolution, we deduced that a much higher rate took place in the first period of lineage evolution after the divergence of the Escherichia coli lineage. On the other hand, the lineage of the endosymbiont Blochmannia floridanus did present an almost identical rate to free-living enteric bacteria, indicating that the increase in the genome rearrangement rate is not a general change associated with bacterial endosymbiosis. Phylogenetic reconstruction based on rearrangement distances showed a different topology from the one inferred by sequence information. This topology broke the proposed monophyly of the three endosymbiotic lineages and placed Bl. floridanus as a closer relative to E. coli than Yersinia pestis. These results indicate that the phylogeny of these insect endosymbionts is still an open question that will require the development of specific phylogenetic methods to confirm whether the sisterhood of the three endosymbiotic lineages is real or a consequence of a long-branch attraction phenomenon.  相似文献   

6.
7.
8.

Background  

A fundamental problem when trying to define the functional relationships between proteins is the difficulty in quantifying functional similarities, even when well-structured ontologies exist regarding the activity of proteins (i.e. 'gene ontology' -GO-). However, functional metrics can overcome the problems in the comparing and evaluating functional assignments and predictions. As a reference of proximity, previous approaches to compare GO terms considered linkage in terms of ontology weighted by a probability distribution that balances the non-uniform 'richness' of different parts of the Direct Acyclic Graph. Here, we have followed a different approach to quantify functional similarities between GO terms.  相似文献   

9.

Background  

Genome median and genome halving are combinatorial optimization problems that aim at reconstructing ancestral genomes as well as the evolutionary events leading from the ancestor to extant species. Exploring complexity issues is a first step towards devising efficient algorithms. The complexity of the median problem for unichromosomal genomes (permutations) has been settled for both the breakpoint distance and the reversal distance. Although the multichromosomal case has often been assumed to be a simple generalization of the unichromosomal case, it is also a relaxation so that complexity in this context does not follow from existing results, and is open for all distances.  相似文献   

10.
Meiotic recombination plays critical roles in the acquisition of genetic diversity and has been utilized for conventional breeding of livestock and crops. The frequency of meiotic recombination is normally low, and is extremely low in regions called “recombination cold domains”. Here, we describe a new and highly efficient method to modulate yeast meiotic gene rearrangements using VDE (PI-SceI), an intein-encoded endonuclease that causes an efficient unidirectional meiotic gene conversion at its recognition sequence (VRS). We designed universal targeting vectors, by use of which the strain that inserts the VRS at a desired site is acquired. Meiotic induction of the strains provided unidirectional gene conversions and frequent genetic rearrangements of flanking genes with little impact on cell viability. This system thus opens the way for the designed modulation of meiotic gene rearrangements, regardless of recombinational activity of chromosomal domains. Finally, the VDE–VRS system enabled us to conduct meiosis-specific conditional knockout of genes where VDE-initiated gene conversion disrupts the target gene during meiosis, serving as a novel approach to examine the functions of genes during germination of resultant spores.  相似文献   

11.
Comparative genomic hybridization (CGH) is a modern genetic method which enables a genome-wide survey of chromosomal imbalances. For each chromosome region, one obtains the information whether there is a loss or gain of genetic material, or whether there is no change at that region. Usually it is not possible to evaluate all 46 chromosomes of a metaphase, therefore several (up to 20 or more) metaphases are analyzed per individual, and expressed as average. Mostly one does not study one individual alone but groups of 20-30 individuals. Therefore, large amounts of data quickly accumulate which must be put into a logical order. In this paper we present the application of a self-organizing map (Genecluster) as a tool for cluster analysis of data from pT2N0 prostate cancer cases studied by CGH. Self-organizing maps are artificial neural networks with the capability to form clusters on the basis of an unsupervised learning rule, i.e., in our examples it gets the CGH data as only information (no clinical data). We studied a group of 40 recent cases without follow-up, an older group of 20 cases with follow-up, and the data set obtained by pooling both groups. In all groups good clusterings were found in the sense that clinically similar cases were placed into the same clusters on the basis of the genetic information only. The data indicate that losses on chromosome arms 6q, 8p and 13q are all frequent in pT2N0 prostatic cancer, but the loss on 8p has probably the largest prognostic importance.  相似文献   

12.
Suppose that a family of rooted phylogenetic trees T i with different sets X i of leaves is given. A supertree for the family is a single rooted tree T whose leaf set is the union of all the X i , such that the branching information in T corresponds to the branching information in all the trees T i . This paper proposes a polynomial-time method BUILD-WITH-DISTANCES that makes essential use of distance information provided by the trees T i to construct a rooted tree S 0. When a supertree also containing the distance information exists, then S 0 is a supertree. The supertree S 0 often shows increased resolution over the trees found by methods that utilize only the topology of the input trees. When no supertree exists because the input trees are incompatible, several variants of the method are described which still produce trees with provable properties.  相似文献   

13.
Initial BRCA1 and BRCA2 analyses conducted in breast and ovarian cancer families were focused on identification of mutations in coding sequences and splicing sites of the genes. Large genomic rearrangements as well as mutations in promoter or untranslated regions have been missed by standard detection strategies. Nevertheless, in Western countries, a detailed study of families with strong linkage to BRCA1 identified large genomic deletions and rearrangements in this gene as early as 1997. To date, no such gene alteration has been described in Central and Eastern European populations. In our study of BRCA1/2 genes in the Czech population, we have detected a complex genomic rearrangement in BRCA1 using RNA-based analysis for mutation screening. This rearrangement involves exons 21 and 22 and results in a protein product lacking BRCT domain important for its function.  相似文献   

14.
Comparison of genomic maps is hampered by errors and ambiguities introduced by mapping technology, incorrectly resolved paralogy, small samples of markers, and extensive genome rearrangement. We design an analysis to remove or resolve most of these problems and to extract corrected data where markers occur in consecutive strips in both genomes. To do this, we introduce the notion of prestrip, an efficient way of generating these and a compatibility analysis culminating in a maximum weighted clique (MWC) search. The output can be directly analyzed with genome rearrangement algorithms, allowing the restoration of some of the data not incorporated into the clique solution. We investigate the trade-off between criteria for discarding excessive prestrips to make MWC feasible in terms of retaining as many markers as possible in the solution and producing an economical rearrangement analysis. We explore these questions through simulation and through comparison of the rice and sorghum genomes.  相似文献   

15.
A peak is a pair of real values (x,y), where x is the time when peak of height y is registered. In the peak alignment problem, we are given two sequences of peaks, and our task is to align the sequences allowing some basic edit operations on the peaks. We study an instance of the peak alignment problem that arises in the analysis of Mass Spectrometry data in Systems Biology. There the measurement technique guarantees that two peaks (x,y), (x',y') can only be considered the same if x is close enough to x', and y is close enough to y'. We review some methods to do alignment under such restrictions on matches.  相似文献   

16.
Hundreds of mutants with defects in a variety of physiologically important functions, such as photosynthesis, respiration, flagellar motility, phototaxis, circadian rhythms and the cell cycle, have been isolated from cultures of Chlamydomonas reinhardtii. In only a few cases have the genes responsible for these mutations been cloned and sequenced. The development of efficient methods for transformation with nuclear genes [7] has allowed the recent demonstration of gene isolation through genomic complementation with a pooled library of C. reinhardtii DNA [9]. To improve the efficiency with which genes complementing a particular mutation can be isolated, we have established an indexed (ordered) cosmid library of 11,280 individual clones contained in the separate wells of 120 microtiter plates. The average insert size is ca. 38 kb. PCR analysis of five sequenced nuclear genes present in the Chlamydomonas library revealed a range from two copies for the 2 and 2 tubulin genes to at least seven copies for the agininosuccinate lyase gene. Overall, these five clones were represented an average of >-3.4 times in the library. Thus, the probability that any one particular nuclear gene of < 1000 bp will be found in the library is >-97%, and the probability that a gene of ca. 10 000 bp will be found in the library is ca. 92%. Rapid screening methods with cosmid DNAs pooled from individual microtiter dishes have been applied successfully. Bacteria containing clones of the argininosuccinate lyase gene have been identified through genomic complementation of a Chlamydomonas mutant bearing an inactive arginnosuccinate lyase gene.We are using the nomenclature of indexed library versus ordered library to avoid confusion of this library with a library of ordered contigs.  相似文献   

17.
In this paper, we study the problem of computing the similarity of two protein structures by measuring their contact-map overlap. Contact-map overlap abstracts the problem of computing the similarity of two polygonal chains as a graph-theoretic problem. In R3, we present the first polynomial time algorithm with any guarantee on the approximation ratio for the 3-dimensional problem. More precisely, we give an algorithm for the contact-map overlap problem with an approximation ratio of sigma where sigma = min{sigma(P1), sigma(P2)} 0, is hard.  相似文献   

18.
We have previously shown that a deletion of approximately 3 kilobases in the unique glycophorin C (GPC) gene, which encodes for the human erythrocyte glycophorins C and D, is associated with the Gerbich (Ge) blood group deficiency (Ge-2,-3 and Ge-2,+3 types) (Le van Kim, C., Colin, Y., Blanchard, D., Dahr, W., London, J. & Cartron, J.P. (1987) Eur. J. Biochem. 165, 571-579). We have now isolated and characterized the structure of the GPC gene from the common Ge+2,+3 donors and from a Ge-2,-3 variant (Ge-2,-3 gene). The GPC gene is organized in four exons distributed over 13.5-kilobase pairs (kbp) DNA and contains two directly repeated domains of 3.4 kbp in length which are likely derived from the recent duplication of a unique ancestral domain. Restriction mapping and sequence analysis indicate that a 3.4-kbp deletion within this gene, arising probably by unequal crossing over between the two repeated domains, is responsible for the formation of the Ge-2,-3 gene. The breakpoints of the deletion are located within introns 2 and 3, and therefore exon 3 is removed. The defective gene is transcribed as a mRNA with a continuous open reading frame extending over 300 nucleotides which is translated into an unusual sialoglycoprotein present on Ge-2,-3 red cells. The primary structure of this new glycoprotein has been deduced from nucleotide sequencing. It is proposed in addition, that another 3.4-kb deletion within the GPC gene eliminates exon 2 only by a similar mechanism and generates a defective gene encoding for the abnormal glycoprotein present on Ge-2,+3 erythrocytes. Interestingly, the same deletion which lead to the rare Ge-2,-3 genetic condition, occurred spontaneously and frequently in the cloned GPC gene during the propagation of the recombinant phages in Escherichia coli. From these observations we suggest that the Ge-2,-3 and Ge-2,+3 genes might represent the two allelic forms of a unique ancestral form of the GPC gene, following successive internal duplication and deletion events.  相似文献   

19.
Gene prioritization through genomic data fusion   总被引:4,自引:0,他引:4  
The identification of genes involved in health and disease remains a challenge. We describe a bioinformatics approach, together with a freely accessible, interactive and flexible software termed Endeavour, to prioritize candidate genes underlying biological processes or diseases, based on their similarity to known genes involved in these phenomena. Unlike previous approaches, ours generates distinct prioritizations for multiple heterogeneous data sources, which are then integrated, or fused, into a global ranking using order statistics. In addition, it offers the flexibility of including additional data sources. Validation of our approach revealed it was able to efficiently prioritize 627 genes in disease data sets and 76 genes in biological pathway sets, identify candidates of 16 mono- or polygenic diseases, and discover regulatory genes of myeloid differentiation. Furthermore, the approach identified a novel gene involved in craniofacial development from a 2-Mb chromosomal region, deleted in some patients with DiGeorge-like birth defects. The approach described here offers an alternative integrative method for gene discovery.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号