首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 175 毫秒
1.
Efficient enumeration of phylogenetically informative substrings.   总被引:1,自引:0,他引:1  
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore conserved) substrings that are shared between all mammals and not found in non-mammals. Such collection of substrings may be used to identify conserved subsequences or to construct sets of identifying substrings for branches of a phylogenetic tree. For two disjoint sets of genomes on a phylogenetic tree, a substring is called a tag if it is found in all of the genomes of one set and none of the genomes of the other set. We present a near-linear time algorithm that finds all tags in a given phylogeny; and a sublinear space algorithm (at the expense of running time) that is more suited for very large data sets. Under a stochastic model of evolution, we show that a simple process of tag-generation essentially captures all possible ways of generating tags. We use this insight to develop a faster tag discovery algorithm with a small chance of error. However, since tags are not guaranteed to exist in a given data set, we generalize the notion of a tag from a single substring to a set of substrings. We present a linear programming-based approach for finding approximate generalized tag sets. Finally, we use our tag enumeration algorithm to analyze a phylogeny containing 57 whole microbial genomes. We find tags for all nodes in the phylogeny except the root for which we find generalized tag sets.  相似文献   

2.
The wood structure of 71 species representing 24 genera of the pantropical Lecythidaceae s.l., including the edible Brazil nuts (Bertholletia excelsa) and the spectacular cannon-ball tree (Couroupita guianensis), was investigated using light and scanning electron microscopy. This study focused on finding phylogenetically informative characters to help elucidate any obscure evolutionary patterns within the family. The earliest diverging subfamily Napoleonaeoideae has mixed simple/scalariform vessel perforations, scalariform vessel-ray pitting, and high multiseriate rays, all features that are also present in Scytopetaloideae. The wood structure of Napoleonaea is distinct, but its supposed close relative Crateranthus strongly resembles Scytopetaloideae. The isolated position of Foetidia (Foetidioideae) can be supported by a unique type of vessel-ray pitting that is similar in shape and size to intervessel pitting (distinctly bordered, <5 μm). The more derived Planchonioideae and Lecythidoideae share exclusively simple perforations and two types of vessel-ray pitting, but they can easily be distinguished from each other by the size of intervessel pitting, shape of body ray cells in multiseriate rays, and the type of crystalliferous axial parenchyma cells. The anatomical diversity observed is clearly correlated with differences in plant size (shrubs vs. tall trees): the percentage of scalariform perforations, as well as vessel density, and the length of vessel elements, fibers, and multiseriate rays are negatively correlated with increasing plant size, while the reverse is true for vessel diameter.  相似文献   

3.
Fascioles are important early-forming structures that play a key role in allowing irregular echinoids to burrow. They have traditionally been grouped into a small number of types according to their general position on the test, but this masks some significant differences that exist. The precise course that fasciole bands follow over the test plating has been mapped in detail for 89 species of spatangoid echinoids, representing the great majority of fasciole-bearing genera both living and fossil. Within each fasciole type, discrete and conserved patterns can be distinguished, differing both in which plates they are initiated on, and on whether they cross plate growth centres or are late-stage bands positioned towards the edge of the plate. Fasciole position is most highly conserved in the anterior and lateral interambulacral plates and on the earliest forming bands. The existence of different subanal fasciole patterns in the Micrasteridae and Brissidae suggests that these may have evolved independently. Schizasterid and hemiasterine spatangoids can each be subdivided into two major clades, and brissid spatangoids into three clades based on detailed patterns of their fascioles. Plotting fasciole pathways over test architecture provides a rich new source of phylogenetically informative characters.  © 2005 The Linnean Society of London, Zoological Journal of the Linnean Society , 2005, 144 , 15−35.  相似文献   

4.
Belzile F  Lassner MW  Tong Y  Khush R  Yoder JI 《Genetics》1989,123(1):181-189
The transmission of transposed Ac elements in progeny derived by self-pollination of ten transformed tomato plants has been examined by Southern hybridization analysis. We show that six of these primary transformants have transmitted a transposed Ac to at least one progeny. One of the families was segregating for at least two different insertion events. In five of ten families, progeny were detected that contained a transposed Ac but no donor T-DNA sequences, indicating that a recombination event occurred between the original and new Ac insertion site. Somatic transposition of Ac as late as the R2 generation is evidenced. One family contained an empty donor site fragment but Ac was not detected in either the parent or progeny, indicating Ac was lost in this population early in regeneration. While four of ten families were segregating for aberrant phenotypes, there was no evidence that the mutated gene was linked to a transposed Ac.  相似文献   

5.

Background

Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.

Results

For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.

Conclusions

SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.
  相似文献   

6.
7.
8.
Summary Trichomes of Tremandra R.Br. ex DC., Platytheca Steetz and Tetratheca Sm. (Elaeocarpaceae, former Tremandraceae), together with two outgroup species of Elaeocarpus L., are illustrated using scanning electron microscopy, and their distribution on various plant organs is documented. Various trichomes types were identified that relate taxa: simple hairs, stellate hairs, short glandular trichomes, long glandular trichomes, and three forms of tubercules. Both outgroup and ingroup taxa have simple hairs. Stellate hairs are confirmed as unique to Tremandra. Prominent and sculptured multicelled tubercules, some bearing a stout hair, are characteristic of Platytheca. Smaller multicelled tubercules occur in both Platytheca and Tetratheca, except for the Western Australian taxon Te. filiformis Benth. (possibly plesiomorphic). Unicellular tubercules (papilla) characterise two species of Tetratheca. Short glandular trichomes, usually found on the ovary, also occur in both of these genera but not in all species (possibly secondary losses), while long glandular trichomes, usually on stems and leaves, occur only in some groups of Tetratheca. Within Tetratheca, Western Australian taxa that have five-merous flowers fall into three ‘groups’: seven species (together with one from South Australia) that have short glandular trichomes but no long glandular trichomes; six species that have long glandular trichomes but no short glandular trichomes; and four species that have both trichome types. All other species of Tetratheca have four-merous flowers and form two ‘groups’: 12 eastern species (including one from South Australia) that have both short glandular trichomes and long glandular trichomes; 4 western species and six eastern species that lack short glandular trichomes. On the basis of these characters, a phylogenetic hypothesis for the three genera is presented.  相似文献   

9.
Single nucleotide polymorphisms (SNPs) are abundant in genomes of all species and represent informative DNA markers extensively used to analyze phylogenetic relationships between strains. Medium to high throughput, open methodologies able to test many SNPs in a minimum time are therefore in great need. By using the versatile Luminex® xTAG technology, we developed an efficient multiplexed SNP genotyping assay to score 13 phylogenetically informative SNPs within the genome of Bacillus anthracis. The Multiplex Oligonucleotide Ligation-PCR procedure (MOL-PCR) described by Deshpande et al., 2010 has been modified and adapted for simultaneous interrogation of 13 biallelic canonical SNPs in a 13-plex assay. Changes made to the originally published method include the design of allele-specific dual-priming-oligonucleotides (DPOs) as competing detection probes (MOLigo probes) and use of asymmetric PCR reaction for signal amplification and labeling of ligation products carrying SNP targets. These innovations significantly reduce cross-reactivity observed when initial MOLigo probes were used and enhance hybridization efficiency onto the microsphere array, respectively. When evaluated on 73 representative samples, the 13-plex assay yielded unambiguous SNP calls and lineage affiliation. Assay limit of detection was determined to be 2 ng of genomic DNA. The reproducibility, robustness and easy-of-use of the present method were validated by a small-scale proficiency testing performed between four European laboratories. While cost-effective compared to other singleplex methods, the present MOL-PCR method offers a high degree of flexibility and scalability. It can easily accommodate newly identified SNPs to increase resolving power to the canSNP typing of B. anthracis.  相似文献   

10.
Sporadic amplification of ID elements in rodents   总被引:8,自引:0,他引:8  
ID sequences are members of a short interspersed element (SINE) repetitive DNA family within the rodent genome. The copy number of individual ID elements varies by up to three orders of magnitude between species. This amplification has been highly sporadic in the order Rodentia and does not follow any phylogenetic trend. Using library screening and dot-blot analysis, we estimate there are 25,000 copies of ID elements in the deer mouse, 1,500 copies in the gerbil (both cricetid rodents), and 60,000 copies of either ID or ID-like elements in a sciurid rodent (squirrel). By dot-blot analysis, we estimate there are 150,000, 4,000, 1,000, and 200 copies of ID elements in the rat, mouse, hamster, and guinea pig, respectively (which is consistent with previous reports) and 200 copies in the hystricognath rodent, nutria. Therefore, a rapid amplification took place not only after the divergence of rat and mouse but also following the deer mouse (Peromyscus) and hamster split, with no evidence of increased amplifications in hystricognath rodents. No notable variations of sequences from the BC1 genes of several myomorphic rodents were observed that would possibly explain the varied levels of ID amplification. We did observe subgenera and species-group-specific variation in the ID core sequence of the BC1 gene within the genus Peromyscus. Sequence analysis of cloned ID elements in Peromyscus show most ID elements in this genus arose prior to Peromyscus subgenus divergence. Correspondence of the consensus sequence of individual ID elements in gerbil and deer mouse further confirms BC1 as a master gene in ID amplification. Several possible mechanisms responsible for the quantitative variations are explored.The nucleotide sequences reported in this paper have been submitted to the GenBank/EMBL Data Bank with accession numbers: U33850, U33851, U33852 (BC1 sequences); and U33853, U33854, U33855, U33856, U33857, U33858, U33859, U33860, U33861, U33862, U33863, U33864, U33865, U33866, U33867 (ID sequences) Correspondence to: D.H. Kass  相似文献   

11.
Within- and between-species variability was examined in a noncoding 238-bp segment of the HOX2 cluster. DNA of 4-26 individuals of four species (Pongo pygmaeus, Pan troglodytes, Gorilla gorilla, and Homo sapiens) was PCR amplified and electrophoresed in a denaturing gradient gel to screen for variability. Coupled amplification and sequencing was used to determine the complete sequence for each of the different alleles identified, one each in humans and orangutans, two in chimpanzees, and four in gorillas. Maximum-parsimony methods were used to construct a gene tree for these sequences. Alleles in all four species cluster into groups consisting of only one species (i.e., alleles within a species are monophyletic). The number of base-pair differences observed among alleles within P. troglodytes and within G. gorilla is larger than the number of base-pair substitutions that phylogenetically link Pan with Homo. Given these and other published data, it is premature to accept any particular phylogenetic tree that relates these three genera through two separate speciation events.  相似文献   

12.
We have developed software for fully automated tracking of vibrissae (whiskers) in high-speed videos (>500 Hz) of head-fixed, behaving rodents trimmed to a single row of whiskers. Performance was assessed against a manually curated dataset consisting of 1.32 million video frames comprising 4.5 million whisker traces. The current implementation detects whiskers with a recall of 99.998% and identifies individual whiskers with 99.997% accuracy. The average processing rate for these images was 8 Mpx/s/cpu (2.6 GHz Intel Core2, 2 GB RAM). This translates to 35 processed frames per second for a 640 px×352 px video of 4 whiskers. The speed and accuracy achieved enables quantitative behavioral studies where the analysis of millions of video frames is required. We used the software to analyze the evolving whisking strategies as mice learned a whisker-based detection task over the course of 6 days (8148 trials, 25 million frames) and measure the forces at the sensory follicle that most underlie haptic perception.  相似文献   

13.
Two gene segments coding for the variable region of human immunoglobulin light chains of the kappa type (VK genes, ref. 2) were found to have unusual structures. The two genes which are called A6 and A22 are located in duplicated gene clusters. Their restriction maps are very similar. About 4 kb of the A22 gene region were sequenced. It turned out that the intron contains an insert with the characteristics of a transposed element. The inserted DNA of 1.2 kb length contains imperfect direct and inverted repeats at its ends; at the insertion site a duplication of five nucleotides was found. Within the inserted DNA one copy each of an Alu element and of the simple sequence motif (T-G)17 were identified. Also these two repetitive sequences are themselves flanked by short direct repeats. The major inserted DNA has no significant homology to published human nucleic acid sequences. The whole structure is interpreted best by assuming a sequential insertion of the three elements. The coding region of the VK gene itself has several mutations which by themselves would render it a pseudogene; we assume that the insertion event(s) occurred prior to the mutations. According to mapping and hybridization data A6 is very similar to A22.  相似文献   

14.
15.
16.
To automate the acquisition of images from fluorescently stained gels, the power of the excitation laser(s) must be optimized for each sample to prevent spot saturation (or to allow unimportant spots to saturate) yet still retaining sensitivity. In this work, we describe the implementation and effectiveness of a pre-scan function in a robotic solution for the automation of 2D gel scanning.  相似文献   

17.
We transposed Dissociation (Ds) elements from three start loci on chromosome 5 in Arabidopsis (Nossen ecotype) by using a local transposition system. We determined partial genomic sequences flanking the Ds elements and mapped the elements' insertion sites in 1,173 transposed lines by comparison with the published genomic sequence. Most of the lines contained a single copy of the Ds element. One-half of the lines contained Ds on chromosome 5; in particular, insertion "hot spots" near the three start loci were clearly observed. In the other lines, the Ds elements were transposed across chromosomes. We found other insertion hot spots at the tops of chromosomes 2 and 4, near nucleolus organizer regions 2 and 4, respectively. Another characteristic feature was that the Ds elements tended to transpose near the chromosome ends and rarely transposed near centromeres. The distribution patterns differed among the three start loci, even though they possessed the same Ds construct. More than one-half of the Ds elements were inserted irregularly into the genome; that is, they did not retain the perfect inverted repeat sequence of Ds nor leave perfect target site duplications. This precise analysis of distribution patterns will contribute to a comprehensive understanding of the transposing mechanism. From these Ds insertion sites, we have constructed a database for screening gene-knockout mutants in silico. In 583 of the 1,173 lines, the Ds elements were inserted into protein-coding genes, which suggests that these lines are gene-knockout mutants. The database and individual lines will be available freely for academic use from the RIKEN Bio-Resource Center (http://www.brc.riken.go.jp/Eng/index.html).  相似文献   

18.
Throughout evolution, eukaryotic genomes have been invaded by transposable elements (TEs). Little is known about the factors leading to genomic proliferation of TEs, their preferred integration sites and the molecular mechanisms underlying their insertion. We analyzed hundreds of thousands nested TEs in the human genome, i.e. insertions of TEs into existing ones. We first discovered that most TEs insert within specific ‘hotspots’ along the targeted TE. In particular, retrotransposed Alu elements contain a non-canonical single nucleotide hotspot for insertion of other Alu sequences. We next devised a method for identification of integration sequence motifs of inserted TEs that are conserved within the targeted TEs. This method revealed novel sequences motifs characterizing insertions of various important TE families: Alu, hAT, ERV1 and MaLR. Finally, we performed a global assessment to determine the extent to which young TEs tend to nest within older transposed elements and identified a 4-fold higher tendency of TEs to insert into existing TEs than to insert within non-TE intergenic regions. Our analysis demonstrates that TEs are highly biased to insert within certain TEs, in specific orientations and within specific targeted TE positions. TE nesting events also reveal new characteristics of the molecular mechanisms underlying transposition.  相似文献   

19.
20.

Background  

The construction of robust and well resolved phylogenetic trees is important for our understanding of many, if not all biological processes, including speciation and origin of higher taxa, genome evolution, metabolic diversification, multicellularity, origin of life styles, pathogenicity and so on. Many older phylogenies were not well supported due to insufficient phylogenetic signal present in the single or few genes used in phylogenetic reconstructions. Importantly, single gene phylogenies were not always found to be congruent. The phylogenetic signal may, therefore, be increased by enlarging the number of genes included in phylogenetic studies. Unfortunately, concatenation of many genes does not take into consideration the evolutionary history of each individual gene. Here, we describe an approach to select informative phylogenetic proteins to be used in the Tree of Life (TOL) and barcoding projects by comparing the cophenetic correlation coefficients (CCC) among individual protein distance matrices of proteins, using the fungi as an example. The method demonstrated that the quality and number of concatenated proteins is important for a reliable estimation of TOL. Approximately 40–45 concatenated proteins seem needed to resolve fungal TOL.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号