共查询到20条相似文献,搜索用时 13 毫秒
1.
2.
3.
4.
EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference 总被引:1,自引:0,他引:1
EFICAz (Enzyme Function Inference by Combined Approach) is an automatic engine for large-scale enzyme function inference that combines predictions from four different methods developed and optimized to achieve high prediction accuracy: (i) recognition of functionally discriminating residues (FDRs) in enzyme families obtained by a Conservation-controlled HMM Iterative procedure for Enzyme Family classification (CHIEFc), (ii) pairwise sequence comparison using a family specific Sequence Identity Threshold, (iii) recognition of FDRs in Multiple Pfam enzyme families, and (iv) recognition of multiple Prosite patterns of high specificity. For FDR (i.e. conserved positions in an enzyme family that discriminate between true and false members of the family) identification, we have developed an Evolutionary Footprinting method that uses evolutionary information from homofunctional and heterofunctional multiple sequence alignments associated with an enzyme family. The FDRs show a significant correlation with annotated active site residues. In a jackknife test, EFICAz shows high accuracy (92%) and sensitivity (82%) for predicting four EC digits in testing sequences that are <40% identical to any member of the corresponding training set. Applied to Escherichia coli genome, EFICAz assigns more detailed enzymatic function than KEGG, and generates numerous novel predictions. 相似文献
5.
Somatic variant analysis of a tumour sample and its matched normal has been widely used in cancer research to distinguish germline polymorphisms from somatic mutations. However, due to the extensive intratumour heterogeneity of cancer, sequencing data from a single tumour sample may greatly underestimate the overall mutational landscape. In recent studies, multiple spatially or temporally separated tumour samples from the same patient were sequenced to identify the regional distribution of somatic mutations and study intratumour heterogeneity. There are a number of tools to perform somatic variant calling from matched tumour-normal next-generation sequencing (NGS) data; however none of these allow joint analysis of multiple same-patient samples. We discuss the benefits and challenges of multisample somatic variant calling and present multiSNV, a software package for calling single nucleotide variants (SNVs) using NGS data from multiple same-patient samples. Instead of performing multiple pairwise analyses of a single tumour sample and a matched normal, multiSNV jointly considers all available samples under a Bayesian framework to increase sensitivity of calling shared SNVs. By leveraging information from all available samples, multiSNV is able to detect rare mutations with variant allele frequencies down to 3% from whole-exome sequencing experiments. 相似文献
6.
F. Mertens E. Pålsson Anders Lindstrand Sören Toksvig-Larsen Sakari Knuutila Marcelo L. Larramendy Wa`el El-Rifai Janusz Limon Felix Mitelman Nils Mandahl 《Human genetics》1996,98(6):651-656
We examined, cytogenetically and by in situ hybridization (ISH) techniques, the synovia, osteophytes, and articular cartilage
from 32 patients with pronounced osteoarthritis (OA), a prevalent form of arthropathy characterized by progressive reduction
of articular cartilage, and synovial samples from 17 control patients. In short-term cultures, clonal chromosome aberrations,
in particular the gain of chromosomes 7 (+7) and 5 (+5), were found to be strongly associated with OA. These aberrations were
found in almost 90% of the cultures from synovia and osteophytes, whereas only 1/11 synovial samples from joints unequivocally
unaffected by OA had cells with +5 or +7. The in vivo nature of trisomy 7 was demonstrated by ISH on uncultured cells, and
serial passaging showed that cells with +7 had a proliferative advantage in vitro. Thus, the combined data indicate that cells
with somatic mutations appear early and may be influential in the disease process leading to OA.
Received: 7 June 1996 / Revised: 9 August 1996 相似文献
7.
It has become clear that hybridization between species is much more common than previously recognized. As a result, we now know that the genomes of many modern species, including our own, are a patchwork of regions derived from past hybridization events. Increasingly researchers are interested in disentangling which regions of the genome originated from each parental species using local ancestry inference methods. Due to the diverse effects of admixture, this interest is shared across disparate fields, from human genetics to research in ecology and evolutionary biology. However, local ancestry inference methods are sensitive to a range of biological and technical parameters which can impact accuracy. Here we present paired simulation and ancestry inference pipelines, mixnmatch and ancestryinfer, to help researchers plan and execute local ancestry inference studies. mixnmatch can simulate arbitrarily complex demographic histories in the parental and hybrid populations, selection on hybrids, and technical variables such as coverage and contamination. ancestryinfer takes as input sequencing reads from simulated or real individuals, and implements an efficient local ancestry inference pipeline. We perform a series of simulations with mixnmatch to pinpoint factors that influence accuracy in local ancestry inference and highlight useful features of the two pipelines. mixnmatch is a powerful tool for simulations of hybridization while ancestryinfer facilitates local ancestry inference on real or simulated data. 相似文献
8.
Jacob L. Steenwyk Thomas J. Buida III Yuanning Li Xing-Xing Shen Antonis Rokas 《PLoS biology》2020,18(12)
Highly divergent sites in multiple sequence alignments (MSAs), which can stem from erroneous inference of homology and saturation of substitutions, are thought to negatively impact phylogenetic inference. Thus, several different trimming strategies have been developed for identifying and removing these sites prior to phylogenetic inference. However, a recent study reported that doing so can worsen inference, underscoring the need for alternative alignment trimming strategies. Here, we introduce ClipKIT, an alignment trimming software that, rather than identifying and removing putatively phylogenetically uninformative sites, instead aims to identify and retain parsimony-informative sites, which are known to be phylogenetically informative. To test the efficacy of ClipKIT, we examined the accuracy and support of phylogenies inferred from 14 different alignment trimming strategies, including those implemented in ClipKIT, across nearly 140,000 alignments from a broad sampling of evolutionary histories. Phylogenies inferred from ClipKIT-trimmed alignments are accurate, robust, and time saving. Furthermore, ClipKIT consistently outperformed other trimming methods across diverse datasets, suggesting that strategies based on identifying and retaining parsimony-informative sites provide a robust framework for alignment trimming.Highly divergent sites in multiple sequence alignments are thought to negatively impact phylogenetic inference; trimming methods aim to remove these sites, but recent analysis suggests that doing so can worsen inference. This study introduces ClipKIT, a trimming method that instead aims to retain parsimony-informative sites; phylogenetic inference using ClipKIT-trimmed alignments is accurate, robust and time-saving. 相似文献
9.
Ellegren H 《Trends in genetics : TIG》2000,16(12):165-558
Microsatellite DNA sequences mutate at rates several orders of magnitude higher than that of the bulk of DNA. Such high rates mean that spontaneous mutations that form new-length variants can realistically be seen in pedigree analysis. Data on observed mutation events from various organisms are now accumulating, allowing inferences on DNA sequence evolution to be made through an unusually direct approach. Here I discuss and integrate microsatellite mutation data in an evolutionary context. A striking feature of the mutation process is that it seems highly heterogeneous, with distinct differences between species, repeat types, loci and alleles. Age and sex also affect the mutation rate. Within genomes at equilibrium, the microsatellite-length distribution is a delicate balance between biased mutation processes and point mutations acting towards the decay of repetitive DNA. Indeed, simple repeats do not evolve simply. 相似文献
10.
11.
MOTIVATION: Sequencing of a bi-allelic PCR product, which contains an allele with a deletion/insertion mutation results in a superimposed tracefile following the site of this shift mutation. A trace file of this type hampers the use of current computer programs for base calling. ShiftDetector analyses a sequencing trace file in order to discover if it is a superimposed sequence of two molecules that differ in a shift mutation of 1 to 25 bases. The program calculates a probability score for the existence of such a shift and reconstructs the sequence of the original molecule. AVAILABILITY: ShiftDetector is available from http://cowry.agri.huji.ac.il 相似文献
12.
13.
Bud-wood from seven rose cultivars exhibiting five different colors were exposed to 0, 3, 4, and 5 krad of gamma rays. A similar response was observed for all exposed cultivars; it included dose response reductions in bud-take, number and height of shoots, survival, flowers, petal weight and pollen fertility.The LD50 for white and mauve-flowered cultivars was found to be lower than the yellow, red andpink-flowered ones; the latter were more prone to mutations. Many phenotypically detectable variations in leaf, flower and growth habit were recorded in irradiated populations. Only three mutations, one in growth habit and two in flower colors, were successfully isolated and propagated. The results suggest that the Floribunda rose, i.e., Pink Parfait was more suitable for induction of mutations as compared with the six Hybrid Teas. 相似文献
14.
The explosive growth of genomic data provides an opportunity to make increased use of protein markers for phylogenetic inference.
We have developed an automated pipeline for phylogenomic analysis (AMPHORA) that overcomes the existing bottlenecks limiting
large-scale protein phylogenetic inference. We demonstrated its high throughput capabilities and high quality results by constructing
a genome tree of 578 bacterial species and by assigning phylotypes to 18,607 protein markers identified in metagenomic data
collected from the Sargasso Sea. 相似文献
15.
16.
Konishi H Lauring J Garay JP Karakas B Abukhdeir AM Gustin JP Konishi Y Park BH 《Nature protocols》2007,2(11):2865-2874
Here, we describe a method of systematic PCR screening with multiround sample pooling for the isolation of rare PCR-positive samples. As an example, we have applied this protocol to the recovery of gene-targeted clones in human somatic cells comprising only 0.02-0.17% of cells transduced with targeting vectors. Initially, cells infected with targeting vectors are seeded and grown in fourteen 96-well tissue culture plates. Samples are then collected from these plates and subjected to two rounds of pooling to yield twelve 'superpools' used for an initial PCR. After identifying PCR-positive samples, de-pooling is carried out with successive rounds of PCR screening, using samples of decreasing complexity. Single-cell cloning is subsequently performed to isolate gene-targeted clones. The entire protocol can be completed in 4-8 weeks depending on the proliferative capacity of the cell line. 相似文献
17.
18.
Emelyanova M. A. Amossenko F. A. Semyanikhina A. V. Aliev V. A. Barsukov Yu. A. Lyubchenko L. N. Nasedkina T. V. 《Molecular Biology》2015,49(4):550-559
Molecular Biology - Somatic mutations of KRAS, PIK3CA, and BRAF cause insensitivity of colorectal tumors to therapy with anti-EGFR monoclonal antibodies, necessitating a genetic testing prior to... 相似文献
19.