首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Oligo kernels for biological sequence classification have a high discriminative power. A new parameterization for the K-mer oligo kernel is presented, where all oligomers of length K are weighted individually. The task specific choice of these parameters increases the classification performance and reveals information about discriminative features. For adapting the multiple kernel parameters based on cross-validation the covariance matrix adaptation evolution strategy is proposed. It is applied to optimize the trimer oligo kernels for the detection of bacterial gene starts. The resulting kernels lead to higher classification rates, and the adapted parameters reveal the importance of particular triplets for classification, for example of those occurring in the Shine-Dalgarno Sequence.  相似文献   

2.
Integer lattices are important theoretical landscapes for studying the consequences of dispersal and spatial population structure, but convenient dispersal kernels able to represent important features of dispersal in nature have been lacking for lattices. Because leptokurtic (centrally peaked and long-tailed) kernels are common in nature and have important effects in models, of particular interest are families of dispersal kernels in which the degree of leptokurtosis can be varied parametrically. Here we develop families of kernels on integer lattices with several important properties. The degree of leptokurtosis can be varied parametrically from near 0 (the Gaussian value) to infinity. These kernels are all asymptotically radially symmetric. (Exact radial symmetry is impossible on lattices except in one dimension.) They have separate parameters for shape and scale, and their lower order moments and Fourier transforms are given by simple formulae. In most cases, the kernel families that we develop are closed under convolution so that multiple steps of a kernel remain within the same family. Included in these families are kernels with asymptotic power function tails, which have provided good fits to some observations from nature. These kernel families are constructed by randomizing convolutions of stepping-stone kernels and have interpretations in terms of population heterogeneity and heterogeneous physical processes.  相似文献   

3.
Marginalized kernels for biological sequences   总被引:1,自引:0,他引:1  
MOTIVATION: Kernel methods such as support vector machines require a kernel function between objects to be defined a priori. Several works have been done to derive kernels from probability distributions, e.g., the Fisher kernel. However, a general methodology to design a kernel is not fully developed. RESULTS: We propose a reasonable way of designing a kernel when objects are generated from latent variable models (e.g., HMM). First of all, a joint kernel is designed for complete data which include both visible and hidden variables. Then a marginalized kernel for visible data is obtained by taking the expectation with respect to hidden variables. We will show that the Fisher kernel is a special case of marginalized kernels, which gives another viewpoint to the Fisher kernel theory. Although our approach can be applied to any object, we particularly derive several marginalized kernels useful for biological sequences (e.g., DNA and proteins). The effectiveness of marginalized kernels is illustrated in the task of classifying bacterial gyrase subunit B (gyrB) amino acid sequences.  相似文献   

4.
Using time-domain correlation techniques, the first- and second-order Wiener kernels have been calculated for the system mediating the human visual evoked response. The first-order kernels indicate the linear element is a resonant one, with a natural frequency near 20 Hz, and a memory of approximately 250 ms. The transport delay associated with this element is approximately 56 ms. The second-order kernels indicate a quadratic nonlinear element with a memory less than 20 ms. The analytic form of this element can be approximated by a parabola shifted to the right of the origin. A close correspondance between the spectrum of the first-order kernel and the spectrum of the main diagonal of the second-order kernel suggests the nonlinear element preceeds the linear one. Tests of reproducibility on the first-order kernel and the main diagonal of the second-order kernel suggest they are reliable describing functions for the system mediating the human visual evoked response.  相似文献   

5.
A functional expansion was used to model the relationship between a Gaussian white noise stimulus current and the resulting action potential output in the single sensory neuron of the cockroach femoral tactile spine. A new precise procedure was used to measure the kernels of the functional expansion. Very similar kernel estimates were obtained from separate sections of the data produced by the same neuron with the same input noise power level, although some small time-varying effects were detectable in moving through the data. Similar kernel estimates were measured using different input noise power levels for a given cell, or when comparing different cells under similar stimulus conditions. The kernels were used to identify a model for sensory encoding in the neuron, comprising a cascade of dynamic linear, static nonlinear, and dynamic linear elements. Only a single slice of the estimated experimental second-order kernel was used in identifying the cascade model. However, the complete second-order kernel of the cascade model closely resembled the estimated experimental kernel. Moreover, the model could closely predict the experimental action potential train obtained with novel white noise inputs.  相似文献   

6.
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data sources can be challenging. Recently, it has been recognized that kernel-based classifiers are well suited for this task. However, the different kernels (data sources) are often combined using equal weights. Although several methods have been developed to optimize kernel weights, no large-scale example of an improvement in classifier performance has been shown yet. In this work, we employ an evolutionary algorithm to determine weights for a larger set of kernels by optimizing a criterion based on the area under the ROC curve. We show that setting the right kernel weights can indeed improve performance. We compare this to the existing kernel weight optimization methods (i.e., (regularized) optimization of the SVM criterion or aligning the kernel with an ideal kernel) and find that these do not result in a significant performance improvement and can even cause a decrease in performance. Results also show that an expert approach of assigning high weights to features with high individual performance is not necessarily the best strategy.  相似文献   

7.
A L Boyer 《Radiation research》1988,113(2):235-242
Dose-spread kernels can be used to calculate the dose distribution in a photon beam by convolving the kernel with the primary fluence distribution. The theoretical relationships between various types and components of dose-spread kernels relative to photon attenuation coefficients are explored. These relations can be valuable as checks on the conservation of energy by dose-spread kernels calculated by analytic or Monte Carlo methods.  相似文献   

8.
We introduce novel profile-based string kernels for use with support vector machines (SVMs) for the problems of protein classification and remote homology detection. These kernels use probabilistic profiles, such as those produced by the PSI-BLAST algorithm, to define position-dependent mutation neighborhoods along protein sequences for inexact matching of k-length subsequences ("k-mers") in the data. By use of an efficient data structure, the kernels are fast to compute once the profiles have been obtained. For example, the time needed to run PSI-BLAST in order to build the profiles is significantly longer than both the kernel computation time and the SVM training time. We present remote homology detection experiments based on the SCOP database where we show that profile-based string kernels used with SVM classifiers strongly outperform all recently presented supervised SVM methods. We further examine how to incorporate predicted secondary structure information into the profile kernel to obtain a small but significant performance improvement. We also show how we can use the learned SVM classifier to extract "discriminative sequence motifs"--short regions of the original profile that contribute almost all the weight of the SVM classification score--and show that these discriminative motifs correspond to meaningful structural features in the protein data. The use of PSI-BLAST profiles can be seen as a semi-supervised learning technique, since PSI-BLAST leverages unlabeled data from a large sequence database to build more informative profiles. Recently presented "cluster kernels" give general semi-supervised methods for improving SVM protein classification performance. We show that our profile kernel results also outperform cluster kernels while providing much better scalability to large datasets.  相似文献   

9.
10.
The ionome, or elemental profile, of a maize kernel can be viewed in at least two distinct ways. First, the collection of elements within the kernel are food and feed for people and animals. Second, the ionome of the kernel represents a developmental end point that can summarize the life history of a plant, combining genetic programs and environmental interactions. We assert that single-kernel-based phenotyping of the ionome is an effective method of analysis, as it represents a reasonable compromise between precision, efficiency, and power. Here, we evaluate potential pitfalls of this sampling strategy using several field-grown maize sample sets. We demonstrate that there is enough genetically determined diversity in accumulation of many of the elements assayed to overcome potential artifacts. Further, we demonstrate that environmental signals are detectable through their influence on the kernel ionome. We conclude that using single kernels as the sampling unit is a valid approach for understanding genetic and environmental effects on the maize kernel ionome.  相似文献   

11.
We study how the speed of spread for an integrodifference equation depends on the dispersal pattern of individuals. When the dispersal kernel has finite variance, the central limit theorem states that convolutions of the kernel with itself will approach a suitably chosen Gaussian distribution. Despite this fact, the speed of spread cannot be obtained from the Gaussian approximation. We give several examples and explanations for this fact. We then use the kurtosis of the kernel to derive an improved approximation that shows a very good fit to all the kernels tested. We apply the theory to one well-studied data set of dispersal of Drosophila pseudoobscura and to two one-parameter families of theoretical dispersal kernels. In particular, we find kernels that, despite having compact support, have a faster speed of spread than the Gaussian kernel.  相似文献   

12.
Wheat plants (Triticum aestivum L. 'Lyallpur'), limited to a single culm, were grown at day/night temperatures of either 18/13 degrees C (moderate temperature), or 27/22 degrees C (chronic high temperature) from the time of anthesis. Plants were either non-droughted or subjected to two post-anthesis water stresses by withholding water from plants grown in different volumes of potting mix. In selected plants the demand for assimilates by the ear was reduced by removal of all but the five central spikelets. In non-droughted plants, it was confirmed that shading following anthesis (source limitation) reduced kernel dry weight at maturity, with a compensating increase in the dry weight of the remaining kernels when the total number of kernels was reduced (small sink). Reducing kernel number did not alter the effect of high temperature following anthesis on the dry weight of the remaining kernels at maturity, but reducing the number of kernels did result in a greater dry weight of the remaining kernels of droughted plants. However, the relationship between the response to drought and kernel number was confounded by a reduction in the extent of water stress associated with kernel removal. Data on the effect of water stress on kernel dry weight at maturity of plants with either the full complement or reduced numbers of kernels, and subjected to low and high temperatures following anthesis, indicate that the effect of drought on kernel dry weight may be reduced, in both absolute and relative terms, rather than enhanced, at high temperature. It is suggested that where high temperature and drought occur concurrently after anthesis there may be a degree of drought escape associated with chronic high temperature due to the reduction in the duration of kernel filling, even though the rate of water use may be enhanced by high temperature.  相似文献   

13.

Background

Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of biochemical reactions, in which the product (output) from one reaction serves as the substrate (input) to another reaction. Many pathways remain incompletely characterized. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes. Pairwise classification methods are supervised learning methods used to classify new pair of entities. Some of these classification methods, e.g., Pairwise Support Vector Machines (SVMs), use pairwise kernels. Pairwise kernels describe similarity measures between two pairs of entities. Using pairwise kernels to handle sequence data requires long processing times and large storage. Rational kernels are kernels based on weighted finite-state transducers that represent similarity measures between sequences or automata. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and machine translations.

Results

We create a new family of pairwise kernels using weighted finite-state transducers (called Pairwise Rational Kernel (PRK)) to predict metabolic pathways from a variety of biological data. PRKs take advantage of the simpler representations and faster algorithms of transducers. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We then developed several experiments with PRKs and Pairwise SVM to validate our methods using the metabolic network of Saccharomyces cerevisiae. As a result, when PRKs are used, our method executes faster in comparison with other pairwise kernels. Also, when we use PRKs combined with other simple kernels that include evolutionary information, the accuracy values have been improved, while maintaining lower construction and execution times.

Conclusions

The power of using kernels is that almost any sort of data can be represented using kernels. Therefore, completely disparate types of data can be combined to add power to kernel-based machine learning methods. When we compared our proposal using PRKs with other similar kernel, the execution times were decreased, with no compromise of accuracy. We also proved that by combining PRKs with other kernels that include evolutionary information, the accuracy can also also be improved. As our proposal can use any type of sequence data, genes do not need to be properly annotated, avoiding accumulation errors because of incorrect previous annotations.  相似文献   

14.
Glossy Black‐Cockatoos (Calyptorhynchus lathami) appear to maximize dry matter intake by selecting feeding trees that have the most profitable seed crops as indicated by Clout’s Index (seed weight/total cone weight). However, as it is unlikely that cockatoos can directly assess Clout’s Index, the mechanism for such selection is unclear. Moreover, as cockatoos consume only the kernels, and not all seeds contain kernels, better estimates of food value are required. Therefore, we examine seed and cone characteristics and establish that Seed Fill (percentage of seeds containing kernels) and Kernel Ratio (average kernel weight/average cone weight) contribute significantly to Food Value (weight of kernels/total cone weight). We propose that these factors can be rapidly assessed by cockatoos, and show that selection of feeding trees can be more accurately predicted using discriminant analysis with a combination of Seed Fill and Kernel Ratio than with either Food Value or Clout’s Index alone. Along with most other characteristics, Seed Fill and Kernel Ratio were consistent between the two years of study, enabling foraging cockatoos to return to profitable trees annually, without sampling. Where sampling is undertaken, rapid assessment of profitability by sampling cone ends is possible, as kernels are randomly distributed through the cone. Also, a decline in Food Value, Seed Fill and Kernel Ratio with cone age, means that cockatoos could also assess profitability on the basis of cone colour. We show that concentrations of individual nutrients are unlikely to contribute to tree selection, previous reports of such selection being caused by the predominance of protein and oils in the kernels, and of ash, fibre and carbohydrates in the samara. We therefore conclude that cockatoos select feeding trees primarily on the basis of optimizing kernel intake.  相似文献   

15.
Huang L  Massa L  Karle J 《Biochemistry》2005,44(50):16747-16752
The kernel energy method (KEM) has been used in three recent papers (1-3) to calculate the quantum mechanical ab inito molecular energy of peptides and the protein insulin. It was found to have good accuracy. The computational difficulty of representing a molecule increases only modestly with the number of atoms. The calculations are simplified by adopting the approximation that a full biological molecule can be represented by smaller "kernels" of atoms. In this paper, the accuracy of the KEM is tested in the application to DNA, whose basic kernels, chemical bonding, and overall molecular structure are quite different from peptides and proteins. The basic kernel in the case of peptides and proteins is an amino acid. The basic kernel in the case of DNA is a nucleotide consisting of a phosphate-sugar-base. The molecular energy is calculated for all three basic types of DNA, i.e., B, A, and Z configurations of DNA. The results give an accuracy that is comparable to that achieved with peptides and proteins. Thus, the KEM is found to be applicable to major types of biological molecules.  相似文献   

16.
We have developed a method for detecting a transgene and its protein product in maize endosperm that allows the kernel to be germinated after analysis. This technique could be highly useful for several monocots and dicots. Our method involves first sampling the endosperm with a hand-held rotary grinder so that the embryo is preserved and capable of germination. This tissue is then serially extracted, first with SDS-PAGE sample buffer to extract proteins, then with an aqueous buffer to extract DNA. The product of the transgene can be detected in the first extract by SDS-PAGE with visualization by total protein staining or immuno-blot detection. The second extract can be purified and used as template DNA in PCR reactions to detect the transgene. This method is particularly useful for screening transgenic kernels in breeding experiments and testing for gene silencing in kernels.  相似文献   

17.
Temperature stress during kernel development affects maize (Zea mays L.) grain growth and yield stability. Maize kernels (hybrid A619 x W64A) were cultured in vitro at 3 d after pollination and either maintained at 25[deg]C or transferred to 35[deg]C for 4 or 8 d, then returned to 25[deg]C until physiological maturity. Kernel fresh and dry matter accumulation was severely disrupted by the long-term heat stress (8 d at 35[deg]C) and did not recover when transferred back to 25[deg]C, resulting in abortion of 97% of the kernels. Kernels exposed to 35[deg]C for 4 d (short-term heat stress) exhibited a recovery in kernel growth and water content at about 18 d after pollination and kernel abortion was reduced to about 23%. During the cell division phase, abscisic acid (ABA) levels showed a steady decline in the control but maintained a moderate level in the heat-stressed kernels. However, later in development heat-stressed kernels had significantly higher levels of ABA than the control. Cytokinin analysis confirmed a peak in zeatin riboside and zeatin levels in control kernels at 10 to 12d after pollination. In contrast, kernels subjected to 4 d of heat stress had no detectable levels of zeatin and the zeatin riboside peak was reduced by 70% and delayed until 18 d after pollination. The long-term heat-stressed kernels showed low to nondetectable levels of either zeatin riboside or zeatin. Regression analysis of ABA level against cytokinin level during the endosperm cell division phase revealed a highly significant negative correlation in nonstressed kernels but no correlation in kernels exposed to short-term or long-term heat stress. Application of benzyladenine to heat-stressed, growth-chamber-grown plants increased thermotolerance in part by reducing kernel abortion at the tip and middle positions on the ear. These results confirm that shift in hormone balance of kernels is one mechanism by which heat stress disrupts maize kernel development. The maintenance of high levels of cytokinins in the kernels during heat stress appears to be important in increasing thermotolerance and providing yield stability of maize.  相似文献   

18.
Growth and development of plants are known to be affected by exposure to red and blue light. Mechanisms by which light quality influences gene expression in maize (Zea mays L.) embryos have not been explored. Maize kernels can be cultured in vitro allowing experimental manipulation of environmental factors during seed development. We used the in vitro kernel culture system to investigate the response of developing maize seeds, which normally develop without exposure to light, to controlled light quality. Kernels grown under red light accumulated more dry weight than those grown in darkness, whereas kernels grown under blue light accumulated less. Reciprocal color shift experiments showed that light quality during the first week in culture had more influence on kernel weight than during the subsequent three weeks in culture. Soluble sugars were higher in both light treatments than in darkness. Blue-grown kernels had higher amino acid and lower lipid levels than red-or dark-grown kernels. Embryo morphology was markedly affected by red light, under which the upper shoot axis was longer than under blue light or in darkness. Embryo morphology was influenced by light quality during the later stages of development rather than the first week. We suggest, based on these results, that gene expression in the embryo and endosperm of developing maize seeds is sensitive to light quality, and the mechanism and time dependence of this effect warrant further study. In vitro maize kernel culture affords a convenient system for such light quality experiments.  相似文献   

19.
MOTIVATION: Microarrays are capable of determining the expression levels of thousands of genes simultaneously. In combination with classification methods, this technology can be useful to support clinical management decisions for individual patients, e.g. in oncology. The aim of this paper is to systematically benchmark the role of non-linear versus linear techniques and dimensionality reduction methods. RESULTS: A systematic benchmarking study is performed by comparing linear versions of standard classification and dimensionality reduction techniques with their non-linear versions based on non-linear kernel functions with a radial basis function (RBF) kernel. A total of 9 binary cancer classification problems, derived from 7 publicly available microarray datasets, and 20 randomizations of each problem are examined. CONCLUSIONS: Three main conclusions can be formulated based on the performances on independent test sets. (1) When performing classification with least squares support vector machines (LS-SVMs) (without dimensionality reduction), RBF kernels can be used without risking too much overfitting. The results obtained with well-tuned RBF kernels are never worse and sometimes even statistically significantly better compared to results obtained with a linear kernel in terms of test set receiver operating characteristic and test set accuracy performances. (2) Even for classification with linear classifiers like LS-SVM with linear kernel, using regularization is very important. (3) When performing kernel principal component analysis (kernel PCA) before classification, using an RBF kernel for kernel PCA tends to result in overfitting, especially when using supervised feature selection. It has been observed that an optimal selection of a large number of features is often an indication for overfitting. Kernel PCA with linear kernel gives better results.  相似文献   

20.
Past research on kernel growth in wheat (Triticum aestivum) has shown that the kernel itself largely regulates the influx of sucrose for consequent starch synthesis in the endosperm of the grain. The first step in the conversion of sucrose to starch is catalyzed by sucrose synthase (EC 2.4.13). Sucrose synthase activity was assayed in developing endosperms from kernels differing in growth rate and in maximum dry weight accumulation. From 10 to 22 days after anthesis, sucrose synthase activity per wheat endosperm remained constant with respect to time in all grains. However, kernels which had higher rates of kernel growth and which achieved greatest maximum weight had consistently and significantly higher sucrose synthase activities at any point in time than did kernels with slower rates of dry matter accumulation and lower maximum weight. In addition, larger kernels had a significantly greater amount of water in which this activity could be expressed. Although the results do not implicate sucrose synthase as the “rate limiting” enzyme in wheat kernel growth, they do emphasize the importance of sucrose synthase activity in larger or more rapidly growing kernels, as compared to smaller slower growing kernels.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号