期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Unsupervised feature selection by kernel density estimation in wavelet-based spike sorting

Xinling Geng Guangshu Hu 《Biomedical signal processing and control》2012,7(2):112-117

Wavelet transform has been widely applied in extracting characteristic information in spike sorting. As the wavelet coefficients used to distinguish various spike shapes are often disorganized, they still lack in effective unsupervised methods still lacks to select the most discriminative features. In this paper, we propose an unsupervised feature selection method, employing kernel density estimation to select those wavelet coefficients with bimodal or multimodal distributions. This method is tested on a simulated spike data set, and the average misclassification rate after fuzzy C-means clustering has been greatly reduced, which proves this kernel density estimation-based feature selection approach is effective. 相似文献

2.

Flexible models for spike count data with both over- and under- dispersion

Ian H. Stevenson 《Journal of computational neuroscience》2016,41(1):29-43

相似文献

3.

Sample size determination for matched-pair equivalence trials using rate ratio

Tang NS Tang ML Wang SF 《Biostatistics (Oxford, England)》2007,8(3):625-631

In this article, we compare Wald-type, logarithmic transformation, and Fieller-type statistics for the classical 2-sided equivalence testing of the rate ratio under matched-pair designs with a binary end point. These statistics can be implemented through sample-based, constrained least squares estimation and constrained maximum likelihood (CML) estimation methods. Sample size formulae based on the CML estimation method are developed. We consider formulae that control a prespecified power or confidence width. Our simulation studies show that statistics based on the CML estimation method generally outperform other statistics and methods with respect to actual type I error rate and average width of confidence intervals. Also, the corresponding sample size formulae are valid asymptotically in the sense that the exact power and actual coverage probability for the estimated sample size are generally close to their prespecified values. The methods are illustrated with a real example from a clinical laboratory study. 相似文献

4.

A new look at state-space models for neural data

Liam Paninski Yashar Ahmadian Daniel Gil Ferreira Shinsuke Koyama Kamiar Rahnama Rad Michael Vidne Joshua Vogelstein Wei Wu 《Journal of computational neuroscience》2010,29(1-2):107-126

State space methods have proven indispensable in neural data analysis. However, common methods for performing inference in state-space models with non-Gaussian observations rely on certain approximations which are not always accurate. Here we review direct optimization methods that avoid these approximations, but that nonetheless retain the computational efficiency of the approximate methods. We discuss a variety of examples, applying these direct optimization techniques to problems in spike train smoothing, stimulus decoding, parameter estimation, and inference of synaptic properties. Along the way, we point out connections to some related standard statistical methods, including spline smoothing and isotonic regression. Finally, we note that the computational methods reviewed here do not in fact depend on the state-space setting at all; instead, the key property we are exploiting involves the bandedness of certain matrices. We close by discussing some applications of this more general point of view, including Markov chain Monte Carlo methods for neural decoding and efficient estimation of spatially-varying firing rates. 相似文献

5.

Multi-scale detection of rate changes in spike trains with weak dependencies

Michael Messer Kauê M. Costa Jochen Roeper Gaby Schneider 《Journal of computational neuroscience》2017,42(2):187-201

The statistical analysis of neuronal spike trains by models of point processes often relies on the assumption of constant process parameters. However, it is a well-known problem that the parameters of empirical spike trains can be highly variable, such as for example the firing rate. In order to test the null hypothesis of a constant rate and to estimate the change points, a Multiple Filter Test (MFT) and a corresponding algorithm (MFA) have been proposed that can be applied under the assumption of independent inter spike intervals (ISIs). As empirical spike trains often show weak dependencies in the correlation structure of ISIs, we extend the MFT here to point processes associated with short range dependencies. By specifically estimating serial dependencies in the test statistic, we show that the new MFT can be applied to a variety of empirical firing patterns, including positive and negative serial correlations as well as tonic and bursty firing. The new MFT is applied to a data set of empirical spike trains with serial correlations, and simulations show improved performance against methods that assume independence. In case of positive correlations, our new MFT is necessary to reduce the number of false positives, which can be highly enhanced when falsely assuming independence. For the frequent case of negative correlations, the new MFT shows an improved detection probability of change points and thus, also a higher potential of signal extraction from noisy spike trains. 相似文献

6.

Methods for the Analyses of Case-Cohort Studies

Alexander Volovics Piet A. Den Van Brandt 《Biometrical journal. Biometrische Zeitschrift》1997,39(2):195-214

Case-cohort and nested case-control sampling methods have recently been introduced as a means of reducing cost in large cohort studies. The asymptotic distribution theory results for relative rate estimation based on Cox type partial or pseudolikelihoods for case-cohort and nested case-control studies have been accounted for. However, many researchers use (stratified) frequency table methods for a first or primary summarization of the most important evidence on exposure-disease or dose-response relationships, i.e. the classical Mantel-Haenszel analyses, trend tests and tests for heterogeneity of relative rates. These can be followed by exponential failure time regression methods on grouped or individual data to model relationships between several factors and response. In this paper we present the adaptations needed to use these methods with case-cohort designs, illustrating their use with data from a recent case-cohort study on the relationship between diet, life-style and cancer. We assume a very general setup allowing piecewise constant failure rates, possible recurrent events per individual, independent censoring and left truncation. 相似文献

7.

A cross-validation study to select a classification procedure for clinical diagnosis based on proteomic mass spectrometry

Valkenborg D Van Sanden S Lin D Kasim A Zhu Q Haldermans P Jansen I Shkedy Z Burzykowski T 《Statistical applications in genetics and molecular biology》2008,7(2):Article12

We present an approach to construct a classification rule based on the mass spectrometry data provided by the organizers of the "Classification Competition on Clinical Mass Spectrometry Proteomic Diagnosis Data." Before constructing a classification rule, we attempted to pre-process the data and to select features of the spectra that were likely due to true biological signals (i.e., peptides/proteins). As a result, we selected a set of 92 features. To construct the classification rule, we considered eight methods for selecting a subset of the features, combined with seven classification methods. The performance of the resulting 56 combinations was evaluated by using a cross-validation procedure with 1000 re-sampled data sets. The best result, as indicated by the lowest overall misclassification rate, was obtained by using the whole set of 92 features as the input for a support-vector machine (SVM) with a linear kernel. This method was therefore used to construct the classification rule. For the training data set, the total error rate for the classification rule, as estimated by using leave-one-out cross-validation, was equal to 0.16, with the sensitivity and specificity equal to 0.87 and 0.82, respectively. 相似文献

8.

Local bandwidth selection for kernel estimation of population densities with line transect sampling 总被引：1，自引：0，他引：1

Gerard PD Schucany WR 《Biometrics》1999,55(3):769-773

Seber (1986, Biometrics 42, 267-292) suggested an approach to biological population density estimation using kernel estimates of the probability density of detection distances in line transect sampling. Chen (1996a, Applied Statistics 45, 135-150) and others have employed cross validation to choose a global bandwidth for the kernel estimator or have suggested adaptive kernel estimation (Chen, 1996b, Biometrics 52, 1283-1294). Because estimation of the density is required at only a single point, we investigate a local bandwidth selection procedure that is a modification of the method of Schucany (1995, Journal of the American Statistical Association 90, 535-540) for nonparametric regression. We report on simulation results comparing the proposed method and a local normal scale rule with cross validation and adaptive estimation. The local bandwidths and normal scale rule produce estimates with mean squares that are half the size of the others in most cases. Consistency results are also provided. 相似文献

9.

Quantifying Spike Train Oscillations: Biases,Distortions and Solutions

Ayala Matzner Izhar Bar-Gad 《PLoS computational biology》2015,11(4)

Estimation of the power spectrum is a common method for identifying oscillatory changes in neuronal activity. However, the stochastic nature of neuronal activity leads to severe biases in the estimation of these oscillations in single unit spike trains. Different biological and experimental factors cause the spike train to differentially reflect its underlying oscillatory rate function. We analyzed the effect of factors, such as the mean firing rate and the recording duration, on the detectability of oscillations and their significance, and tested these theoretical results on experimental data recorded in Parkinsonian non-human primates. The effect of these factors is dramatic, such that in some conditions, the detection of existing oscillations is impossible. Moreover, these biases impede the comparison of oscillations across brain regions, neuronal types, behavioral states and separate recordings with different underlying parameters, and lead inevitably to a gross misinterpretation of experimental results. We introduce a novel objective measure, the "modulation index", which overcomes these biases, and enables reliable detection of oscillations from spike trains and a direct estimation of the oscillation magnitude. The modulation index detects a high percentage of oscillations over a wide range of parameters, compared to classical spectral analysis methods, and enables an unbiased comparison between spike trains recorded from different neurons and using different experimental protocols. 相似文献

10.

Evolutionary Optimization of Kernel Weights Improves Protein Complex Comembership Prediction

Hulsman Marc Reinders Marcel J.T. de Ridder Dick 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(3):427-437

In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data sources can be challenging. Recently, it has been recognized that kernel-based classifiers are well suited for this task. However, the different kernels (data sources) are often combined using equal weights. Although several methods have been developed to optimize kernel weights, no large-scale example of an improvement in classifier performance has been shown yet. In this work, we employ an evolutionary algorithm to determine weights for a larger set of kernels by optimizing a criterion based on the area under the ROC curve. We show that setting the right kernel weights can indeed improve performance. We compare this to the existing kernel weight optimization methods (i.e., (regularized) optimization of the SVM criterion or aligning the kernel with an ideal kernel) and find that these do not result in a significant performance improvement and can even cause a decrease in performance. Results also show that an expert approach of assigning high weights to features with high individual performance is not necessarily the best strategy. 相似文献

11.

Inference of intrinsic spiking irregularity based on the Kullback-Leibler information

Koyama S Shinomoto S 《Bio Systems》2007,89(1-3):69-73

We have recently established an empirical Bayes method that extracts both the intrinsic irregularity and the time-dependent rate from a spike sequence [Koyama, S., Shinomoto, S., 2005. Empirical Bayes interpretations of random point events. J. Phys. A: Math. Gen. 38, L531-L537]. In the present paper, we examine an alternative method based on the more fundamental principle of minimizing the Kullback-Leibler information from the original distribution of spike sequences to a model distribution. Not only the empirical Bayes method but also the Kullback-Leibler information method exhibits a switch of the most plausible interpretation of the spikes between (I) being derived irregularly from a nearly constant rate, and (II) being derived rather regularly from a significantly fluctuating rate.The model distributions selected by both methods are similar for the same spike sequences derived from a given rate-fluctuating gamma process. 相似文献

12.

A general framework of nonparametric feature selection in high-dimensional data

Hang Yu Yuanjia Wang Donglin Zeng 《Biometrics》2023,79(2):951-963

Nonparametric feature selection for high-dimensional data is an important and challenging problem in the fields of statistics and machine learning. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space. The space is generated by a novel tensor product kernel, which depends on a set of parameters that determines the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove the oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and applications to two real studies. 相似文献

13.

State-space analysis of time-varying higher-order spike correlation for multiple neural spike train data

Shimazaki H Amari S Brown EN Grün S 《PLoS computational biology》2012,8(3):e1002385

Precise spike coordination between the spiking activities of multiple neurons is suggested as an indication of coordinated network activity in active cell assemblies. Spike correlation analysis aims to identify such cooperative network activity by detecting excess spike synchrony in simultaneously recorded multiple neural spike sequences. Cooperative activity is expected to organize dynamically during behavior and cognition; therefore currently available analysis techniques must be extended to enable the estimation of multiple time-varying spike interactions between neurons simultaneously. In particular, new methods must take advantage of the simultaneous observations of multiple neurons by addressing their higher-order dependencies, which cannot be revealed by pairwise analyses alone. In this paper, we develop a method for estimating time-varying spike interactions by means of a state-space analysis. Discretized parallel spike sequences are modeled as multi-variate binary processes using a log-linear model that provides a well-defined measure of higher-order spike correlation in an information geometry framework. We construct a recursive Bayesian filter/smoother for the extraction of spike interaction parameters. This method can simultaneously estimate the dynamic pairwise spike interactions of multiple single neurons, thereby extending the Ising/spin-glass model analysis of multiple neural spike train data to a nonstationary analysis. Furthermore, the method can estimate dynamic higher-order spike interactions. To validate the inclusion of the higher-order terms in the model, we construct an approximation method to assess the goodness-of-fit to spike data. In addition, we formulate a test method for the presence of higher-order spike correlation even in nonstationary spike data, e.g., data from awake behaving animals. The utility of the proposed methods is tested using simulated spike data with known underlying correlation dynamics. Finally, we apply the methods to neural spike data simultaneously recorded from the motor cortex of an awake monkey and demonstrate that the higher-order spike correlation organizes dynamically in relation to a behavioral demand. 相似文献

14.

基于带宽优选地理加权回归模型的深圳市植被碳储量反演

龙依蒋馥根孙华王天宏邹琪陈川石《生态学报》2022,42(12):4933-4945

植被碳储量估测是自然资源监测的重要内容,遥感技术结合地面样地进行反演可以获得区域范围内植被碳储量的空间连续分布,弥补了传统人工抽样调查估测的不足。然而,现有的参数和非参数遥感估测模型大多忽略了样地数据的变异与空间自相关关系。研究以Landsat 8 OLI影像为数据源提取遥感变量,结合植被碳储量实测调查数据,利用最小信息准则(AICc)、最大空间自相关距离(MSAD)和交叉验证(CV)分别确定最优带宽,组合Gaussian、Bi-square和Exponential核函数构建地理加权回归(GWR)模型估算深圳市植被碳储量,并与多元线性回归(MLR)进行比较,选择最优模型绘制深圳市植被碳储量空间分布图。研究结果表明,GWR模型整体精度优于MLR模型,GWR模型的决定系数(R~2)均高于MLR模型,且均方根误差(RMSE)和平均绝对误差(MAE)显著降低。带宽和核函数的选择对GWR模型估测结果产生了显著影响。以CV确定带宽、Exponential为核函数组合构建的GWR模型效果最佳,其R~2为0.697,RMSE为10.437 Mg C/hm~2,相比其它模型精度上升了13.87%—32.... 相似文献

15.

Intercontinental and intracontinental biogeography patterns and methods

Jun WEN Qiu-Yun XIANG Hong QIAN Jian-hua LI Xiao-Quan WANG Stefanie M. ICKERT-BOND 《植物分类学报》2009,47(5):327-329

The study of biogeography has benefited from the exponential increase of DNA sequence data from recent molecular systematic studies, the development of analytical methods in the last decade concerning divergence time estimation and geographic area analyses, and the availability of large-scale distributiofi data of species in many groups of organisms. The underlying principle of divergence time estimation from DNA and protein data is that sequence divergence depends on the product of evolutionary rate and time. With their molecular clock hypothesis, Zuckerkandl and Pauling （1965） separated rates of molecular evolution from time by incorporating fossil evidence. Originally, 相似文献

16.

Evaluating automated parameter constraining procedures of neuron models by experimental and surrogate data

Shaul Druckmann Thomas K. Berger Sean Hill Felix Schürmann Henry Markram Idan Segev 《Biological cybernetics》2008,99(4-5):371-379

Neuron models, in particular conductance-based compartmental models, often have numerous parameters that cannot be directly determined experimentally and must be constrained by an optimization procedure. A common practice in evaluating the utility of such procedures is using a previously developed model to generate surrogate data (e.g., traces of spikes following step current pulses) and then challenging the algorithm to recover the original parameters (e.g., the value of maximal ion channel conductances) that were used to generate the data. In this fashion, the success or failure of the model fitting procedure to find the original parameters can be easily determined. Here we show that some model fitting procedures that provide an excellent fit in the case of such model-to-model comparisons provide ill-balanced results when applied to experimental data. The main reason is that surrogate and experimental data test different aspects of the algorithm’s function. When considering model-generated surrogate data, the algorithm is required to locate a perfect solution that is known to exist. In contrast, when considering experimental target data, there is no guarantee that a perfect solution is part of the search space. In this case, the optimization procedure must rank all imperfect approximations and ultimately select the best approximation. This aspect is not tested at all when considering surrogate data since at least one perfect solution is known to exist (the original parameters) making all approximations unnecessary. Furthermore, we demonstrate that distance functions based on extracting a set of features from the target data (such as time-to-first-spike, spike width, spike frequency, etc.)—rather than using the original data (e.g., the whole spike trace) as the target for fitting—are capable of finding imperfect solutions that are good approximations of the experimental data. 相似文献

17.

Effects of Point Pattern Shape on Home-Range Estimates

JONI A. DOWNS MARK W. HORNER 《The Journal of wildlife management》2008,72(8):1813-1818

Abstract: Home-range estimators are commonly tested with simulated animal locational data in the laboratory before the estimators are used in practice. Although kernel density estimation (KDE) has performed well as a home-range estimator for simulated data, several recent studies have reported its poor performance when used with data collected in the field. This difference may be because KDE and other home-range estimators are generally tested with simulated point locations that follow known statistical distributions, such as bivariate normal mixtures, which may not represent well the space-use patterns of all wildlife species. We used simulated animal locational data of 5 point pattern shapes that represent a range of wildlife utilization distributions to test 4 methods of home-range estimation: 1) KDE with reference bandwidths, 2) KDE with least-squares cross-validation, 3) KDE with plug-in bandwidths, and 4) minimum convex polygon (MCP). For the point patterns we simulated, MCP tended to produce more accurate area estimates than KDE methods. However, MCP estimates were markedly unstable, with bias varying widely with both sample size and point pattern shape. The KDE methods performed best for concave distributions, which are similar to bivariate normal mixtures, but still overestimated home ranges by about 40–50% even in the best cases. For convex, linear, perforated, and disjoint point patterns, KDE methods overestimated home-range sizes by 50–300%, depending on sample size and method of bandwidth selection. These results indicate that KDE does not produce home-range estimates that are as accurate as the literature suggests, and we recommend exploring other techniques of home-range estimation. 相似文献

18.

Efficient computation of the maximum a posteriori path and parameter estimation in integrate-and-fire and more general state-space models

Shinsuke Koyama Liam Paninski 《Journal of computational neuroscience》2010,29(1-2):89-105

A number of important data analysis problems in neuroscience can be solved using state-space models. In this article, we describe fast methods for computing the exact maximum a posteriori (MAP) path of the hidden state variable in these models, given spike train observations. If the state transition density is log-concave and the observation model satisfies certain standard assumptions, then the optimization problem is strictly concave and can be solved rapidly with Newton–Raphson methods, because the Hessian of the loglikelihood is block tridiagonal. We can further exploit this block-tridiagonal structure to develop efficient parameter estimation methods for these models. We describe applications of this approach to neural decoding problems, with a focus on the classic integrate-and-fire model as a key example. 相似文献

19.

Identification of sparse neural functional connectivity using penalized likelihood estimation and basis functions

Dong Song Haonan Wang Catherine Y. Tu Vasilis Z. Marmarelis Robert E. Hampson Sam A. Deadwyler Theodore W. Berger 《Journal of computational neuroscience》2013,35(3):335-357

One key problem in computational neuroscience and neural engineering is the identification and modeling of functional connectivity in the brain using spike train data. To reduce model complexity, alleviate overfitting, and thus facilitate model interpretation, sparse representation and estimation of functional connectivity is needed. Sparsities include global sparsity, which captures the sparse connectivities between neurons, and local sparsity, which reflects the active temporal ranges of the input-output dynamical interactions. In this paper, we formulate a generalized functional additive model (GFAM) and develop the associated penalized likelihood estimation methods for such a modeling problem. A GFAM consists of a set of basis functions convolving the input signals, and a link function generating the firing probability of the output neuron from the summation of the convolutions weighted by the sought model coefficients. Model sparsities are achieved by using various penalized likelihood estimations and basis functions. Specifically, we introduce two variations of the GFAM using a global basis (e.g., Laguerre basis) and group LASSO estimation, and a local basis (e.g., B-spline basis) and group bridge estimation, respectively. We further develop an optimization method based on quadratic approximation of the likelihood function for the estimation of these models. Simulation and experimental results show that both group-LASSO-Laguerre and group-bridge-B-spline can capture faithfully the global sparsities, while the latter can replicate accurately and simultaneously both global and local sparsities. The sparse models outperform the full models estimated with the standard maximum likelihood method in out-of-sample predictions. 相似文献

20.

A Multi-Label Learning Based Kernel Automatic Recommendation Method for Support Vector Machine

Xueying Zhang Qinbao Song 《PloS one》2015,10(4)

Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance. 相似文献