首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The presence of missing values in gel-based proteomics data represents a real challenge if an objective statistical analysis is pursued. Different methods to handle missing values were evaluated and their influence is discussed on the selection of important proteins through multivariate techniques. The evaluated methods consisted of directly dealing with them during the multivariate analysis with the nonlinear estimation by iterative partial least squares (NIPALS) algorithm or imputing them by using either k-nearest neighbor or Bayesian principal component analysis (BPCA) before carrying out the multivariate analysis. These techniques were applied to data obtained from gels stained with classical postrunning dyes and from DIGE gels. Before applying the multivariate techniques, the normality and homoscedasticity assumptions on which parametric tests are based on were tested in order to perform a sound statistical analysis. From the three tested methods to handle missing values in our datasets, BPCA imputation of missing values showed to be the most consistent method.  相似文献   

2.

Background

The objective of the present study was to test the ability of the partial least squares regression technique to impute genotypes from low density single nucleotide polymorphisms (SNP) panels i.e. 3K or 7K to a high density panel with 50K SNP. No pedigree information was used.

Methods

Data consisted of 2093 Holstein, 749 Brown Swiss and 479 Simmental bulls genotyped with the Illumina 50K Beadchip. First, a single-breed approach was applied by using only data from Holstein animals. Then, to enlarge the training population, data from the three breeds were combined and a multi-breed analysis was performed. Accuracies of genotypes imputed using the partial least squares regression method were compared with those obtained by using the Beagle software. The impact of genotype imputation on breeding value prediction was evaluated for milk yield, fat content and protein content.

Results

In the single-breed approach, the accuracy of imputation using partial least squares regression was around 90 and 94% for the 3K and 7K platforms, respectively; corresponding accuracies obtained with Beagle were around 85% and 90%. Moreover, computing time required by the partial least squares regression method was on average around 10 times lower than computing time required by Beagle. Using the partial least squares regression method in the multi-breed resulted in lower imputation accuracies than using single-breed data. The impact of the SNP-genotype imputation on the accuracy of direct genomic breeding values was small. The correlation between estimates of genetic merit obtained by using imputed versus actual genotypes was around 0.96 for the 7K chip.

Conclusions

Results of the present work suggested that the partial least squares regression imputation method could be useful to impute SNP genotypes when pedigree information is not available.  相似文献   

3.
In this paper a nonlinear model depending of one modifying factor to study the dependence of the PAPOVA virus reduction factor to the irradiant dose. In this model, least squares estimates of the parameters are obtained using the linearization method with the initial guesses values. An extension of the initial exponential model is proposed when the viral material is influenced by some modifying factors.  相似文献   

4.
Investigation of protein‐ligand interactions obtained from experiments has a crucial part in the design of newly discovered and effective drugs. Analyzing the data extracted from known interactions could help scientists to predict the binding affinities of promising ligands before conducting experiments. The objective of this study is to advance the CIFAP (compressed images for affinity prediction) method, which is relevant to a protein‐ligand model, identifying 2D electrostatic potential images by separating the binding site of protein‐ligand complexes and using the images for predicting the computational affinity information represented by pIC50 values. The CIFAP method has 2 phases, namely, data modeling and prediction. In data modeling phase, the separated 3D structure of the binding pocket with the ligand inside is fitted into an electrostatic potential grid box, which is then compressed through 3 orthogonal directions into three 2D images for each protein‐ligand complex. Sequential floating forward selection technique is performed for acquiring prediction patterns from the images. In the prediction phase, support vector regression (SVR) and partial least squares regression are used for testing the quality of the CIFAP method for predicting the binding affinity of 45 CHK1 inhibitors derived from 2‐aminothiazole‐4‐carboxamide. The results show that the CIFAP method using both support vector regression and partial least squares regression is very effective for predicting the binding affinities of CHK1‐ligand complexes with low‐error values and high correlation. As a future work, the results could be improved by working on the pose of the ligands inside the grid.  相似文献   

5.
A method is described for fitting a 'fraction labelled mitoses'curve to a set of data points and for estimating the values of the best fitting parameters of the cell cycle. Estimates of the SE of the parameters are obtained. The method depends on the fact that when gamma distributions are used to describe the durations of the phases of the cell cycle, the Laplace transform of a FLM curve can be described by simple analytic functions enabling a least squares fit to be made to a set of Laplace transforms of the experimental data. The method is easy to program and quick to execute.  相似文献   

6.
A discrete time cell cycle kinetics model is developed to account for the effects of cytotoxic chemotherapy, particularly including the existence of cells destined to die. A model structure is determined from related experiments, leaving key parameter values undetermined. These values are found by determining the best least squares fit of the predicted to the observed DNA distribution data at a series of time intervals. The numerical methods include separable least squares, linear inequality constrained least squares and the Gauss--Newton method. This approach is applied to an experiment in which the Ehrlich ascites tumour was given a single dose of bleomycin. The results include several different parameters, including the age response function and a time series of cell age and DNA distributions, which can be used as a basis for further treatment.  相似文献   

7.
Amino acid sequences, carbohydrate compositions and residue volumes are used to compare critically calculations of partial specific volumes v, neutron scattering matchpoints and 280-nm absorption coefficients with experimental v values for proteins and glycoproteins. The v values that are obtained from amino acid densitometry underestimate experimental v values by 0.01-0.02 ml/g while the v values from crystallographic volumes overestimate the experimental v values by 0.04-0.05 ml/g. An intermediate consensus volume set of amino-acid-residue volumes is proposed in order to predict experimental v values using sequence information. The method is extended to carbohydrates and glycoproteins. Neutron scattering matchpoints can be calculated from crystallographic residue volumes on the basis of the non-exchange of 10% of the main-chain NH protons. Crystallographic results on protein-bound water are used to account for the experimental values of v and matchpoints. Finally, 280-nm absorption coefficients, A1%, 1 cm 280, of 5-27 are found to be well predicted by the Wetlaufer procedure based on the totals of Trp, Tyr and Cys residues. Average errors are +/- 0.7, and the experimental A(1%,1cm)280 values can be larger than the predicted values by 3%.  相似文献   

8.
MOTIVATION: Gene expression data often contain missing expression values. Effective missing value estimation methods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local similarity structures in the data as well as least squares optimization process. RESULTS: The proposed local least squares imputation method (LLSimpute) represents a target gene that has missing values as a linear combination of similar genes. The similar genes are chosen by k-nearest neighbors or k coherent genes that have large absolute values of Pearson correlation coefficients. Non-parametric missing values estimation method of LLSimpute are designed by introducing an automatic k-value estimator. In our experiments, the proposed LLSimpute method shows competitive results when compared with other imputation methods for missing value estimation on various datasets and percentages of missing values in the data. AVAILABILITY: The software is available at http://www.cs.umn.edu/~hskim/tools.html CONTACT: hpark@cs.umn.edu  相似文献   

9.
At 20 degrees C, in a phosphate buffer, pH 5,8--8,0, methanol and aniline interactions with hemoglobin and cytochrome c were studied using the difference spectrophotometry method. The difference absorption spectra are characterized by following values of lambdamax and lambdamin (nm): I--MeOH--hemoglobin (405 and 420), II-MeOH--cytochrome c (405--406 and 419--422), III--aniline--cytochrome c (421--410 and 401--396). The values of lambdamax and lambdamin for system III are shifted in the region of shorter wavelengths from 421 to 410 nm and from 401 to 396 nm, respectively within the pH range of 5,8--7,95. From difference spectra for systems I, II, III the dissociation constants of complexes obtained, Ks were calculated. Log Ks is linearly dependent on pH. System I is characterized by two values of Ks at all pH. The Ks values were calculated in general form from the dependences obtained. The nature of the complexes is discussed.  相似文献   

10.
The Gauss-peak spectra (GPS) method represents individual pigment spectra as weighted sums of Gaussian functions, and uses these to model absorbance spectra of phytoplankton pigment mixtures. We here present several improvements for this type of methodology, including adaptation to plate reader technology and efficient model fitting by open source software. We use a one-step modeling of both pigment absorption and background attenuation with non-negative least squares, following a one-time instrument-specific calibration. The fitted background is shown to be higher than a solvent blank, with features reflecting contributions from both scatter and non-pigment absorption. We assessed pigment aliasing due to absorption spectra similarity by Monte Carlo simulation, and used this information to select a robust set of identifiable pigments that are also expected to be common in natural samples. To test the method’s performance, we analyzed absorbance spectra of pigment extracts from sediment cores, 75 natural lake samples, and four phytoplankton cultures, and compared the estimated pigment concentrations with concentrations obtained using high performance liquid chromatography (HPLC). The deviance between observed and fitted spectra was generally very low, indicating that measured spectra could successfully be reconstructed as weighted sums of pigment and background components. Concentrations of total chlorophylls and total carotenoids could accurately be estimated for both sediment and lake samples, but individual pigment concentrations (especially carotenoids) proved difficult to resolve due to similarity between their absorbance spectra. In general, our modified-GPS method provides an improvement of the GPS method that is a fast, inexpensive, and high-throughput alternative for screening of pigment composition in samples of phytoplankton material.  相似文献   

11.
A discrete time cell cycle kinetics model is developed to account for the effects of cytotoxic chemotherapy, particularly including the existence of cells destined to die. A model structure is determined from related experiments, leaving key parameter values undetermined. These values are found by determining the best least squares fit of the predicted to the observed DNA distribution data at a series of time intervals. the numerical methods include separable least squares, linear inequality constrained least squares and the Gauss-Newton method. This approach is applied to an experiment in which the Ehrlich ascites tumour was given a single dose of bleomycin. the results include several different parameters, including the age response function and a time series of cell age and DNA distributions, which can be used as a basis for further treatment.  相似文献   

12.
A novel method is proposed to determine deductively and uniquely the values of three parameters, a, b, and c in a fractional function of the form, y=a+bx/(c+x) where x and y are experimentally obtainable variables. This type of equation is frequently encountered in chemistry and biochemistry involving relaxation kinetics. The method of least squares with the Taylor expansion is employed for direct curve fitting of observed data to the fractional function. Approximate values of the parameters, which are always necessary prior to commending the above procedure, can be obtained by the method of rearrangement after canceling the denominator of fractional functions. This procedure is very simple, but very effective for estimating provisional values of the parameters. Deductive and unique determination of the parameters involved in the fractional function shown above can be accomplished for the first time by the combination of these two procedures. This method is extended to include the analysis of relaxation kinetic data such as those of temperature-jump method where the determination of equilibrium concentrations of reactants in addition to the three parameters is also necessary.  相似文献   

13.
The pH-dependence of RNAase A and of Ntau-carboxymethylhistidine-12-RNAase (ribonucleate 3'-pyrimidino-oligonucleotidohydrolase) catalysis was studied. Apparent acid dissociation constants were obtained by least squares analysis of the kinetics data. These dissociation constants were compared with pKa values of model imidazole compounds, and with pKa values of histidine residues 12 and 119 on the protein. The shapes of the kcat versus pH profiles for RNAase A and its carboxymethyl derivative are very similar, from which it is concluded that the mechanism of catalysis is closely similar in the two proteins. Apparent pKa values obtained from the kinetic data are higher for the carboxymethylated protein than for RNAase A, as are the pKa values of residues 12 and 119. The similar shifts are consistent with the conclusions that both these residues are functionally significant in native and modified enzyme, and that an unblocked tau-nitrogen on histidine-12 is not essential for activity. From the enzyme's catalytic dependence on pH, and the NMR determined pKa values we propose that histidine 12 and 119 function catalytically in their basic and acidic forms respectively.  相似文献   

14.
A desk-top computing system has been programmed to store accurately the quench curves necessary for the calculation of total disintegrations/minute (d.p.m.) of samples containing either one or two radioactive isotopes. In producing d.p.m. values background counts are subtracted, and in binary-labelled samples the counts attributable to each radioactive isotope are separated. The programme also relates d.p.m. to the weight, volume and density of the sample. Each variable is easily recalled and adjustments can be made for different batches of samples without reprogramming. Equally easily changes of radioactive isotope, quenching agent, scintillator or window setting can be accommodated. Quadratic equations are used to express the quench curves. Counting efficiencies obtained when the coefficients in the quadratic equations are derived from three carefully chosen points on a quench curve are compared with those obtained when the coefficients are derived by the method of least squares. The results of both mathematical approximations are compared with the efficiencies read by the eye from graphs.  相似文献   

15.
2-Aminopurine (P) is a mutagen causing A.T to G.C transitions in prokaryotic systems. To study the base-pairing schemes between P and cytosine (C) or thymine (T), two self-complementary dodecamers containing P paired with either C or T were synthesized, and their protonation equilibria were studied by acid-base titrations and melting experiments. The mismatches were incorporated into the self-complementary sequence d(CGCPCCGGXGCG), where X was C or T. Spectroscopic data obtained from molecular absorption, circular dichroism (CD), and molecular fluorescence spectroscopy were analyzed by a factor-analysis-based method, multivariate curve resolution based on the alternating least squares optimization procedure (MCR-ALS). This procedure allows determination of the number of acid-base species or conformations present in an acid-base or melting experiment and the resolution of the concentration profiles and pure spectra for each of them. Acid-base experiments have shown that at pH 7, 150 mM ionic strength, and 37 degrees C, both C and P are deprotonated. At pH near 4, the majority of species shows C protonated and P deprotonated. Finally, at pH values near 3, the majority of species shows both protonated C and P. These results are in agreement with NMR studies showing a wobble geometry for the P x C base pair and a Watson-Crick geometry for the P x T base pair at neutral pH. Melting experiments were carried out to confirm the proposed acid-base distribution profile. For the sequence including the P x T mismatch, only one transition was observed at neutral pH. However, for the sequence including the P x C mismatch, two transitions were detected by CD but only one by molecular absorption. This behavior agrees with that observed by other authors for oligonucleotides of similar sequence and suggests the following sequence of conformational changes during melting: duplex --> hairpin --> random coil.  相似文献   

16.
张倩倩  黄青 《菌物学报》2018,37(12):1792-1801
本文报道了基于香草醛-高氯酸显色反应的分光光度法定量测定灵芝三萜的修正方法,并对该方法应用进行了探讨和优化。采用此方法检测了灵芝子实体中含量较高的几种三萜酸,结果表明若采用齐墩果酸为标准品检测灵芝三萜,检测结果远低于真实值。在光谱分析上,研究表明对紫外-可见光扫描吸收峰进行面积积分,获得的标准曲线的线性关系更优。  相似文献   

17.
Question: Can non‐parametric multiplicative regression (NPMR) improve estimates of potential direct incident radiation (PDIR) and heat load based on topographic variables, as compared to least‐squares multiple regression against trigonometric transforms of the predictors? Methods: We used a multiplicative kernel smoothing technique to interpolate between tabulated values of PDIR, using a locally linear model and a Gaussian kernel, with slope, aspect, and latitude as predictors. Heat load was calculated as a 45 degree rotation of the PDIR response surface. Results: This method yielded a fit to a complex response surface with R2 > 0.99 and eliminated the areas of poor fit given by a previously published method based on least squares multiple regression with trigonometric functions of the predictors. Conclusions: Improved estimates of PDIR and heat load based on topographic variables can be obtained by using non‐parametric multiplicative regression (NPMR). The main drawback to the method is that it requires reference to the data tables, since those data are part of the model.  相似文献   

18.
The association constants and the binding capacities of association of small molecules with macromolecules have been determined by the tangent analysis, the graphical analysis, and the computer data analysis, by trial and convergence of the Scatchard plot. The analytical method for the calculation of the binding parameters based on the Scatchard plot was derived and the optimum values of the binding parameters were obtained by the least squares calculation based on the analytical method. The errors by the analytical method were smaller than those by the graphical method in the equilibrium system between 3H-estradiol and some cytosols of uterus.  相似文献   

19.
Important aspects of k(l)a measurement in agitated aerated vessels are briefly characterized from the standpoint of reliability of the measured data. It seems that most of the k(l)a data, based on a number of variants of the steady-state and dynamic methods in noncoalescent liquids, do not have a clear physical meaning, because they are affected by the differences between the actual driving force and the driving force assumed by the model used for its evaluation. A reliability test is given for the Na(2)SO(3) feeding steady-state method (FSM), by comparing the results of air and pure oxygen absorption in a noncoalescent liquid (0.5M Na(2)SO(4) solution) with the results obtained by the independent pressure step dynamic method (RDM). The RDM is one of a few variants of the dynamic method which gives correct k(l)a data unaffected by nonideal mixing of the gas phase in the reactor. It was found that the FSM yields correct k(l)a values only when pure oxygen is used for absorption. When air is absorbed, the FSM gives k(l)a values in the region of k(l)a > 0.1 s(-1) substantially (to 55%) lower than those for pure oxygen absorption.  相似文献   

20.
The traditional method for estimating the linear function of fixed parameters in mixed linear model is a two-stage procedure. In the first stage of this procedure the variance components estimators are calculated and next in the second stage these estimators are taken as true values of variance components to estimating the linear function of fixed parameters according to generalized least squares method. In this paper the general mixed linear model is considered in which a matrix related to fixed parameters and or/a dispersion matrix of observation vector may be deficient in rank. It is shown that the estimators of a set of functions of fixed parameters obtained in second stage are unbiased if only the observation vector is symmetrically distributed about its expected value and the estimators of variance components from first stage are translation-invariant and are even functions of the observation vector.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号