首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 145 毫秒
1.
Peak detection is a key step in the analysis of SELDI-TOF-MS spectra, but the current default method has low specificity and poor peak annotation. To improve data quality, scientists still have to validate the identified peaks visually, a tedious and time-consuming process, especially for large data sets. Hence, there is a genuine need for methods that minimize manual validation. We have previously reported a multi-spectral signal detection method, called RS for 'region of significance', with improved specificity. Here we extend it to include a peak quantification algorithm based on annotated regions of significance (ARS). For each spectral region flagged as significant by RS, we first identify a dominant spectrum for determining the number of peaks and the m/z region of these peaks. From each m/z region of peaks, a peak template is extracted from all spectra via the principal component analysis. Finally, with the template, we estimate the amplitude and location of the peak in each spectrum with the least-squares method and refine the estimation of the amplitude via the mixture model.We have evaluated the ARS algorithm on patient samples from a clinical study. Comparison with the standard method shows that ARS (i) inherits the superior specificity of RS, and (ii) gives more accurate peak annotations than the standard method. In conclusion, we find that ARS alleviates the main problems in the preprocessing of SELDI-TOF spectra. The R-package ProSpect that implements ARS is freely available for academic use at http://www.meb.ki.se/ yudpaw.  相似文献   

2.
Digestion-resistant starch (RS) has many physiologic functions. The RS content is measured by enzymatically degrading flour samples according to the method of the Association of Official Analytical Chemists. Experiments have been performed with wheat, corn, and other grains, but there are no data for cooked rice grains in the form ingested by humans. Thus, we investigated a method to measure RS that is suitable for cooked rice grains using rice cultivars that are reported to differentially increase postprandial blood glucose in humans. Using a method for cooking individual rice grains and optimized enzyme reaction conditions, we established an RS measurement method. We also found that the amylopectin crystal condition affects the RS content measured using our method.  相似文献   

3.
There is a great challenge in combining soil proximal spectra and remote sensing spectra to improve the accuracy of soil organic carbon (SOC) models. This is primarily because mixing of spectral data from different sources and technologies to improve soil models is still in its infancy. The first objective of this study was to integrate information of SOC derived from visible near-infrared reflectance (Vis-NIR) spectra in the laboratory with remote sensing (RS) images to improve predictions of topsoil SOC in the Skjern river catchment, Denmark. The second objective was to improve SOC prediction results by separately modeling uplands and wetlands. A total of 328 topsoil samples were collected and analyzed for SOC. Satellite Pour l’Observation de la Terre (SPOT5), Landsat Data Continuity Mission (Landsat 8) images, laboratory Vis-NIR and other ancillary environmental data including terrain parameters and soil maps were compiled to predict topsoil SOC using Cubist regression and Bayesian kriging. The results showed that the model developed from RS data, ancillary environmental data and laboratory spectral data yielded a lower root mean square error (RMSE) (2.8%) and higher R2 (0.59) than the model developed from only RS data and ancillary environmental data (RMSE: 3.6%, R2: 0.46). Plant-available water (PAW) was the most important predictor for all the models because of its close relationship with soil organic matter content. Moreover, vegetation indices, such as the Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI), were very important predictors in SOC spatial models. Furthermore, the ‘upland model’ was able to more accurately predict SOC compared with the ‘upland & wetland model’. However, the separately calibrated ‘upland and wetland model’ did not improve the prediction accuracy for wetland sites, since it was not possible to adequately discriminate the vegetation in the RS summer images. We conclude that laboratory Vis-NIR spectroscopy adds critical information that significantly improves the prediction accuracy of SOC compared to using RS data alone. We recommend the incorporation of laboratory spectra with RS data and other environmental data to improve soil spatial modeling and digital soil mapping (DSM).  相似文献   

4.
5.
BACKGROUND: Mesenchymal stem cells (MSC) are multipotent progenitors retaining the capability to undergo multilineage differentiation, mostly towards all the mesodermal cellular lineages. MSC growing under standard conditions are composed of two main subpopulations with a characteristic distribution in the morphologic flow cytometric scatter: RS (recycling stem) cells (small, agranular) and m (mature) MSC (large, moderately granular cells). METHODS: MSC obtained from BM of healthy donors and expanded in culture were characterized by evaluating both the expression of conventional markers and differentiation potential. We used CFSE, a lipophilic dye that is taken up by cell membranes, to investigate separately the proliferative activity of RS cells and mMSC subsets. RESULTS: With flow cytometric analysis, RS cells and mMSC showed nearly the same immunophenotypic pattern, even if a significantly smaller percentage of RS cells expressed some of the classic mesenchymal Ag. The RS cell fraction was confirmed to have a higher proliferative potential and such a feature was particularly evident under certain culture conditions. DISCUSSION: CFSE has been shown as a reliable method for studying the proliferative activity of MSC subpopulations identified by flow cytometric analysis. The acquisition parameter strategy is crucial for the accuracy of the analysis.  相似文献   

6.
基于地-空遥感耦合的冬小麦叶片氮积累量估算   总被引:1,自引:0,他引:1  
利用不同冬小麦生态区同步的SPOT-5多光谱遥感影像、地面光谱数据和植株取样数据,提出一种基于波谱响应函数拟合和混合像元分解的纯净像元光谱提取方法,并对比分析了纯净像元光谱、模拟像元光谱和实测像元光谱与冬小麦叶片氮积累量(LNA)的定量关系.结果表明: 模拟像元光谱对叶片氮积累量的反演效果较好,纯净像元光谱反演效果次之,实测像元光谱最差;但基于模拟像元光谱的LNA监测模型不能直接外推至空间尺度.模型检验结果表明,基于纯净像元光谱的LNA监测模型在2个小麦生态区均具有较好的精度和稳定性,该方法综合利用了地-空遥感的优点,可以推广应用到其他不同空间分辨率和光谱分辨率的遥感数据,从而为区域性冬小麦氮素营养状况的遥感监测提供技术依据.  相似文献   

7.
We address the neglected issue of ecological and evolutionary significance of root sprouting (RS) in plants. RS has been considered a sort of morphological curiosity. However, existing data of the Central European flora show that it occurs in about 10% of species. These species are therefore independent of a stem-derived bud bank in their resprouting. As sprouting from roots has been hypothesised to help plants survive disturbance, we used a large data set (2914 species with data on presence/absence of RS from Central Europe) to perform comparative analyses of its occurrence in disturbed habitats, evolution of RS in response to disturbance, and its distribution among individual plant lineages. To address these questions, we linked the data with species-level indicator values for disturbance, data on additional functional traits and phylogenetic data. We confirmed that RS ability is more frequent in plants growing in habitats subjected to disturbance, especially in annuals and clonal species. This contrasts with clonality via stem-based organs, which does not promote occurrence in disturbed habitats. Disturbance severity is the most important factor determining RS species distribution, whereas disturbance frequency plays a smaller role. RS is phylogenetically less conservative than sprouting from the stem-based belowground bud bank and thus can be easily acquired or lost in evolution, although these rates strongly differ between individual lineages. Evolution of RS seems to be driven largely by occurrence in disturbed habitats, and has appeared/disappeared independently of the presence of a stem-derived bud bank. Importantly, the data support the scenario in which colonisation of such habitats occurs prior to acquiring the RS ability, which develops only later. RS is hence a more important ecological trait than hitherto assumed. It constitutes an independent route of response to severe disturbance and its ecological effects and evolutionary patterns differ from stem-based clonality.  相似文献   

8.
MOTIVATION: Due to the recent advances in technology of mass spectrometry, there has been an exponential increase in the amount of data being generated in the past few years. Database searches have not been able to keep with this data explosion. Thus, speeding up the data searches becomes increasingly important in mass-spectrometry-based applications. Traditional database search methods use one-against-all comparisons of a query spectrum against a very large number of peptides generated from in silico digestion of protein sequences in a database, to filter potential candidates from this database followed by a detailed scoring and ranking of those filtered candidates. RESULTS: In this article, we show that we can avoid the one-against-all comparisons. The basic idea is to design a set of hash functions to pre-process peptides in the database such that for each query spectrum we can use the hash functions to find only a small subset of peptide sequences that are most likely to match the spectrum. The construction of each hash function is based on a random spectrum and the hash value of a peptide is the normalized shared peak counts score (cosine) between the random spectrum and the hypothetical spectrum of the peptide. To implement this idea, we first embed each peptide into a unit vector in a high-dimensional metric space. The random spectrum is represented by a random vector, and we use random vectors to construct a set of hash functions called locality sensitive hashing (LSH) for preprocessing. We demonstrate that our mapping is accurate. We show that our method can filter out >95.65% of the spectra without missing any correct sequences, or gain 111 times speedup by filtering out 99.64% of spectra while missing at most 0.19% (2 out of 1014) of the correct sequences. In addition, we show that our method can be effectively used for other mass spectra mining applications such as finding clusters of spectra efficiently and accurately. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

9.
The ultimate goal of the Recommender System (RS) is to offer a proposal that is very close to the user's real opinion. Data clustering can be effective in increasing the accuracy of production proposals by the RS. In this paper, single-objective hybrid evolutionary approach is proposed for clustering items in the offline collaborative filtering RS. This method, after generating a population of randomized solutions, at each iteration, improves the population of solutions first by Genetic Algorithm (GA) and then by using the Gravitational Emulation Local Search (GELS) algorithm. Simulation results on standard datasets indicate that although the proposed hybrid meta-heuristic algorithm requires a relatively high run time, it can lead to more appropriate clustering of existing data and thus improvement of the Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and Coverage criteria.  相似文献   

10.
Since consumers are showing increased interest in the origin and method of production of their food, it is important to be able to authenticate dietary history of animals by rapid and robust methods used in the ruminant products. Promising breakthroughs have been made in the use of spectroscopic methods on fat to discriminate pasture-fed and concentrate-fed lambs. However, questions remained on their discriminatory ability in more complex feeding conditions, such as concentrate-finishing after pasture-feeding. We compared the ability of visible reflectance spectroscopy (Vis RS, wavelength range: 400 to 700 nm) with that of visible-near-infrared reflectance spectroscopy (Vis-NIR RS, wavelength range: 400 to 2500 nm) to differentiate between carcasses of lambs reared with three feeding regimes, using partial least square discriminant analysis (PLS-DA) as a classification method. The sample set comprised perirenal fat of Romane male lambs fattened at pasture (P, n=69), stall-fattened indoors on commercial concentrate and straw (S, n=55) and finished indoors with concentrate and straw for 28 days after pasture-feeding (PS, n=65). The overall correct classification rate was better for Vis-NIR RS than for Vis RS (99.0% v. 95.1%, P<0.05). Vis-NIR RS allowed a correct classification rate of 98.6%, 100.0% and 98.5% for P, S and PS lambs, respectively, whereas Vis RS allowed a correct classification rate of 98.6%, 94.5% and 92.3% for P, S and PS lambs, respectively. This study suggests the likely implication of molecules absorbing light in the non-visible part of the Vis-NIR spectra (possibly fatty acids), together with carotenoid and haem pigments, in the discrimination of the three feeding regimes.  相似文献   

11.
Replication stress (RS) is a source of DNA damage that has been linked to cancer and aging, which is suppressed by the ATR kinase. In mice, reduced ATR levels in a model of the ATR-Seckel syndrome lead to RS and accelerated aging. Similarly, ATR-Seckel embryonic fibroblasts (MEF) accumulate RS and undergo cellular senescence. We previously showed that senescence of ATR-Seckel MEF cannot be rescued by p53-deletion. Here, we show that the genetic ablation of the INK4a/Arf locus fully rescues senescence on ATR mutant MEF, but also that induced by other conditions that generate RS such as low doses of hydroxyurea or ATR inhibitors. In addition, we show that a persistent exposure to RS leads to increased levels of INK4a/Arf products, revealing that INK4a/ARF behaves as a bona fide RS checkpoint. Our data reveal an unknown role for INK4a/ARF in limiting the expansion of cells suffering from persistent replication stress, linking this well-known tumor suppressor to the maintenance of genomic integrity.  相似文献   

12.
Kwon D  Vannucci M  Song JJ  Jeong J  Pfeiffer RM 《Proteomics》2008,8(15):3019-3029
In recent years there has been an increased interest in using protein mass spectroscopy to discriminate diseased from healthy individuals with the aim of discovering molecular markers for disease. A crucial step before any statistical analysis is the pre-processing of the mass spectrometry data. Statistical results are typically strongly affected by the specific pre-processing techniques used. One important pre-processing step is the removal of chemical and instrumental noise from the mass spectra. Wavelet denoising techniques are a standard method for denoising. Existing techniques, however, do not accommodate errors that vary across the mass spectrum, but instead assume a homogeneous error structure. In this paper we propose a novel wavelet denoising approach that deals with heterogeneous errors by incorporating a variance change point detection method in the thresholding procedure. We study our method on real and simulated mass spectrometry data and show that it improves on performances of peak detection methods.  相似文献   

13.
MOTIVATION: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data. RESULTS: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an interpretation of currently used methods as a maximum-likelihood estimator and allows theoretical derivation of its variance. We observe that this variance depends crucially on the total number of ionic species, which can vary considerably between different pooled spectra. This variation in variance can potentially invalidate the results from naive methods of discrimination/classification and we outline appropriate data transformations. Introducing methods from robust statistics did not improve the standard errors of the pooled samples. Imputing missing values however-using the EM algorithm-had a notable effect on the result; for our data, the pooled height of peaks which were frequently truncated increased by up to 30%.  相似文献   

14.
Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful and widely applied method for the study of biological systems, biomarker discovery and pharmacological interventions. LC-MS measurements are, however, significantly complicated by several technical challenges, including: (1) ionisation suppression/enhancement, disturbing the correct quantification of analytes, and (2) the detection of large amounts of separate derivative ions, increasing the complexity of the spectra, but not their information content. Here we introduce an experimental and analytical strategy that leads to robust metabolome profiles in the face of these challenges. Our method is based on rigorous filtering of the measured signals based on a series of sample dilutions. Such data sets have the additional characteristic that they allow a more robust assessment of detection signal quality for each metabolite. Using our method, almost 80% of the recorded signals can be discarded as uninformative, while important information is retained. As a consequence, we obtain a broader understanding of the information content of our analyses and a better assessment of the metabolites detected in the analyzed data sets. We illustrate the applicability of this method using standard mixtures, as well as cell extracts from bacterial samples. It is evident that this method can be applied in many types of LC-MS analyses and more specifically in untargeted metabolomics.  相似文献   

15.
Protein identification by MS/MS is an important technique in proteome studies. The Open Mass Spectrometry Search Algorithm (OMSSA) is an open‐source search engine that can be used to identify MS/MS spectra acquired in these experiments. Here, we present a software tool, termed OMSSAPercolator, which interfaces OMSSA with Percolator, a post‐search machine learning method for rescoring database search results. We demonstrate that it outperforms the standard OMSSA scoring scheme, and provides reliable significant measurements. OMSSAPercolator is programmed using JAVA and can be readily used as a standalone tool or integrated into existing data analysis pipelines. OMSSAPercolator is freely available and can be downloaded at http://sourceforge.net/projects/omssapercolator/ .  相似文献   

16.
We developed 74 microsatellite marker primer pairs yielding 76 polymorphic loci, specific for the short arm of rye chromosome 1R (1RS) in wheat background. Four libraries enriched for microsatellite motifs AG, AAG, AC and AAC were constructed from DNA of flow-sorted 1RS chromosomes and 1,290 clones were sequenced. Additionally, 2,778 BAC-end-sequences from a 1RS specific BAC library were used for microsatellite screening and marker development. From 724 designed primer pairs, 119 produced 1RS specific bands and 74 of them showed polymorphism in a set of ten rye genotypes. We show that this high attrition rate was due to the highly repetitive nature of the rye genome consisting of a large number of transposable elements. We mapped the 76 polymorphic loci physically into three regions (bins) on 1RS; 29, 30 and 17 loci were assigned to the distal, intercalary and proximal regions of the 1RS arm, respectively. The average polymorphism information content increases with distance from the centromere, which could be due to an increased recombination rate along the chromosome arm toward’s the telomere. Additionally, we demonstrate, using the data of the whole rice genome, that the intra-genomic length variation of microsatellites correlates (r = 0.87) with microsatellite polymorphism. Based on these results we suggest that an analysis of the microsatellite length variation is conducted for each species prior to microsatellite development, provided that sufficient sequence information is available. This will allow to selectively design microsatellite markers for motifs likely to yield a high level of polymorphism. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

17.
Monte Carlo (MC) modeling is a valuable tool to gain fundamental understanding of light-tissue interactions, provide guidance and assessment to optical instrument designs, and help analyze experimental data. It has been a major challenge to efficiently extend MC towards modeling of bulk-tissue Raman spectroscopy (RS) due to the wide spectral range, relatively sharp spectral features, and presence of background autofluorescence. Here, we report a computationally efficient MC approach for RS by adapting the massively-parallel Monte Carlo eXtreme (MCX) simulator. Simulation efficiency is achieved through “isoweight,” a novel approach that combines the statistical generation of Raman scattered and Fluorescence emission with a lookup-table-based technique well-suited for parallelization. The MC model uses a graphics processor to produce dense Raman and fluorescence spectra over a range of 800 − 2000 cm−1 with an approximately 100× increase in speed over prior RS Monte Carlo methods. The simulated RS signals are compared against experimentally collected spectra from gelatin phantoms, showing a strong correlation.  相似文献   

18.
MOTIVATION: Tandem mass spectrometry allows for high-throughput identification of complex protein samples. Searching tandem mass spectra against sequence databases is the main analysis method nowadays. Since many peptide variations are possible, including them in the search space seems only logical. However, the search space usually grows exponentially with the number of independent variations and may therefore overwhelm computational resources. RESULTS: We provide fast, cache-efficient search algorithms to screen large peptide search spaces including non-tryptic peptides, whole genomes, dozens of posttranslational modifications, unannotated point mutations and even unannotated splice sites. All these search spaces can be screened simultaneously. By optimizing the cache usage, we achieve a calculation speed that closely approaches the limits of the hardware. At the same time, we control the size of the overall search space by limiting the combinations of variations that can co-occur on the same peptide. Using a hypergeometric scoring scheme, we applied these algorithms to a dataset of 1 420 632 spectra. We were able to identify a considerable number of peptide variations within a modest amount of computing time on standard desktop computers.  相似文献   

19.
Radial spokes (RSs) are ubiquitous components in the 9 + 2 axoneme thought to be mechanochemical transducers involved in local control of dynein-driven microtubule sliding. They are composed of >23 polypeptides, whose interactions and placement must be deciphered to understand RS function. In this paper, we show the detailed three-dimensional (3D) structure of RS in situ in Chlamydomonas reinhardtii flagella and Tetrahymena thermophila cilia that we obtained using cryoelectron tomography (cryo-ET). We clarify similarities and differences between the three spoke species, RS1, RS2, and RS3, in T. thermophila and in C. reinhardtii and show that part of RS3 is conserved in C. reinhardtii, which only has two species of complete RSs. By analyzing C. reinhardtii mutants, we identified the specific location of subsets of RS proteins (RSPs). Our 3D reconstructions show a twofold symmetry, suggesting that fully assembled RSs are produced by dimerization. Based on our cryo-ET data, we propose models of subdomain organization within the RS as well as interactions between RSPs and with other axonemal components.  相似文献   

20.
A new method for analyzing three-state protein unfolding equilibria is described that overcomes the difficulties created by direct effects of denaturants on circular dichroism (CD) and fluorescence spectra of the intermediate state. The procedure begins with a singular value analysis of the data matrix to determine the number of contributing species and perturbations. This result is used to choose a fitting model and remove all spectra from the fitting equation. Because the fitting model is a product of a matrix function which is nonlinear in the thermodynamic parameters and a matrix that is linear in the parameters that specify component spectra, the problem is solved with a variable projection algorithm. Advantages of this procedure are perturbation spectra do not have to be estimated before fitting, arbitrary assumptions about magnitudes of parameters that describe the intermediate state are not required, and multiple experiments involving different spectroscopic techniques can be simultaneously analyzed. Two tests of this method were performed: First, simulated three-state data were analyzed, and the original and recovered thermodynamic parameters agreed within one standard error, whereas recovered and original component spectra agreed within 0.5%. Second, guanidine-induced unfolding titrations of the human retinoid-X-receptor ligand-binding domain were analyzed according to a three-state model. The standard unfolding free energy changes in the absence of guanidine and the guanidine concentrations at zero free-energy change for both transitions were determined from a joint analysis of fluorescence and CD spectra. Realistic spectra of the three protein states were also obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号