Introduction: Despite the unquestionable advantages of Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry Imaging in visualizing the spatial distribution and the relative abundance of biomolecules directly on-tissue, the yielded data is complex and high dimensional. Therefore, analysis and interpretation of this huge amount of information is mathematically, statistically and computationally challenging.
Areas covered: This article reviews some of the challenges in data elaboration with particular emphasis on machine learning techniques employed in clinical applications, and can be useful in general as an entry point for those who want to study the computational aspects. Several characteristics of data processing are described, enlightening advantages and disadvantages. Different approaches for data elaboration focused on clinical applications are also provided. Practical tutorial based upon Orange Canvas and Weka software is included, helping familiarization with the data processing.
Expert commentary: Recently, MALDI-MSI has gained considerable attention and has been employed for research and diagnostic purposes, with successful results. Data dimensionality constitutes an important issue and statistical methods for information-preserving data reduction represent one of the most challenging aspects. The most common data reduction methods are characterized by collecting independent observations into a single table. However, the incorporation of relational information can improve the discriminatory capability of the data. 相似文献
SUMMARY: GeneSyn is a software tool that allows automatic detection of conserved gene order from annotated genomes. AVAILABILITY: Available free of charge for Unix/Linux/Cygwin platforms at ftp://159.149.110.11/pub/GeneSyn_1.0/ SUPPLEMENTARY INFORMATION: ftp://159.149.110.11/pub/GeneSyn_1.0/ 相似文献
Phospholipid hydroperoxide glutathione peroxidase (PHGPx; EC 1.11.1.12), a broad-spectrum thiol-dependent peroxidase, deserves renewed interest as a regulatory factor in various signaling cascades and as a structural protein in sperm cells. We present a first attempt to identify catalytic intermediates and derivatives of the selenoprotein by liquid chromatography coupled to electrospray tandem mass spectrometry (LC/ESI-MS/MS) and to explain observed specificities by molecular modeling. The ground state enzyme E proved to correspond to position 3-170 of the deduced porcine sequence with selenium being present as selenocysteine at position 46. The selenenic acid form, which is considered to be the first catalytic intermediate F formed by reaction with hydroperoxide, could not be identified. The second catalytic intermediate G was detected as Se-glutathionylated enzyme. This intermediate is generated in the reverse reaction where the active site selenol interacts with glutathione disulfide (GSSG). According to molecular models, specific binding of reduced glutathione (GSH) and of GSSG is inter alia facilitated by electrostatic attraction of Lys-48 and Lys-125. Polymerization of PHGPx is obtained under oxidizing conditions in the absence of low molecular weight thiols. Analysis of MS spectra revealed that the process is due to a selective reaction of Sec-46 with Cys-148' resulting in linear polymers representing dead-end intermediates (G'). FT Docking of PHGPx molecules allowed reactions of Sec-46 with either Cys-66', Cys-107', Cys-168' or Cys-148', the latter option being most likely as judged by the number of proposed intermediates with reasonable hydrogen bonds, interaction energies and interface areas. We conclude that the same catalytic principles, depending on the conditions, can drive the diverse actions of PHGPx, i.e. hydroperoxide reduction, GSSG reduction, S-derivatization and self-incorporation into biological structures. 相似文献
The present study aimed to evaluate the efficacy of the hyaluronic acid (HA) binding assay in the selection of motile spermatozoa
with normal morphology at high magnification (8400x). 相似文献
Carbohydrate microarrays have emerged as powerful tools in analyses of microbe-host interactions. Using a microarray with 190 sequence-defined oligosaccharides in the form of natural glycolipids and neoglycolipids representative of diverse mammalian glycans, we examined interactions of simian virus 40 (SV40) with potential carbohydrate receptors. While the results confirmed the high specificity of SV40 for the ganglioside GM1, they also revealed that N-glycolyl GM1 ganglioside [GM1(Gc)], which is characteristic of simian species and many other nonhuman mammals, is a better ligand than the N-acetyl analog [GM1(Ac)] found in mammals, including humans. After supplementing glycolipid-deficient GM95 cells with GM1(Ac) and GM1(Gc) gangliosides and the corresponding neoglycolipids with phosphatidylethanolamine lipid groups, it was found that GM1(Gc) analogs conferred better virus binding and infectivity. Moreover, we visualized the interaction of NeuGc with VP1 protein of SV40 by molecular modeling and identified a conformation for GM1(Gc) ganglioside in complex with the virus VP1 pentamer that is compatible with its presentation as a membrane receptor. Our results open the way not only to detailed studies of SV40 infection in relation to receptor expression in host cells but also to the monitoring of changes that may occur with time in receptor usage by the virus. 相似文献
Although simulation studies show that combining multiple breeds in one reference population increases accuracy of genomic prediction, this is not always confirmed in empirical studies. This discrepancy might be due to the assumptions on quantitative trait loci (QTL) properties applied in simulation studies, including number of QTL, spectrum of QTL allele frequencies across breeds, and distribution of allele substitution effects. We investigated the effects of QTL properties and of including a random across- and within-breed animal effect in a genomic best linear unbiased prediction (GBLUP) model on accuracy of multi-breed genomic prediction using genotypes of Holstein-Friesian and Jersey cows.
Methods
Genotypes of three classes of variants obtained from whole-genome sequence data, with moderately low, very low or extremely low average minor allele frequencies (MAF), were imputed in 3000 Holstein-Friesian and 3000 Jersey cows that had real high-density genotypes. Phenotypes of traits controlled by QTL with different properties were simulated by sampling 100 or 1000 QTL from one class of variants and their allele substitution effects either randomly from a gamma distribution, or computed such that each QTL explained the same variance, i.e. rare alleles had a large effect. Genomic breeding values for 1000 selection candidates per breed were estimated using GBLUP modelsincluding a random across- and a within-breed animal effect.
Results
For all three classes of QTL allele frequency spectra, accuracies of genomic prediction were not affected by the addition of 2000 individuals of the other breed to a reference population of the same breed as the selection candidates. Accuracies of both single- and multi-breed genomic prediction decreased as MAF of QTL decreased, especially when rare alleles had a large effect. Accuracies of genomic prediction were similar for the models with and without a random within-breed animal effect, probably because of insufficient power to separate across- and within-breed animal effects.
Conclusions
Accuracy of both single- and multi-breed genomic prediction depends on the properties of the QTL that underlie the trait. As QTL MAF decreased, accuracy decreased, especially when rare alleles had a large effect. This demonstrates that QTL properties are key parameters that determine the accuracy of genomic prediction.
Electronic supplementary material
The online version of this article (doi:10.1186/s12711-015-0124-6) contains supplementary material, which is available to authorized users. 相似文献