首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 828 毫秒
1.
One goal of single-cell RNA sequencing (scRNA seq) is to expose possible heterogeneity within cell populations due to meaningful, biological variation. Examining cell-to-cell heterogeneity, and further, identifying subpopulations of cells based on scRNA seq data has been of common interest in life science research. A key component to successfully identifying cell subpopulations (or clustering cells) is the (dis)similarity measure used to group the cells. In this paper, we introduce a novel measure, named SIDEseq, to assess cell-to-cell similarity using scRNA seq data. SIDEseq first identifies a list of putative differentially expressed (DE) genes for each pair of cells. SIDEseq then integrates the information from all the DE gene lists (corresponding to all pairs of cells) to build a similarity measure between two cells. SIDEseq can be implemented in any clustering algorithm that requires a (dis)similarity matrix. This new measure incorporates information from all cells when evaluating the similarity between any two cells, a characteristic not commonly found in existing (dis)similarity measures. This property is advantageous for two reasons: (a) borrowing information from cells of different subpopulations allows for the investigation of pairwise cell relationships from a global perspective and (b) information from other cells of the same subpopulation could help to ensure a robust relationship assessment. We applied SIDEseq to a newly generated human ovarian cancer scRNA seq dataset, a public human embryo scRNA seq dataset, and several simulated datasets. The clustering results suggest that the SIDEseq measure is capable of uncovering important relationships between cells, and outperforms or at least does as well as several popular (dis)similarity measures when used on these datasets.  相似文献   

2.
3.
分别以牛血清白蛋白和人血清白蛋白作为封闭液的主要成分测定7份血清中抗b型流感嗜血杆菌荚膜多糖的抗体含量,结果显示,两组数据之间没有显著性差异(P>0.05)。分别用HRP标记的羊抗人IgG和HRP标记的鼠抗人IgG测定该7份血清,结果显示两组结果无显著性差异(P>0.05)。同时,将血清反应的温度和时间由37℃1.5h改为4℃16h(过夜),结果显示两者无显著性差异(P>0.05)。从而优化了该方法。  相似文献   

4.
Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein–protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process.  相似文献   

5.
Group B Streptococcus (GBS) is classified into nine serotypes that vary in capsular polysaccharide (CPS) architecture but share in common the presence of a terminal sialic acid (Sia) residue. This position and linkage of GBS Sia closely resembles that of cell surface glycans found abundantly on human cells. CD33-related Siglecs (CD33rSiglecs) are a family of Sia-binding lectins expressed on host leukocytes that engage host Sia-capped glycans and send signals that dampen inflammatory gene activation. We hypothesized that GBS evolved to display CPS Sia as a form of molecular mimicry limiting the activation of an effective innate immune response. In this study, we applied a panel of immunologic and cell-based assays to demonstrate that GBS of several serotypes interacts in a Sia- and serotype-specific manner with certain human CD33rSiglecs, including hSiglec-9 and hSiglec-5 expressed on neutrophils and monocytes. Modification of GBS CPS Sia by O acetylation has recently been recognized, and we further show that the degree of O acetylation can markedly affect the interaction between GBS and hSiglec-5, -7, and -9. Thus, production of Sia-capped bacterial polysaccharide capsules that mimic human cell surface glycans in order to engage CD33rSiglecs may be an example of a previously unrecognized bacterial mechanism of leukocyte manipulation.  相似文献   

6.
We consider the problem of identifying common three-dimensional substructures between proteins. Our method is based on comparing the shape of the alpha-carbon backbone structures of the proteins in order to find three-dimensional (3D) rigid motions that bring portions of the geometric structures into correspondence. We propose a geometric representation of protein backbone chains that is compact yet allows for similarity measures that are robust against noise and outliers. This representation encodes the structure of the backbone as a sequence of unit vectors, defined by each adjacent pair of alpha-carbons. We then define a measure of the similarity of two protein structures based on the root mean squared (RMS) distance between corresponding orientation vectors of the two proteins. Our measure has several advantages over measures that are commonly used for comparing protein shapes, such as the minimum RMS distance between the 3D positions of corresponding atoms in two proteins. A key advantage is that this new measure behaves well for identifying common substructures, in contrast with position-based measures where the nonmatching portions of the structure dominate the measure. At the same time, it avoids the quadratic space and computational difficulties associated with methods based on distance matrices and contact maps. We show applications of our approach to detecting common contiguous substructures in pairs of proteins, as well as the more difficult problem of identifying common protein domains (i.e., larger substructures that are not necessarily contiguous along the protein chain).  相似文献   

7.
Following sequence alignment, clustering algorithms are among the most utilized techniques in gene expression data analysis. Clustering gene expression patterns allows researchers to determine which gene expression patterns are alike and most likely to participate in the same biological process being investigated. Gene expression data also allow the clustering of whole samples of data, which makes it possible to find which samples are similar and, consequently, which sampled biological conditions are alike. Here, a novel similarity measure calculation and the resulting rank-based clustering algorithm are presented. The clustering was applied in 418 gene expression samples from 13 data series spanning three model organisms: Homo sapiens, Mus musculus, and Arabidopsis thaliana. The initial results are striking: more than 91% of the samples were clustered as expected. The MESs (most expressed sequences) approach outperformed some of the most used clustering algorithms applied to this kind of data such as hierarchical clustering and K-means. The clustering performance suggests that the new similarity measure is an alternative to the traditional correlation/distance measures typically used in clustering algorithms.  相似文献   

8.
Distance-based clustering of CGH data   总被引:1,自引:0,他引:1  
MOTIVATION: We consider the problem of clustering a population of Comparative Genomic Hybridization (CGH) data samples. The goal is to develop a systematic way of placing patients with similar CGH imbalance profiles into the same cluster. Our expectation is that patients with the same cancer types will generally belong to the same cluster as their underlying CGH profiles will be similar. RESULTS: We focus on distance-based clustering strategies. We do this in two steps. (1) Distances of all pairs of CGH samples are computed. (2) CGH samples are clustered based on this distance. We develop three pairwise distance/similarity measures, namely raw, cosine and sim. Raw measure disregards correlation between contiguous genomic intervals. It compares the aberrations in each genomic interval separately. The remaining measures assume that consecutive genomic intervals may be correlated. Cosine maps pairs of CGH samples into vectors in a high-dimensional space and measures the angle between them. Sim measures the number of independent common aberrations. We test our distance/similarity measures on three well known clustering algorithms, bottom-up, top-down and k-means with and without centroid shrinking. Our results show that sim consistently performs better than the remaining measures. This indicates that the correlation of neighboring genomic intervals should be considered in the structural analysis of CGH datasets. The combination of sim with top-down clustering emerged as the best approach. AVAILABILITY: All software developed in this article and all the datasets are available from the authors upon request. CONTACT: juliu@cise.ufl.edu.  相似文献   

9.
The fundamentals of growth-linked biodegradation occurring at low substrate concentrations are poorly understood. Substrate utilization kinetics and microbial growth yields are two critically important process parameters that can be influenced by low substrate concentrations. Standard biodegradation tests aimed at measuring these parameters generally ignore the ubiquitous occurrence of assimilable organic carbon (AOC) in experimental systems which can be present at concentrations exceeding the concentration of the target substrate. The occurrence of AOC effectively makes biodegradation assays conducted at low substrate concentrations mixed-substrate assays, which can have profound effects on observed substrate utilization kinetics and microbial growth yields. In this work, we introduce a novel methodology for investigating biodegradation at low concentrations by restricting AOC in our experiments. We modified an existing method designed to measure trace concentrations of AOC in water samples and applied it to systems in which pure bacterial strains were growing on pesticide substrates between 0.01 and 50 mg liter−1. We simultaneously measured substrate concentrations by means of high-performance liquid chromatography with UV detection (HPLC-UV) or mass spectrometry (MS) and cell densities by means of flow cytometry. Our data demonstrate that substrate utilization kinetic parameters estimated from high-concentration experiments can be used to predict substrate utilization at low concentrations under AOC-restricted conditions. Further, restricting AOC in our experiments enabled accurate and direct measurement of microbial growth yields at environmentally relevant concentrations for the first time. These are critical measurements for evaluating the degradation potential of natural or engineered remediation systems. Our work provides novel insights into the kinetics of biodegradation processes and growth yields at low substrate concentrations.  相似文献   

10.
A large number of assays are available to monitor viability in mammalian cell cultures with most defining loss of viability as a loss of plasma membrane integrity, a characteristic of necrotic cell death. However, the majority of cultured cells die by apoptosis and early apoptotic cells, although non-viable, maintain an intact plasma membrane and are thus ignored. Here we measure the viability of cultures of a number of common mammalian cell lines by assays that measure membrane integrity (a measure of necrotic cell death) and assays that measure apoptotic cells, and show that discrepancies in the measurement of culture viability have a significant impact on the calculation of cell culture parameters and lead to skewed experimental data.  相似文献   

11.
MOTIVATION: We consider the problem of clustering a population of Comparative Genomic Hybridization (CGH) data samples using similarity based clustering methods. A key requirement for clustering is to avoid using the noisy aberrations in the CGH samples. RESULTS: We develop a dynamic programming algorithm to identify a small set of important genomic intervals called markers. The advantage of using these markers is that the potentially noisy genomic intervals are excluded during the clustering process. We also develop two clustering strategies using these markers. The first one, prototype-based approach, maximizes the support for the markers. The second one, similarity-based approach, develops a new similarity measure called RSim and refines clusters with the aim of maximizing the RSim measure between the samples in the same cluster. Our results demonstrate that the markers we found represent the aberration patterns of cancer types well and they improve the quality of clustering significantly. AVAILABILITY: All software developed in this paper and all the datasets used are available from the authors upon request.  相似文献   

12.
Existing methods for calculating semantic similarities between pairs of Gene Ontology (GO) terms and gene products often rely on external databases like Gene Ontology Annotation (GOA) that annotate gene products using the GO terms. This dependency leads to some limitations in real applications. Here, we present a semantic similarity algorithm (SSA), that relies exclusively on the GO. When calculating the semantic similarity between a pair of input GO terms, SSA takes into account the shortest path between them, the depth of their nearest common ancestor, and a novel similarity score calculated between the definitions of the involved GO terms. In our work, we use SSA to calculate semantic similarities between pairs of proteins by combining pairwise semantic similarities between the GO terms that annotate the involved proteins. The reliability of SSA was evaluated by comparing the resulting semantic similarities between proteins with the functional similarities between proteins derived from expert annotations or sequence similarity. Comparisons with existing state-of-the-art methods showed that SSA is highly competitive with the other methods. SSA provides a reliable measure for semantics similarity independent of external databases of functional-annotation observations.  相似文献   

13.
Cytotoxicity assays are essential tests in studies on the safety and biocompatibility of various substances and on the efficiency of anticancer drugs. The most frequently used assays commonly require application of externally added labels and read only collective response of cells. Recent studies show that the internal biophysical parameters of cells can be associated with the cellular damage. Therefore, using atomic force microscopy, we assessed the changes in the viscoelastic parameters of cells treated with eight different common cytotoxic agents to gain a more systematic view of the occurring mechanical changes. With the robust statistical analysis to account for both the cell-level variability and the experimental reproducibility, we have found that cell softening is a common response after each treatment. More precisely, the combined changes in the viscoelastic parameters of power-law rheology model led to a significant decrease of the apparent elastic modulus. The comparison with the morphological parameters (cytoskeleton and cell shape) demonstrated a higher sensitivity of the mechanical parameters versus the morphological ones. The obtained results support the idea of cell mechanics-based cytotoxicity tests and suggest a common way of a cell responding to damaging actions by softening.  相似文献   

14.
Two assays based on the inhibition of 3H-thymidine incorporation into DNA were used to measure either the antimetabolic or the antiproliferative effects of anticancer drugs. A direct comparison of the two assays was made with cell suspensions obtained from 11 ovarian cancers and 22 malignant melanomas. Drugs with different effects on cell cycle phases were tested by both assays, for a total of 53 drug comparisons. When the sensitivity indices specific for each system was used, a significant association (p less than 0.01) was noted between the two assays. The agreement of both assays in defining in vitro sensitivity or resistance was 100% for ovarian cancer. For melanoma, 97% of samples resistant to the antimetabolic assay were also resistant to the antiproliferative assay; whereas, only 45% of samples sensitive to the antimetabolic assay were sensitive to the antiproliferative assay.  相似文献   

15.
Among the several linkage disequilibrium measures known to capture different features of the non-independence between alleles at different loci, the most commonly used for diallelic loci is the r(2) measure. In the present study, we tackled the problem of the bias of r(2) estimate, which results from the sample structure and/or the relatedness between genotyped individuals. We derived two novel linkage disequilibrium measures for diallelic loci that are both extensions of the usual r(2) measure. The first one, r(S)(2), uses the population structure matrix, which consists of information about the origins of each individual and the admixture proportions of each individual genome. The second one, r(V)(2), includes the kinship matrix into the calculation. These two corrections can be applied together in order to correct for both biases and are defined either on phased or unphased genotypes.We proved that these novel measures are linked to the power of association tests under the mixed linear model including structure and kinship corrections. We validated them on simulated data and applied them to real data sets collected on Vitis vinifera plants. Our results clearly showed the usefulness of the two corrected r(2) measures, which actually captured 'true' linkage disequilibrium unlike the usual r(2) measure.  相似文献   

16.
The profile hidden Markov model (PHMM) is widely used to assign the protein sequences to their respective families. A major limitation of a PHMM is the assumption that given states the observations (amino acids) are independent. To overcome this limitation, the dependency between amino acids in a multiple sequence alignment (MSA) which is the representative of a PHMM can be appended to the PHMM. Due to the fact that with a MSA, the sequences of amino acids are biologically related, the one-by-one dependency between two amino acids can be considered. In other words, based on the MSA, the dependency between an amino acid and its corresponding amino acid located above can be combined with the PHMM. For this purpose, the new emission probability matrix which considers the one-by-one dependencies between amino acids is constructed. The parameters of a PHMM are of two types; transition and emission probabilities which are usually estimated using an EM algorithm called the Baum-Welch algorithm. We have generalized the Baum-Welch algorithm using similarity emission matrix constructed by integrating the new emission probability matrix with the common emission probability matrix. Then, the performance of similarity emission is discussed by applying it to the top twenty protein families in the Pfam database. We show that using the similarity emission in the Baum-Welch algorithm significantly outperforms the common Baum-Welch algorithm in the task of assigning protein sequences to protein families.  相似文献   

17.
Group B Streptococcus (GBS) is the leading cause of bacterial sepsis and meningitis among neonates. While the capsular polysaccharide (CPS) is an important virulence factor of GBS, other cell surface components, such as C proteins, may also play a role in GBS disease. CPS production by GBS type III strain M781 was greater when cells were held at a fast (1.4-h mass-doubling time [td]) than at a slow (11-h td) rate of growth. To further investigate growth rate regulation of CPS production and to investigate production of other cell components, different serotypes and strains of GBS were grown in continuous culture in a semidefined and a complex medium. Samples were obtained after at least five generations at the selected growth rate. Cells and cell-free supernatants were processed immediately, and results from all assays were normalized for cell dry weight. All serotypes (Ia, Ib, and III) and strains (one or two strains per serotype) tested produced at least 3.6-fold more CPS at a td of 1. 4 h than at a td of 11 h. Production of beta C protein by GBS type Ia strain A909 and type Ib strain H36B was also shown to increase at least 5.5-fold with increased growth rate (production at a td of 1. 4 h versus 11 h). The production of alpha C protein by the same strains did not significantly change with increased growth rate. The effect of growth rate on other cell components was also investigated. Production of group B antigen did not change with growth rate, while alkaline phosphatase decreased with increased growth rate. Both CAMP factor and beta-hemolysin production increased fourfold with increased growth rate. Growth rate regulation is specific for select cell components in GBS, including beta C protein, alkaline phosphatase, beta-hemolysin, and CPS production.  相似文献   

18.
This study aimed to standardise an in-house real-time polymerase chain reaction (rtPCR) to allow quantification of hepatitis B virus (HBV) DNA in serum or plasma samples, and to compare this method with two commercial assays, the Cobas Amplicor HBV monitor and the Cobas AmpliPrep/Cobas TaqMan HBV test. Samples from 397 patients from the state of São Paulo were analysed by all three methods. Fifty-two samples were from patients who were human immunodeficiency virus and hepatitis C virus positive, but HBV negative. Genotypes were characterised, and the viral load was measure in each sample. The in-house rtPCR showed an excellent success rate compared with commercial tests; inter-assay and intra-assay coefficients correlated with commercial tests (r = 0.96 and r = 0.913, p < 0.001) and the in-house test showed no genotype-dependent differences in detection and quantification rates. The in-house assay tested in this study could be used for screening and quantifying HBV DNA in order to monitor patients during therapy.  相似文献   

19.
RNA dot-blot, quantitative electron microscope immunocytochemistry, and electrophoretic immunoblotting techniques were employed to investigate the expression of carbamoyl-phosphate synthetase I (CPS) and ornithine carbamoyl transferase (OCT) genes in rat liver and intestinal mucosa. Comparing only those cell types in the two tissues which express these enzymes, we show that the concentration of CPS and OCT in hepatocyte mitochondria is 2.3-times and 1.2-times greater, respectively, than in intestinal epithelial cell mitochondria. As a percentage of total tissue protein, however, liver homogenates contain 10-20 times more CPS and 5-10 times more OCT than is found in intestinal mucosa. These relatively large differences in enzyme protein levels between the two tissues are not reflected by differences in their mRNA levels. As a percentage of total translational activity in vitro (based on incorporation of [35S]methionine), total liver mRNA directed synthesis of about twice as much precursor CPS (pCPS) and precursor OCT (pOCT) than did equivalent amounts of mRNA from intestinal mucosa. The ratio of pCPS and pOCT mRNA levels between the two tissues (2:1, liver:intestinal mucosa) was confirmed by dot-blot and Northern hybridizations employing specific cDNA probes. The sizes of the respective mRNAs were the same for the two tissues: about 6000 residues for pCPS mRNA and about 1700 residues for pOCT mRNA.  相似文献   

20.
MOTIVATION: Comparative metabolic profiling by nuclear magnetic resonance (NMR) is showing increasing promise for identifying inter-individual differences to drug response. Two dimensional (2D) (1)H (13)C NMR can reduce spectral overlap, a common problem of 1D (1)H NMR. However, the peak alignment tools for 1D NMR spectra are not well suited for 2D NMR. An automated and statistically robust method for aligning 2D NMR peaks is required to enable comparative metabonomic analysis using 2D NMR. RESULTS: A novel statistical method was developed to align NMR peaks that represent the same chemical groups across multiple 2D NMR spectra. The degree of local pattern match among peaks in different spectra is assessed using a similarity measure, and a heuristic algorithm maximizes the similarity measure for peaks across the whole spectrum. This peak alignment method was used to align peaks in 2D NMR spectra of endogenous metabolites in liver extracts obtained from four inbred mouse strains in the study of acetaminophen-induced liver toxicity. This automated alignment method was validated by manual examination of the top 50 peaks as ranked by signal intensity. Manual inspection of 1872 peaks in 39 different spectra demonstrated that the automated algorithm correctly aligned 1810 (96.7%) peaks. AVAILABILITY: Algorithm is available upon request.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号