首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
One of the main concerns in biology is extracting sophisticated features from DNA sequence for gene interaction determination, receiving a great deal of researchers’ attention. The epigenetic modifications along with their patterns have been intensely recognized as dominant features affecting on gene expression. However, studying sequenced-based features highly correlated to this key element has remained limited. The main objective in this research was to propose a new feature highly correlated to epigenetic modifications capable of classification of genes. In this paper, classification of 34 genes in PPAR signaling pathway associated with muscle fat tissue in human was performed. Using different statistical outlier detection methods, we proposed that 5-mers highly correlated to epigenetic modifications can correctly categorize the genes involved in the same biological pathway or process. Thirty-four genes in PPAR signaling pathway were classified via applying a proposed feature, 5-mers strongly associated to 17 different epigenetic modifications. For this, diverse statistical outlier detection methods were applied to specify the group of thoroughly correlated genes. The results indicated that these 5-mers can appropriately identify correlated genes. In addition, our results corresponded to GeneMania interaction information, leading to support the suggested method. The appealing findings imply that not only epigenetic modifications but also their highly correlated 5-mers can be applied for reconstructing gene regulatory networks as supplementary data as well as other applications like physical interaction, genes prioritization, indicating some sort of data fusion in this analysis.  相似文献   

2.
A variety of methods that predict human nonsynonymous single nucleotide polymorphisms (SNPs) to be neutral or disease-associated have been developed over the last decade. These methods are used for pinpointing disease-associated variants in the many variants obtained with next-generation sequencing technologies. The high performances of current sequence-based predictors indicate that sequence data contains valuable information about a variant being neutral or disease-associated. However, most predictors do not readily disclose this information, and so it remains unclear what sequence properties are most important. Here, we show how we can obtain insight into sequence characteristics of variants and their surroundings by interpreting predictors. We used an extensive range of features derived from the variant itself, its surrounding sequence, sequence conservation, and sequence annotation, and employed linear support vector machine classifiers to enable extracting feature importance from trained predictors. Our approach is useful for providing additional information about what features are most important for the predictions made. Furthermore, for large sets of known variants, it can provide insight into the mechanisms responsible for variants being disease-associated.  相似文献   

3.
4.
Loss or gain of DNA methylation can affect gene expression and is sometimes transmitted across generations. Such epigenetic alterations are thus a possible source of heritable phenotypic variation in the absence of DNA sequence change. However, attempts to assess the prevalence of stable epigenetic variation in natural and experimental populations and to quantify its impact on complex traits have been hampered by the confounding effects of DNA sequence polymorphisms. To overcome this problem as much as possible, two parents with little DNA sequence differences, but contrasting DNA methylation profiles, were used to derive a panel of epigenetic Recombinant Inbred Lines (epiRILs) in the reference plant Arabidopsis thaliana. The epiRILs showed variation and high heritability for flowering time and plant height (~30%), as well as stable inheritance of multiple parental DNA methylation variants (epialleles) over at least eight generations. These findings provide a first rationale to identify epiallelic variants that contribute to heritable variation in complex traits using linkage or association studies. More generally, the demonstration that numerous epialleles across the genome can be stable over many generations in the absence of selection or extensive DNA sequence variation highlights the need to integrate epigenetic information into population genetics studies.  相似文献   

5.
Chromatin is considered to be a principal carrier of epigenetic information due to the ability of alternative chromatin states to persist through generations of cell divisions and to spread on DNA. Replacement histone variants are novel candidates for epigenetic marking of chromatin. We developed a novel approach to analyze the chromatin environment of nucleosomes containing a particular replacement histone. We applied it to human H2AZ, one of the most studied alternative histones. We find that neither H2AZ itself nor other features of the H2AZ-containing nucleosome spread to the neighboring nucleosomes in vivo, arguing against a role for H2AZ as a self-perpetuating epigenetic mark.  相似文献   

6.
Building an accurate disease risk prediction model is an essential step in the modern quest for precision medicine. While high-dimensional genomic data provides valuable data resources for the investigations of disease risk, their huge amount of noise and complex relationships between predictors and outcomes have brought tremendous analytical challenges. Deep learning model is the state-of-the-art methods for many prediction tasks, and it is a promising framework for the analysis of genomic data. However, deep learning models generally suffer from the curse of dimensionality and the lack of biological interpretability, both of which have greatly limited their applications. In this work, we have developed a deep neural network (DNN) based prediction modeling framework. We first proposed a group-wise feature importance score for feature selection, where genes harboring genetic variants with both linear and non-linear effects are efficiently detected. We then designed an explainable transfer-learning based DNN method, which can directly incorporate information from feature selection and accurately capture complex predictive effects. The proposed DNN-framework is biologically interpretable, as it is built based on the selected predictive genes. It is also computationally efficient and can be applied to genome-wide data. Through extensive simulations and real data analyses, we have demonstrated that our proposed method can not only efficiently detect predictive features, but also accurately predict disease risk, as compared to many existing methods.  相似文献   

7.
Kaplan andZucker (1980) argued that dominance and kinship do not function as important organizing features for intragroup behavior and social structure among patas monkeys (Erythrocebus patas). This paper reviews the available data pertinent to this argument and concludes that dominance probably is not a reliable structural variable for captive patas, despite its clear development in most groups. In contrast, kinship is a major organizing feature that strongly affects allogrooming and other affiliative interactions, and socialization.  相似文献   

8.

Background

about 15% to 30% of the DNA in human sperm is packed in nucleosomes and transmission of this fraction to the embryo potentially serves as a mechanism to facilitate paternal epigenetic programs during embryonic development. However, hitherto it has not been established whether these nucleosomes are removed like the protamines or indeed contribute to paternal zygotic chromatin, thereby potentially contributing to the epigenome of the embryo.

Results

to clarify the fate of sperm-derived nucleosomes we have used the deposition characteristics of histone H3 variants from which follows that H3 replication variants present in zygotic paternal chromatin prior to S-phase originate from sperm. We have performed heterologous ICSI by injecting human sperm into mouse oocytes. Probing these zygotes with an antibody highly specific for the H3.1/H3.2 replication variants showed a clear signal in the decondensed human sperm chromatin prior to S-phase. In addition, staining of human multipronuclear zygotes also showed the H3.1/H3.2 replication variants in paternal chromatin prior to DNA replication.

Conclusion

these findings reveal that sperm-derived nucleosomal chromatin contributes to paternal zygotic chromatin, potentially serving as a template for replication, when epigenetic information can be copied. Hence, the execution of epigenetic programs originating from transmitted paternal chromatin during subsequent embryonic development is a logical consequence of this observation.  相似文献   

9.
ABSTRACT: BACKGROUND: Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. RESULTS: We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension - UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. CONCLUSIONS: Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/.  相似文献   

10.
The prevalence of common chronic non-communicable diseases (CNCDs) far overshadows the prevalence of both monogenic and infectious diseases combined. All CNCDs, also called complex genetic diseases, have a heritable genetic component that can be used for pre-symptomatic risk assessment. Common single nucleotide polymorphisms (SNPs) that tag risk haplotypes across the genome currently account for a non-trivial portion of the germ-line genetic risk and we will likely continue to identify the remaining missing heritability in the form of rare variants, copy number variants and epigenetic modifications. Here, we describe a novel measure for calculating the lifetime risk of a disease, called the genetic composite index (GCI), and demonstrate its predictive value as a clinical classifier. The GCI only considers summary statistics of the effects of genetic variation and hence does not require the results of large-scale studies simultaneously assessing multiple risk factors. Combining GCI scores with environmental risk information provides an additional tool for clinical decision-making. The GCI can be populated with heritable risk information of any type, and thus represents a framework for CNCD pre-symptomatic risk assessment that can be populated as additional risk information is identified through next-generation technologies.  相似文献   

11.
Systems initially emerged for protecting genomes against insertions of transposable elements and represented by mechanisms of splicing regulation, RNA–interference, and epigenetic factors have played a key role in the evolution of animals. Many studies have shown inherited transpositions of mobile elements in embryogenesis and preservation of their activities in certain tissues of adult organisms. It was supposed that on the emergence of Metazoa the self–regulation mechanisms of transposons related with the gene networks controlling their activity could be involved in intercellular cell coordination in the cascade of successive divisions with differentiated gene expression for generation of tissues and organs. It was supposed that during evolution species–specific features of transposons in the genomes of eukaryotes could form the basis for creation of dynamically related complexes of systems for epigenetic regulation of gene expression. These complexes could be produced due to the influence of noncoding transposon–derived RNAs on DNA methylation, histone modifications, and processing of alternative splicing variants, whereas the mobile elements themselves could be directly involved in the regulation of gene expression in cis and in trans. Transposons are widely distributed in the genomes of eukaryotes; therefore, their activation can change the expression of specific genes. In turn, this can play an important role in cell differentiation during ontogenesis. It is supposed that transposons can form a species–specific pattern for control of gene expression, and that some variants of this pattern can be favorable for adaptation. The presented data indicate the possible influence of transposons in karyotype formation. It is supposed that transposon localization relative to one another and to protein–coding genes can influence the species–specific epigenetic regulation of ontogenesis.  相似文献   

12.
R DeMars 《Mutation research》1974,24(3):335-364
In vitro enumeration of diploid human cell variants that are resistant to purine analogues is a possible method of detecting mutagenesis. Their incidences can be increased by the known mutagens, X-rays and N-methyl-N′-nitro-N-nitrosoguanidine (MNNG). Usefulness of this method depends on the kinds of hereditary changes that confer analogue-resistance on somatic cells. If resistance usually results from changes in genetic material, in vitro studies could be useful indicators of mutagenic effects on somatic cells and germ cells in vivo. If epigenetic changes are primarily responsible for analogue-resistant variants, their enumeration might not provide information relevant to germinal mutations but would still be a useful way to detect induction of general kinds of stable phenotypic changes that could cause cancer. This article outlines hypothetical epigenetic and genetic causes of somatic cell variation and a prospective genetic analysis of human cell variants that are resistant to 8-azaguanine (AG) or 2,6-diaminopurine ( (DAP).Recent evidences and arguments favoring epigenetic origins of resistance to base-analogues are inconclusive. The often cited high rate of changes causing impermeability to BUdR in hamster cells is based on one improperly executed determination. Comparisons of rates of variation conferring BUdR-resistance on cultured haploid and diploid frog cells included diploid variants that did not behave as mutants and ignored major sources of error in estimating mutation rates. AG-resistance could result from recessive mutations in X-chromosomal genes but comparisons of rates of mutation in hamster cells of different ploidies did not provide information about the numbers of X-chromosomes in the variants. Reports that normal rodent HGPRT reappeared in hybrids of enzyme-deficient rodent cells and HGPRT-containing cells of other species or in the rodent cells alone in response to the conditions of cell hybridization did not include adequate controls for reversions in mutant genes of the rodent cells. Questions about the epigenetic and genetic origins of analogue-resistance are mostly unanswered. It remains possible that some kinds of abnormal epigenetic changes cause somatic disease. Specific methods for detecting their occurrence and responsiveness to environmental factors should be devised by focusing efforts on traits that are normally subject to epigenetic regulation. Derepression of genes on the inactive X-chromosome and of liver phenylalanine hydroxylase production are presented as possible examples of abnormal epigenetic changes that could be quantitatively studied by direct selection in vitro.  相似文献   

13.
Histone variants play a critical role in chromatin structure and epigenetic regulation. These “deviant” proteins have been historically considered as the evolutionary descendants of ancestral canonical histones, helping specialize the nucleosome structure during eukaryotic evolution. Such view is now challenged by 2 major observations: first, canonical histones present extremely unique features not shared with any other genes; second, histone variants are widespread across many eukaryotic groups. The present work further supports the ancestral nature of histone variants by providing the first in vivo characterization of a functional macroH2A histone (a variant long defined as a specific refinement of vertebrate chromatin) in a non-vertebrate organism (the mussel Mytilus) revealing its recruitment into heterochromatic fractions of actively proliferating tissues. Combined with in silico analyses of genomic data, these results provide evidence for the widespread presence of macroH2A in metazoan animals, as well as in the holozoan Capsaspora, supporting an evolutionary origin for this histone variant lineage before the radiation of Filozoans (including Filasterea, Choanoflagellata and Metazoa). Overall, the results presented in this work help configure a new evolutionary scenario in which histone variants, rather than modern “deviants” of canonical histones, would constitute ancient components of eukaryotic chromatin.  相似文献   

14.
15.
Histone variants and epigenetic inheritance   总被引:1,自引:0,他引:1  
Nucleosome particles, which are composed of core histones and DNA, are the basic unit of eukaryotic chromatin. Histone modifications and histone composition determine the structure and function of the chromatin; this genome packaging, often referred to as "epigenetic information", provides additional information beyond the underlying genomic sequence. The epigenetic information must be transmitted from mother cells to daughter cells during mitotic division to maintain the cell lineage identity and proper gene expression. However, the mechanisms responsible for mitotic epigenetic inheritance remain largely unknown. In this review, we focus on recent studies regarding histone variants and discuss the assembly pathways that may contribute to epigenetic inheritance. This article is part of a Special Issue entitled: Histone chaperones and Chromatin assembly.  相似文献   

16.
The millions of mutations and polymorphisms that occur in human populations are potential predictors of disease, of our reactions to drugs, of predisposition to microbial infections, and of age-related conditions such as impaired brain and cardiovascular functions. However, predicting the phenotypic consequences and eventual clinical significance of a sequence variant is not an easy task. Computational approaches have found perturbation of conserved amino acids to be a useful criterion for identifying variants likely to have phenotypic consequences. To our knowledge, however, no study to date has explored the potential of variants that occur at homologous positions within paralogous human proteins as a means of identifying polymorphisms with likely phenotypic consequences. In order to investigate the potential of this approach, we have assembled a unique collection of known disease-causing variants from OMIM and the Human Genome Mutation Database (HGMD) and used them to identify and characterize pairs of sequence variants that occur at homologous positions within paralogous human proteins. Our analyses demonstrate that the locations of variants are correlated in paralogous proteins. Moreover, if one member of a variant-pair is disease-causing, its partner is likely to be disease-causing as well. Thus, information about variant-pairs can be used to identify potentially disease-causing variants, extend existing procedures for polymorphism prioritization, and provide a suite of candidates for further diagnostic and therapeutic purposes.  相似文献   

17.
In the era of structural genomics, the prediction of protein interactions using docking algorithms is an important goal. The success of this method critically relies on the identification of good docking solutions among a vast excess of false solutions. We have adapted the concept of mutual information (MI) from information theory to achieve a fast and quantitative screening of different structural features with respect to their ability to discriminate between physiological and nonphysiological protein interfaces. The strategy includes the discretization of each structural feature into distinct value ranges to optimize its mutual information. We have selected 11 structural features and two datasets to demonstrate that the MI is dimensionless and can be directly compared for diverse structural features and between datasets of different sizes. Conversion of the MI values into a simple scoring function revealed that those features with a higher MI are actually more powerful for the identification of good docking solutions. Thus, an MI-based approach allows the rapid screening of structural features with respect to their information content and should therefore be helpful for the design of improved scoring functions in future. In addition, the concept presented here may also be adapted to related areas that require feature selection for biomolecules or organic ligands.  相似文献   

18.
19.
20.
What determines phenotype is one of the most fundamental questions in biology. Historically, the search for answers had focused on genetic or environmental variants, but recent studies in epigenetics have revealed a third mechanism that can influence phenotypic outcomes, even in the absence of genetic or environmental heterogeneity. Even more surprisingly, some epigenetic variants, or epialleles, can be inherited by the offspring, indicating the existence of a mechanism for biological heredity that is not based on DNA sequence. Recent work from mouse models, human monozygotic twin studies, and large-scale epigenetic profiling suggests that epigenetically determined phenotypes and epigenetic inheritance are more common than previously appreciated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号