期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Structural interpretation of protein-protein interaction network

Ataur R Katebi Andrzej Kloczkowski Robert L Jernigan 《BMC structural biology》2010,10(Z1):S4

Background

Currently a huge amount of protein-protein interaction data is available from high throughput experimental methods. In a large network of protein-protein interactions, groups of proteins can be identified as functional clusters having related functions where a single protein can occur in multiple clusters. However experimental methods are error-prone and thus the interactions in a functional cluster may include false positives or there may be unreported interactions. Therefore correctly identifying a functional cluster of proteins requires the knowledge of whether any two proteins in a cluster interact, whether an interaction can exclude other interactions, or how strong the affinity between two interacting proteins is.

Methods

In the present work the yeast protein-protein interaction network is clustered using a spectral clustering method proposed by us in 2006 and the individual clusters are investigated for functional relationships among the member proteins. 3D structural models of the proteins in one cluster have been built – the protein structures are retrieved from the Protein Data Bank or predicted using a comparative modeling approach. A rigid body protein docking method (Cluspro) is used to predict the protein-protein interaction complexes. Binding sites of the docked complexes are characterized by their buried surface areas in the docked complexes, as a measure of the strength of an interaction.

Results

The clustering method yields functionally coherent clusters. Some of the interactions in a cluster exclude other interactions because of shared binding sites. New interactions among the interacting proteins are uncovered, and thus higher order protein complexes in the cluster are proposed. Also the relative stability of each of the protein complexes in the cluster is reported.

Conclusions

Although the methods used are computationally expensive and require human intervention and judgment, they can identify the interactions that could occur together or ones that are mutually exclusive. In addition indirect interactions through another intermediate protein can be identified. These theoretical predictions might be useful for crystallographers to select targets for the X-ray crystallographic determination of protein complexes.

相似文献

2.

Diffusion kernel-based logistic regression models for protein function prediction

Lee H Tu Z Deng M Sun F Chen T 《Omics : a journal of integrative biology》2006,10(1):40-55

Assigning functions to unknown proteins is one of the most important problems in proteomics. Several approaches have used protein-protein interaction data to predict protein functions. We previously developed a Markov random field (MRF) based method to infer a protein's functions using protein-protein interaction data and the functional annotations of its protein interaction partners. In the original model, only direct interactions were considered and each function was considered separately. In this study, we develop a new model which extends direct interactions to all neighboring proteins, and one function to multiple functions. The goal is to understand a protein's function based on information on all the neighboring proteins in the interaction network. We first developed a novel kernel logistic regression (KLR) method based on diffusion kernels for protein interaction networks. The diffusion kernels provide means to incorporate all neighbors of proteins in the network. Second, we identified a set of functions that are highly correlated with the function of interest, referred to as the correlated functions, using the chi-square test. Third, the correlated functions were incorporated into our new KLR model. Fourth, we extended our model by incorporating multiple biological data sources such as protein domains, protein complexes, and gene expressions by converting them into networks. We showed that the KLR approach of incorporating all protein neighbors significantly improved the accuracy of protein function predictions over the MRF model. The incorporation of multiple data sets also improved prediction accuracy. The prediction accuracy is comparable to another protein function classifier based on the support vector machine (SVM), using a diffusion kernel. The advantages of the KLR model include its simplicity as well as its ability to explore the contribution of neighbors to the functions of proteins of interest. 相似文献

3.

Mass spectrometry-based functional proteomics: from molecular machines to protein networks 总被引：1，自引：0，他引：1

Köcher T Superti-Furga G 《Nature methods》2007,4(10):807-815

The study of protein-protein interactions by mass spectrometry is an increasingly important part of post-genomics strategies to understand protein function. A variety of mass spectrometry-based approaches allow characterization of cellular protein assemblies under near-physiological conditions and subsequent assignment of individual proteins to specific molecular machines, pathways and networks, according to an increasing level of organizational complexity. An appropriate analytical strategy can be individually tailored--from an in-depth analysis of single complexes to a large-scale characterization of entire molecular pathways or even an analysis of the molecular organization of entire expressed proteomes. Here we review different options regarding protein-complex purification strategies, mass spectrometry analysis and bioinformatic methods according to the specific question that is being addressed. 相似文献

4.

Docking of protein models

下载免费PDF全文

Tovchigrechko A Wells CA Vakser IA 《Protein science : a publication of the Protein Society》2002,11(8):1888-1896

相似文献

5.

Fully automated protein complex prediction based on topological similarity and community structure

Chengwei Lei Saleh Tamim Alexander JR Bishop Jianhua Ruan 《Proteome science》2013,11(Z1):S9

To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for analysis. Moreover, high level of noise, sparseness, and skewness in degree distribution of PPI networks limits the performance of many clustering algorithms and further analysis of their interactions.In addressing and solving these problems we present a novel random walk based algorithm that converts the incomplete and binary PPI network into a protein-protein topological similarity matrix (PP-TS matrix). We believe that if two proteins share some high-order topological similarities they are likely to be interacting with each other. Using the obtained PP-TS matrix, we constructed and used weighted networks to further study and analyze the interaction among proteins. Specifically, we applied a fully automated community structure finding algorithm (Auto-HQcut) on the obtained weighted network to cluster protein complexes. We then analyzed the protein complexes for significance in biological processes. To help visualize and analyze these protein complexes we also developed an interface that displays the resulting complexes as well as the characteristics associated with each complex.Applying our approach to a yeast protein-protein interaction network, we found that the predicted protein-protein interaction pairs with high topological similarities have more significant biological relevance than the original protein-protein interactions pairs. When we compared our PPI network reconstruction algorithm with other existing algorithms using gene ontology and gene co-expression, our algorithm produced the highest similarity scores. Also, our predicted protein complexes showed higher accuracy measure compared to the other protein complex predictions. 相似文献

6.

Identification of transient hub proteins and the possible structural basis for their multiple interactions 总被引：1，自引：0，他引：1

Higurashi M Ishida T Kinoshita K 《Protein science : a publication of the Protein Society》2008,17(1):72-78

Proteins that can interact with multiple partners play central roles in the network of protein-protein interactions. They are called hub proteins, and recently it was suggested that an abundance of intrinsically disordered regions on their surfaces facilitates their binding to multiple partners. However, in those studies, the hub proteins were identified as proteins with multiple partners, regardless of whether the interactions were transient or permanent. As a result, a certain number of hub proteins are subunits of stable multi-subunit proteins, such as supramolecules. It is well known that stable complexes and transient complexes have different structural features, and thus the statistics based on the current definition of hub proteins will hide the true nature of hub proteins. Therefore, in this paper, we first describe a new approach to identify proteins with multiple partners dynamically, using the Protein Data Bank, and then we performed statistical analyses of the structural features of these proteins. We refer to the proteins as transient hub proteins or sociable proteins, to clarify the difference with hub proteins. As a result, we found that the main difference between sociable and nonsociable proteins is not the abundance of disordered regions, in contrast to the previous studies, but rather the structural flexibility of the entire protein. We also found greater predominance of charged and polar residues in sociable proteins than previously reported. 相似文献

7.

Categorizing biases in high-confidence high-throughput protein-protein interaction data sets

Yu X Ivanic J Memisević V Wallqvist A Reifman J 《Molecular & cellular proteomics : MCP》2011,10(12):M111.012500

We characterized and evaluated the functional attributes of three yeast high-confidence protein-protein interaction data sets derived from affinity purification/mass spectrometry, protein-fragment complementation assay, and yeast two-hybrid experiments. The interacting proteins retrieved from these data sets formed distinct, partially overlapping sets with different protein-protein interaction characteristics. These differences were primarily a function of the deployed experimental technologies used to recover these interactions. This affected the total coverage of interactions and was especially evident in the recovery of interactions among different functional classes of proteins. We found that the interaction data obtained by the yeast two-hybrid method was the least biased toward any particular functional characterization. In contrast, interacting proteins in the affinity purification/mass spectrometry and protein-fragment complementation assay data sets were over- and under-represented among distinct and different functional categories. We delineated how these differences affected protein complex organization in the network of interactions, in particular for strongly interacting complexes (e.g. RNA and protein synthesis) versus weak and transient interacting complexes (e.g. protein transport). We quantified methodological differences in detecting protein interactions from larger protein complexes, in the correlation of protein abundance among interacting proteins, and in their connectivity of essential proteins. In the latter case, we showed that minimizing inherent methodology biases removed many of the ambiguous conclusions about protein essentiality and protein connectivity. We used these findings to rationalize how biological insights obtained by analyzing data sets originating from different sources sometimes do not agree or may even contradict each other. An important corollary of this work was that discrepancies in biological insights did not necessarily imply that one detection methodology was better or worse, but rather that, to a large extent, the insights reflected the methodological biases themselves. Consequently, interpreting the protein interaction data within their experimental or cellular context provided the best avenue for overcoming biases and inferring biological knowledge. 相似文献

8.

Elucidating the dynamic remodelling of Escherichia coli interactome in different growth conditions using multiplex co-fractionation MS (mCF-MS)

Teck Yew Low 《Proteomics》2023,23(21-22):2300209

Most proteins function by forming complexes within a dynamic interconnected network that underlies various biological mechanisms. To systematically investigate such interactomes, high-throughput techniques, including CF-MS, have been developed to capture, identify, and quantify protein-protein interactions (PPIs) on a large scale. Compared to other techniques, CF-MS allows the global identification and quantification of native protein complexes in one setting, without genetic manipulation. Furthermore, quantitative CF-MS can potentially elucidate the distribution of a protein in multiple co-elution features, informing the stoichiometries and dynamics of a target protein complex. In this issue, Youssef et al. (Proteomics 2023, 00, e2200404) combined multiplex CF-MS and a new algorithm to study the dynamics of the PPI network for Escherichia coli grown under ten different conditions. Although the results demonstrated that most proteins remained stable, the authors were able to detect disrupted interactions that were growth condition specific. Further bioinformatics analyses also revealed the biophysical properties and structural patterns that govern such a response. 相似文献

9.

Prediction of protein-protein interactions: unifying evolution and structure at protein interfaces

Tuncbag N Gursoy A Keskin O 《Physical biology》2011,8(3):035006

The vast majority of the chores in the living cell involve protein-protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein-protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations. 相似文献

10.

An expanded protein-protein interaction network in Bacillus subtilis reveals a group of hubs: Exploration by an integrative approach

Marchadier E Carballido-López R Brinster S Fabret C Mervelet P Bessières P Noirot-Gros MF Fromion V Noirot P 《Proteomics》2011,11(15):2981-2991

相似文献

11.

Using indirect protein-protein interactions for protein complex prediction 总被引：1，自引：0，他引：1

Chua HN Ning K Sung WK Leong HW Wong L 《Journal of bioinformatics and computational biology》2008,6(3):435-466

Protein complexes are fundamental for understanding principles of cellular organizations. As the sizes of protein-protein interaction (PPI) networks are increasing, accurate and fast protein complex prediction from these PPI networks can serve as a guide for biological experiments to discover novel protein complexes. However, it is not easy to predict protein complexes from PPI networks, especially in situations where the PPI network is noisy and still incomplete. Here, we study the use of indirect interactions between level-2 neighbors (level-2 interactions) for protein complex prediction. We know from previous work that proteins which do not interact but share interaction partners (level-2 neighbors) often share biological functions. We have proposed a method in which all direct and indirect interactions are first weighted using topological weight (FS-Weight), which estimates the strength of functional association. Interactions with low weight are removed from the network, while level-2 interactions with high weight are introduced into the interaction network. Existing clustering algorithms can then be applied to this modified network. We have also proposed a novel algorithm that searches for cliques in the modified network, and merge cliques to form clusters using a "partial clique merging" method. Experiments show that (1) the use of indirect interactions and topological weight to augment protein-protein interactions can be used to improve the precision of clusters predicted by various existing clustering algorithms; and (2) our complex-finding algorithm performs very well on interaction networks modified in this way. Since no other information except the original PPI network is used, our approach would be very useful for protein complex prediction, especially for prediction of novel protein complexes. 相似文献

12.

Integrative approach for computationally inferring protein domain interactions 总被引：9，自引：0，他引：9

Ng SK Zhang Z Tan SH 《Bioinformatics (Oxford, England)》2003,19(8):923-929

MOTIVATION: The current need for high-throughput protein interaction detection has resulted in interaction data being generated en masse through such experimental methods as yeast-two-hybrids and protein chips. Such data can be erroneous and they often do not provide adequate functional information for the detected interactions. Therefore, it is useful to develop an in silico approach to further validate and annotate the detected protein interactions. RESULTS: Given that protein-protein interactions involve physical interactions between protein domains, domain-domain interaction information can be useful for validating, annotating, and even predicting protein interactions. However, large-scale, experimentally determined domain-domain interaction data do not exist. Here, we describe an integrative approach to computationally derive putative domain interactions from multiple data sources, including protein interactions, protein complexes, and Rosetta Stone sequences. We further prove the usefulness of such an integrative approach by applying the derived domain interactions to predict and validate protein-protein interactions. AVAILABILITY: A database of putative protein domain interactions derived using the method described in this paper is available at http://interdom.lit.org.sg. 相似文献

13.

MULTIPROSPECTOR: an algorithm for the prediction of protein-protein interactions by multimeric threading

Lu L Lu H Skolnick J 《Proteins》2002,49(3):350-364

In this postgenomic era, the ability to identify protein-protein interactions on a genomic scale is very important to assist in the assignment of physiological function. Because of the increasing number of solved structures involving protein complexes, the time is ripe to extend threading to the prediction of quaternary structure. In this spirit, a multimeric threading approach has been developed. The approach is comprised of two phases. In the first phase, traditional threading on a single chain is applied to generate a set of potential structures for the query sequences. In particular, we use our recently developed threading algorithm, PROSPECTOR. Then, for those proteins whose template structures are part of a known complex, we rethread on both partners in the complex and now include a protein-protein interfacial energy. To perform this analysis, a database of multimeric protein structures has been constructed, the necessary interfacial pairwise potentials have been derived, and a set of empirical indicators to identify true multimers based on the threading Z-score and the magnitude of the interfacial energy have been established. The algorithm has been tested on a benchmark set comprised of 40 homodimers, 15 heterodimers, and 69 monomers that were scanned against a protein library of 2478 structures that comprise a representative set of structures in the Protein Data Bank. Of these, the method correctly recognized and assigned 36 homodimers, 15 heterodimers, and 65 monomers. This protocol was applied to identify partners and assign quaternary structures of proteins found in the yeast database of interacting proteins. Our multimeric threading algorithm correctly predicts 144 interacting proteins, compared to the 56 (26) cases assigned by PSI-BLAST using a (less) permissive E-value of 1 (0.01). Next, all possible pairs of yeast proteins have been examined. Predictions (n = 2865) of protein-protein interactions are made; 1138 of these 2865 interactions have counterparts in the Database of Interacting Proteins. In contrast, PSI-BLAST made 1781 predictions, and 1215 have counterparts in DIP. An estimation of the false-negative rate for yeast-predicted interactions has also been provided. Thus, a promising approach to help assist in the assignment of protein-protein interactions on a genomic scale has been developed. 相似文献

14.

Prediction of protein functions with gene ontology and interspecies protein homology data

Mitrofanova A Pavlovic V Mishra B 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2011,8(3):775-784

Accurate computational prediction of protein functions increasingly relies on network-inspired models for the protein function transfer. This task can become challenging for proteins isolated in their own network or those with poor or uncharacterized neighborhoods. Here, we present a novel probabilistic chain-graph-based approach for predicting protein functions that builds on connecting networks of two (or more) different species by links of high interspecies sequence homology. In this way, proteins are able to "exchange" functional information with their neighbors-homologs from a different species. The knowledge of interspecies relationships, such as the sequence homology, can become crucial in cases of limited information from other sources of data, including the protein-protein interactions or cellular locations of proteins. We further enhance our model to account for the Gene Ontology dependencies by linking multiple but related functional ontology categories within and across multiple species. The resulting networks are of significantly higher complexity than most traditional protein network models. We comprehensively benchmark our method by applying it to two largest protein networks, the Yeast and the Fly. The joint Fly-Yeast network provides substantial improvements in precision, accuracy, and false positive rate over networks that consider either of the sources in isolation. At the same time, the new model retains the computational efficiency similar to that of the simpler networks. 相似文献

15.

Mass spectrometry for the study of protein-protein interactions 总被引：8，自引：0，他引：8

Figeys D McBroom LD Moran MF 《Methods (San Diego, Calif.)》2001,24(3):230-239

The identification of subpicomolar amounts of protein by mass spectrometry (MS) coupled with two-dimensional methods to separate complex protein mixtures is fueling the field of proteomics, and making feasible the notion of cataloging and comparing all of the expressed proteins in a biological sample. Functional proteomics is a complementary effort aimed at the characterization of functional features of proteins, such as their interactions with other proteins. Proteins comprise modular domains, many of which are noncatalytic modules that direct protein-protein interactions. Capturing proteins of interest and their interacting proteins by using high-affinity antibodies presents a simple method to prepare relatively simple protein mixtures easily resolved in one-dimensional formats. Individual or mixtures of proteins identified as stained bands in polyacrylamide gels are subjected to in situ digestion with the protease trypsin, and the extracted peptide fragments are analyzed by MS. The quality, quantity, and complexity of the tryptic digest, the species origin of the proteins, and the quality of the corresponding databases of genomic and protein information greatly influence the subsequent MS analysis in terms of degree of difficulty and methodological approach required to make an unambiguous protein identification. In this article we report the isolation of associated proteins from a complex cell-derived lysate by using an epitope-directed antibody. The protein pICLn engineered to carry an epitope tag was purified from cultured human embryonic kidney cells, and found to associate with a variety of proteins including the spliceosomal proteins smE and smG. By application of this general approach, the systematic identification of protein complexes and assignment of protein function are possible. 相似文献

16.

Far-Western based protein-protein interaction screening of high-density protein filter arrays

Mahlknecht U Ottmann OG Hoelzer D 《Journal of biotechnology》2001,88(2):89-94

Even though a rough sketch of the human genome is now available and the number of newly discovered genes, which carry the potential of being biologically and medically relevant is currently greater than ever, only a small proportion has been assigned a biological function. Therefore, enormous attention is now increasingly being drawn towards functional genomics, i.e. the functional characterization of these newly identified sequences. In order to elucidate the role of a particular gene product within its cellular context, we have screened high-density protein filter arrays for protein-protein interactions on the basis of a 'Far-Western' based approach. The methodology described herein easily allows the identification and isolation of cDNAs of proteins, which interact with specific ligands (interacting proteins, antibodies and DNA/RNA sequences), and represents an alternative to tedious conventional protein interaction analyses. Far-Western screening in the context of a whole-genome expression analysis not only facilitates the assignment of biological functions to specific, newly identified protein and DNA sequences, but also is useful in studies that assess the binding capacity of mutant proteins to their interaction partner and in the identification of domains and amino acids involved in known protein-protein interactions. Taken together, we describe an approach that allows the easy and reproducible identification of protein ligands on the basis of a whole-genome expression analysis. 相似文献

17.

Identification of functional modules from conserved ancestral protein-protein interactions 总被引：1，自引：0，他引：1

Dutkowski J Tiuryn J 《Bioinformatics (Oxford, England)》2007,23(13):i149-i158

MOTIVATION: The increasing availability of large-scale protein-protein interaction (PPI) data has fueled the efforts to elucidate the building blocks and organization of cellular machinery. Previous studies have shown cross-species comparison to be an effective approach in uncovering functional modules in protein networks. This has in turn driven the research for new network alignment methods with a more solid grounding in network evolution models and better scalability, to allow multiple network comparison. RESULTS: We develop a new framework for protein network alignment, based on reconstruction of an ancestral PPI network. The reconstruction algorithm is built upon a proposed model of protein network evolution, which takes into account phylogenetic history of the proteins and the evolution of their interactions. The application of our methodology to the PPI networks of yeast, worm and fly reveals that the most probable conserved ancestral interactions are often related to known protein complexes. By projecting the conserved ancestral interactions back onto the input networks we are able to identify the corresponding conserved protein modules in the considered species. In contrast to most of the previous methods, our algorithm is able to compare many networks simultaneously. The performed experiments demonstrate the ability of our method to uncover many functional modules with high specificity. AVAILABILITY: Information for obtaining software and supplementary results are available at http://bioputer.mimuw.edu.pl/papers/cappi. 相似文献

18.

Functional topology in a network of protein interactions 总被引：8，自引：0，他引：8

Przulj N Wigle DA Jurisica I 《Bioinformatics (Oxford, England)》2004,20(3):340-348

MOTIVATION: The building blocks of biological networks are individual protein-protein interactions (PPIs). The cumulative PPI data set in Saccharomyces cerevisiae now exceeds 78 000. Studying the network of these interactions will provide valuable insight into the inner workings of cells. RESULTS: We performed a systematic graph theory-based analysis of this PPI network to construct computational models for describing and predicting the properties of lethal mutations and proteins participating in genetic interactions, functional groups, protein complexes and signaling pathways. Our analysis suggests that lethal mutations are not only highly connected within the network, but they also satisfy an additional property: their removal causes a disruption in network structure. We also provide evidence for the existence of alternate paths that bypass viable proteins in PPI networks, while such paths do not exist for lethal mutations. In addition, we show that distinct functional classes of proteins have differing network properties. We also demonstrate a way to extract and iteratively predict protein complexes and signaling pathways. We evaluate the power of predictions by comparing them with a random model, and assess accuracy of predictions by analyzing their overlap with MIPS database. CONCLUSIONS: Our models provide a means for understanding the complex wiring underlying cellular function, and enable us to predict essentiality, genetic interaction, function, protein complexes and cellular pathways. This analysis uncovers structure-function relationships observable in a large PPI network. 相似文献

19.

Methods for mapping of interaction networks involving membrane proteins 总被引：2，自引：0，他引：2

Hooker BS Bigelow DJ Lin CT 《Biochemical and biophysical research communications》2007,363(3):457-461

Nearly one-third of all genes in various organisms encode membrane-associated proteins that participate in numerous protein-protein interactions important to the processes of life. However, membrane protein interactions pose significant challenges due to the need to solubilize membranes without disrupting protein-protein interactions. Traditionally, analysis of isolated protein complexes by high-resolution 2D gel electrophoresis has been the main method used to obtain an overall picture of proteome constituents and interactions. However, this method is time consuming, labor intensive, detects only abundant proteins and is limited with respect to the coverage required to elucidate large interaction networks. In this review, we discuss the application of various methods to elucidate interactions involving membrane proteins. These techniques include methods for the direct isolation of single complexes or interactors as well as methods for characterization of entire subcellular and cellular interactomes. 相似文献

20.

Prediction of protein function using protein-protein interaction data. 总被引：8，自引：0，他引：8

Minghua Deng Kui Zhang Shipra Mehta Ting Chen Fengzhu Sun 《Journal of computational biology》2003,10(6):947-960

Assigning functions to novel proteins is one of the most important problems in the postgenomic era. Several approaches have been applied to this problem, including the analysis of gene expression patterns, phylogenetic profiles, protein fusions, and protein-protein interactions. In this paper, we develop a novel approach that employs the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of protein's interaction partners. For each function of interest and protein, we predict the probability that the protein has such function using Bayesian approaches. Unlike other available approaches for protein annotation in which a protein has or does not have a function of interest, we give a probability for having the function. This probability indicates how confident we are about the prediction. We employ our method to predict protein functions based on "biochemical function," "subcellular location," and "cellular role" for yeast proteins defined in the Yeast Proteome Database (YPD, www.incyte.com), using the protein-protein interaction data from the Munich Information Center for Protein Sequences (MIPS, mips.gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data. The supplementary data is available at www-hto.usc.edu/~msms/ProteinFunction. 相似文献