首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
All tick proteins assigned to the lipocalin family lack the structural conserved regions (SCRs) that are characteristic of the kernel lipocalins and can thus be classified as outliers. These tick proteins have been assigned to the tick lipocalin family based on database searches that indicated homology between tick sequences and the fact that the histamine binding protein (HBP2) from the hard tick Rhipicephalus appendiculatus (Ixodidae) shows structural similarity to the lipocalin fold. Sequence identity between kernel and outlier lipocalins falls below 20% and the question raised is whether the outlier and kernel lipocalins are truly homologous. More specifically in the case of the tick lipocalins, whether their structural fold is derived from the lipocalin fold or whether convergent evolution resulted in the generation of the basic lipocalin-like fold which consists of an eight stranded continuous anti-parallel beta-barrel terminated by a C-terminal alpha-helix that lies parallel to the barrel. The current study determined the gene structure for HBP2 and TSGP1, TSGP2 and TSGP4, lipocalins identified from the soft tick Ornithodoros savignyi (Argasidae). All tick lipocalins have four introns (A-D) with conserved positions and phases within the tick lipocalin sequence alignment. The positions and phase information are also conserved with regard to the rest of the lipocalin family. Phylogenetic analysis using this information shows conclusively that tick lipocalins are evolutionary related to the rest of the lipocalin family. Tick lipocalins are grouped within a monophyletic clade that indicates a monophyletic origin within the tick lineage and also group with the other arthropod lipocalins in a larger clade. Phylogenetic analysis of sequence alignments based on conserved secondary structure of the lipocalin fold support the conclusions from the gene structure trees. These results indicate that exon-intron arrangement can be useful for the inclusion of outlier lipocalins within the larger lipocalin family.  相似文献   

2.
A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase β-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

3.
4.
The lipocalins constitute a family of proteins that have been found in eubacteria and a variety of eukaryotic cells, where they play diverse physiological roles. It is the primary goal of this review to examine the patterns of change followed by lipocalins through their complex history, in order to stimulate scientists in the field to experimentally contrast our phylogeny-derived hypotheses. We reexamine our previous work on lipocalin phylogeny and update the phylogenetic analysis of the family. Lipocalins separate into 14 monophyletic clades, some of which are grouped in well supported superclades. The lipocalin tree was rooted with the bacterial lipocalin genes under the assumption that they have evolved from a single common ancestor with the metazoan lipocalins, and not by horizontal transfer. The topology of the rooted tree and the species distribution of lipocalins suggest that the newly arising lipocalins show a higher rate of amino acid sequence divergence, a higher rate of gene duplication, and their internal pocket has evolved towards binding smaller hydrophobic ligands with more efficiency.  相似文献   

5.
The lipocalins are a family of extracellular proteins that bind and transport small hydrophobic molecules. They are found in eubacteria and a great variety of eukaryotic cells, in which they play diverse physiological roles. We report here the detection of two new eukaryotic lipocalins and a phylogenetic analysis of 113 lipocalin family members performed with maximum-likelihood and parsimony methods on their amino acid sequences. Lipocalins segregate into 13 monophyletic clades, some of which are grouped in well-supported superclades. An examination of the G + C content of the bacterial lipocalin genes and the detection of four new conceptual lipocalins in other eubacterial species argue against a recent horizontal transfer as the origin of prokaryotic lipocalins. Therefore, we rooted our lipocalin tree using the clade containing the prokaryotic lipocalins. The topology of the rooted lipocalin tree is in general agreement with the currently accepted view of the organismal phylogeny of arthropods and chordates. The rooted tree allows us to assign polarity to character changes and suggests a plausible scenario for the evolution of important lipocalin properties. More recently evolved lipocalins tend to (1) show greater rates of amino acid substitutions, (2) have more flexible protein structures, (3) bind smaller hydrophobic ligands, and (4) increase the efficiency of their ligand-binding contacts. Finally, we found that the family of fatty-acid-binding proteins originated from the more derived lipocalins and therefore cannot be considered a sister group of the lipocalin family.  相似文献   

6.
Summary In the previous three reports in this series we demonstrated that the EF-hand family of proteins evolved by a complex pattern of gene duplication, transposition, and splicing. The dendrograms based on exon sequences are nearly identical to those based on protein sequences for troponin C, the essential light chain myosin, the regulatory light chain, and calpain. This validates both the computational methods and the dendrograms for these subfamilies. The proposal of congruence for calmodulin, troponin, C, essential light chain, and regulatory light chain was confirmed. There are, however, significant differences in the calmodulin dendrograms computed from DNA and from protein sequences. In this study we find that introns are distributed throughout the EF-hand domain and the interdomain regions. Further, dendrograms based on intron type and distribution bear little resemblance to those based on protein or on DNA sequences. We conclude that introns are inserted, and probably deleted, with relatively high frequency. Further, in the EF-hand family exons do not correspond to structural domains and exon shuffling played little if any role in the evolution of this widely distributed homolog family. Calmodulin has had a turbulent evolution. Its dendrograms based on protein sequence, exon sequence, 3′-tail sequence, intron sequences, and intron positions all show significant differences.  相似文献   

7.
The lipocalins are a highly divergent, ubiquitous family of proteins that commonly function in binding lipophilic molecules. Although a specific tear lipocalin is a major component of lacrimal fluid and tears in many mammals, there has been no definitive identification of such a protein in rabbit tears. The goals of this project were to identify the major proteins in rabbit (Oryctolagus cuniculus) lacrimal fluid, so as to determine if they include a lipocalin and, if such a protein is present, to determine its source. Lacrimal fluid was collected from NZW sexually mature female rabbits, and culture medium from rabbit lacrimal gland epithelial (acinar) and interstitial cells was isolated. Proteins from these fluids were separated by SDS-PAGE electrophoresis and analyzed by sequencing the intact proteins and sequencing or mass analysis of fragments derived by trypsin digestion. Proteins of approximately 85 and 67 kDa were identified as rabbit transferrin and serum albumin, respectively, while components of 17 and 7 kDa had N-terminal sequences identical to those of lipophilin CL and AL, respectively. BLAST searches of the nr database with the N-terminal sequence of a protein of 18 kDa did not identify any homologues. However, when used to scan the PROSITE database, it was found to contain a lipocalin signature sequence. It is closely related to two lipocalins previously isolated from rabbit saliva and nasal mucus. Further studies with the N-terminal and internal sequences confirmed that the lacrimal protein is a lipocalin that is truncated at the N-terminus as compared with other tear lipocalins and is more similar to odorant binding proteins from rodents.  相似文献   

8.
We studied a protein from the midgut of the silkworm Bombyx mori characterized by its ability to bind the prosthetic group of chlorophyll, that confers fluorescent properties to this protein. Several techniques, 2D electrophoresis purification, MS-MS and Maldi-TOF peptide sequencing, RT-PCR and nucleotide sequencing were used to obtain the nucleotide sequence and the deduced amino acid sequence. The coding sequence was compared to the gene sequence to define the number and size of introns and exons.The gene spanned 45.5 kb of DNA and consisted of 46 exons. The cDNA encoded a protein of 2721 amino acids.The protein was identified as a lipocalin with novel features. Most lipocalins are proteins with high affinity to small lipophilic molecules, with a molecular size in the 25 kDa range and a well conserved tertiary structure. The apoprotein described here revealed 15 lipocalin like structures, in line. We called this protein a polycalin (pentadecacalin).  相似文献   

9.
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.  相似文献   

10.
Lipocalins are functionally diverse proteins that are composed of 120–180 amino acid residues. Members of this family have several important biological functions including ligand transport, cryptic coloration, sensory transduction, endonuclease activity, stress response activity in plants, odorant binding, prostaglandin biosynthesis, cellular homeostasis regulation, immunity, immunotherapy and so on. Identification of lipocalins from protein sequence is more challenging due to the poor sequence identity which often falls below the twilight zone. So far, no specific method has been reported to identify lipocalins from primary sequence. In this paper, we report a support vector machine (SVM) approach to predict lipocalins from protein sequence using sequence-derived properties. LipoPred was trained using a dataset consisting of 325 lipocalin proteins and 325 non-lipocalin proteins, and evaluated by an independent set of 140 lipocalin proteins and 21,447 non-lipocalin proteins. LipoPred achieved 88.61% accuracy with 89.26% sensitivity, 85.27% specificity and 0.74 Matthew’s correlation coefficient (MCC). When applied on the test dataset, LipoPred achieved 84.25% accuracy with 88.57% sensitivity, 84.22% specificity and MCC of 0.16. LipoPred achieved better performance rate when compared with PSI-BLAST, HMM and SVM-Prot methods. Out of 218 lipocalins, LipoPred correctly predicted 194 proteins including 39 lipocalins that are non-homologous to any protein in the SWISSPROT database. This result shows that LipoPred is potentially useful for predicting the lipocalin proteins that have no sequence homologs in the sequence databases. Further, successful prediction of nine hypothetical lipocalin proteins and five new members of lipocalin family prove that LipoPred can be efficiently used to identify and annotate the new lipocalin proteins from sequence databases. The LipoPred software and dataset are available at .  相似文献   

11.
The evolutionary significance of introns remains a mystery. The current availability of several complete eukaryotic genomes permits new studies to probe the possible function of these peculiar genomic features. Here we investigate the degree to which gene structure (intron position, phase and length) is conserved between homologous protein domains. We find that for certain extracellular-signalling and nuclear domains, gene structures are similar even when protein sequence similarity is low or not significant and sequences can only be aligned with a knowledge of protein tertiary structure. In contrast, other domains, including most intracellular signalling modules, show little gene structure conservation. Intriguingly, many domains with conserved gene structures, such as cytokines, are involved in similar biological processes, such as the immune response. This suggests that gene structure conservation may be a record of key events in evolution, such as the origin of the vertebrate immune system or the duplication of nuclear receptors in nematodes. The results suggest ways to detect new and potentially very remote homologues, and to construct phylogenies for proteins with limited sequence similarity.  相似文献   

12.
The origins of tick toxicoses remain a subject of controversy because no molecular data are yet available to study the evolution of tick-derived toxins. In this study we describe the molecular structure of toxins from the soft tick, Ornithodoros savignyi. The tick salivary gland proteins (TSGPs) are four highly abundant proteins proposed to play a role in salivary gland granule biogenesis of the soft tick O. savignyi, of which the toxins TSGP2 and TSGP4 are a part. They were assigned to the lipocalin family based on sequence similarity to known tick lipocalins. Several other tick lipocalins were also identified using Smith-Waterman database searches, bringing the tick lipocalin family up to 20. Phylogenetic analysis showed that most tick lipocalins group within genus-specific clades, suggesting that gene duplication and divergence of tick lipocalin function occurred after tick speciation, most probably during the evolution of a hematophagous lifestyle. TSGP2 and TSGP3 show high sequence identity and group terminal to moubatin, an inhibitor of collagen-induced platelet aggregation from the tick, O. moubata. However, no platelet aggregation inhibitory activity is associated with the TSGPs using ADP or collagen as agonists, suggesting that TSGP2 and TSGP3 duplicated after divergence of O. savignyi and O. moubata. This timing is supported by the absence of TSGP2-4 in the salivary gland extracts of O. moubata. The absence of TSGP2 and TSGP4 in salivary gland extracts from O. moubata correlates with the nontoxicity of this tick species. The implications of this study are that the various forms of tick toxicoses do not have a common origin, but must have evolved independently in those tick species that cause pathogenesis.  相似文献   

13.
Phylogenetic and exon–intron structure analyses of intra- and interspecific fungal subtilisins in this study provided support for a mixed model of intron evolution: a synthetic theory of introns-early and introns-late speculations. Intraspecifically, there were three phase zero introns in Pr1A and its introns 1 and 2 located at the highly conserved positions were phylogentically congruent with coding region, which is in favor of the view of introns-early speculation, while intron 3 had two different sizes and was evolutionarily incongruent with coding region, the evidence for introns-late speculation. Noticeably, the subtilisin Pr1J gene from different strains of M. ansiopliae contained different number of introns, the strong evidence in support of introns-late theory. Interspecifically, phylogenetic analysis of 60 retrievable fungal subtilisins provided a clear relationship between amino acid sequence and gene exon–intron structure that the homogeneous sequences usually have a similar exon–infron structure. There were 10 intron positions inserted by highly biased phase zero introns across examined fungal subtilisin genes, half of these positions were highly conserved, while the others were species-specific, appearing to be of recent origins due to intron insertion, in favor of the introns-late theory. High conservations of positions 1 and 2 inserted by the high percentage of phase zero introns as well as the evidence of phylogenetic congruence between the evolutionary histories of intron sequences and coding region suggested that the introns at these two positions were primordial.Reviewing Editor:Dr. Manyuan Long  相似文献   

14.
Different species of the lichen-forming ascomycete fungus Teloschistes were found to contain group IB introns at position S1506 in the small subunit ribosomal RNA gene. We have characterized the structural organization and phylogeny of the Teloschistes introns Tco.S1506, Tla.S1506, and Tvi.S1506. Common features to all the introns are a small size, a compact RNA structure, and an atypical catalytic ribozyme core sequence motif. Variations in intron sizes, due to sequence extensions in the P1 and P8 loop segments, were observed in different species and isolates. Phylogenetic analyses based on the ITS1-5.8S-ITS2 region as well as the introns show that the Teloschistes S1506 introns represent a distinct evolutionary isolated cluster among the nuclear group I introns. Furthermore, introns from different lineages of Teloschistes villosus appear not strictly vertically inherited probably due to horizontal transfer in one of the lineages.  相似文献   

15.
Savicalin, is a lipocalin found in the hemocytes of the soft tick, Ornithodoros savignyi. It could be assigned to the tick lipocalin family based on BLAST analysis. Savicalin is the first non-salivary gland lipocalin described in ticks. The mature sequence is composed of 188 amino acids with a molecular mass of 21481.9 Da. A homolog for savicalin was found in a whole body EST-library from a related soft tick O. porcinus, while other tick salivary gland derived lipocalins retrieved from the non-redundant sequence database are more distantly related. Homology modeling supports the inclusion of savicalin into the lipocalin family. The model as well as multiple alignments suggests the presence of five disulphide bonds. Two conserved disulphide bonds are found in hard and soft tick lipocalins. A third disulphide bond is shared with the TSGP4-clade of leukotriene C4 binding soft tick lipocalins and a fourth is shared with a lipocalin from the hard tick Ixodes scapularis. The fifth disulphide bond is unique and links strands D-E. Phylogenetic analysis showed that savicalin is a distant relative of salivary gland derived lipocalins, but groups within a clade that is possibly non-salivary gland derived. It lacks the biogenic amine-binding motif associated with tick histamine and serotonin binding proteins. Expression profiles indicate that savicalin is found in hemocytes, midgut and ovaries, but not in the salivary glands. Up-regulation occurs in hemocytes after bacterial challenge and in midguts and ovaries after feeding. Given its tissue distribution and up-regulation of expression, it is possible that this lipocalin functions in tick development after feeding or in an anti-microbial capacity.  相似文献   

16.
The origin and modes of transmission of introns remain matters of much debate. Previous studies of the group I intron in the angiosperm cox1 gene inferred frequent angiosperm-to-angiosperm horizontal transmission of the intron from apparent incongruence between intron phylogenies and angiosperm phylogenies, patchy distribution of the intron among angiosperms, and differences between cox1 exonic coconversion tracts (the first 22 nt downstream of where the intron inserted). We analyzed the cox1 gene in 179 angiosperms, 110 of them containing the intron (intron(+)) and 69 lacking it (intron(-)). Our taxon sampling in Araceae is especially dense to test hypotheses about vertical and horizontal intron transmission put forward by Cho and Palmer (1999. Multiple acquisitions via horizontal transfer of a group I intron in the mitochondrial coxl gene during evolution of the Araceae family. Mol Biol Evol. 16:1155-1165). Maximum likelihood trees of Araceae cox1 introns, and also of all angiosperm cox1 introns, are largely congruent with known phylogenetic relationships in these taxa. The exceptions can be explained by low signal in the intron and long-branch attraction among a few taxa with high mitochondrial substitution rates. Analysis of the 179 coconversion tracts reveals 20 types of tracts (11 of them only found in single species, all involving silent substitutions). The distribution of these tracts on the angiosperm phylogeny shows a common ancestral type, characterizing most intron(+) and some intron(-) angiosperms, and several derivative tract types arising from gradual back mutation of the coconverted nucleotides. Molecular clock dating of small intron(+) and intron(-) sister clades suggests that coconversion tracts have persisted for 70 Myr in Araceae, whose cox1 sequences evolve comparatively slowly. Sequence similarity among the 110 introns ranges from 91% to identical, whereas putative homologs from fungi are highly different, but sampling in fungi is still sparse. Together, these results suggest that the cox1 intron entered angiosperms once, has largely or entirely been transmitted vertically, and has been lost numerous times, with coconversion tract footprints providing unreliable signal of former intron presence.  相似文献   

17.
The lipocalins and fatty acid-binding proteins (FABPs) are two recently identified protein families that both function by binding small hydrophobic molecules. We have sought to clarify relationships within and between these two groups through an analysis of both structure and sequence. Within a similar overall folding pattern, we find large parts of the lipocalin and FABP structures to be quantitatively equivalent. The three largest structurally conserved regions within the lipocalin common core correspond to characteristic sequence motifs that we have used to determine the constitution of this family using an iterative sequence analysis procedure. This afforded a new interpretation of the family, which highlighted the difficulties of determining a comprehensive and coherent classification of the lipocalins. The first of the three conserved sequence motifs is also common to the FABPs and corresponds to a conserved structural element characteristic of both families. Similarities of structure and sequence within the two families suggests that they form part of a larger "structural superfamily"; we have christened this overall group the calycins to reflect the cup-shaped structure of its members.  相似文献   

18.
We describe the complete sequence of the gene encoding mouse NF-M, the middle-molecular-mass neurofilament protein. The coding sequence is interrupted by two intervening sequences which align perfectly with the first two intervening sequences in the gene encoding NF-L (the low-molecular-mass neurofilament protein); there is no intron in the gene encoding NF-M corresponding to the third intron in NF-L. Therefore, both the number of introns and their arrangement in the genes coding NF-L and NF-M contrast sharply with the number and arrangement of introns in the genes of known sequence, encoding other members of the intermediate filament multigene family (desmin, vimentin, glial fibrillary acidic protein and the acidic and basic keratins); with the exception of a single truncated keratin gene that lacks an encoded tailpiece, these genes all contain eight introns, of which at least six are placed at homologous locations. Assuming the existence of a primordial intermediate filament gene containing most (if not all) the introns found in contemporary non-neurofilament intermediate filament genes, it seems likely that an RNA-mediated transposition event was involved in the generation of an ancestral gene encoding the NF polypeptides. A combination of insertional transposition and gene-duplication events could then explain the anomalous number and placement of introns within these genes. Consistent with this notion, we show that the genes encoding NF-M and NF-L are linked.  相似文献   

19.
The concept of scaffolds that can be equipped with artificial biochemically active sites has gained recent interest in the field of protein design. Members of the lipocalin protein family represent promising model systems in this respect. Especially prototypic lipocalins, such as the retinol-binding protein or the bilin-binding protein (BBP), exhibit a structurally simple one-domain fold with a conformationally well conserved beta-barrel as their central motif. This type of supersecondary structure is made of a cylindrically closed beta-sheet of eight antiparallel strands. At the open end of the barrel the beta-strands are connected by four loops in a pairwise manner so that a pocket for the ligand is formed. In a rational protein design study a metal-binding site was functionally grafted on the solvent-exposed surface of the beta-barrel, whereby the rigid backbone conformation permitted the spatially defined arrangement of three His side chains. In a combinatorial protein design approach, the natural ligand pocket of a lipocalin was reshaped. In this manner variants of the BBP were engineered which exhibit high affinity and remarkable specificity for haptens like fluorescein and digoxigenin. The so-called 'anticalins', i.e. artificial lipocalins recognizing prescribed ligands, could provide an interesting alternative to recombinant antibody fragments. Consequently, the use of lipocalins as a scaffold opens new applications for members of this functionally diverse protein family in biotechnology and medicine.  相似文献   

20.
TheArtemia hemoglobin contains two sub-units that are similar or different chains of nine globin domains. The domains are ancestrally related and are presumed to be derived from copies of an original single-domain parent gene. Since the gene copies have remained in the same environment for several hundred million years they provide an excellent model for the investigation of intron stability. The cDNA for one of the two types of nine-domain subunit (domains T1–T9) has been sequenced. Comparison with the corresponding genomic DNA reveals a total of 17 intradomain introns. Fourteen of the introns are in locations on the protein that are conventional in globins of other species. In eight of the nine domains an intron corresponds to the B helix, amino acid B12, following the second nucleotide (phase 2), and in six domains a G-helix intron is located between G6 and G7 (phase 0). The consistency of this pattern is supportive of the introns having been inherited from a single-domain parent gene. The remaining three introns are in unconventional locations. Two occur in the F helix, either in amino acid F3 (phase 1) in domain T3, or between F2 and F3 (phase 0) in domain T6. The two F introns strengthen an interpretation of intron inheritance since globin F introns are rare, and in domains T3 and T6 they replace rather than supplement the conventional G introns, as though displacement from G to F occurred before that part of the gene became duplicated. It is inferred that one of the F introns subsequently moved by one nucleotide. Similarly, the third unconventional intron location is the G intron in domain T4 which is in G6, phase 2, one nucleotide earlier than the other G introns. Domain T4 is also unusual in lacking a B intron. The pattern of introns in theArtemia globin gene supports a concept of general positional stability but the exceptions, where introns have moved out of reading frame, or have moved by several codons, or have been deleted, suggest that intron displacements can occur after inheritance from an ancient source. Correspondence to: C.N.A. Trotman  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号