共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Simple sequence repeats (SSRs) are ubiquitous short tandem repeats, which are associated with various regulatory mechanisms and have been found in viral genomes. Herein, we develop MfSAT (Multi-functional SSRs Analytical Tool), a new powerful tool which can fast identify SSRs in multiple short viral genomes and then automatically calculate the numbers and proportions of various SSR types (mono-, di-, tri-, tetra-, penta- and hexanucleotide repeats). Furthermore, it also can detect codon repeats and report the corresponding amino acid. 相似文献
3.
Summary Segments of the Japanese quail mito-chondrial genome encompassing many tRNA and protein genes, the small and part of the large
rRNA genes, and the control region have been cloned and sequenced. Analysis of the relative position of these genes confirmed
that the tRNAGlu and ND6 genes in galliform mitochondrial DNA are located immediately adjacent to the control region of the molecule instead
of between the cytochrome b and ND5 genes as in other vertebrates. Japanese quail and chicken display another distinctive
characteristic, that is, they both lack an equivalent to the light-strand replication origin found between the tRNACys and tRNAAsn genes in all vertebrate mitochondrial genomes sequenced thus far. Comparison of the protein-encoding genes revealed that
a great proportion of the substitutions are silent and involve mainly transitions. This bias toward transitions also occurs
in the tRNA and rRNA genes but is not observed in the control region where transversions account for many of the substitutions.
Sequence alignment indicated that the two avian control regions evolve mainly through base substitutions but are also characterized
by the occurrence of a 57-bp deletion/addition event at their 5′ end. The overall sequence divergence between the two gallinaceous
birds suggests that avian mitochondrial genomes evolve at a similar rate to other vertebrate mitochondrial DNAs. 相似文献
4.
Gene population statistical studies of protein coding genes and introns have identified two types of periodicities on the purine/pyrimidine alphabet: (i) the modulo 3 periodicity or coding periodicity (periodicity P3) in protein coding genes of eukaryotes, prokaryotes, viruses, chloroplasts, mitochondria, plasmids and in introns of viruses and mitochondria, and (ii) the modulo 2 periodicity (periodicity P2) in the eukaryotic introns. The periodicity study is herein extended to the 5' and 3' regions of eukaryotes, prokaryotes and viruses and shows: (i) the periodicity P3 in the 5' and 3' regions of eukaryotes. Therefore, these observations suggest a unitary and dynamic concept for the genes as for a given genome, the 5' and 3' regions have the genetic information for protein coding genes and for introns: (1) In the eukaryotic genome, the 5' (P2 and P3) and 3' (P2 and P3) regions have the information for protein coding genes (P3) and for introns (P2). The intensity of P3 is high in 5' regions and weak in 3' regions, while the intensity of P2 is weak in 5' regions and high in 3' regions. (2) In the prokaryotic genome, the 5' (P3) and 3' (P3) regions have the information for protein coding genes (P3). (3) In the viral genome, the 5' (P3) and 3' (P3) regions have the information for protein coding genes (P3) and for introns (P3). The absence of P2 in viral introns (in opposition to eukaryotic introns) may be related to the absence of P2 in 5' and 3' regions of viruses. 相似文献
5.
A comparison was made of the structures of the Fnr and ArcA modulons and regulons. The data on modulon composition were taken from published microarray assays, whereas regulons were characterized using comparative genomic approaches. The regulatory cascade involving Fnr and ArcA contributes greatly to the extension of the Fnr modulon over the Fnr regulon by adding operons of the ArcA modulon. The Fnr and ArcA regulons were shown to contain 26 and 16 operons, respectively. Ten operons had high-score and highly conserved sites for both Fnr and ArcA and were isolated as a so-called core of regulons. 相似文献
6.
Oligopeptide biases in protein sequences and their use in predicting protein coding regions in nucleotide sequences 总被引:16,自引:0,他引:16
We have examined oligopeptides with lengths ranging from 2 to 11 residues in protein sequences that show no obvious evolutionary relationship. All sequences in the Protein Identification Resource database were carefully classified by sensitive homology searches into superfamilies to obtain unbiased oligopeptide counts. The results, contrary to previous studies, show clear prejudices in protein sequences. The oligopeptide preferences were used to help decide the significance of sequence homologies and to improve the more general methods for detecting protein coding regions within nucleotide sequences. 相似文献
7.
8.
9.
猪肌生成抑制素(myostatin,MSTN)基因的cDNA去除信号肽后PCR扩增出成熟蛋白编码序列1.2kb片段,将该片段与pMD18-T载体连接。转化JM109受体菌细胞;筛选阳性克隆测序分析,结果表明与设计序列完全一致。将该克隆载体的质粒DNA用带有BamHI和Sall内切酶识别序列的另一对引物进行PCR扩增,将回收的1.2 kb PCR目的片段定向克隆到pET28a( )表达载体上。成功地构建了编码绪肌生成抑制素成熟蛋白的原核表达载体。LB液体培养基中用IPTG诱导表达,SDS—PAGE显示重组菌表达的MSTN蛋白以包涵体形式存在;经薄层扫描仪扫描分析SDS-PAGE凝胶,表达的MSTN包涵体蛋白占菌体不溶性蛋白含量的27.9%,其相对分子质量为41451.3。所构建的表达载体中含六聚组氨酸标签,HisTrap亲和柱纯化后纯度可达92.5%。该试验为获得较好的绪肌生成抑制素并制备抗体打下了良好的基础。 相似文献
10.
MtDNA substitution rate and segregation of heteroplasmy in coding and noncoding regions 总被引:3,自引:0,他引:3
The mitochondrial DNA (mtDNA) substitution rate and segregation of heteroplasmy were studied for the non-coding control region (D-loop) and 500 bp of the coding region between nucleotide positions 5550 and 6050, by sequence analysis of blood samples from 194 individuals, representing 33 maternal lineages. No homoplasmic nucleotide substitutions were detected in a total of 292 transmissions. The estimated substitution rate per nucleotide per million years for the control region (micro>0.21, 95% CI 0-0.6) was not significantly different from that for the coding region (micro>0.54, 95% CI 0-1.0). Variation in the length of homopolymeric C streches was observed at three sites in the control region (positions 65, 309 and 16,189), all of which were in the heteroplasmic state. Segregation of heteroplasmic genotypes between generations was observed in several maternal pedigrees. At position 309, a longer poly C tract length was strongly associated with a higher probability for heteroplasmy and rapid segregation between generations. The length heteroplasmy at positions 65 and 16,189 was found at low frequency and was confined to a few families. 相似文献
11.
12.
Donald J. Cummings Joanne M. Domenico James Nelson Mitchell L. Sogin 《Journal of molecular evolution》1989,28(3):232-241
Summary DNA sequence analysis and the localization of the 5 and 3 termini by S1 mapping have shown that the mitochondrial (mt) small subunit rRNA coding region fromPodospora anserina is 1980 bp in length. The analogous coding region for mt rRNA is 1962 bp in maize, 1686 bp inSaccharomyces cerevisiae, and 956 bp in mammals, whereas its counterpart inEscherichia coli is 1542 bp. TheP. anserina mt 16S-like rRNA is 400 bases longer than that fromE. coli, but can be folded into a similar secondary structure. The additional bases appear to be clustered at specific locations, including extensions at the 5 and 3 termini. Comparison with secondary structure diagrams of 16S-like RNAs from several organisms allowed us to specify highly conserved and variable regions of this gene. Phylogenetic tree construction indicated that this gene is grouped with other mitochondrial genes, but most closely, as expected, with the fungal mitochondrial genes. 相似文献
13.
14.
The animal in the genome: comparative genomics and evolution 总被引:1,自引:0,他引:1
Copley RR 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2008,363(1496):1453-1461
Comparisons between completely sequenced metazoan genomes have generally emphasized how similar their encoded protein content is, even when the comparison is between phyla. Given the manifest differences between phyla and, in particular, intuitive notions that some animals are more complex than others, this creates something of a paradox. Simplistic explanations have included arguments such as increased numbers of genes; greater numbers of protein products produced through alternative splicing; increased numbers of regulatory non-coding RNAs and increased complexity of the cis-regulatory code. An obvious value of complete genome sequences lies in their ability to provide us with inventories of such components. I examine progress being made in linking genome content to the pattern of animal evolution, and argue that the gap between genomic and phenotypic complexity can only be understood through the totality of interacting components. 相似文献
15.
Bonasa sewerzowi, the smallest and most southerly distributed grouse species in the world, is a bird endemic to China. The population of B. sewerzowi had shown a declining trend, which made it to be the endangered species in the China Red Data Book and Category I of nationally protected animals. So far, however, most studies about this species were mainly focused on the morphological and ecological aspects. In order to further study the feature of B. sewerzowi, the complete mitochondrial genome(mitogenome) of B. sewerzowi was sequenced by Illumina Hiseq 2000 high-throughput sequencing. Then, we focused on comparative genomics of two Bonasa species to find their characteristics. Finally, phylogenetic position of Bonasa was made based on the mitogenome dataset. Our results revealed that:(1) the mitogenome of B. sewerzowi, consisting of 16 658 bp, displayed typical genome organization and gene order found in other previously determined Galliformes mitogenomes;(2) the structure and composition of mitogenomes were similar between B. sewerzowi and B. bonasia;(3) the monophyly of Bonasa was well supported, which had a closer phylogenetic relationship with Meleagris gallopavo. 相似文献
16.
17.
Data mining for simple sequence repeats in expressed sequence tags from barley,maize, rice,sorghum and wheat 总被引:97,自引:0,他引:97
Plant genomics projects involving model species and many agriculturally important crops are resulting in a rapidly increasing database of genomic and expressed DNA sequences. The publicly available collection of expressed sequence tags (ESTs) from several grass species can be used in the analysis of both structural and functional relationships in these genomes. We analyzed over 260000 EST sequences from five different cereals for their potential use in developing simple sequence repeat (SSR) markers. The frequency of SSR-containing ESTs (SSR-ESTs) in this collection varied from 1.5% for maize to 4.7% for rice. In addition, we identified several ESTs that are related to the SSR-ESTs by BLAST analysis. The SSR-ESTs and the related sequences were clustered within each species in order to reduce the redundancy and to produce a longer consensus sequence. The consensus and singleton sequences from each species were pooled and clustered to identify cross-species matches. Overall a reduction in the redundancy by 85% was observed when the resulting consensus and singleton sequences (3569) were compared to the total number of SSR-EST and related sequences analyzed (24606). This information can be useful for the development of SSR markers that can amplify across the grass genera for comparative mapping and genetics. Functional analysis may reveal their role in plant metabolism and gene evolution. 相似文献
18.
19.
20.
Comparative studies of the proteomes from different organisms have provided valuable information about protein domain distribution in the kingdoms of life. Earlier studies have been limited by the fact that only about 50% of the proteomes could be matched to a domain. Here, we have extended these studies by including less well-defined domain definitions, Pfam-B and clustered domains, MAS, in addition to Pfam-A and SCOP domains. It was found that a significant fraction of these domain families are homologous to Pfam-A or SCOP domains. Further, we show that all regions that do not match a Pfam-A or SCOP domain contain a significantly higher fraction of disordered structure. These unstructured regions may be contained within orphan domains or function as linkers between structured domains. Using several different definitions we have re-estimated the number of multi-domain proteins in different organisms and found that several methods all predict that eukaryotes have approximately 65% multi-domain proteins, while the prokaryotes consist of approximately 40% multi-domain proteins. However, these numbers are strongly dependent on the exact choice of cut-off for domains in unassigned regions. In conclusion, all eukaryotes have similar fractions of multi-domain proteins and disorder, whereas a high fraction of repeating domain is distinguished only in multicellular eukaryotes. This implies a role for repeats in cell-cell contacts while the other two features are important for intracellular functions. 相似文献