首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
中国明对虾基因组小卫星重复序列分析   总被引:4,自引:0,他引:4  
高焕  孔杰 《动物学报》2005,51(1):101-107
通过对中国明对虾基因组随机DNA片断的测序 ,我们获得了总长度约 6 4 10 0 0个碱基的基因组DNA序列 ,从中共找到 172 0个重复序列。其中 ,小卫星序列的数目为 398个 ,占重复序列总数目的 2 3 14 %。这些小卫星序列的重复单位长度为 7- 16 5个碱基 ,集中分布于 7- 2 1个碱基范围内 ,其中以重复单位长度为 12个碱基的重复序列数目最多 ,为 5 8个 ,占小卫星重复序列总数目的 14 5 7%。不同拷贝数目所对应的重复序列的数目情况为 :拷贝数目为 2的重复单位所组成的重复序列数目最多 ,为 137个 ;其次是拷贝数目为 3的重复序列 ,为12 2个 ,且随着拷贝数目的增加 ,由其所组成的重复序列的数目呈递减的趋势。其中一部分序列见GeneBank数据库 ,登录号为AY6 990 72 -AY6 990 76。 398个重复序列分别由 398种重复单位所组成 ,因而小卫星重复序列的类型很多 ,我们初步分成三类 :两种碱基组成类别、三种碱基组成类别和四种碱基组成类别 ,并进一步根据各个重复序列中所含有的碱基种类的数量从大到小排列这些碱基而分成若干小类。从这些分类中可以看出 ,中国明对虾基因组中的小卫星整体上是富含A T的重复序列 ,并具有一定的“等级制度” ,揭示了其与微卫星重复序列之间的关系 ,即一部分小卫星重复序列可能起源于微卫星  相似文献   

2.
A total of 100 H1N1 flu real-time-PCR positive throat swabs collected from fever patients in Zhejiang, Hubei and Guangdong between June and November 2009, were provided by local CDC laboratories. After MDCK cell culture, 57 Influenza A Pandemic (H1N1) viruses were isolated and submitted for whole genome sequencing. A total of 39 HA sequences, 52 NA sequences, 36 PB2 sequences, 31 PB1 sequences, 40 PA sequences, 48 NP sequences, 51 MP sequences and 36 NS sequences were obtained, including 20 whole genome seq...  相似文献   

3.
The comparisons of 170 sequences of kinetoplast DNA minicircle hypervariable region obtained from 19 stocks of Trypanosoma cruzi and 2 stocks of Trypanosoma cruzi marenkellei showed that only 56% exhibited a significant homology one with other sequences. These sequences could be grouped into homology classes showing no significant sequence similarity with any other homology group. The 44% remaining sequences thus corresponded to unique sequences in our data set. In the DTU I ("Discrete Typing Units") 51% of the sequences were unique. In contrast, in the DTU IId, 87.5% of sequences were distributed into three classes. The results obtained for T. cruzi marinkellei, showed that all sequences were unique, without any similarity between them and T. cruzi sequences. Analysis of palindromes in all sequence sets show high frequency of the EcoRI site. Analysis of repetitive sequences suggested a common ancestral origin of the kDNA. The editing mechanism that occurs in kinetoplastidae is discussed.  相似文献   

4.
The uRNA database.   总被引:3,自引:0,他引:3       下载免费PDF全文
The uRNADB offers aligned, annotated and phylogenetically ordered sequences of several U RNAs. New to this release are RNAs from U7 (14 sequences), U8 (two sequences), U11 (three sequences), U12 (two sequences), U14 (11 sequences), U18, U48 and U49. A total of 34 new sequences were aligned with the previously compiled snRNAs U1, U2, U3, U4, U5 and U6.  相似文献   

5.
6.
Comparing DNA or protein sequences plays an important role in the functional analysis of genomes. Despite many methods available for sequences comparison, few methods retain the information content of sequences. We propose a new approach, the Yau-Hausdorff method, which considers all translations and rotations when seeking the best match of graphical curves of DNA or protein sequences. The complexity of this method is lower than that of any other two dimensional minimum Hausdorff algorithm. The Yau-Hausdorff method can be used for measuring the similarity of DNA sequences based on two important tools: the Yau-Hausdorff distance and graphical representation of DNA sequences. The graphical representations of DNA sequences conserve all sequence information and the Yau-Hausdorff distance is mathematically proved as a true metric. Therefore, the proposed distance can preciously measure the similarity of DNA sequences. The phylogenetic analyses of DNA sequences by the Yau-Hausdorff distance show the accuracy and stability of our approach in similarity comparison of DNA or protein sequences. This study demonstrates that Yau-Hausdorff distance is a natural metric for DNA and protein sequences with high level of stability. The approach can be also applied to similarity analysis of protein sequences by graphic representations, as well as general two dimensional shape matching.  相似文献   

7.
8.
Goto N  Kurokawa K  Yasunaga T 《Gene》2007,401(1-2):172-180
To date, the complete genome sequences of more than 250 organisms have been determined. This information can now be used to determine whether there exist any invariant sequences that are conserved among all organisms, from bacteria to plants, animals, and humans. The existence of invariant sequences would strongly suggest that these sequences have been inherited unchanged from the last common ancestor of all life, and that they have essential functions. We have developed a new software program to identify invariant sequences conserved among the currently sequenced genomes and applied this analysis to the complete genome sequences of 266 organisms. We have identified 3 invariant DNA sequences longer than or equal to 11 bp and 6 invariant amino acid sequences longer than or equal to 6 aa. The longest invariant DNA sequence, AAGTCGTACAAGGT (15 bp), was found in the 16S/18S rRNA gene. Two 8 aa sequences, GHVDHGKT in IF2 and EF-Tu and DTPGHVDF in EF-G, were the longest invariant amino acid sequences detected. These sequences could be essential elements from the genome of the last common ancestor and may have remained unchanged throughout evolution.  相似文献   

9.
10.
The frequencies of "words", oligonucleotides within nucleotide sequences, reflect the genetic information contained in the sequence "texts". Nucleotide sequences are characteristically represented by their contrast word vocabularies. Comparison of the sequences by correlating their contrast vocabularies is shown to reflect well the relatedness (unrelatedness) between the sequences. A single value, the linguistic similarity between the sequences, is suggested as a measure of sequence relatedness. Sequences as short as 1000 bases can be characterized and quantitatively related to other sequences by this technique. The linguistic sequence similarity value is used for analysis of taxonomically and functionally diverse nucleotide sequences. The similarity value is shown to be very sensitive to the relatedness of the source species, thus providing a convenient tool for taxonomic classification of species by their sequence vocabularies. Functionally diverse sequences appear distinct by their linguistic similarity values. This can be a basis for a quick screening technique for functional characterization of the sequences and for mapping functionally distinct regions in long sequences.  相似文献   

11.
To study the properties of DNA sequences we have transformed the sequences of bases into the sequences of twist angles along the chain of DNA double helix by using the Dickerson sum function. The Fourier transform and the auto-correlation function of the twist angles sequences have been used to study the periodicity and randomness of the original DNA sequences. Basing on the correlation coefficient, a "distance" between two DNA fragments has been defined and used to compare some realistic DNA sequences. It is hoped that the techniques developed here could be used to analyze more realistic DNA sequences.  相似文献   

12.
Sequence alignment is an important bioinformatics tool for identifying homology, but searching against the full set of available sequences is likely to result in many hits to poorly annotated sequences providing very little information. Consequently, we often want alignments against a specific subset of sequences: for instance, we are looking for sequences from a particular species, sequences that have known 3d-structures, sequences that have a reliable (curated) function annotation, and so on. Although such subset databases are readily available, they only represent a small fraction of all sequences. Thus, the likelihood of finding close homologs for query sequences is smaller, and the alignments will in general have lower scores. This makes it difficult to distinguish hits to homologous sequences from random hits to unrelated sequences. Here, we propose a method that addresses this problem by first aligning query sequences against a large database representing the corpus of known sequences, and then constructing indirect (or transitive) alignments by combining the results with alignments from the large database against the desired target database. We compare the results to direct pairwise alignments, and show that our method gives us higher sensitivity alignments against the target database.  相似文献   

13.
14.
Here we report radiation hybrid mapping of 105 new porcine microsatellite markers on the IMpRH7000 radiation hybrid panel. In addition, we searched flanking sequences of these markers, as well as 673 previously reported RH-mapped microsatellite markers, for orthology to human sequences. Eighty-seven new and 111 previously mapped sequences exhibited orthology to human sequences. Using a stringent sequence alignment, 25 microsatellite-flanking sequences were found to be highly similar to genic sequences, whereas 173 were similar to non-genic sequences in the human genome. Five markers were located near known breakpoints of synteny between human and swine.  相似文献   

15.
16.
Repetitive sequences constitute a significant component of most eukaryotic genomes, and the isolation and characterization of repetitive DNA sequences provide an insight into the organization and evolution of the genome of interest. We report the isolation and characterization of the major classes of repetitive sequences from the genome of Panax ginseng. The isolation of repetitive DNA from P. ginseng was achieved by the reannealing of chemically hydrolyzed (200 bp-1 kb fragments) and heat-denatured genomic DNA to low C(o)t value. The low C(o)t fraction was cloned, and fifty-five P. ginseng clones were identified that contained repetitive sequences. Sequence analysis revealed that the fraction includes repetitive telomeric sequences, species-specific satellite sequences, chloroplast DNA fragments and sequences that are homologous to retrotransposons. Two of the retrotransposon-like sequences are homologous to Ty1/ copia-type retroelements of Zea mays, and six cloned sequences are homologous to various regions of the del retrotransposon of Lilium henryi. The del retrotransposon-like sequences and several novel repetitive DNA sequences from P. ginseng were used to differentiate P. ginseng from P. quinquefolius, and should be useful for evolutionary studies of these disjunct species.  相似文献   

17.
目的研究不同来源HBV病毒株的基因同源性。方法在CenBank中调取中国各地递交的HBV病毒株的全基因序列24个,使用ClustalW1.83生物软件,对各HBV病毒株的全基因序列进行同源性比较,并建立基因进化树,分析其特点。结果不同地区的HBV病毒株的全基因序列并不一致,同一地区来源的HBV病毒株的全基因序列亦并不一致,甚至差别很大。结论不同地区间和地区内的HBV病毒株的全基因序列具有很大的异质性。  相似文献   

18.
The 5' portions and flanking sequences of genes encoding types 1, 12, 24, and 6 M proteins were compared. Although the DNA sequences encoding the amino-termini of the mature M proteins had no obvious similarity, upstream sequences, and those encoding the signal peptides (leader sequences) of the four M protein genes had considerable similarity. In general, the 5' ends of all the leader sequences were more conserved than the 3' ends, although the M6 and M24 leader sequences had identical 3' ends. Sequence similarity among the deduced amino acid sequences of the four signal peptides was more extensive than the corresponding DNA sequences. We found that strict DNA similarity among all four sequences extended only to the ends of the hydrophilic amino-terminal regions of the signal peptides, but that amino acid sequence conservation continued to the ends of the respective hydrophobic cores. With the exception of the M6 and M24 sequences, the regions adjacent to the signal peptidase cleavage sites were highly variable.  相似文献   

19.
Repetitive DNA sequences near immunoglobulin genes in the mouse genome (Steinmetz et al., 1980a,b) were characterized by restriction mapping and hybridization. Six sequences were determined that turned out to belong to a new family of dispersed repetitive DNA. From the sequences, which are called R1 to R6, a 475 base-pair consensus sequence was derived. The R family is clearly distinct from the mouse B1 family (Krayev et al., 1980). According to saturation hybridization experiments, there are about 100,000 R sequences per haploid genome, and they are probably distributed throughout the genome. The individual R sequences have an average divergence from the consensus sequence of 12.5%, which is largely due to point mutations and, among those, to transitions. Some R sequences are severly truncated. The R sequences extend into A-rich sequences and are flanked by short direct repeats. Also, two large insertions in the R2 sequence are flanked by direct repeats. In the neighbourhood of and within R sequences, stretches of DNA have been identified that are homologous to parts of small nuclear RNA sequences. Mouse satellite DNA-like sequences and members of the B1 family were also found in close proximity to the R sequences. The dispersion of R sequences within the mouse genome may be a consequence of transposition events. The possible role of the R sequences in recombination and/or gene conversion processes is discussed.  相似文献   

20.
以7种古菌、46种细菌和10种真核生物的基因组为样本,考虑碱基间的短程关联和长程关联作用,得到编码序列的密码对和基因间序列的三联体对中不同位点的二核苷酸频率,据此构建了基于编码序列和基因间序列的系统发生关系。无论是基于编码序列还是基因间序列对信息进行聚类,古菌或真核均被聚在一支上,表明聚类参数的选择是合适的;与基于氨基酸序列构建的系统发生关系进行两两比较,发现大部分硬壁菌的编码序列与基因间序列之间,以及编码序列与氨基酸序列之间的进化都存在较大差异。通过分析认为,只有综合考虑这三类序列的进化信息,才可能得到更自然的系统发生关系。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号