首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
EPS8 codes for a protein essential in Ras to Rac signaling leading to actin remodeling. Three genes highly homologous to EPS8 were discovered, thereby defining a novel gene family. Here, we report the genomic structure of EPS8 and the EPS8-related genes in human and mouse. We performed BLASTN searches against the Celera Human Genome and Mouse Fragments Database. The mouse fragments were manually assembled, and the organization of both human and mouse genes was reconstructed. The gene structures in Celera annotations of the human and mouse genomes were compared to outline correspondences and divergences. We also compared the EPS8 family gene structures predicted by Celera with those predicted by NCBI. Moreover, we performed a virtual analysis of the expression of the EPS8 gene family members by using the SAGEmap Database in NCBI. Finally, we analyzed the domain organization of the gene products and their evolutionary conservation to define novel putative domains, thereby helping to predict novel modality of action for the members of this gene family. The data obtained will be instrumental in directing further experimental functional characterization of these genes.  相似文献   

2.
Garcia SP  Pinho AJ 《PloS one》2011,6(12):e29344
Minimal absent words have been computed in genomes of organisms from all domains of life. Here, we aim to contribute to the catalogue of human genomic variation by investigating the variation in number and content of minimal absent words within a species, using four human genome assemblies. We compare the reference human genome GRCh37 assembly, the HuRef assembly of the genome of Craig Venter, the NA12878 assembly from cell line GM12878, and the YH assembly of the genome of a Han Chinese individual. We find the variation in number and content of minimal absent words between assemblies more significant for large and very large minimal absent words, where the biases of sequencing and assembly methodologies become more pronounced. Moreover, we find generally greater similarity between the human genome assemblies sequenced with capillary-based technologies (GRCh37 and HuRef) than between the human genome assemblies sequenced with massively parallel technologies (NA12878 and YH). Finally, as expected, we find the overall variation in number and content of minimal absent words within a species to be generally smaller than the variation between species.  相似文献   

3.
Gap junctions serve for direct intercellular communication by docking of two hemichannels in adjacent cells thereby forming conduits between the cytoplasmic compartments of adjacent cells. Connexin genes code for subunit proteins of gap junction channels and are members of large gene families in mammals. So far, 17 connexin (Cx) genes have been described and characterized in the murine genome. For most of them, orthologues in the human genome have been found (see White and Paul 1999; Manthey et al. 1999; Teubner et al. 2001; S?hl et al. 2001). We have recently performed searches for connexin genes in murine and human gene libraries available at EMBL/Heidelberg, NCBI and the Celera company that have increased the number of identified connexins to 19 in mouse and 20 in humans. For one mouse connexin gene and two human connexin genes we did not find orthologues in the other genome. Here we present a short overview on distinct connexin genes which we found in the mouse and human genome and which may include all members of this gene family, if no further connexin gene will be discovered in the remaining non-sequenced parts (about 1-5%) of the genomes.  相似文献   

4.
Chudin  Eugene  Walker  Randal  Kosaka  Alan  Wu  Sue X  Rabert  Douglas  Chang  Thomas K  Kreder  Dirk E 《Genome biology》2002,4(1):1-10

Background

The availability of both mouse and human draft genomes has marked the beginning of a new era of comparative mammalian genomics. The two available mouse genome assemblies, from the public mouse genome sequencing consortium and Celera Genomics, were obtained using different clone libraries and different assembly methods.

Results

We present here a critical comparison of the two latest mouse genome assemblies. The utility of the combined genomes is further demonstrated by comparing them with the human 'golden path' and through a subsequent analysis of a resulting conserved sequence element (CSE) database, which allows us to identify over 6,000 potential novel genes and to derive independent estimates of the number of human protein-coding genes.

Conclusion

The Celera and public mouse assemblies differ in about 10% of the mouse genome. Each assembly has advantages over the other: Celera has higher accuracy in base-pairs and overall higher coverage of the genome; the public assembly, however, has higher sequence quality in some newly finished bacterial artifical chromosome clone (BAC) regions and the data are freely accessible. Perhaps most important, by combining both assemblies, we can get a better annotation of the human genome; in particular, we can obtain the most complete set of CSEs, one third of which are related to known genes and some others are related to other functional genomic regions. More than half the CSEs are of unknown function. From the CSEs, we estimate the total number of human protein-coding genes to be about 40,000. This searchable publicly available online CSEdb will expedite new discoveries through comparative genomics.  相似文献   

5.
Gap junctions serve for direct intercellular communication by docking of two hemichannels in adjacent cells thereby forming conduits between the cytoplasmic compartments of adjacent cells. Connexin genes code for subunit proteins of gap junction channels and are members of large gene families in mammals. So far, 17 connexin (Cx) genes have been described and characterized in the murine genome. For most of them, orthologues in the human genome have been found (see White and Paul 1999; Manthey et al. 1999; Teubner et al. 2001; Söhl et al. 2001). We have recently performed searches for connexin genes in murine and human gene libraries available at EMBL/Heidelberg, NCBI and the Celera company that have increased the number of identified connexins to 19 in mouse and 20 in humans. For one mouse connexin gene and two human connexin genes we did not find orthologues in the other genome. Here we present a short overview on distinct connexin genes which we found in the mouse and human genome and which may include all members of this gene family, if no further connexin gene will be discovered in the remaining non-sequenced parts (about 1-5%) of the genomes.  相似文献   

6.
7.
8.
9.
10.
11.
We present a bacterial genome computational analysis pipeline, called GenVar. The pipeline, based on the program GeneWise, is designed to analyze an annotated genome and automatically identify missed gene calls and sequence variants such as genes with disrupted reading frames (split genes) and those with insertions and deletions (indels). For a given genome to be analyzed, GenVar relies on a database containing closely related genomes (such as other species or strains) as well as a few additional reference genomes. GenVar also helps identify gene disruptions probably caused by sequencing errors. We exemplify GenVar's capabilities by presenting results from the analysis of four Brucella genomes. Brucella is an important human pathogen and zoonotic agent. The analysis revealed hundreds of missed gene calls, new split genes and indels, several of which are species specific and hence provide valuable clues to the understanding of the genome basis of Brucella pathogenicity and host specificity.  相似文献   

12.
Species trees have traditionally been inferred from a few selected markers, and genome‐wide investigations remain largely restricted to model organisms or small groups of species for which sampling of fresh material is available, leaving out most of the existing and historical species diversity. The genomes of an increasing number of species, including specimens extracted from natural history collections, are being sequenced at low depth. While these data sets are widely used to analyse organelle genomes, the nuclear fraction is generally ignored. Here we evaluate different reference‐based methods to infer phylogenies of large taxonomic groups from such data sets. Using the example of the Oleeae tribe, a worldwide‐distributed group, we build phylogenies based on single nucleotide polymorphisms (SNPs) obtained using two reference genomes (the olive and ash trees). The inferred phylogenies are overall congruent, yet present differences that might reflect the effect of distance to the reference on the amount of missing data. To limit this issue, genome complexity was reduced by using pairs of orthologous coding sequences as the reference, thus allowing us to combine SNPs obtained using two distinct references. Concatenated and coalescence trees based on these combined SNPs suggest events of incomplete lineage sorting and/or hybridization during the diversification of this large phylogenetic group. Our results show that genome‐wide phylogenetic trees can be inferred from low‐depth sequence data sets for eukaryote groups with complex genomes, and histories of reticulate evolution. This opens new avenues for large‐scale phylogenomics and biogeographical analyses covering both the extant and the historical diversity stored in museum collections.  相似文献   

13.
MOTIVATION: Since the simultaneous publication of the human genome assembly by the International Human Genome Sequencing Consortium (HGSC) and Celera Genomics, several comparisons have been made of various aspects of these two assemblies. In this work, we set out to provide a more comprehensive comparative analysis of the two assemblies and their associated gene sets. RESULTS: The local sequence content for both draft genome assemblies has been similar since the early releases, however it took a year for the quality of the Celera assembly to approach that of HGSC, suggesting an advantage of HGSC's hierarchical shotgun (HS) sequencing strategy over Celera's whole genome shotgun (WGS) approach. While similar numbers of ab initio predicted genes can be derived from both assemblies, Celera's Otto approach consistently generated larger, more varied gene sets than the Ensembl gene build system. The presence of a non-overlapping gene set has persisted with successive data releases from both groups. Since most of the unique genes from either genome assembly could be mapped back to the other assembly, we conclude that the gene set discrepancies do not reflect differences in local sequence content but rather in the assemblies and especially the different gene-prediction methodologies.  相似文献   

14.
RefSeq and LocusLink: NCBI gene-centered resources   总被引:50,自引:4,他引:46  
  相似文献   

15.
A white spruce gene catalog for conifer genome analyses   总被引:1,自引:0,他引:1  
  相似文献   

16.
Recent sequencing of the human and other mammalian genomes has brought about the necessity to align them, to identify and characterize their commonalities and differences. Programs that align whole genomes generally use a seed-and-extend technique, starting from exact or near-exact matches and selecting a reliable subset of these, called anchors, and then filling in the remaining portions between the anchors using a combination of local and global alignment algorithms, but their choices for the parameters so far have been primarily heuristic.We present a statistical framework and practical methods for selecting a set of matches that is both sensitive and specific and can constitute a reliable set of anchors for a one-to-one mapping of two genomes from which a whole-genome alignment can be built. Starting from exact matches, we introduce a novel per-base repeat annotation, the Z-score, from which noise and repeat filtering conditions are explored. Dynamic programming-based chaining algorithms are also evaluated as context-based filters. We apply the methods described here to the comparison of two progressive assemblies of the human genome, NCBI build 28 and build 34 (www.genome.ucsc.edu), and show that a significant portion of the two genomes can be found in selected exact matches, with very limited amount of sequence duplication.  相似文献   

17.
Liu Y  Li J 《Current microbiology》2011,62(3):770-776
The interaction between bacteria and human is still incomplete. With the recent availability of many microbial genomes and human genome, as well as the series of basic local alignment search tool (BLAST) programs, a new perspective to gain insight into the interaction between the bacteria and human is possible. This study is to determine the possibility of existence of sequence identity between the genomes of bacteria and human, and try to explain this phenomenon in term of bacteriophages and other genetic mobile elements. BLAST searches of the genomes of bacteria, bacteriophages, and plasmids against human genome were performed using the resources of the National Center for Biotechnology Information (NCBI). All studied bacteria contain variable numbers of short regions of sequence identity to the genome of human, which ranged from 27 to 84 nt. They were found at multiple sites within the human genome. The short regions of sequence identity existed between the genomes of bacteria and human, and a hypothesis that viruses, especially bacteriophages, might play a significant role in shaping the genomes of bacterial and human, and contribute to the short regions of sequence identity is developed.  相似文献   

18.
A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions.  相似文献   

19.
The human 13q32-q33 region has been linked to both bipolar disorder and schizophrenia. Before completion of the draft sequences, we developed an approximately 15-Mb comprehensive map for the region extending from D13S1300 to ATA35H12. This map was assembled using publicly available mapping data and sequence-tagged site (STS)-based PCR confirmation. We then compared this map with the NCBI, Celera Genomics, and UCSC Golden Path data in February, June, and September 2001. All data sets showed gaps, misassignment of STSs, and errors in orientation and marker order. Surprisingly, the completed sequences of many bacterial artificial chromosomes (BACs) had been truncated. Of 21 gaps that were detected, 4 were present in both the NCBI and Celera databases. All gaps could be filled using 1-2 BAC clones. A total of 39 loci mapped to additional sites within the human genome, providing evidence of segmental duplications. Additionally, 61 unique cDNA clones were sequenced to increase available transcribed sequence, and 11,353 reference single-nucleotide polymorphisms (SNPs) with an average density of 1 SNP/3720 bases were identified. Overall, integration of the data from multiple sources is still needed for complete assembly of the 13q32-q33 region. (c)  相似文献   

20.

Background  

The availability of both mouse and human draft genomes has marked the beginning of a new era of comparative mammalian genomics. The two available mouse genome assemblies, from the public mouse genome sequencing consortium and Celera Genomics, were obtained using different clone libraries and different assembly methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号