首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A report on the genomics workshop 'Identification of Functional Elements in Mammalian Genomes', Cold Spring Harbor, New York, 11-13 November 2004.  相似文献   

2.
MOTIVATION: The rapid accumulation of microarray datasets provides unique opportunities to perform systematic functional characterization of the human genome. We designed a graph-based approach to integrate cross-platform microarray data, and extract recurrent expression patterns. A series of microarray datasets can be modeled as a series of co-expression networks, in which we search for frequently occurring network patterns. The integrative approach provides three major advantages over the commonly used microarray analysis methods: (1) enhance signal to noise separation (2) identify functionally related genes without co-expression and (3) provide a way to predict gene functions in a context-specific way. RESULTS: We integrate 65 human microarray datasets, comprising 1105 experiments and over 11 million expression measurements. We develop a data mining procedure based on frequent itemset mining and biclustering to systematically discover network patterns that recur in at least five datasets. This resulted in 143,401 potential functional modules. Subsequently, we design a network topology statistic based on graph random walk that effectively captures characteristics of a gene's local functional environment. Function annotations based on this statistic are then subject to the assessment using the random forest method, combining six other attributes of the network modules. We assign 1126 functions to 895 genes, 779 known and 116 unknown, with a validation accuracy of 70%. Among our assignments, 20% genes are assigned with multiple functions based on different network environments. AVAILABILITY: http://zhoulab.usc.edu/ContextAnnotation.  相似文献   

3.
New techniques for physical mapping of the human genome.   总被引:2,自引:0,他引:2  
We describe improvements in techniques and strategies used for making maps of the human genome. The methods currently used are changing and evolving rapidly. Today's techniques can produce ordered arrays of DNA fragments and overlapping sets of DNA clones covering extensive genomic regions, but they are relatively slow and tedious. Methods under development will speed the process considerably. New developments include a range of applications of the polymerase chain reaction, enhanced procedures for high resolution in situ hybridization, and improved methods for generating, manipulating, and cloning large DNA fragments. More detailed genetic and physical maps will be useful for finding genes, including those associated with human diseases, long before the complete DNA sequence of the human genome is available.  相似文献   

4.
Gap junctions are clustered channels between contacting cells through which direct intercellular communication via diffusion of ions and metabolites can occur. Two hemichannels, each built up of six connexin protein subunits in the plasma membrane of adjacent cells, can dock to each other to form conduits between cells. We have recently screened mouse and human genomic data bases and have found 19 connexin (Cx) genes in the mouse genome and 20 connexin genes in the human genome. One mouse connexin gene and two human connexin genes do not appear to have orthologs in the other genome. With three exceptions, the characterized connexin genes comprise two exons whereby the complete reading frame is located on the second exon. Targeted ablation of eleven mouse connexin genes revealed basic insights into the functional diversity of the connexin gene family. In addition, the phenotypes of human genetic disorders caused by mutated connexin genes further complement our understanding of connexin functions in the human organism. In this review we compare currently identified connexin genes in both the mouse and human genome and discuss the functions of gap junctions deduced from targeted mouse mutants and human genetic disorders.  相似文献   

5.

Background

Predicting the functional impact of amino acid substitutions (AAS) caused by nonsynonymous single nucleotide polymorphisms (nsSNPs) is becoming increasingly important as more and more novel variants are being discovered. Bioinformatics analysis is essential to predict potentially causal or contributing AAS to human diseases for further analysis, as for each genome, thousands of rare or private AAS exist and only a very small number of which are related to an underlying disease. Existing algorithms in this field still have high false prediction rate and novel development is needed to take full advantage of vast amount of genomic data.

Results

Here we report a novel algorithm that features two innovative changes: 1. making better use of sequence conservation information by grouping the homologous protein sequences into six blocks according to evolutionary distances to human and evaluating sequence conservation in each block independently, and 2. including as many such homologous sequences as possible in analyses. Random forests are used to evaluate sequence conservation in each block and to predict potential impact of an AAS on protein function. Testing of this algorithm on a comprehensive dataset showed significant improvement on prediction accuracy upon currently widely-used programs. The algorithm and a web-based application tool implementing it, EFIN (Evaluation of Functional Impact of Nonsynonymous SNPs) were made freely available (http://paed.hku.hk/efin/) to the public.

Conclusions

Grouping homologous sequences into different blocks according to the evolutionary distance of the species to human and evaluating sequence conservation in each group independently significantly improved prediction accuracy. This approach may help us better understand the roles of genetic variants in human disease and health.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-455) contains supplementary material, which is available to authorized users.  相似文献   

6.
7.
Availability of the human genome sequence and high similarity between humans and pigs at the molecular level provides an opportunity to use a comparative mapping approach to piggy-BAC the human genome. In order to advance the pig genome sequencing initiative, sequence similarity between large-scale porcine BAC-end sequences (BESs) and human genome sequence was used to construct a comparatively-anchored porcine physical map that is a first step towards sequencing the pig genome. A total of 50,300 porcine BAC clones were end-sequenced, yielding 76,906 BESs after trimming with an average read length of 538 bp. To anchor the porcine BACs on the human genome, these BESs were subjected to BLAST analysis using the human draft sequence, revealing 31.5% significant hits (E < e(-5)). Both genic and non-genic regions of homology contributed to the alignments between the human and porcine genomes. Porcine BESs with unique homology matches within the human genome provided a source of markers spaced approximately 70 to 300 kb along each human chromosome. In order to evaluate the utility of piggy-BACing human genome sequences, and confirm predictions of orthology, 193 evenly spaced BESs with similarity to HSA3 and HSA21 were selected and then utilized for developing a high-resolution (1.22 Mb) comparative radiation hybrid map of SSC13 that represents a fusion of HSA3 and HSA21. Resulting RH mapping of SSC13 covers 99% and 97% of HSA3 and HSA21, respectively. Seven evolutionary conserved blocks were identified including six on HSA3 and a single syntenic block corresponding to HSA21. The strategy of piggy-BACing the human genome described in this study demonstrates that through a directed, targeted comparative genomics approach construction of a high-resolution anchored physical map of the pig genome can be achieved. This map supports the selection of BACs to construct a minimal tiling path for genome sequencing and targeted gap filling. Moreover, this approach is highly relevant to other genome sequencing projects.  相似文献   

8.
9.
Availability of the human genome data has enabled the exploration of a huge amount of biological information encoded in it. There are extensive ongoing experimental efforts to understand the biological functions of the gene products encoded in the human genome. However, computational analysis can aid immensely in the interpretation of biological function by associating known functional/structural domains to the human proteins. In this article we have discussed the implications of such associations. The association of structural domains to human proteins could help in prioritizing the targets for structure determination in the structural genomics initiatives. The protein kinase family is one of the most frequently occurring protein domain families in the human proteome while P-loop hydrolase, which comprises many GTPases and ATPases, is a highly represented superfamily. Using the superfamily relationships between families of unknown and known structures we could increase structural information content of the human genome by about 5%. We could also make new associations of domain families to 33 human proteins that are potentially linked to genetically inherited diseases.  相似文献   

10.
11.

Background  

In order to take full advantage of the newly available public human genome sequence data and associated annotations, biologists require visualization tools ("genome browsers") that can accommodate the high frequency of alternative splicing in human genes and other complexities.  相似文献   

12.
13.
14.
Nefedova LN  Kim AI 《Genetika》2007,43(5):620-632
The structure was analyzed for 60 annotated copies of the mobile genetic element (MGE) HB from the Drosophila melanogaster genome. The genomic distribution of HB copies was studied, and preferential insertion sites (hot spots) were identified, which presumably amount to several kilobases. Structural analysis of the open reading frame (ORF) and terminal repeats of HB was performed. All 26 HB copies retaining the ORF sequence have a stop codon in the same position. Consequently, the HB ORF proved indeed to code for an enzyme of 148 amino acid residues, relatively small for Tc1-family transposases. The ORF consensus sequence was established. HB{}1185 was identified as the only HB copy potentially coding for a functional protein. All 37 repeat-containing HB copies were analyzed. Of these, only four had functional terminal sequences, lacking, however, a functional transposase gene. A new 7762-bp copy of MGE roo was found in the D. melanogaster genome; the copy was earlier unavailable from databases and represents an insert in the HB{}1605 sequence.  相似文献   

15.

Background

Cattle are important agriculturally and relevant as a model organism. Previously described genetic and radiation hybrid (RH) maps of the bovine genome have been used to identify genomic regions and genes affecting specific traits. Application of these maps to identify influential genetic polymorphisms will be enhanced by integration with each other and with bacterial artificial chromosome (BAC) libraries. The BAC libraries and clone maps are essential for the hybrid clone-by-clone/whole-genome shotgun sequencing approach taken by the bovine genome sequencing project.

Results

A bovine BAC map was constructed with HindIII restriction digest fragments of 290,797 BAC clones from animals of three different breeds. Comparative mapping of 422,522 BAC end sequences assisted with BAC map ordering and assembly. Genotypes and pedigree from two genetic maps and marker scores from three whole-genome RH panels were consolidated on a 17,254-marker composite map. Sequence similarity allowed integrating the BAC and composite maps with the bovine draft assembly (Btau3.1), establishing a comprehensive resource describing the bovine genome. Agreement between the marker and BAC maps and the draft assembly is high, although discrepancies exist. The composite and BAC maps are more similar than either is to the draft assembly.

Conclusion

Further refinement of the maps and greater integration into the genome assembly process may contribute to a high quality assembly. The maps provide resources to associate phenotypic variation with underlying genomic variation, and are crucial resources for understanding the biology underpinning this important ruminant species so closely associated with humans.  相似文献   

16.
17.
SNPing in the human genome   总被引:4,自引:0,他引:4  
More than a million genetic markers in the form of single nucleotide polymorphisms are now available for use in genotype-phenotype studies in humans. The application of new strategies for representational cloning and sequencing from genomes combined with the mining of high-quality sequence variations in clone overlaps of genomic and/or cDNA sequences has played an important role in generating this new resource. The focus of variation analysis is now shifting from the identification of new markers to their typing in populations, and novel typing strategies are rapidly emerging. Assay readouts on oligonucleotide arrays, in microtiter plates, gels, flow cytometers and mass spectrometers have all been developed, but decreasing cost and increasing throughput of DNA typing remain key if high-density genetic maps are to be applied on a large scale.  相似文献   

18.
We have devised a strategy (called recombinase-mediated genomic replacement, RMGR) to allow the replacement of large segments (>100 kb) of the mouse genome with the equivalent human syntenic region. The technique involves modifying a mouse ES cell chromosome and a human BAC by inserting heterotypic lox sites to flank the proposed exchange interval and then using Cre recombinase to achieve segmental exchange. We have demonstrated the feasibility of this approach by replacing the mouse alpha globin regulatory domain with the human syntenic region and generating homozygous mice that produce only human alpha globin chains. Furthermore, modified ES cells can be used iteratively for functional studies, and here, as an example, we have used RMGR to produce an accurate mouse model of human alpha thalassemia. RMGR has general applicability and will overcome limitations inherent in current transgenic technology when studying the expression of human genes and modeling human genetic diseases.  相似文献   

19.
The open reading frames of human cytomegalovirus (human herpesvirus-5, HHV5) encode some 213 unique proteins with mostly unknown functions. Using the threading program, ProCeryon, we calculated possible matches between the amino acid sequences of these proteins and the Protein Data Bank library of three-dimensional structures. Thirty-six proteins were fully identified in terms of their structure and, often, function; 65 proteins were recognized as members of narrow structural/functional families (e.g. DNA-binding factors, cytokines, enzymes, signaling particles, cell surface receptors etc.); and 87 proteins were assigned to broad structural classes (e.g. all-beta, 3-layer-alphabetaalpha, multidomain, etc.). Genes encoding proteins with similar folds, or containing identical structural traits (extreme sequence length, runs of unstructured (Pro and/or Gly-rich) residues, transmembrane segments, etc.) often formed tandem clusters throughout the genome. In the course of this work, benchmarks on about 20 known folds were used to optimize adjustable parameters of threading calculations, i.e. gap penalty weights used in sequence/structure alignments; new scores obtained as simple combinations of existing scoring functions; and number of threading runs conducive to meaningful results. An introduction of summed, per-residue-normalized scores has been essential for discovery of subdomains (EGF-like, SH2, SH3) in longer protein sequences, such as the eight "open sandwich" cytokine domains, 60-70 amino acids long and having the 3beta1alpha fold with one or two disulfide bridges, present in otherwise unrelated proteins.  相似文献   

20.
A physical map is presented for the 1200 kb genome of Mycoplasma mycoides subsp. mycoides Y, locating 32 cleavage sites for 8 restriction endonucleases. The large restriction fragments involved were separated and sized by pulsed-field agarose gel electrophoresis. Their locations on the map were determined by probing Southern blots of digests with individual fragments isolated from other digests and by correlating the products of double and triple digestions. Loci for 2 ribosomal RNA operons and 2 tRNA operons have been determined by probing with cloned genes and the broad regions of the replication origin and terminus have also been outlined by in vivo labelling studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号