期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An efficient algorithm for minimal primer set selection

Hsieh MH Hsu WC Chiu SK Tzeng CM 《Bioinformatics (Oxford, England)》2003,19(2):285-286

SUMMARY: We have developed U-PRIMER, a primer design program, to compute a minimal primer set (MPS) for any given set of DNA sequences. The U-PRIMER algorithm, which uses automatic variable fixing and automatic redundant constraint elimination to tackle the binary integer programming problem associated with the MPS selection problem. The program has been tested successfully with 32 adipocyte development-related genes and 9 TB-specific genes to obtain their respective MPSs. AVAILABILITY: A free copy of U-PRIMER implemented in C++ programming language is available from http://www.u-vision-biotech.com 相似文献

2.

An efficient comprehensive search algorithm for tagSNP selection using linkage disequilibrium criteria 总被引：1，自引：0，他引：1

Qin ZS Gopalakrishnan S Abecasis GR 《Bioinformatics (Oxford, England)》2006,22(2):220-225

MOTIVATION: Selecting SNP markers for genome-wide association studies is an important and challenging task. The goal is to minimize the number of markers selected for genotyping in a particular platform and therefore reduce genotyping cost while simultaneously maximizing the information content provided by selected markers. RESULTS: We devised an improved algorithm for tagSNP selection using the pairwise r(2) criterion. We first break down large marker sets into disjoint pieces, where more exhaustive searches can replace the greedy algorithm for tagSNP selection. These exhaustive searches lead to smaller tagSNP sets being generated. In addition, our method evaluates multiple solutions that are equivalent according to the linkage disequilibrium criteria to accommodate additional constraints. Its performance was assessed using HapMap data. AVAILABILITY: A computer program named FESTA has been developed based on this algorithm. The program is freely available and can be downloaded at http://www.sph.umich.edu/csg/qin/FESTA/ 相似文献

3.

HLA and mate selection

Leon T. Rosenberg Debra Cooperman Rose Payn 《Immunogenetics》1983,17(1):89-93

HLA types of the partners in 1017 couples were determined. It appeared that there was statistically significant occurrence of like types more frequently than predicted by chance. The existence of ethnic or racial groups with characteristically different frequencies of the HLA types might explain the result. 相似文献

4.

Theories of mate selection

《Biodemography and social biology》2013,59(2):71-84

相似文献

5.

Sexual selection does not equal mate selection

Louis Levine 《Animal behaviour》1985,33(4):1363-1364

相似文献

6.

An efficient algorithm for the additive kinship matrix

Backus V Gilpin M 《The Journal of heredity》2002,93(6):453-456

相似文献

7.

Sexual selection and mate choice 总被引：1，自引：0，他引：1

Andersson M Simmons LW 《Trends in ecology & evolution》2006,21(6):296-302

The past two decades have seen extensive growth of sexual selection research. Theoretical and empirical work has clarified many components of pre- and postcopulatory sexual selection, such as aggressive competition, mate choice, sperm utilization and sexual conflict. Genetic mechanisms of mate choice evolution have been less amenable to empirical testing, but molecular genetic analyses can now be used for incisive experimentation. Here, we highlight some of the currently debated areas in pre- and postcopulatory sexual selection. We identify where new techniques can help estimate the relative roles of the various selection mechanisms that might work together in the evolution of mating preferences and attractive traits, and in sperm-egg interactions. 相似文献

8.

An algorithm for selection of functional siRNA sequences 总被引：33，自引：0，他引：33

Amarzguioui M Prydz H 《Biochemical and biophysical research communications》2004,316(4):1050-1058

Randomly designed siRNA targeting different positions within the same mRNA display widely differing activities. We have performed a statistical analysis of 46 siRNA, identifying various features of the 19bp duplex that correlate significantly with functionality at the 70% knockdown level and verified these results against an independent data set of 34 siRNA recently reported by others. Features that consistently correlated positively with functionality across the two data sets included an asymmetry in the stability of the duplex ends (measured as the A/U differential of the three terminal basepairs at either end of the duplex) and the motifs S1, A6, and W19. The presence of the motifs U1 or G19 was associated with lack of functionality. A selection algorithm based on these findings strongly differentiated between the two functional groups of siRNA in both data sets and proved highly effective when used to design siRNA targeting new endogenous human genes. 相似文献

9.

A simple and efficient algorithm for gene selection using sparse logistic regression 总被引：4，自引：0，他引：4

Shevade SK Keerthi SS 《Bioinformatics (Oxford, England)》2003,19(17):2246-2253

MOTIVATION: This paper gives a new and efficient algorithm for the sparse logistic regression problem. The proposed algorithm is based on the Gauss-Seidel method and is asymptotically convergent. It is simple and extremely easy to implement; it neither uses any sophisticated mathematical programming software nor needs any matrix operations. It can be applied to a variety of real-world problems like identifying marker genes and building a classifier in the context of cancer diagnosis using microarray data. RESULTS: The gene selection method suggested in this paper is demonstrated on two real-world data sets and the results were found to be consistent with the literature. AVAILABILITY: The implementation of this algorithm is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml Supplementary Information: Supplementary material is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml 相似文献

10.

An efficient algorithm for finding short approximate non-tandem repeats

Adebiyi EF Jiang T Kaufmann M 《Bioinformatics (Oxford, England)》2001,17(Z1):S5-S12

We study the problem of approximate non-tandem repeat extraction. Given a long subject string S of length N over a finite alphabet Sigma and a threshold D, we would like to find all short substrings of S of length P that repeat with at most D differences, i.e., insertions, deletions, and mismatches. We give a careful theoretical characterization of the set of seeds (i.e., some maximal exact repeats) required by the algorithm, and prove a sublinear bound on their expected numbers. Using this result, we present a sub-quadratic algorithm for finding all short (i.e., of length O(log N)) approximate repeats. The running time of our algorithm is O(DN(3pow(epsilon)-1)log N), where epsilon = D/P and pow(epsilon) is an increasing, concave function that is 0 when epsilon = 0 and about 0.9 for DNA and protein sequences. 相似文献

11.

An efficient algorithm for DNA fragment assembly in MapReduce

Baomin Xu Jin GaoChunyan Li 《Biochemical and biophysical research communications》2012,426(3):395-398

Fragment assembly is one of the most important problems of sequence assembly. Algorithms for DNA fragment assembly using de Bruijn graph have been widely used. These algorithms require a large amount of memory and running time to build the de Bruijn graph. Another drawback of the conventional de Bruijn approach is the loss of information. To overcome these shortcomings, this paper proposes a parallel strategy to construct de Bruijin graph. Its main characteristic is to avoid the division of de Bruijin graph. A novel fragment assembly algorithm based on our parallel strategy is implemented in the MapReduce framework. The experimental results show that the parallel strategy can effectively improve the computational efficiency and remove the memory limitations of the assembly algorithm based on Euler superpath. This paper provides a useful attempt to the assembly of large-scale genome sequence using Cloud Computing. 相似文献

12.

An algorithm for efficient identification of branched metabolic pathways

Heath AP Bennett GN Kavraki LE 《Journal of computational biology》2011,18(11):1575-1597

This article presents a new graph-based algorithm for identifying branched metabolic pathways in multi-genome scale metabolic data. The term branched is used to refer to metabolic pathways between compounds that consist of multiple pathways that interact biochemically. A branched pathway may produce a target compound through a combination of linear pathways that split compounds into smaller ones, work in parallel with many compounds, and join compounds into larger ones. While branched metabolic pathways predominate in metabolic networks, most previous work has focused on identifying linear metabolic pathways. The ability to automatically identify branched pathways is important in applications that require a deeper understanding of metabolism, such as metabolic engineering and drug target identification. The algorithm presented in this article utilizes explicit atom tracking to identify linear metabolic pathways and then merges them together into branched metabolic pathways. We provide results on several well-characterized metabolic pathways that demonstrate that the new merging approach can efficiently find biologically relevant branched metabolic pathways. 相似文献

13.

An efficient algorithm for large-scale detection of protein families 总被引：6，自引：0，他引：6

Enright AJ Van Dongen S Ouzounis CA 《Nucleic acids research》2002,30(7):1575-1584

Detection of protein families in large databases is one of the principal research objectives in structural and functional genomics. Protein family classification can significantly contribute to the delineation of functional diversity of homologous proteins, the prediction of function based on domain architecture or the presence of sequence motifs as well as comparative genomics, providing valuable evolutionary insights. We present a novel approach called TRIBE-MCL for rapid and accurate clustering of protein sequences into families. The method relies on the Markov cluster (MCL) algorithm for the assignment of proteins into families based on precomputed sequence similarity information. This novel approach does not suffer from the problems that normally hinder other protein sequence clustering algorithms, such as the presence of multi-domain proteins, promiscuous domains and fragmented proteins. The method has been rigorously tested and validated on a number of very large databases, including SwissProt, InterPro, SCOP and the draft human genome. Our results indicate that the method is ideally suited to the rapid and accurate detection of protein families on a large scale. The method has been used to detect and categorise protein families within the draft human genome and the resulting families have been used to annotate a large proportion of human proteins. 相似文献

14.

An efficient Z-score algorithm for assessing sequence alignments.

Hilary S Booth John H Maindonald Susan R Wilson Jill E Gready 《Journal of computational biology》2004,11(4):616-625

We describe an alternative method for scoring of the pairwise alignment of two biological sequences. Designed to overcome the bias due to the composition of the alignment, it measures the distance (in standard deviations) between the given alignment and the mean value of all other alignments that can be obtained by a permutation of either sequence. We demonstrate that the standard deviation can be calculated efficiently. By concentrating upon the ungapped case, the mean and standard deviation can be calculated exactly and in two steps, the first being O(N) time, where N is the length of the sequence, the second in a fixed number of calculations, i.e., in O(1) time. We argue that this statistic is a more consistent measure than a similarity score based upon a standard scoring matrix. Even in the ungapped case, the statistic proves in many cases to be more accurate than the commonly used (FASTA) (Pearson and Lipman, 1988) gapped Z-score in which the sequence is matched against a random sample of the database. We demonstrate the use of the POZ-score as a secondary filter which screens out several well-known types of false positive, reducing the amount of manual screening to be done by the biologist. 相似文献

15.

Sexual selection and condition-dependent mate preferences

Cotton S Small J Pomiankowski A 《Current biology : CB》2006,16(17):R755-R765

The last decade has witnessed considerable theoretical and empirical investigation of how male sexual ornaments evolve. This strong male-biased perspective has resulted in the relative neglect of variation in female mate preferences and its consequences for ornament evolution. As sexual selection is a co-evolutionary process between males and females, ignoring variation in females overlooks a key aspect of this process. Here, we review the empirical evidence that female mate preferences, like male ornaments, are condition dependent. We show accumulating support for the hypothesis that high quality females show the strongest mate preference. Nonetheless, this is still an infant field, and we highlight areas in need of more research, both theoretical and empirical. We also examine some of the wider implications of condition-dependent mating decisions and their effect on the strength of sexual selection. 相似文献

16.

An efficient randomized algorithm for contact-based NMR backbone resonance assignment

Kamisetty H Bailey-Kellogg C Pandurangan G 《Bioinformatics (Oxford, England)》2006,22(2):172-180

MOTIVATION: Backbone resonance assignment is a critical bottleneck in studies of protein structure, dynamics and interactions by nuclear magnetic resonance (NMR) spectroscopy. A minimalist approach to assignment, which we call 'contact-based', seeks to dramatically reduce experimental time and expense by replacing the standard suite of through-bond experiments with the through-space (nuclear Overhauser enhancement spectroscopy, NOESY) experiment. In the contact-based approach, spectral data are represented in a graph with vertices for putative residues (of unknown relation to the primary sequence) and edges for hypothesized NOESY interactions, such that observed spectral peaks could be explained if the residues were 'close enough'. Due to experimental ambiguity, several incorrect edges can be hypothesized for each spectral peak. An assignment is derived by identifying consistent patterns of edges (e.g. for alpha-helices and beta-sheets) within a graph and by mapping the vertices to the primary sequence. The key algorithmic challenge is to be able to uncover these patterns even when they are obscured by significant noise. RESULTS: This paper develops, analyzes and applies a novel algorithm for the identification of polytopes representing consistent patterns of edges in a corrupted NOESY graph. Our randomized algorithm aggregates simplices into polytopes and fixes inconsistencies with simple local modifications, called rotations, that maintain most of the structure already uncovered. In characterizing the effects of experimental noise, we employ an NMR-specific random graph model in proving that our algorithm gives optimal performance in expected polynomial time, even when the input graph is significantly corrupted. We confirm this analysis in simulation studies with graphs corrupted by up to 500% noise. Finally, we demonstrate the practical application of the algorithm on several experimental beta-sheet datasets. Our approach is able to eliminate a large majority of noise edges and to uncover large consistent sets of interactions. AVAILABILITY: Our algorithm has been implemented in the platform-independent Python code. The software can be freely obtained for academic use by request from the authors. 相似文献

17.

An efficient algorithm for detecting frequent subgraphs in biological networks

Koyutürk M Grama A Szpankowski W 《Bioinformatics (Oxford, England)》2004,20(Z1):i200-i207

相似文献

18.

An efficient algorithm for optimizing whole genome alignment with noise

Wong PW Lam TW Lu N Ting HF Yiu SM 《Bioinformatics (Oxford, England)》2004,20(16):2676-2684

MOTIVATION: This paper is concerned with algorithms for aligning two whole genomes so as to identify regions that possibly contain conserved genes. Motivated by existing heuristic-based software tools, we initiate the study of an optimization problem that attempts to uncover conserved genes with a global concern. Another interesting feature in our formulation is the tolerance of noise, which also complicates the optimization problem. A brute-force approach takes time exponential in the noise level. RESULTS: We show how an insight into the optimization structure can lead to a drastic improvement in the time and space requirement [precisely, to O(k2n2) and O(k2n), respectively, where n is the size of the input and k is the noise level]. The reduced space requirement allows us to implement the new algorithm, called MaxMinCluster, on a PC. It is exciting to see that when tested with different real data sets, MaxMinCluster consistently uncovers a high percentage of conserved genes that have been published by GenBank. Its performance is indeed favorably compared to MUMmer (perhaps the most popular software tool for uncovering conserved genes in a whole-genome scale). AVAILABILITY: The source code is available from the website http://www.csis.hku.hk/~colly/maxmincluster/ detailed proof of the propositions can also be found there. 相似文献

19.

An efficient algorithm for approximating geodesic distances in tree space

Battagliero S Puglia G Vicario S Rubino F Scioscia G Leo P 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2011,8(5):1196-1207

The increasing use of phylogeny in biological studies is limited by the need to make available more efficient tools for computing distances between trees. The geodesic tree distance-introduced by Billera, Holmes, and Vogtmann-combines both the tree topology and edge lengths into a single metric. Despite the conceptual simplicity of the geodesic tree distance, algorithms to compute it don't scale well to large, real-world phylogenetic trees composed of hundred or even thousand leaves. In this paper, we propose the geodesic distance as an effective tool for exploring the likelihood profile in the space of phylogenetic trees, and we give a cubic time algorithm, GeoHeuristic, in order to compute an approximation of the distance. We compare it with the GTP algorithm, which calculates the exact distance, and the cone path length, which is another approximation, showing that GeoHeuristic achieves a quite good trade-off between accuracy (relative error always lower than 0.0001) and efficiency. We also prove the equivalence among GeoHeuristic, cone path, and Robinson-Foulds distances when assuming branch lengths equal to unity and we show empirically that, under this restriction, these distances are almost always equal to the actual geodesic. 相似文献

20.

FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium

Guimei Liu Yue Wang Limsoon Wong 《BMC bioinformatics》2010,11(1):66

Background

Human genome contains millions of common single nucleotide polymorphisms (SNPs) and these SNPs play an important role in understanding the association between genetic variations and human diseases. Many SNPs show correlated genotypes, or linkage disequilibrium (LD), thus it is not necessary to genotype all SNPs for association study. Many algorithms have been developed to find a small subset of SNPs called tag SNPs that are sufficient to infer all the other SNPs. Algorithms based on the r ² LD statistic have gained popularity because r ² is directly related to statistical power to detect disease associations. Most of existing r ² based algorithms use pairwise LD. Recent studies show that multi-marker LD can help further reduce the number of tag SNPs. However, existing tag SNP selection algorithms based on multi-marker LD are both time-consuming and memory-consuming. They cannot work on chromosomes containing more than 100 k SNPs using length-3 tagging rules. 相似文献