A new method for finding long consensus patterns in nucleic acid sequences |
| |
Authors: | Taylor, Philip Rosenberg, Paul Samsonova, Mary.G. |
| |
Affiliation: | MRC Virology Unit, Institute of Virology, Glasgow GI I 5JR; Computing Service, Glasgow University Glasgow G12 8QQ, UKDepartment of Genetics, Leningrad State University Leningrad 199034, USSR |
| |
Abstract: | We describe a fast computer algorithm for identifying consensuspatterns in DNA sequences. The method requires no prior assumptionsabout the consensus pattern other than its length. In particularno previous knowledge of the frequency or spacing of consensuspatterns is required. However, a priori information about theshape of the consensus pattern, or invariability of individualpositions, or the overall conservation level, can be utilizedto enhance the selectivity and sensitivity of search. As thenumber of all possible consensus words increases very rapidlywith length, comprehensive searches have usually been restrictedto a maximum of 1012 nucleotides, even when large mainframesare used. Our algorithm enables searching for consensus patternsof this order on current mid-range and powerful microcomputers.Searches may be conducted on single, long sequences or a setof possibly aligned shorter sequences. We give examples of identifiedconsensus patterns in both prokaryotic and eukaryotic DNA sequences,along with some typical program timings. Received on January 14, 1991; accepted on March 5, 1991 |
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|