- 著者
-
Masayuki Takeda
Yusuke Shibata
Tetsuya Matsumoto
Takuya Kida
Ayumi Shinohara
Shuichi Fukamachi
Takeshi Shinohara
Setsuo Arikawa
- 雑誌
- 情報処理学会論文誌 (ISSN:18827764)
- 巻号頁・発行日
- vol.42, no.3, pp.370-384, 2001-03-15
This paper describes our recent studies onstring pattern matching in compressed textsmainly from practical viewpoints.The aim is to speed up the string pattern matching task in comparison with an ordinary search over the original texts.We have successfully developed (1) an AC type algorithmfor searching in Huffman encoded files and (2) a KMP typealgorithm and (3) a BM type algorithm for searchingin files compressed by the so-called byte pair encoding (BPE).Each of the algorithms reduces the search timeat nearly the same rate as the compression ratio.Surprisingly the BM type algorithm runs over BPE compressed filesabout $1.2$--$3.0$ times faster thanthe exact match routines of the software package {?tt agrep} which is known as the fastest pattern matching tool.