期刊文献+

Optimizing of large-number-patterns string matching algorithms based on definite-state automata 被引量:3

Optimizing of large-number-patterns string matching algorithms based on definite-state automata
下载PDF
导出
摘要 Because the small CACHE size of computers, the scanning speed of DFA based multi-pattern string-matching algorithms slows down rapidly especially when the number of patterns is very large. For solving such problems, we cut down the scanning time of those algorithms (i.e. DFA based) by rearranging the states table and shrinking the DFA alphabet size. Both the methods can decrease the probability of large-scale random memory accessing and increase the probability of continuously memory accessing. Then the hitting rate of the CACHE is increased and the searching time of on the DFA is reduced. Shrinking the alphabet size of the DFA also reduces the storage complication. The AC++algorithm, by optimizing the Aho-Corasick (i.e. AC) algorithm using such methods, proves the theoretical analysis. And the experimentation results show that the scanning time of AC++and the storage occupied is better than that of AC in most cases and the result is much attractive when the number of patterns is very large. Because DFA is a widely used base algorithm in may string matching algorithms, such as DAWG, SBOM etc., the optimizing method discussed is significant in practice. Because the small CACHE size of computers, the scanning speed of DFA based multi-pattern stringmatching algorithms slows down rapidly especially when the number of patterns is very large. For solving such problems, we cut down the scanning time of those algorithms (i. e. DFA based) by rearranging the states table and shrinking the DFA alphabet size. Both the methods can decrease the probability of large-scale random mem- ory accessing and increase the probability of continuously memory accessing. Then the hitting rate of the CACHE is increased and the searching time of on the DFA is reduced. Shrinking the alphabet size of the DFA also reduces the storage complication. The AC + + algorithm, by optimizing the Aho-Corasick (i. e. AC) algorithm using such methods, proves the theoretical analysis. And the experimentation results show that the scanning time of AC + + and the storage occupied is better than that of AC in most cases and the result is much attractive when the number of patterns is very large. Because DFA is a widely used base algorithm in may string matching algorithms, such as DAWG, SBOM etc. , the optimizing method discussed is significant in practice.
出处 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2007年第2期236-239,共4页 哈尔滨工业大学学报(英文版)
关键词 multi-pattern string-matching definite-state automata Aho-Corasick algorithm CACHE 有限自动机 大数型字符串匹配算法 优化 多类型字符串 CACHE
  • 相关文献

参考文献4

  • 1Allauzen C,Raffinot M.Factor Oracle of a Set of Words[]..1999
  • 2Aho A,Corasick M.Efficient string matching: An aid to bibliographic search[].Communications of the ACM.1975
  • 3Knuth D,Morris J,Pratt V.Fast pattern matching in strings[].SIAM Journal on Computing.1977
  • 4Wu S,Manber U.A Fast Algorithm for Multi-pattern Searching[]..1994

同被引文献12

  • 1宋明秋,张国权,邓贵仕.IDS中新的快速多模式匹配算法及其设计[J].计算机工程与应用,2005,41(21):159-162. 被引量:9
  • 2杨薇薇,廖翔.一种改进的BM模式匹配算法[J].计算机应用,2006,26(2):318-319. 被引量:25
  • 3Boyer RS,Moore JS.A fast string searching algorithm[J].Communication of the ACM.1997,20(10):762-772.
  • 4R.Nigel Horspool.Practical fast searching in strings.Software Practice and Experience.1980,10(6):501-506.
  • 5Mike fisk,George Varghese,Fast content-based packet handling for intrusion detection[R].UCSD Technical Report CS2001-0670.2001-05.
  • 6Boyer RS,Moore Js.A fast string searching algorithm[J].Communication of the ACM.1997,20(10):762-772.
  • 7R.Nigel Horspool.Practical fast searching in strings.Software Practice and Experience.1980,10(6):501-506.
  • 8A Aho,M Corasick.Efficient string matching an aid to bibliographic search.Communication of the ACM,1975,18(6):333-340.
  • 9C Jason coit,Stuart Staniford.Toward faster string matching for intrusion detection or exceeding the speed of snort[J].IEEE,CS Press,2001:367-373.
  • 10Mike fisk,George Varghese,Fast content-based packet handling for intrusion detection[R].UCSD Technical Report CS2001-0670.2001-05.

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部