期刊文献+

一种LSH索引的自动参数调整方法 被引量:6

A self-tuning method of LSH index
下载PDF
导出
摘要 针对LSH技术的固有缺点提出了一种根据数据自动调整LSH索引结构关键参数的方法,该方法面向数据集,使得索引结构可以针对不同数据集的统计特征选取适当的散列函数,而不用手工调整LSH索引结构中的关键参数,提高了LSH算法的准确性,且在进行查询时不增加额外的时间空间开销.模拟实验表明,和使用原始LSH算法相比较,使用该方法进行最近邻查询得到结果集的相似性可以提高10%左右,相似偏差可以减小8%左右;并且由于参数调整过程在查询过程之前,因此改进LSH算法和原始LSH算法在进行查询时有相同的时间空间性能. To overcome the handicap of original LSH indexing, an improving approach is presented which enables self-tuning of key parameters of indexing structure. The new approach is dataset-oriented, which make it possible to select appropriate hashing functions according to the statistic feature of a dataset automatically, instead of settling the key parameters manually. This approach improves the indexing performance while not increasing storage and query overhead experiment study shows that comparing to the original LSH method the new approach can improve the inter similarity of result set of query by about 10 %, reduce the error of result set by about 8 %. Meanwhile the new approach has the same temporal-spatial overhead as original LSH when performing query, since the query process is preceded with tuning process.
作者 卢炎生 饶祺
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2006年第11期38-40,57,共4页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 湖北省自然科学基金资助项目(ABA048)
关键词 高维数据索引 相似度查询 近似最近邻查询 high dimensional data indexing similarity search approximate nearest neighbor search
  • 相关文献

参考文献6

  • 1Edelsbrunner H. Algorithms in combinatorial geometry[M].Edelsbrunner: Springer-Verlag, 1987.
  • 2Weber R, Schek H, Blott S. A quantitative analysis and performance study for similarity search methodsin high dimensional spaces[C]// Ashish O S, JenniferW. Proceedings of the 24th International Conference on Very Large Data Bases (VLDB). New York:Morgan Kaufmann Publishers Inc, 1998: 194-205.
  • 3Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing[C]// Malcolm P A, Maria E Q, et al. Proc of VLDB. Edinburgh: Morgan Kaufmann Publishers Inc, 1999:518-529.
  • 4Indyk P, Motwani R. Approximate nearest neighbor towards removing the curse of dimensionality[C]//Laszio B. Proc of STOC. New York: ACM Press,1998:604-613.
  • 5Qamra A, Meng Y, Chang Y. Enhanced distance functions and indexing for image replica recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 21(3):375-391.
  • 6Li B, Chang E, Wu Y. Discovery of a perceptual distance function for measuring image similarity [J].ACM Multimedia J, 2003, 8(6): 512-522.

同被引文献93

  • 1王国仁,黄健美,王斌,韩东红,乔百友,于戈.基于最大间隙空间映射的高维数据索引技术[J].软件学报,2007,18(6):1419-1428. 被引量:9
  • 2Daswani N, Garcia-Molina H, Yang B. Open problems in data sharing peer-to-peer systems [C]. Heidelberg: Springer-Veda, 2003:1-15.
  • 3Li J,Loo B T, Hellerstein J,et al.On the feasibility of peer-to-peer web indexing and search[C].Berkeley:Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS), 2003: 207-215.
  • 4Reynolds P, Vahdat A.Efficient peer-to-peer keyword searching [C].Riode Janeiro,Brazil:Middleware,2003:21-40.
  • 5Indyk P. Approximate nearest neighbor algorithms for Frechet distance via product metrics[C].Barcelona:Symposium on Computational Geometry,2002:102-106.
  • 6Broder A Z,Charikar M,Frieze A M,et al.Min-wise independent permutations[J].J Comput System Sci,2000,60(3):630- 659.
  • 7Smith M K. Web ontology issue status [EB/OL] .http://www. w3.org/2001/sw/WebOnt/webont-issues.html,2003-11.
  • 8TREC: Text retrieval conference [EB/OL] .http://trec.nist.gov, 2006-05.
  • 9Schaffalitzky F, Zisserman A. Multi-view matching for unordered image sets, or "how do I organize my holiday snaps?"[C] //Proceedings of the 7th European Conference on Computer Vision, Copenhagen, 2002 : 414-431.
  • 10Snavely N, Seitz S M, Szeliski R. Photo tourism: exploring photo collections in 3D [J]. ACM Transactions on Graphics, 2006, 25(3): 835-846.

引证文献6

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部