SOTU is a near-duplicate image/video retrieval/detection toolkit fully based on bag-of-visual word (BoW) scheme.  In SOTU, we provide solutions for both efficient near-duplicate image retrieval and detection. The major difference between our scheme from traditional BoW is that SOTU incorporates visual and geometric verification during the online retrieval stage. Besides integrating the scheme proposed in [1], SOTU also offers options to perform traditional BoW retrieval and BoW retrieval with Hamming Embedding (HE) [3] and weak geometric constraint (WGC) [2] support.

Functions Integrated

  • Bag-of visual keywords generation
  • Vector Quantization with various schemes
  • Hamming Embedding Training
  • Near duplicate Retrieval/Detection with various schemes

Core idea with examples

  • Truth positive and false alarm with matches of visual words
  • WGC [2] Higorams
  • EWGC [1] Higorams


[1] Wan-Lei Zhao, Xiao Wu and Chong-Wah Ngo, "On the Annotation of Web Videos by Efficient Near-duplicate Search", in IEEE Trans. on Mutimedia. (Accepted)
[2] Herve Jegou, Matthijs Douze and Cordelia Schmid, "Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search", in Proceedings of ECCV'2008, pages: 304-317. 
[3] Chong-Wah Ngo, Yu-Gang Jiang, Xiao-Yong Wei, Wanlei Zhao, Yang Liu, Jun Wang, Shiai Zhu and Shih-Fu Chang, "VIREO/DVMM at TRECVID 2009: High-Level Feature Extraction, Automatic Video Search, and Content-Based Copy Detection", in Proceedings of TRECVID 2009.
[4] Wan-Lei Zhao and Chong-Wah Ngo, "Scale-Rotation Invariant Pattern Entropy for Keypoint-based Near-Duplicate Detection", in IEEE Trans. on Image Processing, Vol 18, pp 412-423.