Visual co-occurrence network: Using context for large-scale object recognition in retail

Siddharth Advani, Brigid Smith, Yasuki Tanabe, Kevin Irick, Matthew Cotter, Jack Sampson, Vijaykrishnan Narayanan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

In any visual object recognition system, the classification accuracy will likely determine the usefulness of the system as a whole. In many real-world applications, it is also important to be able to recognize a large number of diverse objects for the system to be robust enough to handle the sort of tasks that the human visual system handles on an average day. These objectives are often at odds with performance, as running too large of a number of detectors on any one scene will be prohibitively slow for use in any real-time scenario. However, visual information has temporal and spatial context that can be exploited to reduce the number of detectors that need to be triggered at any given instance. In this paper, we propose a dynamic approach to encode such context, called Visual Co-occurrence Network (ViCoNet) that establishes relationships between objects observed in a visual scene. We investigate the utility of ViCoNet when integrated into a vision pipeline targeted for retail shopping. When evaluated on a large and deep dataset, we achieve a 50% improvement in performance and a 7% improvement in accuracy in the best case, and a 45% improvement in performance and a 3% improvement in accuracy in the average case over an established baseline. The memory overhead of ViCoNet is around 10KB, highlighting its effectiveness on temporal big data.

Original languageEnglish (US)
Title of host publicationESTIMedia 2015 - 13th IEEE Symposium on Embedded Systems for Real-Time Multimedia
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781467381642
DOIs
StatePublished - Dec 9 2015
Event13th IEEE Symposium on Embedded Systems for Real-Time Multimedia, ESTIMedia 2015 - Amsterdam, Netherlands
Duration: Oct 8 2015Oct 9 2015

Publication series

NameESTIMedia 2015 - 13th IEEE Symposium on Embedded Systems for Real-Time Multimedia

Other

Other13th IEEE Symposium on Embedded Systems for Real-Time Multimedia, ESTIMedia 2015
CountryNetherlands
CityAmsterdam
Period10/8/1510/9/15

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Media Technology

Fingerprint Dive into the research topics of 'Visual co-occurrence network: Using context for large-scale object recognition in retail'. Together they form a unique fingerprint.

  • Cite this

    Advani, S., Smith, B., Tanabe, Y., Irick, K., Cotter, M., Sampson, J., & Narayanan, V. (2015). Visual co-occurrence network: Using context for large-scale object recognition in retail. In ESTIMedia 2015 - 13th IEEE Symposium on Embedded Systems for Real-Time Multimedia [7351774] (ESTIMedia 2015 - 13th IEEE Symposium on Embedded Systems for Real-Time Multimedia). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ESTIMedia.2015.7351774