TY - JOUR
T1 - Task-agnostic object recognition for mobile robots through few-shot image matching
AU - Chiatti, Agnese
AU - Bardaro, Gianluca
AU - Bastianelli, Emanuele
AU - Tiddi, Ilaria
AU - Mitra, Prasenjit
AU - Motta, Enrico
N1 - Funding Information:
Funding: This work has been partially supported by a European Union’s Horizon 2020 grant—Sciroc, No780086.
Funding Information:
This work has been partially supported by a European Union?s Horizon 2020 grant?Sciroc, No 780086. Acknowledgments: The authors would like to thank Prof. Stefan Rueger (Knowledge Media Instutite, The Open University, UK), for graciously providing the computing resources used in this work.
Publisher Copyright:
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2020/3
Y1 - 2020/3
N2 - To assist humans with their daily tasks, mobile robots are expected to navigate complex and dynamic environments, presenting unpredictable combinations of known and unknown objects. Most state-of-the-art object recognition methods are unsuitable for this scenario because they require that: (i) all target object classes are known beforehand, and (ii) a vast number of training examples is provided for each class. This evidence calls for novel methods to handle unknown object classes, for which fewer images are initially available (few-shot recognition). One way of tackling the problem is learning how to match novel objects to their most similar supporting example. Here, we compare different (shallow and deep) approaches to few-shot image matching on a novel data set, consisting of 2D views of common object types drawn from a combination of ShapeNet and Google. First, we assess if the similarity of objects learned from a combination of ShapeNet and Google can scale up to new object classes, i.e., categories unseen at training time. Furthermore, we show how normalising the learned embeddings can impact the generalisation abilities of the tested methods, in the context of two novel configurations: (i) where the weights of a Convolutional two-branch Network are imprinted and (ii) where the embeddings of a Convolutional Siamese Network are L2-normalised.
AB - To assist humans with their daily tasks, mobile robots are expected to navigate complex and dynamic environments, presenting unpredictable combinations of known and unknown objects. Most state-of-the-art object recognition methods are unsuitable for this scenario because they require that: (i) all target object classes are known beforehand, and (ii) a vast number of training examples is provided for each class. This evidence calls for novel methods to handle unknown object classes, for which fewer images are initially available (few-shot recognition). One way of tackling the problem is learning how to match novel objects to their most similar supporting example. Here, we compare different (shallow and deep) approaches to few-shot image matching on a novel data set, consisting of 2D views of common object types drawn from a combination of ShapeNet and Google. First, we assess if the similarity of objects learned from a combination of ShapeNet and Google can scale up to new object classes, i.e., categories unseen at training time. Furthermore, we show how normalising the learned embeddings can impact the generalisation abilities of the tested methods, in the context of two novel configurations: (i) where the weights of a Convolutional two-branch Network are imprinted and (ii) where the embeddings of a Convolutional Siamese Network are L2-normalised.
UR - http://www.scopus.com/inward/record.url?scp=85080854351&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85080854351&partnerID=8YFLogxK
U2 - 10.3390/electronics9030380
DO - 10.3390/electronics9030380
M3 - Article
AN - SCOPUS:85080854351
VL - 9
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
SN - 1450-5843
IS - 3
M1 - 380
ER -