TY - GEN
T1 - VidQ
T2 - 17th ACM Symposium on QoS and Security for Wireless and Mobile Networks, Q2SWinet 2021
AU - Felemban, Noor
AU - Mehmeti, Fidan
AU - La Porta, Thomas
N1 - Funding Information:
This research was sponsored by the U.S. Army Research Laboratory and the U.K. Ministry of Defence under Agreement Number W911NF-16-3-0001. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.
Publisher Copyright:
© 2021 ACM.
PY - 2021/11/22
Y1 - 2021/11/22
N2 - As the amount of recorded and stored videos on mobile devices increase, efficient techniques for searching video content become more and more important, especially for applications like searching for the moment of crime or other specific actions. When a user sends a query searching for a specific action in a large amount of data, the goal is to respond to the query accurately and fast. In this paper, we address the problem of responding to queries which search for specific actions in mobile devices in a timely manner by utilizing both visual and audio content processing approaches. We build a system, called VidQ, which consists of several stages and uses various Convolutional Neural Networks (CNNs) and Speech APIs to respond to such queries. As the state-of-the-art computer vision and speech algorithms are computationally intensive, we use servers with GPUs to assist mobile users in the process. After a query has been issued, we identify the possible different stages of processing that will take place. This is followed by identifying the order of these stages that build up the system. Finally, we distribute the process among the available network resources to further improve the performance by minimizing the processing time. Results show that VidQ reduces the completion time by at least 50% compared to other approaches.
AB - As the amount of recorded and stored videos on mobile devices increase, efficient techniques for searching video content become more and more important, especially for applications like searching for the moment of crime or other specific actions. When a user sends a query searching for a specific action in a large amount of data, the goal is to respond to the query accurately and fast. In this paper, we address the problem of responding to queries which search for specific actions in mobile devices in a timely manner by utilizing both visual and audio content processing approaches. We build a system, called VidQ, which consists of several stages and uses various Convolutional Neural Networks (CNNs) and Speech APIs to respond to such queries. As the state-of-the-art computer vision and speech algorithms are computationally intensive, we use servers with GPUs to assist mobile users in the process. After a query has been issued, we identify the possible different stages of processing that will take place. This is followed by identifying the order of these stages that build up the system. Finally, we distribute the process among the available network resources to further improve the performance by minimizing the processing time. Results show that VidQ reduces the completion time by at least 50% compared to other approaches.
UR - http://www.scopus.com/inward/record.url?scp=85121712419&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121712419&partnerID=8YFLogxK
U2 - 10.1145/3479242.3487320
DO - 10.1145/3479242.3487320
M3 - Conference contribution
AN - SCOPUS:85121712419
T3 - Q2SWinet 2021 - Proceedings of the 17th ACM Symposium on QoS and Security for Wireless and Mobile Networks
SP - 51
EP - 60
BT - Q2SWinet 2021 - Proceedings of the 17th ACM Symposium on QoS and Security for Wireless and Mobile Networks
PB - Association for Computing Machinery, Inc
Y2 - 22 November 2021 through 26 November 2021
ER -