Large-Scale Third-Party Library Detection in Android Markets

Menghao Li, Pei Wang, Wei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, Wei Huo, Wei Zou

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

With the thriving of mobile app markets, third-party libraries are pervasively used in Android applications. The libraries provide functionalities such as advertising, location, and social networking services, making app development much more productive. However, the spread of vulnerable and harmful third-party libraries can also hurt the mobile ecosystem, leading to various security problems. Therefore, third-party library identification has emerged as an important problem, being the basis of many security applications such as repackaging detection, vulnerability identification, and malware analysis. Previously, we proposed a novel approach to identifying third-party Android libraries at a massive scale. Our method uses the internal code dependencies of an app to recognize library candidates and further classify them. With a fine-grained feature hashing strategy, we can better handle code whose package and method names are obfuscated than historical work. We have developed a prototypical tool called LibD and evaluated it with an up-to-date dataset containing 1,427,395 Android apps. Our experiment results show that LibD outperforms existing tools in detecting multi-package third-party libraries with the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability. In this paper, we extend our early work by investigating the possibility of employing effective and scalable library detection to boost the performance of large-scale app analyses in the real world. We show that the technique of LibD can be used to accelerate whole-app Android vulnerability detection and quickly identify variants of vulnerable third-party libraries. This extension paper sheds light on the practical value of our previous research.

Original languageEnglish (US)
Article number8478000
Pages (from-to)981-1003
Number of pages23
JournalIEEE Transactions on Software Engineering
Volume46
Issue number9
DOIs
StatePublished - Sep 1 2020

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint Dive into the research topics of 'Large-Scale Third-Party Library Detection in Android Markets'. Together they form a unique fingerprint.

Cite this