With the thriving of mobile app markets, third-party libraries are pervasively used in Android applications. The libraries provide functionality such as advertising, location, and social networking services, making app development much more productive. However, the spread of vulnerable and harmful third-party libraries can also hurt the mobile ecosystem, leading to various security problems. Therefore, third-party library identification has emerged as an important problem and the basis of many security applications such as repackaging detection, vulnerability identification, and malware analysis. Previously, we proposed a novel approach to identifying third-party Android libraries at a massive scale. Our method uses the internal code dependencies of an app to detect and classify library candidates. With a fine-grained feature hashing strategy, it can better handle code whose package and method names are obfuscated. We have developed a prototypical tool called LibD and evaluated it with an up-to-date and humongous dataset. Our experimental results on 1,427,395 apps show that compared to existing tools, LibD can better handle multi-package third-party libraries in the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability. In this paper, we extend our previous work by demonstrating that effective and scalable library detection can significantly improve the performance of large-scale app analyses in the real world. We show that the technique of LibD can be used to speed up whole-app Android vulnerability detection and quickly identify variants of vulnerable third-party libraries. The extension sheds light on the practical value of our previous work.
All Science Journal Classification (ASJC) codes