Understanding the reproducibility of crowd-reported security vulnerabilities

Dongliang Mu, Alejandro Cuevas, Limin Yang, Hang Hu, Xinyu Xing, Bing Mao, Gang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Today's software systems are increasingly relying on the "power of the crowd" to identify new security vulnerabilities. And yet, it is not well understood how reproducible the crowd-reported vulnerabilities are. In this paper, we perform the first empirical analysis on a wide range of real-world security vulnerabilities (368 in total) with the goal of quantifying their reproducibility. Following a carefully controlled workflow, we organize a focused group of security analysts to carry out reproduction experiments. With 3600 man-hours spent, we obtain quantitative evidence on the prevalence of missing information in vulnerability reports and the low repro-ducibility of the vulnerabilities. We find that relying on a single vulnerability report from a popular security forum is generally difficult to succeed due to the incomplete information. By widely crowdsourcing the information gathering, security analysts could increase the reproduction success rate, but still face key challenges to troubleshoot the non-reproducible cases. To further explore solutions, we surveyed hackers, researchers, and engineers who have extensive domain expertise in software security (N=43). Going beyond Internet-scale crowd-sourcing, we find that, security professionals heavily rely on manual debugging and speculative guessing to infer the missed information. Our result suggests that there is not only a necessity to overhaul the way a security forum collects vulnerability reports, but also a need for automated mechanisms to collect information commonly missing in a report.

Original languageEnglish (US)
Title of host publicationProceedings of the 27th USENIX Security Symposium
PublisherUSENIX Association
Pages919-936
Number of pages18
ISBN (Electronic)9781939133045
StatePublished - Jan 1 2018
Event27th USENIX Security Symposium - Baltimore, United States
Duration: Aug 15 2018Aug 17 2018

Publication series

NameProceedings of the 27th USENIX Security Symposium

Conference

Conference27th USENIX Security Symposium
CountryUnited States
CityBaltimore
Period8/15/188/17/18

Fingerprint

Security of data
Internet
Engineers
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Cite this

Mu, D., Cuevas, A., Yang, L., Hu, H., Xing, X., Mao, B., & Wang, G. (2018). Understanding the reproducibility of crowd-reported security vulnerabilities. In Proceedings of the 27th USENIX Security Symposium (pp. 919-936). (Proceedings of the 27th USENIX Security Symposium). USENIX Association.
Mu, Dongliang ; Cuevas, Alejandro ; Yang, Limin ; Hu, Hang ; Xing, Xinyu ; Mao, Bing ; Wang, Gang. / Understanding the reproducibility of crowd-reported security vulnerabilities. Proceedings of the 27th USENIX Security Symposium. USENIX Association, 2018. pp. 919-936 (Proceedings of the 27th USENIX Security Symposium).
@inproceedings{a6320046bd2a4607989e871560fd4a81,
title = "Understanding the reproducibility of crowd-reported security vulnerabilities",
abstract = "Today's software systems are increasingly relying on the {"}power of the crowd{"} to identify new security vulnerabilities. And yet, it is not well understood how reproducible the crowd-reported vulnerabilities are. In this paper, we perform the first empirical analysis on a wide range of real-world security vulnerabilities (368 in total) with the goal of quantifying their reproducibility. Following a carefully controlled workflow, we organize a focused group of security analysts to carry out reproduction experiments. With 3600 man-hours spent, we obtain quantitative evidence on the prevalence of missing information in vulnerability reports and the low repro-ducibility of the vulnerabilities. We find that relying on a single vulnerability report from a popular security forum is generally difficult to succeed due to the incomplete information. By widely crowdsourcing the information gathering, security analysts could increase the reproduction success rate, but still face key challenges to troubleshoot the non-reproducible cases. To further explore solutions, we surveyed hackers, researchers, and engineers who have extensive domain expertise in software security (N=43). Going beyond Internet-scale crowd-sourcing, we find that, security professionals heavily rely on manual debugging and speculative guessing to infer the missed information. Our result suggests that there is not only a necessity to overhaul the way a security forum collects vulnerability reports, but also a need for automated mechanisms to collect information commonly missing in a report.",
author = "Dongliang Mu and Alejandro Cuevas and Limin Yang and Hang Hu and Xinyu Xing and Bing Mao and Gang Wang",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
series = "Proceedings of the 27th USENIX Security Symposium",
publisher = "USENIX Association",
pages = "919--936",
booktitle = "Proceedings of the 27th USENIX Security Symposium",

}

Mu, D, Cuevas, A, Yang, L, Hu, H, Xing, X, Mao, B & Wang, G 2018, Understanding the reproducibility of crowd-reported security vulnerabilities. in Proceedings of the 27th USENIX Security Symposium. Proceedings of the 27th USENIX Security Symposium, USENIX Association, pp. 919-936, 27th USENIX Security Symposium, Baltimore, United States, 8/15/18.

Understanding the reproducibility of crowd-reported security vulnerabilities. / Mu, Dongliang; Cuevas, Alejandro; Yang, Limin; Hu, Hang; Xing, Xinyu; Mao, Bing; Wang, Gang.

Proceedings of the 27th USENIX Security Symposium. USENIX Association, 2018. p. 919-936 (Proceedings of the 27th USENIX Security Symposium).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Understanding the reproducibility of crowd-reported security vulnerabilities

AU - Mu, Dongliang

AU - Cuevas, Alejandro

AU - Yang, Limin

AU - Hu, Hang

AU - Xing, Xinyu

AU - Mao, Bing

AU - Wang, Gang

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Today's software systems are increasingly relying on the "power of the crowd" to identify new security vulnerabilities. And yet, it is not well understood how reproducible the crowd-reported vulnerabilities are. In this paper, we perform the first empirical analysis on a wide range of real-world security vulnerabilities (368 in total) with the goal of quantifying their reproducibility. Following a carefully controlled workflow, we organize a focused group of security analysts to carry out reproduction experiments. With 3600 man-hours spent, we obtain quantitative evidence on the prevalence of missing information in vulnerability reports and the low repro-ducibility of the vulnerabilities. We find that relying on a single vulnerability report from a popular security forum is generally difficult to succeed due to the incomplete information. By widely crowdsourcing the information gathering, security analysts could increase the reproduction success rate, but still face key challenges to troubleshoot the non-reproducible cases. To further explore solutions, we surveyed hackers, researchers, and engineers who have extensive domain expertise in software security (N=43). Going beyond Internet-scale crowd-sourcing, we find that, security professionals heavily rely on manual debugging and speculative guessing to infer the missed information. Our result suggests that there is not only a necessity to overhaul the way a security forum collects vulnerability reports, but also a need for automated mechanisms to collect information commonly missing in a report.

AB - Today's software systems are increasingly relying on the "power of the crowd" to identify new security vulnerabilities. And yet, it is not well understood how reproducible the crowd-reported vulnerabilities are. In this paper, we perform the first empirical analysis on a wide range of real-world security vulnerabilities (368 in total) with the goal of quantifying their reproducibility. Following a carefully controlled workflow, we organize a focused group of security analysts to carry out reproduction experiments. With 3600 man-hours spent, we obtain quantitative evidence on the prevalence of missing information in vulnerability reports and the low repro-ducibility of the vulnerabilities. We find that relying on a single vulnerability report from a popular security forum is generally difficult to succeed due to the incomplete information. By widely crowdsourcing the information gathering, security analysts could increase the reproduction success rate, but still face key challenges to troubleshoot the non-reproducible cases. To further explore solutions, we surveyed hackers, researchers, and engineers who have extensive domain expertise in software security (N=43). Going beyond Internet-scale crowd-sourcing, we find that, security professionals heavily rely on manual debugging and speculative guessing to infer the missed information. Our result suggests that there is not only a necessity to overhaul the way a security forum collects vulnerability reports, but also a need for automated mechanisms to collect information commonly missing in a report.

UR - http://www.scopus.com/inward/record.url?scp=85071720467&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071720467&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85071720467

T3 - Proceedings of the 27th USENIX Security Symposium

SP - 919

EP - 936

BT - Proceedings of the 27th USENIX Security Symposium

PB - USENIX Association

ER -

Mu D, Cuevas A, Yang L, Hu H, Xing X, Mao B et al. Understanding the reproducibility of crowd-reported security vulnerabilities. In Proceedings of the 27th USENIX Security Symposium. USENIX Association. 2018. p. 919-936. (Proceedings of the 27th USENIX Security Symposium).