Errors, misunderstandings, and attacks: Analyzing the crowdsourcing process of ad-blocking systems

Mshabab Alrizah, Xinyu Xing, Sencun Zhu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.

Original languageEnglish (US)
Title of host publicationIMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference
PublisherAssociation for Computing Machinery
Pages230-244
Number of pages15
ISBN (Electronic)9781450369480
DOIs
StatePublished - Oct 21 2019
Event19th ACM Internet Measurement Conference, IMC 2019 - Amsterdam, Netherlands
Duration: Oct 21 2019Oct 23 2019

Publication series

NameProceedings of the ACM SIGCOMM Internet Measurement Conference, IMC

Conference

Conference19th ACM Internet Measurement Conference, IMC 2019
CountryNetherlands
CityAmsterdam
Period10/21/1910/23/19

Fingerprint

Websites

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications

Cite this

Alrizah, M., Xing, X., & Zhu, S. (2019). Errors, misunderstandings, and attacks: Analyzing the crowdsourcing process of ad-blocking systems. In IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference (pp. 230-244). (Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC). Association for Computing Machinery. https://doi.org/10.1145/3355369.3355588
Alrizah, Mshabab ; Xing, Xinyu ; Zhu, Sencun. / Errors, misunderstandings, and attacks : Analyzing the crowdsourcing process of ad-blocking systems. IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference. Association for Computing Machinery, 2019. pp. 230-244 (Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC).
@inproceedings{c2ceffe0f15b4541962836dfea114829,
title = "Errors, misunderstandings, and attacks: Analyzing the crowdsourcing process of ad-blocking systems",
abstract = "Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50{\%} of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.",
author = "Mshabab Alrizah and Xinyu Xing and Sencun Zhu",
year = "2019",
month = "10",
day = "21",
doi = "10.1145/3355369.3355588",
language = "English (US)",
series = "Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC",
publisher = "Association for Computing Machinery",
pages = "230--244",
booktitle = "IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference",

}

Alrizah, M, Xing, X & Zhu, S 2019, Errors, misunderstandings, and attacks: Analyzing the crowdsourcing process of ad-blocking systems. in IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference. Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC, Association for Computing Machinery, pp. 230-244, 19th ACM Internet Measurement Conference, IMC 2019, Amsterdam, Netherlands, 10/21/19. https://doi.org/10.1145/3355369.3355588

Errors, misunderstandings, and attacks : Analyzing the crowdsourcing process of ad-blocking systems. / Alrizah, Mshabab; Xing, Xinyu; Zhu, Sencun.

IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference. Association for Computing Machinery, 2019. p. 230-244 (Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Errors, misunderstandings, and attacks

T2 - Analyzing the crowdsourcing process of ad-blocking systems

AU - Alrizah, Mshabab

AU - Xing, Xinyu

AU - Zhu, Sencun

PY - 2019/10/21

Y1 - 2019/10/21

N2 - Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.

AB - Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.

UR - http://www.scopus.com/inward/record.url?scp=85074855325&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074855325&partnerID=8YFLogxK

U2 - 10.1145/3355369.3355588

DO - 10.1145/3355369.3355588

M3 - Conference contribution

AN - SCOPUS:85074855325

T3 - Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC

SP - 230

EP - 244

BT - IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference

PB - Association for Computing Machinery

ER -

Alrizah M, Xing X, Zhu S. Errors, misunderstandings, and attacks: Analyzing the crowdsourcing process of ad-blocking systems. In IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference. Association for Computing Machinery. 2019. p. 230-244. (Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC). https://doi.org/10.1145/3355369.3355588