Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data

Fenglong Ma, Jing Gao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on the same set of questions. The most important challenge for this task is to estimate source reliability and select answers that are provided by high-quality sources. Existing work solves this problem by simultaneously estimating sources' reliability and inferring questions' true answers (i.e., the truths). However, these methods assume that a source has the same reliability degree on all the questions, but ignore the fact that sources' reliability may vary significantly among different topics. To capture various expertise levels on different topics, we propose three fine-grained truth discovery models-parametric probabilistic model (FaitCrowd), non-parametric probabilistic model and topical influence-aware model-for the task of aggregating conflicting data collected from multiple users/sources. These probabilistic models jointly model the process of generating question content and sources' provided answers to estimate both fine-grained expertise and true answers simultaneously. This leads to a more precise estimation of source reliability. Therefore, theses models demonstrate better ability to obtain true answers for the questions compared with existing approaches.

Original languageEnglish (US)
Title of host publicationProceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015
EditorsXindong Wu, Alexander Tuzhilin, Hui Xiong, Jennifer G. Dy, Charu Aggarwal, Zhi-Hua Zhou, Peng Cui
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1556-1557
Number of pages2
ISBN (Electronic)9781467384926
DOIs
StatePublished - Jan 29 2016
Event15th IEEE International Conference on Data Mining Workshop, ICDMW 2015 - Atlantic City, United States
Duration: Nov 14 2015Nov 17 2015

Publication series

NameProceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015

Other

Other15th IEEE International Conference on Data Mining Workshop, ICDMW 2015
CountryUnited States
CityAtlantic City
Period11/14/1511/17/15

Fingerprint

Agglomeration
Statistical Models

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Computer Science Applications

Cite this

Ma, F., & Gao, J. (2016). Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data. In X. Wu, A. Tuzhilin, H. Xiong, J. G. Dy, C. Aggarwal, Z-H. Zhou, & P. Cui (Eds.), Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015 (pp. 1556-1557). [7395859] (Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDMW.2015.109
Ma, Fenglong ; Gao, Jing. / Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data. Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015. editor / Xindong Wu ; Alexander Tuzhilin ; Hui Xiong ; Jennifer G. Dy ; Charu Aggarwal ; Zhi-Hua Zhou ; Peng Cui. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1556-1557 (Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015).
@inproceedings{b1bcb1d92e964ab592f3ca30b9ce4bc9,
title = "Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data",
abstract = "In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on the same set of questions. The most important challenge for this task is to estimate source reliability and select answers that are provided by high-quality sources. Existing work solves this problem by simultaneously estimating sources' reliability and inferring questions' true answers (i.e., the truths). However, these methods assume that a source has the same reliability degree on all the questions, but ignore the fact that sources' reliability may vary significantly among different topics. To capture various expertise levels on different topics, we propose three fine-grained truth discovery models-parametric probabilistic model (FaitCrowd), non-parametric probabilistic model and topical influence-aware model-for the task of aggregating conflicting data collected from multiple users/sources. These probabilistic models jointly model the process of generating question content and sources' provided answers to estimate both fine-grained expertise and true answers simultaneously. This leads to a more precise estimation of source reliability. Therefore, theses models demonstrate better ability to obtain true answers for the questions compared with existing approaches.",
author = "Fenglong Ma and Jing Gao",
year = "2016",
month = "1",
day = "29",
doi = "10.1109/ICDMW.2015.109",
language = "English (US)",
series = "Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1556--1557",
editor = "Xindong Wu and Alexander Tuzhilin and Hui Xiong and Dy, {Jennifer G.} and Charu Aggarwal and Zhi-Hua Zhou and Peng Cui",
booktitle = "Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015",
address = "United States",

}

Ma, F & Gao, J 2016, Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data. in X Wu, A Tuzhilin, H Xiong, JG Dy, C Aggarwal, Z-H Zhou & P Cui (eds), Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015., 7395859, Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015, Institute of Electrical and Electronics Engineers Inc., pp. 1556-1557, 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015, Atlantic City, United States, 11/14/15. https://doi.org/10.1109/ICDMW.2015.109

Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data. / Ma, Fenglong; Gao, Jing.

Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015. ed. / Xindong Wu; Alexander Tuzhilin; Hui Xiong; Jennifer G. Dy; Charu Aggarwal; Zhi-Hua Zhou; Peng Cui. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1556-1557 7395859 (Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data

AU - Ma, Fenglong

AU - Gao, Jing

PY - 2016/1/29

Y1 - 2016/1/29

N2 - In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on the same set of questions. The most important challenge for this task is to estimate source reliability and select answers that are provided by high-quality sources. Existing work solves this problem by simultaneously estimating sources' reliability and inferring questions' true answers (i.e., the truths). However, these methods assume that a source has the same reliability degree on all the questions, but ignore the fact that sources' reliability may vary significantly among different topics. To capture various expertise levels on different topics, we propose three fine-grained truth discovery models-parametric probabilistic model (FaitCrowd), non-parametric probabilistic model and topical influence-aware model-for the task of aggregating conflicting data collected from multiple users/sources. These probabilistic models jointly model the process of generating question content and sources' provided answers to estimate both fine-grained expertise and true answers simultaneously. This leads to a more precise estimation of source reliability. Therefore, theses models demonstrate better ability to obtain true answers for the questions compared with existing approaches.

AB - In crowdsourced data aggregation task, there exist conflicts in the answers provided by large numbers of sources on the same set of questions. The most important challenge for this task is to estimate source reliability and select answers that are provided by high-quality sources. Existing work solves this problem by simultaneously estimating sources' reliability and inferring questions' true answers (i.e., the truths). However, these methods assume that a source has the same reliability degree on all the questions, but ignore the fact that sources' reliability may vary significantly among different topics. To capture various expertise levels on different topics, we propose three fine-grained truth discovery models-parametric probabilistic model (FaitCrowd), non-parametric probabilistic model and topical influence-aware model-for the task of aggregating conflicting data collected from multiple users/sources. These probabilistic models jointly model the process of generating question content and sources' provided answers to estimate both fine-grained expertise and true answers simultaneously. This leads to a more precise estimation of source reliability. Therefore, theses models demonstrate better ability to obtain true answers for the questions compared with existing approaches.

UR - http://www.scopus.com/inward/record.url?scp=84964734734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964734734&partnerID=8YFLogxK

U2 - 10.1109/ICDMW.2015.109

DO - 10.1109/ICDMW.2015.109

M3 - Conference contribution

AN - SCOPUS:84964734734

T3 - Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015

SP - 1556

EP - 1557

BT - Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015

A2 - Wu, Xindong

A2 - Tuzhilin, Alexander

A2 - Xiong, Hui

A2 - Dy, Jennifer G.

A2 - Aggarwal, Charu

A2 - Zhou, Zhi-Hua

A2 - Cui, Peng

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Ma F, Gao J. Probabilistic Models for Fine-Grained Truth Discovery from Crowdsourced Data. In Wu X, Tuzhilin A, Xiong H, Dy JG, Aggarwal C, Zhou Z-H, Cui P, editors, Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1556-1557. 7395859. (Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015). https://doi.org/10.1109/ICDMW.2015.109