A Faster Algorithm for Truth Discovery via Range Cover

Ziyun Huang, Hu Ding, Jinhui Xu

Research output: Contribution to journalArticle

Abstract

Truth discovery is a key problem in data analytics which has received a great deal of attention in recent years. In this problem, we seek to obtain trustworthy information from data aggregated from multiple (possibly) unreliable sources. Most of the existing approaches for this problem are of heuristic nature and do not provide any quality guarantee. Very recently, the first quality-guaranteed algorithm has been discovered. However, the running time of the algorithm depends on the spread ratio of the input points and is fully polynomial only when the spread ratio is relatively small. This could restrict the applicability of the algorithm. To resolve this issue, we propose in this paper a new algorithm which yields a (1 + ϵ) -approximation in near quadratic time for any dataset with constant probability. Our algorithm relies on a data structure called range cover, which is interesting in its own right. The data structure provides a general approach for solving some high dimensional optimization problems by breaking down them into a small number of parametrized cases.

Original languageEnglish (US)
Pages (from-to)4118-4133
Number of pages16
JournalAlgorithmica
Volume81
Issue number10
DOIs
StatePublished - Oct 1 2019

Fingerprint

Fast Algorithm
Cover
Range of data
Data structures
Data Structures
Resolve
High-dimensional
Truth
Polynomials
Heuristics
Optimization Problem
Polynomial
Approximation

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Computer Science Applications
  • Applied Mathematics

Cite this

Huang, Ziyun ; Ding, Hu ; Xu, Jinhui. / A Faster Algorithm for Truth Discovery via Range Cover. In: Algorithmica. 2019 ; Vol. 81, No. 10. pp. 4118-4133.
@article{49c262e4f8e04beea7850963fecf2a11,
title = "A Faster Algorithm for Truth Discovery via Range Cover",
abstract = "Truth discovery is a key problem in data analytics which has received a great deal of attention in recent years. In this problem, we seek to obtain trustworthy information from data aggregated from multiple (possibly) unreliable sources. Most of the existing approaches for this problem are of heuristic nature and do not provide any quality guarantee. Very recently, the first quality-guaranteed algorithm has been discovered. However, the running time of the algorithm depends on the spread ratio of the input points and is fully polynomial only when the spread ratio is relatively small. This could restrict the applicability of the algorithm. To resolve this issue, we propose in this paper a new algorithm which yields a (1 + ϵ) -approximation in near quadratic time for any dataset with constant probability. Our algorithm relies on a data structure called range cover, which is interesting in its own right. The data structure provides a general approach for solving some high dimensional optimization problems by breaking down them into a small number of parametrized cases.",
author = "Ziyun Huang and Hu Ding and Jinhui Xu",
year = "2019",
month = "10",
day = "1",
doi = "10.1007/s00453-019-00562-z",
language = "English (US)",
volume = "81",
pages = "4118--4133",
journal = "Algorithmica",
issn = "0178-4617",
publisher = "Springer New York",
number = "10",

}

A Faster Algorithm for Truth Discovery via Range Cover. / Huang, Ziyun; Ding, Hu; Xu, Jinhui.

In: Algorithmica, Vol. 81, No. 10, 01.10.2019, p. 4118-4133.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A Faster Algorithm for Truth Discovery via Range Cover

AU - Huang, Ziyun

AU - Ding, Hu

AU - Xu, Jinhui

PY - 2019/10/1

Y1 - 2019/10/1

N2 - Truth discovery is a key problem in data analytics which has received a great deal of attention in recent years. In this problem, we seek to obtain trustworthy information from data aggregated from multiple (possibly) unreliable sources. Most of the existing approaches for this problem are of heuristic nature and do not provide any quality guarantee. Very recently, the first quality-guaranteed algorithm has been discovered. However, the running time of the algorithm depends on the spread ratio of the input points and is fully polynomial only when the spread ratio is relatively small. This could restrict the applicability of the algorithm. To resolve this issue, we propose in this paper a new algorithm which yields a (1 + ϵ) -approximation in near quadratic time for any dataset with constant probability. Our algorithm relies on a data structure called range cover, which is interesting in its own right. The data structure provides a general approach for solving some high dimensional optimization problems by breaking down them into a small number of parametrized cases.

AB - Truth discovery is a key problem in data analytics which has received a great deal of attention in recent years. In this problem, we seek to obtain trustworthy information from data aggregated from multiple (possibly) unreliable sources. Most of the existing approaches for this problem are of heuristic nature and do not provide any quality guarantee. Very recently, the first quality-guaranteed algorithm has been discovered. However, the running time of the algorithm depends on the spread ratio of the input points and is fully polynomial only when the spread ratio is relatively small. This could restrict the applicability of the algorithm. To resolve this issue, we propose in this paper a new algorithm which yields a (1 + ϵ) -approximation in near quadratic time for any dataset with constant probability. Our algorithm relies on a data structure called range cover, which is interesting in its own right. The data structure provides a general approach for solving some high dimensional optimization problems by breaking down them into a small number of parametrized cases.

UR - http://www.scopus.com/inward/record.url?scp=85064082201&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064082201&partnerID=8YFLogxK

U2 - 10.1007/s00453-019-00562-z

DO - 10.1007/s00453-019-00562-z

M3 - Article

AN - SCOPUS:85064082201

VL - 81

SP - 4118

EP - 4133

JO - Algorithmica

JF - Algorithmica

SN - 0178-4617

IS - 10

ER -