Unsupervised discovery of drug side-effects from heterogeneous data sources

Fenglong Ma, Chuishi Meng, Houping Xiao, Qi Li, Jing Gao, Lu Su, Aidong Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Citations (Scopus)

Abstract

Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.

Original languageEnglish (US)
Title of host publicationKDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages967-976
Number of pages10
ISBN (Electronic)9781450348874
DOIs
StatePublished - Aug 13 2017
Event23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017 - Halifax, Canada
Duration: Aug 13 2017Aug 17 2017

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
VolumePart F129685

Other

Other23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017
CountryCanada
CityHalifax
Period8/13/178/17/17

Fingerprint

Public health
Drug products
Labels
Industry

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Cite this

Ma, F., Meng, C., Xiao, H., Li, Q., Gao, J., Su, L., & Zhang, A. (2017). Unsupervised discovery of drug side-effects from heterogeneous data sources. In KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 967-976). (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Vol. Part F129685). Association for Computing Machinery. https://doi.org/10.1145/3097983.3098129
Ma, Fenglong ; Meng, Chuishi ; Xiao, Houping ; Li, Qi ; Gao, Jing ; Su, Lu ; Zhang, Aidong. / Unsupervised discovery of drug side-effects from heterogeneous data sources. KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2017. pp. 967-976 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).
@inproceedings{0f6acbcf65014a75a0bdf80d6a1c9a95,
title = "Unsupervised discovery of drug side-effects from heterogeneous data sources",
abstract = "Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.",
author = "Fenglong Ma and Chuishi Meng and Houping Xiao and Qi Li and Jing Gao and Lu Su and Aidong Zhang",
year = "2017",
month = "8",
day = "13",
doi = "10.1145/3097983.3098129",
language = "English (US)",
series = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",
pages = "967--976",
booktitle = "KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

}

Ma, F, Meng, C, Xiao, H, Li, Q, Gao, J, Su, L & Zhang, A 2017, Unsupervised discovery of drug side-effects from heterogeneous data sources. in KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. Part F129685, Association for Computing Machinery, pp. 967-976, 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, Halifax, Canada, 8/13/17. https://doi.org/10.1145/3097983.3098129

Unsupervised discovery of drug side-effects from heterogeneous data sources. / Ma, Fenglong; Meng, Chuishi; Xiao, Houping; Li, Qi; Gao, Jing; Su, Lu; Zhang, Aidong.

KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2017. p. 967-976 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Vol. Part F129685).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Unsupervised discovery of drug side-effects from heterogeneous data sources

AU - Ma, Fenglong

AU - Meng, Chuishi

AU - Xiao, Houping

AU - Li, Qi

AU - Gao, Jing

AU - Su, Lu

AU - Zhang, Aidong

PY - 2017/8/13

Y1 - 2017/8/13

N2 - Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.

AB - Drug side-effects become a worldwide public health concern, which are the fourth leading cause of death in the United States. Pharmaceutical industry has paid tremendous effort to identify drug side-effects during the drug development. However, it is impossible and impractical to identify all of them. Fortunately, drug side-effects can also be reported on heterogeneous platforms (i.e., data sources), such as FDA Adverse Event Reporting System and various online communities. However, existing supervised and semi-supervised approaches are not practical as annotating labels are expensive in the medical field. In this paper, we propose a novel and effective unsupervised model Sifter to automatically discover drug side-effects. Sifter enhances the estimation on drug side-effects by learning from various online platforms and measuring platform-level and user-level quality simultaneously. In this way, Sifter demonstrates better performance compared with existing approaches in terms of correctly identifying drug side-effects. Experimental results on five real-world datasets show that Sifter can significantly improve the performance of identifying side-effects compared with the state-of-the-art approaches.

UR - http://www.scopus.com/inward/record.url?scp=85029026084&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029026084&partnerID=8YFLogxK

U2 - 10.1145/3097983.3098129

DO - 10.1145/3097983.3098129

M3 - Conference contribution

AN - SCOPUS:85029026084

T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

SP - 967

EP - 976

BT - KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -

Ma F, Meng C, Xiao H, Li Q, Gao J, Su L et al. Unsupervised discovery of drug side-effects from heterogeneous data sources. In KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery. 2017. p. 967-976. (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining). https://doi.org/10.1145/3097983.3098129