"is a picture really worth a thousand words?": A case study on classifying user attributes on Instagram

Junho Song, Kyungsik Han, Dongwon Lee, Sang Wook Kim

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Because using social media has become a major part of people's daily lives, many of their personal characteristics are often implicitly or explicitly reflected in the content they share. We present a study of two personal characteristicsDage and genderDrelated to user engagement on Instagram that can be determined through the characterization of images and tags. We demonstrate the strong influence of age and gender on Instagram use in terms of topical and content differences. We then build age and gender classification models that yield F1 scores of up to 88% and 74% in the detection of age and gender, respectively, and that better characterize users by images than by tags. We further demonstrate the robustness of our models using a new set of test data, with which the models exhibit greater overall performance than human raters. Our study highlights that future research should look to exploit images to a greater degree because they complement text and there are many unexamined images with no embedded text available.

Original languageEnglish (US)
Article numbere0204938
JournalPloS one
Volume13
Issue number10
DOIs
StatePublished - Oct 2018

Fingerprint

Social Media
case studies
gender
social networks
complement
Datasets
testing

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

@article{9cae8771749e4e69ad9d737ea8e255ec,
title = "{"}is a picture really worth a thousand words?{"}: A case study on classifying user attributes on Instagram",
abstract = "Because using social media has become a major part of people's daily lives, many of their personal characteristics are often implicitly or explicitly reflected in the content they share. We present a study of two personal characteristicsDage and genderDrelated to user engagement on Instagram that can be determined through the characterization of images and tags. We demonstrate the strong influence of age and gender on Instagram use in terms of topical and content differences. We then build age and gender classification models that yield F1 scores of up to 88{\%} and 74{\%} in the detection of age and gender, respectively, and that better characterize users by images than by tags. We further demonstrate the robustness of our models using a new set of test data, with which the models exhibit greater overall performance than human raters. Our study highlights that future research should look to exploit images to a greater degree because they complement text and there are many unexamined images with no embedded text available.",
author = "Junho Song and Kyungsik Han and Dongwon Lee and Kim, {Sang Wook}",
year = "2018",
month = "10",
doi = "10.1371/journal.pone.0204938",
language = "English (US)",
volume = "13",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "10",

}

"is a picture really worth a thousand words?" : A case study on classifying user attributes on Instagram. / Song, Junho; Han, Kyungsik; Lee, Dongwon; Kim, Sang Wook.

In: PloS one, Vol. 13, No. 10, e0204938, 10.2018.

Research output: Contribution to journalArticle

TY - JOUR

T1 - "is a picture really worth a thousand words?"

T2 - A case study on classifying user attributes on Instagram

AU - Song, Junho

AU - Han, Kyungsik

AU - Lee, Dongwon

AU - Kim, Sang Wook

PY - 2018/10

Y1 - 2018/10

N2 - Because using social media has become a major part of people's daily lives, many of their personal characteristics are often implicitly or explicitly reflected in the content they share. We present a study of two personal characteristicsDage and genderDrelated to user engagement on Instagram that can be determined through the characterization of images and tags. We demonstrate the strong influence of age and gender on Instagram use in terms of topical and content differences. We then build age and gender classification models that yield F1 scores of up to 88% and 74% in the detection of age and gender, respectively, and that better characterize users by images than by tags. We further demonstrate the robustness of our models using a new set of test data, with which the models exhibit greater overall performance than human raters. Our study highlights that future research should look to exploit images to a greater degree because they complement text and there are many unexamined images with no embedded text available.

AB - Because using social media has become a major part of people's daily lives, many of their personal characteristics are often implicitly or explicitly reflected in the content they share. We present a study of two personal characteristicsDage and genderDrelated to user engagement on Instagram that can be determined through the characterization of images and tags. We demonstrate the strong influence of age and gender on Instagram use in terms of topical and content differences. We then build age and gender classification models that yield F1 scores of up to 88% and 74% in the detection of age and gender, respectively, and that better characterize users by images than by tags. We further demonstrate the robustness of our models using a new set of test data, with which the models exhibit greater overall performance than human raters. Our study highlights that future research should look to exploit images to a greater degree because they complement text and there are many unexamined images with no embedded text available.

UR - http://www.scopus.com/inward/record.url?scp=85054465519&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054465519&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0204938

DO - 10.1371/journal.pone.0204938

M3 - Article

C2 - 30289937

AN - SCOPUS:85054465519

VL - 13

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 10

M1 - e0204938

ER -