Computational linguistics for metadata building (CLiMB): Using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata

Judith L. Klavans, Carolyn Sheffield, Eileen Abels, Jimmy Lin, Rebecca Jane Passonneau, Tandeep Sidhu, Dagobert Soergel

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers' toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.

Original languageEnglish (US)
Pages (from-to)115-138
Number of pages24
JournalMultimedia Tools and Applications
Volume42
Issue number1
DOIs
StatePublished - Mar 1 2009

Fingerprint

Computational linguistics
Thesauri
Metadata
Painting
Labeling
Learning systems
Semantics

All Science Journal Classification (ASJC) codes

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

@article{9898cb1e73e7422193747d245b871d31,
title = "Computational linguistics for metadata building (CLiMB): Using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata",
abstract = "In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers' toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.",
author = "Klavans, {Judith L.} and Carolyn Sheffield and Eileen Abels and Jimmy Lin and Passonneau, {Rebecca Jane} and Tandeep Sidhu and Dagobert Soergel",
year = "2009",
month = "3",
day = "1",
doi = "10.1007/s11042-008-0253-9",
language = "English (US)",
volume = "42",
pages = "115--138",
journal = "Multimedia Tools and Applications",
issn = "1380-7501",
publisher = "Springer Netherlands",
number = "1",

}

Computational linguistics for metadata building (CLiMB) : Using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata. / Klavans, Judith L.; Sheffield, Carolyn; Abels, Eileen; Lin, Jimmy; Passonneau, Rebecca Jane; Sidhu, Tandeep; Soergel, Dagobert.

In: Multimedia Tools and Applications, Vol. 42, No. 1, 01.03.2009, p. 115-138.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Computational linguistics for metadata building (CLiMB)

T2 - Using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata

AU - Klavans, Judith L.

AU - Sheffield, Carolyn

AU - Abels, Eileen

AU - Lin, Jimmy

AU - Passonneau, Rebecca Jane

AU - Sidhu, Tandeep

AU - Soergel, Dagobert

PY - 2009/3/1

Y1 - 2009/3/1

N2 - In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers' toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.

AB - In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers' toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.

UR - http://www.scopus.com/inward/record.url?scp=59849109123&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=59849109123&partnerID=8YFLogxK

U2 - 10.1007/s11042-008-0253-9

DO - 10.1007/s11042-008-0253-9

M3 - Article

AN - SCOPUS:59849109123

VL - 42

SP - 115

EP - 138

JO - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

SN - 1380-7501

IS - 1

ER -