Warehousing structured and unstructured data for data mining

L. L. Miller, Vasant Honavar, Tom Barta

Research output: Contribution to journalArticle

2 Scopus citations

Abstract

More data, especially unstructured data, is available to users than ever. There is so much data available that it is difficult for users to make use of their data in its raw form. To handle the diversity of data types, we have designed and prototyped a multidatabase/warehouse system. The system has been especially designed to facilitate the interaction of structured and unstructured data. The system makes use of object oriented views. The main features of the view mechanism, especially as they relate to textual documents, are presented in the paper. The system is designed to take target documents either from large repositories or from the Web. Issues for both sources of documents are examined in the paper. The paper also looks at how the view approach allows the interaction between the data taken from structured (e.g., relational), semistructured (e.g., object oriented) and unstructured (e.g. text) data sources. The warehouse support provided by the system is briefly examined and the paper concludes by looking at our approach to data mining and how the system will operate in the complete environment.

Original languageEnglish (US)
Pages (from-to)215-224
Number of pages10
JournalProceedings of the ASIS Annual Meeting
Volume34
StatePublished - 1997

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'Warehousing structured and unstructured data for data mining'. Together they form a unique fingerprint.

  • Cite this