Attribute domain discovery for hidden web databases

Xin Jin, Nan Zhang, Gautam Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

Many web databases are hidden behind restrictive form-like interfaces which may or may not provide domain information for an attribute. When attribute domains are not available, domain discovery becomes a critical challenge facing the application of a broad range of existing techniques on third-party analytical and mash-up applications over hidden databases. In this paper, we consider the problem of domain discovery over a hidden database through its web interface. We prove that for any database schema, an achievability guarantee on domain discovery can be made based solely upon the interface design. We also develop novel techniques which provide effective guarantees on the comprehensiveness of domain discovery. We present theoretical analysis and extensive experiments to illustrate the effectiveness of our approach.

Original languageEnglish (US)
Title of host publicationProceedings of SIGMOD 2011 and PODS 2011
Pages553-564
Number of pages12
DOIs
StatePublished - Jul 11 2011
Event2011 ACM SIGMOD and 30th PODS 2011 Conference - Athens, Greece
Duration: Jun 12 2011Jun 16 2011

Other

Other2011 ACM SIGMOD and 30th PODS 2011 Conference
CountryGreece
CityAthens
Period6/12/116/16/11

Fingerprint

Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Cite this

Jin, X., Zhang, N., & Das, G. (2011). Attribute domain discovery for hidden web databases. In Proceedings of SIGMOD 2011 and PODS 2011 (pp. 553-564) https://doi.org/10.1145/1989323.1989381
Jin, Xin ; Zhang, Nan ; Das, Gautam. / Attribute domain discovery for hidden web databases. Proceedings of SIGMOD 2011 and PODS 2011. 2011. pp. 553-564
@inproceedings{fc48552fbeb040a690faf03606bb0b9b,
title = "Attribute domain discovery for hidden web databases",
abstract = "Many web databases are hidden behind restrictive form-like interfaces which may or may not provide domain information for an attribute. When attribute domains are not available, domain discovery becomes a critical challenge facing the application of a broad range of existing techniques on third-party analytical and mash-up applications over hidden databases. In this paper, we consider the problem of domain discovery over a hidden database through its web interface. We prove that for any database schema, an achievability guarantee on domain discovery can be made based solely upon the interface design. We also develop novel techniques which provide effective guarantees on the comprehensiveness of domain discovery. We present theoretical analysis and extensive experiments to illustrate the effectiveness of our approach.",
author = "Xin Jin and Nan Zhang and Gautam Das",
year = "2011",
month = "7",
day = "11",
doi = "10.1145/1989323.1989381",
language = "English (US)",
isbn = "9781450306614",
pages = "553--564",
booktitle = "Proceedings of SIGMOD 2011 and PODS 2011",

}

Jin, X, Zhang, N & Das, G 2011, Attribute domain discovery for hidden web databases. in Proceedings of SIGMOD 2011 and PODS 2011. pp. 553-564, 2011 ACM SIGMOD and 30th PODS 2011 Conference, Athens, Greece, 6/12/11. https://doi.org/10.1145/1989323.1989381

Attribute domain discovery for hidden web databases. / Jin, Xin; Zhang, Nan; Das, Gautam.

Proceedings of SIGMOD 2011 and PODS 2011. 2011. p. 553-564.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Attribute domain discovery for hidden web databases

AU - Jin, Xin

AU - Zhang, Nan

AU - Das, Gautam

PY - 2011/7/11

Y1 - 2011/7/11

N2 - Many web databases are hidden behind restrictive form-like interfaces which may or may not provide domain information for an attribute. When attribute domains are not available, domain discovery becomes a critical challenge facing the application of a broad range of existing techniques on third-party analytical and mash-up applications over hidden databases. In this paper, we consider the problem of domain discovery over a hidden database through its web interface. We prove that for any database schema, an achievability guarantee on domain discovery can be made based solely upon the interface design. We also develop novel techniques which provide effective guarantees on the comprehensiveness of domain discovery. We present theoretical analysis and extensive experiments to illustrate the effectiveness of our approach.

AB - Many web databases are hidden behind restrictive form-like interfaces which may or may not provide domain information for an attribute. When attribute domains are not available, domain discovery becomes a critical challenge facing the application of a broad range of existing techniques on third-party analytical and mash-up applications over hidden databases. In this paper, we consider the problem of domain discovery over a hidden database through its web interface. We prove that for any database schema, an achievability guarantee on domain discovery can be made based solely upon the interface design. We also develop novel techniques which provide effective guarantees on the comprehensiveness of domain discovery. We present theoretical analysis and extensive experiments to illustrate the effectiveness of our approach.

UR - http://www.scopus.com/inward/record.url?scp=79959982304&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959982304&partnerID=8YFLogxK

U2 - 10.1145/1989323.1989381

DO - 10.1145/1989323.1989381

M3 - Conference contribution

SN - 9781450306614

SP - 553

EP - 564

BT - Proceedings of SIGMOD 2011 and PODS 2011

ER -

Jin X, Zhang N, Das G. Attribute domain discovery for hidden web databases. In Proceedings of SIGMOD 2011 and PODS 2011. 2011. p. 553-564 https://doi.org/10.1145/1989323.1989381