Assessing the certainty of locations produced by an address geocoding system

Clodoveu A. Davis, Frederico T. Fonseca

Research output: Contribution to journalArticle

48 Citations (Scopus)

Abstract

Addresses are the most common georeferencing resource people use to communicate to others a location within a city. Urban GIS applications that receive data directly from citizens, or from legacy information systems, need to be able to quickly and efficiently obtain a spatial location from addresses. In this paper we understand addresses in a broader perspective, in which not only the conventional elements of postal addresses are considered, but other kinds of direct or indirect references to places, such as building names, postal codes, or telephone area codes, which are also valuable as locators to urban places. This broader view on addresses allows us to work with two perspectives. First, in the ontological definition, modeling, and implementation of an addressing database that is flexible enough to accommodate the variety of concepts and address formats used worldwide, along with direct and indirect references to places. Second, in the definition of an indicator that is able to quantify the degree of certainty that could be reached when a user-given, semi-structured address is geocoded into a spatial position, as a function of the type and completeness of the available addressing data and of the geocoding method that has been employed. This indicator, which we call Geocoding Certainty Indicator (GCI), can be used as a threshold, beyond which the geocoded event should be left out of any statistical analysis, or as a weight that allows spatial analysis methods to reduce the influence of events that have been less reliably located. In order to support geocoding activities and the determination of the GCI, we propose a conceptual schema for addressing databases. The schema is flexible enough to accommodate a variety of addressing systems, at various levels of detail, and in different countries. Our intention is to depart from the usual geocoding strategy employed in commercial GIS products, which is usually limited to the average American or British address format. The schema also extends the notion of postal address to something broader, including popular names for places, building names, reference places, and other concepts. This approach extends Simpson's and Yu's Comput. Environ. Urban Syst., 27: 283-307, 2003 work on postal codes to records of any kind, including place names and loosely formatted addresses.

Original languageEnglish (US)
Pages (from-to)103-129
Number of pages27
JournalGeoInformatica
Volume11
Issue number1
DOIs
StatePublished - Mar 1 2007

Fingerprint

Geographic information systems
Telephone
Geographical Information System
Statistical methods
Information systems
GIS
place name
event
spatial analysis
statistical analysis
telephone
information system
citizen
indicator
resource
resources
modeling
code
method

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Geography, Planning and Development

Cite this

@article{c88c66e9809c423aa4fc971357ead9df,
title = "Assessing the certainty of locations produced by an address geocoding system",
abstract = "Addresses are the most common georeferencing resource people use to communicate to others a location within a city. Urban GIS applications that receive data directly from citizens, or from legacy information systems, need to be able to quickly and efficiently obtain a spatial location from addresses. In this paper we understand addresses in a broader perspective, in which not only the conventional elements of postal addresses are considered, but other kinds of direct or indirect references to places, such as building names, postal codes, or telephone area codes, which are also valuable as locators to urban places. This broader view on addresses allows us to work with two perspectives. First, in the ontological definition, modeling, and implementation of an addressing database that is flexible enough to accommodate the variety of concepts and address formats used worldwide, along with direct and indirect references to places. Second, in the definition of an indicator that is able to quantify the degree of certainty that could be reached when a user-given, semi-structured address is geocoded into a spatial position, as a function of the type and completeness of the available addressing data and of the geocoding method that has been employed. This indicator, which we call Geocoding Certainty Indicator (GCI), can be used as a threshold, beyond which the geocoded event should be left out of any statistical analysis, or as a weight that allows spatial analysis methods to reduce the influence of events that have been less reliably located. In order to support geocoding activities and the determination of the GCI, we propose a conceptual schema for addressing databases. The schema is flexible enough to accommodate a variety of addressing systems, at various levels of detail, and in different countries. Our intention is to depart from the usual geocoding strategy employed in commercial GIS products, which is usually limited to the average American or British address format. The schema also extends the notion of postal address to something broader, including popular names for places, building names, reference places, and other concepts. This approach extends Simpson's and Yu's Comput. Environ. Urban Syst., 27: 283-307, 2003 work on postal codes to records of any kind, including place names and loosely formatted addresses.",
author = "Davis, {Clodoveu A.} and Fonseca, {Frederico T.}",
year = "2007",
month = "3",
day = "1",
doi = "10.1007/s10707-006-0015-7",
language = "English (US)",
volume = "11",
pages = "103--129",
journal = "GeoInformatica",
issn = "1384-6175",
publisher = "Kluwer Academic Publishers",
number = "1",

}

Assessing the certainty of locations produced by an address geocoding system. / Davis, Clodoveu A.; Fonseca, Frederico T.

In: GeoInformatica, Vol. 11, No. 1, 01.03.2007, p. 103-129.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Assessing the certainty of locations produced by an address geocoding system

AU - Davis, Clodoveu A.

AU - Fonseca, Frederico T.

PY - 2007/3/1

Y1 - 2007/3/1

N2 - Addresses are the most common georeferencing resource people use to communicate to others a location within a city. Urban GIS applications that receive data directly from citizens, or from legacy information systems, need to be able to quickly and efficiently obtain a spatial location from addresses. In this paper we understand addresses in a broader perspective, in which not only the conventional elements of postal addresses are considered, but other kinds of direct or indirect references to places, such as building names, postal codes, or telephone area codes, which are also valuable as locators to urban places. This broader view on addresses allows us to work with two perspectives. First, in the ontological definition, modeling, and implementation of an addressing database that is flexible enough to accommodate the variety of concepts and address formats used worldwide, along with direct and indirect references to places. Second, in the definition of an indicator that is able to quantify the degree of certainty that could be reached when a user-given, semi-structured address is geocoded into a spatial position, as a function of the type and completeness of the available addressing data and of the geocoding method that has been employed. This indicator, which we call Geocoding Certainty Indicator (GCI), can be used as a threshold, beyond which the geocoded event should be left out of any statistical analysis, or as a weight that allows spatial analysis methods to reduce the influence of events that have been less reliably located. In order to support geocoding activities and the determination of the GCI, we propose a conceptual schema for addressing databases. The schema is flexible enough to accommodate a variety of addressing systems, at various levels of detail, and in different countries. Our intention is to depart from the usual geocoding strategy employed in commercial GIS products, which is usually limited to the average American or British address format. The schema also extends the notion of postal address to something broader, including popular names for places, building names, reference places, and other concepts. This approach extends Simpson's and Yu's Comput. Environ. Urban Syst., 27: 283-307, 2003 work on postal codes to records of any kind, including place names and loosely formatted addresses.

AB - Addresses are the most common georeferencing resource people use to communicate to others a location within a city. Urban GIS applications that receive data directly from citizens, or from legacy information systems, need to be able to quickly and efficiently obtain a spatial location from addresses. In this paper we understand addresses in a broader perspective, in which not only the conventional elements of postal addresses are considered, but other kinds of direct or indirect references to places, such as building names, postal codes, or telephone area codes, which are also valuable as locators to urban places. This broader view on addresses allows us to work with two perspectives. First, in the ontological definition, modeling, and implementation of an addressing database that is flexible enough to accommodate the variety of concepts and address formats used worldwide, along with direct and indirect references to places. Second, in the definition of an indicator that is able to quantify the degree of certainty that could be reached when a user-given, semi-structured address is geocoded into a spatial position, as a function of the type and completeness of the available addressing data and of the geocoding method that has been employed. This indicator, which we call Geocoding Certainty Indicator (GCI), can be used as a threshold, beyond which the geocoded event should be left out of any statistical analysis, or as a weight that allows spatial analysis methods to reduce the influence of events that have been less reliably located. In order to support geocoding activities and the determination of the GCI, we propose a conceptual schema for addressing databases. The schema is flexible enough to accommodate a variety of addressing systems, at various levels of detail, and in different countries. Our intention is to depart from the usual geocoding strategy employed in commercial GIS products, which is usually limited to the average American or British address format. The schema also extends the notion of postal address to something broader, including popular names for places, building names, reference places, and other concepts. This approach extends Simpson's and Yu's Comput. Environ. Urban Syst., 27: 283-307, 2003 work on postal codes to records of any kind, including place names and loosely formatted addresses.

UR - http://www.scopus.com/inward/record.url?scp=33847622632&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33847622632&partnerID=8YFLogxK

U2 - 10.1007/s10707-006-0015-7

DO - 10.1007/s10707-006-0015-7

M3 - Article

AN - SCOPUS:33847622632

VL - 11

SP - 103

EP - 129

JO - GeoInformatica

JF - GeoInformatica

SN - 1384-6175

IS - 1

ER -