Address Geocoding with ArcGIS
Geographic Information Systems help us to associate information from various sources based on spatial references. Fundamentally, the spatial references that are required are X,Y coordinates in a documented geographic or projected coordinate system. Nevertheless, we often have useful information about locations that are referenced according to street addresses (e.g. 48 Quincy Street, Cambridge MA, 02138.) Transforming this sort of reference into a simple X and Y, is a process known as Address Matching, or Geocoding. Thanks to a nation-wide GIS database of streets and address ranges and GIS applications such as ArcGIS, the procedure for geocoding a table of 50 addresses is far simpler than the alternative of finding them yourself.
The US census bureau maintains a database of streets with address ranges which has been enhanced in many commercial geocoding products.
A good geocoding tool can tolerate expectable errors , but the process will inevitably have many types of inaccuracy that should be appreciated.
These days, address matching is embedded in many web search tools that return spots on a map as an answer to a web query that includes an address. Some web map providers such as Yahoo offer open interfaces for geocoding. Here is an example of an application from Juice Analytics uses Yahoo's geocoding interface to geocode an excel spreadsheet . When we are using geocoded addresses for research purposes it is critical to understand how the process works and they types of errors that can be expected. ArcGIS provides an interface that provides a decent level of control and transparency to this process.
Address-match geocoding requires a database of properly-formed addresses, a reference database of streets, and a set of rules for matching them. In ArcMap, this set of rules is known as a Geocoding Service. Within the GSD, we have a commercial national streets database product, ESRI Streetmap shared on the network as l:\public\geo\esridata_9\streetmap. ESRI Streetmap includes a ready-made geocoding service, so if you are working in the Design School, and your U.S. addresses with zip codes are formatted correctly in a dbase table, getting started with geocoding should be fairly easy.
References and Deeper Reading
- Geocoding in ArcGIS goes into all the nuts and bolts.
Our COnceptual Model
It is really not very interesting to go through an exercise n information transformation without a purpose. Today our purpose is to try to represent Coffeeshops and Laundromats in Somerville Massachusetts as best we can. We will see many ways of doing this, and we will try to evaluate them all, and choose the best one for (for example, exploring the distribution of income (median household income, aggregated at the Blockgroup Level).
The premise for the first part of part a is that we want to experience the joys and sorrows of Address-Match Geocoding. We will take a list of businesses in our study area, geocode and think critically about the results.
- Find the data
- Set up a geocoding service
- Geocode the businesses in our study area
- Evaluate the result.
Explore the Sample Dataset
- a text file named coffeelaund.csv from the Businesses folder. this file was gathered through the process described in the GSD manual page, US Business Directory.
- For reference, we have an excerpt of the detailed streets from ESRI's Streetmap USA. This is the shapefile and layer file for detailed streets in the sample dataset YOu should use the layer file to provide the symbolization for your streets. You can read about how to get your own information from streetmap in the GSD On-Line manual page: Beginning your GIS Database.
- In order to evaluate the geocoding results you may want to use the parcels layer found in the Somerville Group Layer.
Explore the Data
- Look at the coffeelaund.csv file using wordpad.
- Open it in ArcMap and look at it
- Check out the Local Streets Layer (in the streetmap group) and its attributes. Note especially thge Address ranges and zip codes.
Set up an Address Locator
The address Locator, more or less sets up a very complex and surprisingly flexible join condition between a properly formatted table of addresses and a properly formatted table of vector objects representing street segments (or you might think of these as block-eddges!) with their addresses.
- Open ArcCatalog
- Navigate to your Tools folder
- Right-Click and choose New->Address Locator
- In the first dialog box, for choosing the style of the geocoding service, choose U.S. Streets with Zone
- In the next dialog box, set the Reference Data property to point to your copy of the streets file. You can use the defaults for the rest of the blanks in this dialog.
- Check out all the options for your Address Locator. Espcially 'Side Offset'.
- Copy this address locator to your working directory.
Note that there is no way to set relative pathnames when referencing your streets data file. So expect to have problems with this, if you move your working directory!
If you are geocoding using streets from the on-line version of streetmap, you will find a ready made address locator in the on-line streetmap directory.
Geocode the Addresses in our study area
Now that we have set up the join, our address locator will allow us to take any table, with appropriately formatted address and zip fields, and associate each row in that table with a street segment in our reference database. Or at least it will try very hard to do this. But as you will see, it is not always successful. Sometimes it may be successful, when it might better have failed!
- Geocoding a table of addresses
- Rematching a Geocoded Featureclass See especially the section on interactive matching
- Geocode your file
- Try some Interactive Matching
Evaluate the Result
To understand geocoding you must understand the various kinds of error that result from the process:
- Almost all spots that result are, at best, on the right block but almost never fall into the correct property parcel.
- Many addresses can be matched in the wrong place based on errors in the street database.
- Many addresses are not matched at all because of inconsistencies between the address data and the street data.
If you want to really understand these errors, you may want to do a systermatic comparison of your geocoded addreseese with the provided property parcel map, or the locations of the businesses as indicated by the latitiude longitude fields (hint: use ArcView's "Tools->AddXY Data" procedure.)