Relational Procedures in ArcGIS
In this lab, we will explore the fundamental relational database procedures available inthe ArcMap interface. These procedures: Sorts, Selections, Transformations, Summaries, Joins are at the heart of the ability of GIS and relational database management systems (RDBMS) to create new information by associating data from different sources. These operations are very well described in Using ArcMap 9.0 Chapter 10
For the next hour and half, you are a reporter for the Boston Globe. You are called upon to investigate a rumor that vast amounts of land in the neigborhoods of Allston and Brighton are being bought up by out-of state interests. Data-savvy reporter that you are, you are able to obtain property tax records for the neighborhood in question, and use relational database procedures to sort, sift and re-organize the data, and combine it with national data to see if, indeed there is a pattern.
|
| We begin with a table of information about property, owners and taxes from the Boston Tax Assessor and a table of 3-digit zipcodes from the US Census Bureau. By applying numeric and text-based transformations to some of the atrributes, we can find new patterns in the value of parcels and the geography of tax bills. Using the relational database procedures of summarize the tax assessor information according to zip-code. Then the summary results can be associated with the geography of zip-coeds through a relational join procedure. Thus we are able to associate information about where tax bills are sent with the geography of zipcodes to create new information about where tax bills are sent. |
It is important to keep the question in mind: Is there a significant amount of land in allston and brighton that is owned by out-of-state people?
We will pay particular attention to the logic of tables that is assumed by the tools that we are using, and how this logic applies to what we are doing. First, we should think about how the tax assessor's data performs as a representation of properties and the locations of owners. Then we have to think about how the procedures we are using serve to represent the real-world associations between the geography of the assessor's database and the geography of US zip codes.
We also should be wary of whether we are using relational tools appropriately. Some very important concepts in this regard are discussed in Using ArcMap 9.0 Chapter 10 in the section on Table Joins Page 338.
References:
- Using ArcMap 9.0 Chapter 10
- ArcMap OnLine Help should be your close companion! References Highlighted Like this can be found in the online help index.
Outline of the Flow of this Lab
- Introduce theAllston & Brighton property databases
- Explore property values with sorts and queries and numeric transformations
- Use a table summary explore national patterns in absentee owners
- Use a table Join to associate information on property owners' zipcodes with a national map.
Introducing the Sample Data
Download the sample dataset relational.zip and extract its contents to a folder in your computer's c:\temp\directory. These data comprise a model of the properties in Allston and Brighton Masssachusetts in 1999, and the Building permits issued that year, as well as the locations of the zipcodes in the United States.
- albripar99.shp A shape file representing the 1999 property parcels of Allston and Brighton from the Boston Tax Assessing Office. We have some minimal metadata for this data. Click Here.
- albri_perm.dbf a DBase table representing the Building permits issued for Allston and Brighton properties in 1999. There is no metadata for this table, but we can guess what most of the attributes are.
- st_class_lut a table that has several descriptions for each of the 200 land use codes used by tax assessors in Massachusetts.
- Zip3.shp this is shape file with a polygon representing groups of zipcodes sharing a common first therr digits. This is from the ESRI Maps and Data CDs that you can find on the shared directory: L:\public\geo\esridata.
- States.shp this is a map of the states from the ESRI Data CDs.
Explore property values with sorts and queries, and transformations
What can we learn from a database of parcels?
- Look at online help for Sorting Records in Tables and find information on the 5 most pricey parcels in the database.
- Read the help on Selecting Records in Tables.
- Select the top 5 parcels and look at them on the map.
It should come as little surprise that the most expensive parcels on the map are also some of the largest. To find the most pricey parcels in the area we shpould normalize, right? Right!
Transforming Numeric Attributes
- Add a new column to your copy of the parcels attribute table and calculate its value to be a normalized figure representing price per square foot. You may use the SumofTOtalValue divided by the LOT_SIZE to get a value of the parcel per square foot.
- YOu may have a problem if any of the numbers you are trying to divide by are zero.
- In order to get around this, you can use an attribute query to select all of the rows where your area column is not zero.
- In the Select By Attributes window, this query would look would
look like:
"LOT_SIZE" <> 0.
- WIth the non zero rows selcted, your calculation will operate only on the selected rows -- every row that does not have a zero lot size.
- Now find the 5 most expensive parcels in terms of value per square foot..
- Select an expensive parcel and find it on the map, see Selecting Features.
You should remember how to add a column and calculate its value from the last exercise. If you don't, check arcGIS online help topics under Tables, Adding and Deleting Fields...ArcMap and Tables, Calculating a Field Values You will prbably want to use a Number field of type Double with a precision of 10 and a scale of 2. This will let you have 2 decimal places in a number of 10 digits. This is the last time we will provide detailed instructions about adding a new field and calculating its values.
Use a table summary to look at the national distribution of people who pay property taxes on Allston and Brighton Parcels
Some of the land in our study area is apparently owned by people who live outside of Massachusetts. In this exercise, we will use table summaries and joins to make a map of the US that shows where these people live.
Our map of three-digit zipcodes provides a link between the parcel information and a national spatial framework of zipcodes. But before we can associate our parcels with the referencing system of three-digit zip codes, we have to perform a transformation -- parsing out the first three digits of the mailing-address zipcode. It is interesting to note that although zipcodes are made up of numerals, they should not be operated on using the logic of numbers. It does not make sense to add two zipcodes. But using string operators, we can take advantage of the fact that the first three places in a zipcode denote a more general spatial hierarchy.Transforming a Character-String References
- Create a new field named zip_3dig in your parcels attribute table.
- Make it a String type field with a length of three characters.
- Calculate its values using the Left string operator. Here is a picture of this calculation.
Now that we have transformed the zipcode refernences of our parcels to match that used in our zipcode map, we can join the thwo tables together. But first we must normalize our property data to create a new table with a unique row per zipcode, as explained in Using_ArcMap.pdf Pages 337 and 338.
- Read the help on Tables, Summarizing Data in
- Make a new summary table showing the total value of the parcels for each unique value in the column "MAIL_ZIP." Be sure to save this table in your assignment2 folder, using the Specify Output Table blank at the bottom of the Summarize dialog. Of course, you will want to give this table a name that you will remember later -- like 'par_zip.dbf'
You should add this table to your map when it is finished. Note that you may not see it in your table of contents unless the Source Tab is selected at the bottom. Also, remember that when your table of contents is displaying source information, you can't rearrange your layers. To rearrange the order of layers, the table of contents should be set to Display mode.
Use a Table Join
Now comes our first table join! Add the zip3.shp shape file to your map and look at its attribute table. We want to know, for each of these zipcodes, how many parcels are owned by people in any of these zips. If we can join the information from our zipcode summary table to this table, then we will have new information on the locations of these owners.
- Open your parcel zipcode summary table and examine its attributes. Does it make sense?
- Which zipcodes have the greatest counts of parcels, which have the least?
- >Open the zip3 attribute table. Examine its columns. Read the online help topic on Tables, Joining And create a join on the zip3 table using the 'zip3' column to match with the 'zip_3dig' column from your summary table. Now see how the columnns from the parcel summary table have been joined to the zip3 table?
- The join we created is known as an Outer Join which preserves the records for zipcodes whether ot not there was a matching record in your summary table. This is why you have so many zip codes with null values for the count of parcels. Remove this join and go through the join procedure again but choose Advanced options and change them so that the resulting table will only preserve the records that match.
- Do a sort on this table that would allow you to see the zipcodes in order of the COUNT of parcels per zipcode, and scroll down to see the state with the second largest number of parcels owned. What state is it?
- To preserve the result of this join, you may right-click on your newly joined layer, and choose Data->Export
Thinking Critically about this model
We begin with a formal question about the world:
- How much property in Allston-Brighton is owned by Parties outside of Massachusetts? Are these owners concentrated in some specific place?
We then take a look at the data resources that may serve to represent the aspects of the world that we are interested in:
- Records in 1999 Tax Assessor Database represent Property Parcels and their square footage
- The Mailing ZipCode of the Owner in the same database represents The location of the owner.
- THe 3digit ZipCode map represents the Geography of zipcodes (which ones are near eachother
- Transformation of the mailing zipcode from 5 to three digits permits us to associate the parcel zipcodes with the zipcode database
- for the final analysis a map lets us evaluate the answer to the question.
Does our map answer the question? Is the map 100% accurate? Is it a 'true' representation of the condition we were asking about? In what ways is it flawed? How can we expect these flaws to affect our result? When we say that there are 2,500,000 square feet of proerty owned by entities in philadelphia and Fairfax virginia, do you think that this is an overestimate or an underestimate? How well does the tax assessor data represent the amount of land? How well does it represent the locations of owners? Do we loose or gain anything by transforming the zipcodes from 5 to three digits? Can you think of any ways that we could improve this analysis as an answer to the original question? Can you imagine aby better data? Any associative or transforming procedures that would better represent the relationships of interest?
Ideas for Individual Exploration
If you want to practice using relational procedures to answer other questions, consider doing some summaries of building permit information per parcel and joining this back to the parcel map to see if there are patterns in the amount of money being spent on building improvements.


