Thematic Mapping with Nominal Class Data
Our purpose in mapping and transforming geographic datasets is often to discover and represent useful distinctions about places. Most of the time, we will be using data that has been collected for purposes different from ours, and the referencing systems used as attributes for each feature will incorporate very fine categorical distinctions, most of which are irrelevant for our purpose. This tutorial will examine some techniques for exploring, and transforming, and portraying data with categorical referencing systems.
This tutorial is also the first in a series about modeling with spatial data. One of the keys to being successful with modeling is to have a clear purpose in mind. There is really no point to dealing with data and mapping if you aren;t clear about what the critical distinctions are that you need to represent. This idea is discussed on the page, Modeling What's Important so in order to make this tutorial interesting, we will state our purposes:
Modeling Concepts and our Purpose
We are interested in the relationship of natural areas in the city near high density residential areas. Our ultimate purpose is to develop a model that might help us to find good places in the Boston Area to invest money to improve this relationship. YOu can think of this model as a logical construction with two Nouns: natural areas and dense residential areas. And some sort of Verb or Conjunction representing the relationship or lack of relationship between these two types of places. We will call this logical construction our Conceptual Model and the nouns and the verbs are our Concepts.
Deeper Reading
Download the sample dataset
Evaluating, Portraying and Discussing Critical Aspects of Data
A key part of this exercise is to investigate several datasets that may serve to represent the concepts form our model. As it happens, the modeling process usually begins with a lot of choices as to how we are going to represent the concepts in our model. We have many different datasets that may serve better or worse according to several criteria. IN the end, we may decide to use one set of data or another, or more likely we may decide to pull some information from one dataset and other information from another. In making these decisions, we are deciding to create or throw awy information, and to empasize information from one source and diminish other information. It is important that we are able to do this and to explain it, lest we be vulnerable to the old cliche: Garbage In, Garbage Out which effectively means, if you can't explain what you put into the model, then there should be absolutely no confidence in the result.
The page, Modeling Whats Important provides a rationale and a checklist for evaluating datasets with respect to thier utility for representing specific concepts. An important aim of checking various criteria from a datasets metadata, and making inferences about the apparent relationship of one dataset or another, is to be able to understand the inevitable biases in a dataset as they relate to the purpose we have for it. One very good exercise of all of this is to consider whether in using a dataset to represent your concept, to discuss cases where data will result in Errors of Omission and/or Errors of Comission SO we will take some care in this demonstration to discuss these things, as you should do when developing your own model.
Our first investigation will be relatively simple. We will begin with a layer of polygons representing Protected Openspace from the MassGIS. This will give us an opportunity to examine the various referencing systems used in the attributes of this table, to make portrayal layers that are tuned to bring out the particular distinctions that are important for us for the purposes of representing "Natural Areas". This will bring us through some practice exploring tables, making categorical portrayals, and selecing features by their attributes.
References
- Drawing Features to Show Categories Selecting Records form a Table
- Selecting Features by Attributes
- About Building an SQL Expression
- Displaying a Subset of Features in a Layer AKA setting the definition query for a layer.
- Joining Tables
- Changing a Layer's Text and Feature Descriptions
- Open the arcmap document in the sample dataset and appreciate the contextual framework of roads and physical features provided. No map should be made that does not have such a framework!
- Add the Mass GIS Protected Open Space datset as a layer.
- Examine the metadata for this layer (the osp.htm file in the massgis folder.) Evaluate its Purpose and Authority, its Time period of Content, examine the Logical Consistency of this layer with the City of Somerville official parks layer, and with the Aerial Photo. What do you think of the spatial precision of this layer?
- Now open the attribute table and look at the attribute definiitions in the metadata. How do you rate the appropriateness and precision of the categorical referencing systems used as they relate (or not) to the distiinctions that are important in our conceptual model
- Try to imagine some cases where this layer would err from our concept of Natural Areas in terms of Errors of Omission or Incompleteness where would this layer create errors in terms of Errors of Comission or the inapropriate indication of natural areas where there are, in fact, none?
- Just as an example, we may decide upon looking at this layer that there are two important distinctions in this layer that will serve our model. There are many types of openspaces that we would not consider to be Natural Areas for our purposes.
- Among the openspaces that may serve purpose associated with Natural Areas, some are publically accessible, and some aren't. In our model we may prefer to think of these types of areas as Public and Ambient natural areas.
- Use the definition query property of the parks layer to eliminate protected openspaces whose Purpose is inapropriate
- Adjust the layer symbology properties of this layer to highlight important distinctions in terms of public access.
- Practice combining legend classes
- Take a look at the technique of making new classes by mapping combinations of attributes.
- Adjust all of the layer name and heading, and symbol description information in this layer to reflect its Source, its Theme, and the categorical distinctions you are using. It is very important to adjust this legend information to help your clooeagues, collaborators and critics to understand what this layer is. Never, ever, simply leave these headings and labels as they are, with filenames and inscrutible attribute codes, unless you identify all of these in the caption for the map.
- Try to imagine the paragraph or two you would associate with this map which explains your evaluation of this dataset as to how it serves your purposes. YOu should probably describe what the data were intended to represent, and how you have decided to transform it to represent concepts for your model.
Transformation of Categorical Data with Lookup and Crosswalk Tables
There are two technical parts this tutorial. The first involves a technique for dealing with dataset that use categorical referencing systems that are much different from the sort that our model requires. This will usually be the case, since data collection is a very costly enterprise, and so classification systems used, intended to be versatile, can be more fine-grained that we might like. Our first experience with this will be with a layer of Protected Openspace from the MassGIS First we will look at a simple technique of reclassifying data.
In this next example, we will look at a dataset whose category schemes are more difficult to deal with through simply grouping them in the symbology editor. We will discover that the problem of reclassifying data according to attributes can be handled using a new table that has reflects for each of the land-use-type, has a corresponding description. This is useful forst of all for halpoing to add meaing to the reference codes that are used in the table, but we will also see how such a Lookup Table can help us to reclassify a given referencing system to more closely match the concepts in our own model.
References
- An Overview of Tables
- Adding Tables in ArcGIS
- Common Tables and Attribute Tasks
- Creating New Tables especially see "from text Files"
- About Joining Tables
- About Working with Excel Tables inArcGIS
Play with Lookup Tables
- Open the land use layer from massGIS
- Take a look at its metadata (the lus.htm file in the massgis folder)
- Consider the pros and cons of using the legend editor
- Take a look at the lu37.csv text file and imagine how trhis was made using the metadata, cutting and pasting into wordpad.
- Open this lookup table in arcmap
- Join th enew lookup table with the land use layer.
- Now make a thematic map with the massgis 37 class landuse categories
- Now lets add a column to this table to "crosswalk" their system to our concepts of public versus ambient open space.
- Before we can add a column to this lookup table we need to export it to a DBF table.
- Now we will add a new column to hod sescriptions for natural classes and we will select the classes that represent natural patches, and we will provide a name for this class using the lookup table.
Food for Thought
This is only the beginning. Of course, now you are faced with comparing these two ways of representing natural areas in the city with specific regard to our purpose. You may decide to use one or the other, or you may decide that you would like to combine features from each of these layers to make a hybrid representation. For now we will simply combine them graphically on or map. Most importantly of all, we should consider whether the data we have will serve well enough to represent our concepts, what the biases will be if we decide to use one set of reprewsntations or another, and we might also start to think about where we migh tlook for some data sources that are more ideal.
