Analytic Techniques

Raster GIS Tutorial (Earth Shelter)

This tutorial serves several purposes: First, it demonstrates a methodology for turning questions and intentions regarding the real world into conceptual models; how the concepts in a model may be represented with data and functions in a data model; and how a data model may serve as a laboratory for experimentation. Most importantly, this process will end with a mentality and a method for understanding and discussing the level of confidence we have in the model, and for evaluating the utility of a model, either as a means of generating useful infomration about a place and alternative futures, or at least as a means of understanding something about the challenge and potential for making useful models of the data-world that help us to know useful things about the real world.

This slideshow is currently under construction!

On a secondary level, this tutorial takes us through several layers of technology: An understanding of raster data structures and the functional components and grammar of map algebra, that transform and derive associations from raster layers, cells, zones and neighborhoods -- to represent spatial relationships among concepts. Finally, we will look at some ways theat map algebra functions can be linked together to make elaborate chains of logical operations that may be tweaked and re-run to perform experiments for investigating alternative strategies for underswtanding or changing the landscape.

Tutorial Dataset


Purpose and Question

In beginnning any exploration of models, it is important to have a clear statement of purpose. Without a clearly stated purpose it is pointless to try to evaluate anything.

Power in the Landscape: The Pilgrim Pueblos Project

The rising cost of fuel and the problems caused by burning fossil fuels and nuclear power have led us to an increased appreciation for renewable energy, and in particular passive solar design. After some research, we have determined that some sites are better for passive solar homes than others. We want to buy up select sites around the country to build housing developments that will have the greatest passive solar potential. We plan to create a system that will be national in scope. Ew will employ a Mashup similar to HousingMaps.com, which will automatically process real-estate parcels that come on the market. We will take the location of the parcel and use data of national extent to flag filter and flag parcels that have high earthshelter potential. Flagged parcels will be examined more closely by a trained map interpreter befor being recommended by a site-visit and potential offer.

We begin with a few simple criteria for establishing the potential of a site:

Conceptual Model

We seek sites having the following properties

  • Building Sites must Have Sufficient Slopes to take advantage of eath-sheltered design and passive cooling in the summer.
  • Slopes should Be More or Less South-Facing for best solar heat gain in the winter.
  • Should Have Forested Areas Upwind in order that trees can provide shade and slow down the winter winds
  • Building sites that Are Directly Up-Slope from Drainage Features will not be rated so highly.
  • But building sites Having Potential Water Views will recieve higher marks.
  • Building sites that have High Accessibility to Commercial Areas will be flagged for special scrutiny (for our more urbane customers)

Note that each of the terms highlighted in Bold in the conceptual model involves a term of fact, e.g. Commercial Areas and a term of relationship, e.g. High Acessibility. Part of our task will be to find data to represent the static facts, and procedures to represent the relationships in theis model. This arrangement of data and procedures will be our Data Model Our data model becomes a laboratory for experimentation when we alter aspects of the facts and the relationship and think logically about the impacts of various decisions.

Understanding of Models, and Choice of Errors

Part of our goal is to understand how well we can build a model that will work on a national scale, using the concept of a mashup (see HousingMaps.com. This mashup would automatically find listings of real-estate for sale, and then would evaluate each of them for potential for one of our developments, and would flag potential properties for further on-site evaluation. This will requuire somewhat consistent data that is national in scope. But before implementing this national program, we need to develop a pilot test case using the best quality local data that we can find. This calibration dataset will allow us to evaluate the sorts of error we may be facing when we try to buid a model with coarser-grained national data.

Closely examining both models in the same pilot area, will help us develop a level of confidence in the national model. It is important to tink about two important types of error. First is Errors of Omission, which would be a failure to identify a site in our data model as having potential, when on the ground, it actually does. We also can expect Errors of Comission, in which we identify sites in our data model as having potential, but investigation on the ground reveals that they don't in reality, meet our criteria. As it happens, in the implementation of the model, the decisons we make will lead to more or less of each of these types of errors. Choosing which type of error you would rather have is a key element in the design of data models!

Because our model is the first, automated, stage in a site evaluation process, we would rather have a model that is biased toward errors of comission. The second stage of evaluating sites is to use a trained human being to evaluate sites that are recommended by our model. If our model fails to flag sites that possibly have potential, then our analyst will not have as much work, but we may lose an opportunity to examine a site which may, in reality, have potential. The feedback from the analyst about what sites are flagged, either rightly or wrongly, will help us to fine-tune the model over time.


At each step in this procedure we will create functions that transform input data into a scheme of logical values that are driven by our purpose. Some of our criteria will be derived by looking at associations among our transformed layers. At eac step is is crucial to examine the outputs that are generated to make sure that they make logical sense based on an examination of the topographic map and aerial photograph. It is very easy to make a mistake that will generate a ridiculous result. THe absurdity of the result may be obvious when we look at it directly, however it will be very difficult to figure out that we have made a mistake once we have combined the erroneous layer with other steps of our analysis. This dtep-by-step evalustion of our work wil lalso be a good time to think about which factors of data quality or decisions that we make will be critical factors leading to errors of omission and comission.

Examine a Discrete Value Raster

Cartographic Modeling procedures make a lot of use of Raster Layers, that represent evenly-sapced, congruently-defined locations on the ground as Cells. In a single layer, each cell is tagged by a Value, which may be used to disctiminate various different discrete types of locations known as Zones; or surfaces that vary from cell to cell, in Continuous fashion. The regular relationships among cells allow for many powerful ways to create and use logical relationships among locations and their properties, as we shall see.

The New ENgland Gap Vegetation (Gap_Veg) layer is a discrete value raster. There are several ways to evaluate the raster dataset. For one thing, it has metadata that can be found in its folder (or click here to see the metadata) But even without any metadata at all, we can learn a lot about this layer by examining its properties and its logical consistency with other layers.


MassGIS land cover layer, or the USGS National Land Cover Dataset, the USGS Hydrography Layers, or the Massachusetts Department of Environmental Protection wetlands layers, in terms of their fitness for identifying potentially scenic water features.

Examine a Continuous Surface Raster and some Surface Functions

The Digital Elevation Model, DEM is an example of a Continuous Value raster. Each cell value is an observation of the height of that cell. Obviously, no matter how precise you want to get spatially, you could measure a different height. So the precision of this raster is an critical question to investigate. Without knowing anything about the DEM, we can guess that fluctuations of elevation smaller than the cell size are not well represented. The DEM provides, at best, a general idea of the elevation surface in our study area. Because of the regular arrangement of cells, certain things like Slope and Aspect can be calculated by looking at the other cells in the 3x3 cell Neighborhood of the cell in question. Naturally, the precision of the new slope and aspect layers is a function of the precision of the DEM. We may be able to get a sense of the utility of the slope and aspect layers by looking at the results on top of our familiar USGS Topographic map.

This demonstration will also introduce ad-hoc use of tools in the ArcGIS toolbox.



  1. Take a look at the USGS Digital Elevation Model.
  2. Use the information tool to measure some heights. Note that the values of the cells have a lot of decimal places. Looks precise, but...
  3. Determine the Cell Size.
  4. Look at the DEM on top of the USGS quad map. Zoom into a landform on the topo map and flick the DEM on and off.
  5. It appears as if the DEM is als good a representation of terrain as the topo map. Maybe better. This layer is definitely not adequate for planning the precise location of a building foundation, but it may be useful in quickly locating areas of generally steep slope and generally south-facing aspect.
  6. Find the Aspect Tool using the Search For Tools option in the geoprocessing menu.
  7. Open the Aspect dialog and fill in the blanks to create a map of slope direction.
  8. Admust the Display properties to make it 40% transparent.
  9. Look at this layer over the USGS map abd shaded relief and judge whether it makes sense.
  10. Do the same with a slope calculation.

Making a Model to Combine and Save and Strings of Data and Operations

With the Apsect operation we carried out in the previous demonstration, we have seen how a geoprocessing tool can be used in an ad-hoc fashion. Tools take input data, have settings that we can set to control how things are processed, and they produce output datasets. You could imagine how the output of one tool may be the input of another, and how the settings of individual tools might be saved so that we can re-run a chain of procedures, altering some settings, without having to set up several tools again and again. To do this we will learn how to save our Workflows as Geoprocessing Models Like Map Documents, models can be set up with relative pathnames so that they can travel around with our data and be explored, re-run and ajusted by our collaborators.



  1. Create a new folder within the sample dataset folder to hold your tools, models and maps. Name it with your unique user name.
  2. Within your folder make a scratch folder to hold the intermediate results of your models and create a data folder to hold new data created by your models (data that you will want to store persistenly after your model has finished. Create a Tools Folder
  3. Click Geoprocessing->Options to allow models to Overwrite the results of Geoprocessing Operations
  4. Use the Environment button on the Geoprocessing Menu to set the
      General Settings
      • workspace and scratch workspace settings to your own scratch folder.
      • the analysis extent to be the extent of the layer T_m_Clip
    • Raster Analysis setting for Analysis Cell Size to be 10 meters
    • Output Coordinate System: make it Massachusetss State Plane, NAD 83 Meters.
  5. Right-click in the toolbox window to create a new toolbox for your models. Right-click on this toolbox and check the properties to see that the new toolbox was created in your user folder.
  6. Create a new Model in your toolbox, and open it for editing
  7. Drag digital elevation model raster into your new geoprocessing model.
  8. Find the slope tool and drag it into the model
  9. Double-click the box for the slope tool and fill in the blanks. Note that the Icon for the DEM layer in the model has a blue icon. Note that the output data set is being placed into the scratch workspace folder. We can give it a reasonable name.
  10. Use File->Model Properties to set the model to use Relative Pathnames
  11. Run the model
  12. Right-Click on the slope output layer and choose Add to Display to see it on the map.

Exploring a Ready-Made Model and Reclass and Map Algebra Functions

The fact that models can be made in advance and re-used is a great advantage in teaching, since I can prepare a model and demonstrate it in class without having to spend a lot of class time fiddling with a lot of settings. For the next demonstration, we will examine a model that has already been prepared and packaged with the sample dataset. The next few demonstrations will discuss parts of the Earth_Shelter model in the pbc_tool toolbox. The first phase of this model adds Reclass functions to convert the different scales of the slope and aspect maps into a common scale. Then we will use a Map Algebra statement to calculate a new layer whose cells values are calculated as a simple function of the cell values of two input maps. This combination of operations will produce a new raster that ranks each cell in the study area on a scale of -2 through +2 according to the compatibility of the terrain with our values for steep south facing slopes.

In this demonstration we reveal a special cell value, named NoData no data is a cell value that is used as blank space. In some functions, like Map Algebra The blank areas of NoData, prevent analysis from happening.



  1. Find the pbc_Earthshelter toolbox in your toolbox panel or add it.
  2. Right-click on the model named 0. Sample Model and choose Edit to see and alter the contents of this model. Note that simply double-clicking on it, or choosing open, only shows you the parameters of the model.
  3. Take a look at the settings Reclass function for the Slope Map. Note how all possible values for sloipe are calculated as values 2: Prefered, 1: Good, 0: Possibly OK, and NoData: Absolutely not. representing our value with respect to choosing a site for an earth sheltered house. Values outside of our range of desirable slope and aspect are set to NoData which means that no matter what the values may be in our other layers, cell locations that have a value of NoData in any layer will no be considered. In this model, you can think of NoData as representing No-Build.
  4. Take a look at the reclass function for the Aspect map. Note that areas that have north-facing slope are reclassed to NoData, and other values for aspect are assigned values in the dsame scale.
  5. Now lets look at the Raster Calculator function that brings the Slope Values and the Aspect Values together. Think about what this expression is doing. It creates a new raster layer and calculates the value of each cell to equal the sum of the cell values of the reclassed Slope and Aspect rasters, divided by 2.
  6. Wherever one of the cells in either layer is NoData, the cell in the output raster is NoData.
  7. We want to let Aspect carry double value relative to slope, so we will change this expression to ((2 * aspect_val) + slopeval) / 3.
  8. Right-click the Single Output Map Algebra function and choose Run


It is important to take a break here and understand the new information we have created. It is very easy to make a little mistake in the settings of your functions, that can have very important consequenses in the model. If you don't check each step, you can easily fold a bad mistake into several other steps, wherepon it will be very difficult to figure out where the problem is (if you ever discover it al all.) Aside from this, even if the functions operated as expected, it is important to see whether the result makes logical sense vis-a-vis other data.

  1. Open the new earth_val layer and give it a shadeset that includes a ramp of color from red to green. Make the layer 40% transparent.
  2. Overlay this layer on top of the USGS Quad Maps. Does it appear to make sense?
  3. Use the Information tool to understand the values for each cell. Do they jibe with your understanding of what should have happened in the Map Algebra expression?
  4. Examine the output of this map for logical consistency with the USGS Quad Maps.

Converting Vector Features to Raster

Now that we have found places that have landform capability to support an earth sheltered house we need to narrow our focus. We are interested especially in land that is forested or that is already residential. Later on, we may add more criteria, but this is a start. This will lead us to explore two more important activities of cartographic modeling: conversion of vector features to rasters and a new way of using the map algebra function. As with many of the datasets we have, the MassGIS Land Use data set is a vector feature class, not a raster. Since many cartographic modeling functions require their inputs to be rasters, we must convert them through a process called Sampling. When doing this, we have to keep an eye on the cell size and what we are losing when we trade polygons with vertices to cells.



  1. Turn on the Land Use layer from MassGIS.
  2. Which of the polygons on this map meet our criteria for land use?
  3. It is very important that you understand the data you are using and not make assumptions.
  4. Examine the attribute table and Read the metadata
  5. In the Earthshelter model, open the Polgon to Raster function and examine its settings, especially the Cell Size and Field
  6. Run the Polygon to Raster function.
  7. Examine the raster result with regard to its original vector polygons. Have we lost anything important?
  8. Use the information tool to click around the new Land Use raster and observe the values for various cells. It is useful here to consider how the choice of cell size has resulted in the loss of information about the precise edge of the polygons. Do you think that the resulting error is going to be critical in terms of the overall utility of the model?
  9. Now run the reclass command to convert land use to land use value
  10. Note that land use zones that are not assigned to a specific value, are converted to 0 since the existing land use of a site may enhance its desirability, but is not constitute a deal-breaker in the site selection process.


Once the Sample Model has been run all the way through, we should take a look at the result to see if it makes sense, and reflects our values. I will call your attention to an interesting thing that happens with raster elevation models. Use View->Bookmarks to go to Great Hill, in acton. Notice first that the model seems to make sense with regard to the aspect and the distance from the stream. But notice the horizontal syrations that are blanking out the slopes along the hill. If we examine the compoent layers, we can see that this apparent error of omission is a result of the results of the slope function. Apparently the elebation model shows a horizontal step on the hillside, that is not reflected in the USGS contour map. The fact that these striations are mostly east-west on our map arouses some suspicion about whether these may be artifacts of the process of creating elevation models. This sort of error isn't something that should cause us to doubt the entire model, but it his is something we should keep in mind when thinking about whether our data are as good as they ought to be or whether we ought to look for a better elevation model.

Thinking about Relationships

Lets back up now and think formally about Modeling We have developed a concept of sites that are propitious for earthsheltered housing. Our conceptual model so far includes three concepts of fact:

  1. Slopes steep enough for earthshelter
  2. Aspect for southern exposure
  3. Land Use Forested or existing Residential

Our model also includes a concept of Relationship:

  1. Juxtaposition of propitious factors occurring in the Same Location

You can see that this association among facts is simulated in the Map Algebra procedure in the model. This procedure examines each location in a list of layers and produces a new layer whose cell values are each 0 a function of juxtaposed cells in the input layers. According to Tomlin's Map Algebra notation, this sort of association of cells is a Local Function. Tomlin has several other classes of associative procedures in his Cartographic Modeling language: Incremental Functions, Focal (Neighborhood) Functions and Zonal Functions YOu can check out these illustrations to see how these work.


Incremental Functions: Distance, Movement Across the Landscape

A Local relationship describes two facts corresponding to the same location. Lets think about another relationship we may want to model. For example, in our investigation of earthshelter sites, we can see that there are many areas that our model designates as propitious for earthsheltered housing, which are actually on bluffs that are up-slope from drainage features such as streams or open water. This is a problem not only of erosion but also in terms of the long-term stability of the site, which may be involved in a landslide. We would like our model to eliminate the worst of these and to rank other sites as lower in value if they are too near water bodies. so that we don't have to waste our time and money examining these sites that are more than likely unbuildable. So we have a new concept of fact: drainage feature, and a new concept of relationship: Proximal To There are several ways we could model this relationship. The simplest is the concept of Euclidean Distance. This function takes a source layer, which defines an area of data as the area to be measured from, and an are to be measured into. The source layer may be a raster or it may be vector features. In the case of the Source Layer, the cells we intend to measure From can have any value, but the cells in the area we intend to measure Into need to have a value of No Data. The incremental functions and produces an output that measures, incrementally, some function of distance, or accumulated cost of traversing the landscape to get to the nearest source cell.


In class we will demonstrate the euclidean distance function in the Sample Model. THis application uses Euclidean Distance, using the Linear Hydrologic features as the source layer. The distance function produces an output layer whose cell values represent the distance to the nearest source cell on the source layer. This new laer ten represents the relationship Near Hydrographic Features in this sense it is like the buffer functions that are commonly used in water prodetction ordinances in many cities. It is easy to measure on a map (even without GIS), but does not take slope or ground cover in to account. We will look at a better way of doing this later in this tutorial. In our model, we translate this distance according to our purpose for evaluating sites for Earthsheltered Housing. Sites withn 50 meters of a stream will be considered no-build. Sites between 50 and 100 meters will be considered, all other factors being equal, sites farther than 100 meters aay from hydrographic features are preferred.

Weighted Overlay: Putting it all Together

Note how the last step in our simple sample model uses the Single Output Map Algebra computes the weighted average of all four factors.
( slopeval + ( 2 * aspect_val) + (0.5 * mgis_lu_val) + hy_dist_val) / 4.5
We note here that this map algebra function is a local function. We can also see that the terms in the map algebra expression are weigted, so that Aspect is weigted double with regard to slope. The land use is given half a weight. Distance from water and slope are not weighted, which is the same as giving them a weight of one. The weighted avrearge is computed by summing the terms and dividing by the sum of the weights.



At this point, we should take a look at the results of the model to judge whether it makes sense, based on our visual inspection of the results on top of the topographic map and aerial photograph. It is useful to keep in mind that we don;t expect this model to be perfect, but it would be nice if it did as good a job as we feel is possible given the data and the procedural tools that we have. In the next series of sections, we will explore a few more sophisticated ways that we might approach this problem.

Modeling Complex Incremental, Focal and Zonal Relationships

The approcah we have taken so far in this tutorial has shown us the major workflow patterns involved with cartographic modeling: Transforming data into represenataions of our own values; modeling local and incremental relationships, and also some simple terrain analysis. The next section will elaborate on our model a bit more, to showcase some fancier incremental functions for taking into account areas having variable resistance. We will also look at visibility functions, which is another sort of incremental relationship. THere are two more major types of relationships that are part of the cartographic modeling toolkit: Focal Functions, which consider the relationship of a cell with the facts represented in a data layer -- within a specified neighborhood, and Zonal functions which take measurements of phenomena within areas defined by the zones in a raster or the polygons in a feature-class.

Since our model is going to get a bit more complicated, we will break it down into separate sub-models, which will each yield a value raster which summarizes the local pluses and minuses for each cell within our study area with special regard to the development of earth-sheltered housing. These will then be given the weighted average treatment in the very last phase.

Cost-Weighted Distance


Model Number 2 uses a distance function to look at the proximity of commercial centers. We value good access to commercial centers. To demonstrate this sort of distance function, we will consider accessibility by car, since this is a common thing to do. We will leave it as an exercise for you to figure out how to convert this to a model for pedestrian and cycling accessibility. Our model converts the road class attribute to a value of cost for crossing a cell. Because the reclass function can only yiels an integer raster, our first transformation yields values of Minutes per Kilometer. Because people do not move only on roads, we reclass \ the value of the land between the roads from NoData, to 10. This simulates the cost of traveling overland to get from the road to your destination. The cost units, required by the cost distance function are expected to be experesed in terms of the distance units of the input layers or the geoprocessing environment. This is why we use the map algebra function to divide our cost units by 1000, converting them to minutes per meter. Now when we use this cost layer in the costdistance function, the relationship that is represented in our output for each cell, will be the cost of getting to the nearest commercial area (in minutes.)

Model 2a makes this accessiibility model more realistic, by stipulating that water cells are not so easily crossed as land cells. This model introduces a couple of important functions that are often necessary when transforming raster datasets:

  • Merge function effectively overlays a list of rasters, and allows rasters listed first to supercede rasters listed subseqently. In the merge function cells having a value of NoData are treated as transparent, according to this overlay analogy.
  • Conditional Functions: These allow you to create a new raster whose cell values are defined conditionally based on an evaluation of an expression.

The merge function lets us overlay the road values over the stream values, and the conditional function SetNull allow us to turn the set the wet cells to a value of NoData which will effectively dictate that the only way to cross a hydrographic feature is where a bridge already exists.

Incremental Relationships where Slope is a Factor

Model 3 provides a better way of modeling the relationship of water features to potential building sites. This model uses the Pathdistance tool, which wil yield the distance across the landscale (considering that travel on a slope accumulates more distance than travel on a level surface. Path Distance also will consider that some types of land cost more to cross than others, due to vegetation or roads, for example. Finally, we can adjust the Vertical Factor Properties to restrict the measuermentr so that it will only travel up-slope -- which when runnof is concerned, is the sort of relationship we want to measure.

These distance measuring tools provide a means of representing a relationship of proximity or accessibility. For example, now that we have a means of measuring te relationship between water body and potential sites upslope, we can use this distanncd layer in our evaluation of sites.

Note that this model does a couple of new things: First, we transform land-use to runoff resistence using a relational join to a lookup table. Befoe we do this, we make a new copy of the land use layer. Adding the join to a copy of a layer is a good idea, since we may want to run this model more than once, and adding the join to the new layer guarantees that the model won't encounter the situation where the join already exists. -- which would cause the model to have an error.

ANother feature of this model is that we have pulled the cell-size environment variable out of the path distance procedure. Because this process is fairly time-consuming, enlarging the cell size from 10 to 30 reduces the number of cells that need to be processed by 9 times. Once we have this model working the way we want it, we can poke this cellsize back down to 10 and un the model when we are at home in bed.


Focal Functions: What is in My Neighborhood?

Sometimes a quality of a place is a function of qualities of the areas surrounding it. For example, a good location for a farm house may be related to the amount of productive land within a circular area of 500 meters in radius around the location in question. Focal functions (also known as neighborhood analysis) let us define the geometry of a neighborhood, and then will summarize the values on a data layer (e.g. productive land) within that neighborhood, as it is centered on and evaluated for every cell on the map. The value pf each cell in the output grid is the statistical summary (sum, average, variety...) of the values within that cells neighborhood.


Measuring the amount of trees upwind

Our simple model favors sites that are located in the forest. Part of this idea is that the trees will slow down the winter wind that comes, predominantly from the Northwest. In our more complex model we want to be more precise about the relationship between our site and trees. It is not necessary to be IN the forest to get the value of trees as a windbreak. The critical relationship is that a site should have Trees Upwind in the wintertime when the winds are predominantly from the northwest. Our Trees Upwind model creates a wedge-shaped analysis neighborhood, which can be used to represent the Upwind relationship for each cell. This neighborhood can be moved over a forest raster to count the number of forested cells upwind.

Potential Views of Water

One of the more interesting incremental functions is Viewshed which uses a layer defining the source (which is known as the observer locations, though may just as easily be thought of as the observed!) An elevation raster establishes the barriers to visibility. Like the distance functions, the Viewshed function attempts to spread from the observer points, across the elevation model, until it encounters a barrier. In our model of Water Views, we begin by establishing observer points in the water. For this, the Random Points function is very handy. Then we simply calculate the viewshed from these points. Note that although the output of the viewshed tool is portrayed with only two values: Visible and Invisible, if you change the symology of this raster, and use the get-info tool, you can see that the value of each cell varies -- it is a measure of the number of observers tha can see a given cell. Note that in this model, we set the raster analysis environment of the viewshed tool so that it uses a cellsize of 50 meters. Using the default cell size fo rthis project (10) meters, makes this procedure take 25 times longer. After we get comfortable with the way this is working, we may adjust this cell size down. This will be particularly important when we add buildings to the model.

Our second version of this model adds buildings to the terrain model and adds offset values to the observer points to simulate that the cells on land (where we expect our earthshelter residents to be looking from) sould be considered to be offset 2 meters, to simulate the height of a person.


Parsing Ridges, Hilltops, Valleys, and Hillsides

Take a look at the feng shui model in my latest sample earthshelter dataset. It uses two focal mean operations to find ridges, valleys and mid-slope areas. You need to tune it to find the sorts of areas you are looking for. It is tuned by playing with the radiuses of the larger and smaller focal mean operations. (Actually they should be 50 and 100 to start - there is an error in model in the on-line zip file. This model basically makes two different smoothed elevation models and subtracts them to create a difference raster where the wrinkles are (+) and creases are (-). You also tune it in the final reclass of the difference raster. IN our case, you would want to give valley areas and hilltops a lower score and valley walls something else, etc,. Actually, I have left this factor out opf our model, since I thnk that our slope criteria serve well elugh, but I have left the sample in the toolbox, just because it is a useful demonstration of Focalstatistics.


Summarizing Earthshelter Potential

The Site Score model simply takes a weighted average of the scores for each of the concepts in our model. As you can see, there are a couple of fairly large potential sites. The spread of values is not as great as it could be. In order to get a broader differentiation, we could start to fine-tune our model to make sure that there are more cells that are awarded 2 or -2 in the component models. We could also try different weights on the various components in our weighted average function. But for now, we will just proceed.

Zonal Functions

So far, we have looked at lots of different sorts of relationships -- those based on the cell location, others focused on cell neighborhood (focal functions), relationships based on proximity or accessibility (incremental functions). The last sort of function we will look at in this model is based on the coverage and geometry of zones. If you look at the attribute table for our score layer, you will see that all the cells are divided into just three zones. We can look at the Count field to get a sense of how many cells in our study area have been assigned to each category. This is interesting, and a good reminder of what zones are in the raster context. Now one thing that we need to consider is that we require a fairly large site for oue pilgrim pueblo developments. We are looking to site at least 50 houses. And so it would be helpful to be able to evaluate each clump of contiguous cells as a potential site.

The Large Sites model uses the Regiongroup function to create a new zone for each clump of contiguous cells in our site-score layer. The model then reclasses this layer based on the cell count, to make a mask that will weed out potential sites that do not have at least 50 cells.

Now, lets pretend we have some parcels that have come been reported to be on the market, and we want to evaluate them based on how many high-quality cells of earth-sheltered housing. If you look at the Parcel-Score model, you will see that this uses the parcels as zones, and calculates summary statistics for each parcel, based on the values of the cells on our large_sites raster. The output of this function is a table, which can be joined back to the parcel table, to evaluate every parcel several statistics that will help us to evaluate whether the particular parcel merits further study.


Model Evaluation

Is our model a good one or a bad one? By now, you know that this is the wrong question. We should ask instead:

  • Is our model useful for our purposes? We will get to this...
  • What are the weakest inputs and procedures in the model?
  • In what way would we expect the model to be biased -- toward errors of Omission or errors of Comission?
  • Are there important considerations that are ignored by our model?
  • How could the model be improved?

To me, this model seems to do at least as good a job as I could, of finding places that are likely to be good for building earthsheltered houses as I could by studying the USGS quad map -- and much faster -- paricularly since our plan is to scan and evaluate all of the realestate offerings in the united states. IN terms of our pilot study area, I'm sure that the method has identified many areas as potentially good, that are actually not good. The model has also probably missed some spots that would be great for an earthsheltered house. The weak point here is the coarseness of the terrain model (10 meters), a problem that is compounded in the slope and aspect maps which can only be seen as 90 meter averages at best. This fact would undoubtedly miss a lot of berms and wrinkles that might make a fine backdrop for a house. Better terrain data would no doubt make this model better in finding smaller nooks for a house (and perhaps smaller is better?) But no matter how fine the terrain model, it would be difficult to model all of the site conditions, especially the ones that can potentially be created. One can always regrade a site to an extent. <\p>

Of course I would actually spend a lot of time walking around in the study area before I actually decided on a house site. I think that this analysis is a useful aid in narrowing down areas that I would visit. By finding many of the more obvious spots in a fairly easy way, it gaves me more time to study the finer effects that may make a place more or less fit -- and this activity would probably give me more insight to fine-tune the model or add new data (soil type, access to roads) that would be useful. This knowledge may also help to learn a general method or rating criteria for establishing the thermal propensity of a site.

Dimensions of Experimentation