GSD6322 Fundamentals of GIS

Business

Thoughts about Last Assignment:

  • Don't worry too much about the numbers.
  • Make contextual framework simple, but answer the predictable questions about local and regional context described in Elements of Cartographic Style
  • Be sure to check your CDs with the method covered in the first lab.

Thoughts about next assignment

  • Be sure to check the notes for evaluating data in GIS Manual Page about Models
  • Be sure to observe the thematic mapping tips in the Elements of Cartographic Style and labs on thematic mapping with categorical and numeric data.
  • Be sure to check your CDs with the method covered in the first lab.
  • Be sure to include a contextual framework on your maps.
  • It may be helpful to look at regional pattterns AND one or two detailed areas.

Segue from Last Week

Last week we didn't get time to look at the more elaborate vector relational applications. We will look at them at the beginning of class today. These represent some of the particular characteristic of vector-relational data models that work with geometrically distinct entities. This will serve as our seque to raster GIS functions that can deal with continuous, gradual, or finely incremental sorts of phenomena and associations.

Raster GIS Fundamentals

Now we will examine another sort of information system. Recall that our problem, expressed as its most formal aspect, is to create meaning from information. And to this end, we are studying information systems. We should review our formal framework for modeling to see that information systems can be described and differentiated based on their formal cpacities to:

  • Provide structures for representing entities or phenomena (referencing systems.)
  • Provide procedures for associating representations with eachother.
  • Provide procedures for transforming representations
  • Provide facilities for combining the structures and procedures and visualizing information.


We have seen that relational database management systems and vector GIS are closely related -- vector GIS is basically built from the principles of relational databases, extending their storage, associative and query capabilities into two dimensions. Raster GIS is very much different from relational and vector GIS systems in each of these capacities. Relational tools are useful when dealing with rasters and their value tables, but the basic representations associations and transformations that we do with rasters are fundamentally different. Raster GIS provides us with new structures for representing the properties of locations including how locations are associated with one-another, based on their proerties and spatial relationships. Raster data models include a unique vocabulary of operations for transforming and associating these references. For an overview of Raster GIS in the context of the ArcGIS Spatial Analyst, see

Readings:

A Bit of History

The earliest GIS architectures, implemented by Roger Tomlinson in the Canadian Land Inventory in the mid '60s emulated traditional map drafting. Entities were represented by points and lines that could be drawn with an automated drafting machine (aka pen plotter.) An outline history of GIS can be found at National Center for Geographic Information and Analysis History of GIS and Mapping the Unknown: How Computer Mapping at Harvard Became GIS.

At The Harvard Lab for Computer Graphics and Spatial Analysis (at the Harvard Graduate School of Design) in the late sixties and early seventies, Carl Steinitz and many others were experimenting with ways to use digital geographic data to emulate the Cartographic Overlay techniques portrayed by Ian McHarg in his book Design with Nature. McHarg and Steinitz collaborated in the first digitally augmented landscape planning studio at the GSD in 1967.

The students and researchers at harvard were using a computer language called FORTRAN, IBM's highly customizeable information tool to represent landscapes and things that may happen on them. In Fortran one of the fundamental forms of representation is a Matrix -- an array of values. Naturally, someone began to experiment with this structure to represent the character of locations in 2-d space, and Raster GIS was born.

Consider a simple overlay operation: find areas that are covered with forest and also above a certain elevation First try to imagine how you derive this geometry if your forest cover and zoning maps were stored as lists of vertex coordinates as they are in a vector GIS system. Now consider how you could calculate this if your two maps are stored as arrays of evenly-spaced values. It turns out that this array structure is VERY efficient for represnting the attributes of places and how they may interact with eachother.

One of the reserachers at the Harvard Lab was a student named C. Dana Tomlin. Tomlin found a good thesis topic when he figured out that the many diferent things that you could do with these arrays can be chained together into systems of equations that operate like Algebra. At this point, He siezed the opportunity to lay out a taxonomy of these operations and a notation for designing and sharing models made up of Map Algebra equations.

Consider how the design of a notation for representing abbstract or physical things and their relationships affects our understanding and portrayal of our universe. Another case in point: How would the world be different if we had adopetd the calculus notation of Leibnitz vs. Isaac Newton or E.F. Codd?

Map Algebra

Consider the concept of an algebraic function: Y = f(X) stated as the quantity Y is a function of the quantity X. That function may be something like the equation for a line: Y = aX + b.

Now think of two dimensional patterns being substituted for the one-dimensional scalar values X and Y. In Map Algebra you may have a function stated as:
SCHOOL_ACCESS = distance(SCHOOLS) < 2000m
SLED_SLOPE = slope(ELEVATION) > 20
SCHOOL_SLED = SCHOOL_ACCESS * SLED_SLOPE

Think about how these functions allow us to abstractlym symbolically, formally relate relationships between patterns, to make new pattterns from interesting functions of existing patterns!

Structures for Representation

It may be useful to think of the representational and functional vocabulary of raster GIS in the context of what we know about Relational Database Management Systems and Vector GIS. Whereas the latter are primarily oriented toward representing Distinct Entities Raster GIS is oriented toward representing Locations, Neighborhoods, and Regions.

In lecture we will demonstrate these basic representational procedural capabilities using ArcGIS. If you want to study them in your own time, look at the ArcGIS Spatial Analyst User Guide linked at the top this page. We may also refer to illustrations provided Illustrations of Raster Representation Vocabulary.

Cells

To begin with, the fundamental unit of analysis in raster systems is the Cell. A cell represents a location in teselated space. The condition of a given cell is recorded as a numeric value for each cell. A raster or layer, sometimes referred to as a GRID, is a regular arrangement of cells. The process of choosing a size for these cells and assigning values to them is called sampling.

There are basically two types of grids. There are integer grids, in which the attribute for each cell is an integer, and there are fewer than a few thousand possible values for cells. In this case, we talk about the association of cells having the same value as a zone. Each zone in an integer grid is represented by a row in the grid's value table.

We also have grids in which the values represent small incremental changes, represented as floating (decimal) point numbers like elevation, or distance. These grids have so many potential values that the concept of zones is typically not applied, and these grids do not have value tables.

Layers (aka Rasters or Grids)

Another fundamental unit of analysis in raster GIS is a Layer. Layers are containers for handling regualr arrays of cells. Cells are usually square, but some people have argued thhat they should be hexagons. Why?

Layers are geographically referenced and can be stacked up, creating relationships among cells that share the same location (see local functions below.)

Zones

In an interger-bbased grid, each unique cell value is associated with a row in a Value Attribute Table Each cell sharing the same value (e.g. 24: Forest) is associated with every other cell having the same value on that layer. These related areas of cells (which need not be continuous is known as a Zone. If we wanted to figure out some summary such as Average of values on a layer such as Elevation, for each zone, such as Forest or Residential, we could use an associative procedure known as a Zonal Function to find the average elevation that occurs over all forested and residential cells.

We can use the value attribute table of an integer grid with various relational procedures like select queries and joins

NoData or Null Values

Cells may have values of Integers and Floating Point numbers, but in developing the logic of raster GIS it is necessaqry to allow cells to have a value of NoData or Null, as well. Nodata cells affect map algebra functions in various interesting and useful ways. In some functions, like Distance, NoData provided the empty areas into which distance is calculated. But in functions that have multiple inputs, NoData performs as a mask: cells with value of NoData in any layer, returns nodata for the same locations in the output layer. In a Merge Function, The NoData Zone is treated as Transparent allowing a new raster to be built up from a stack of other rasters.

Important Geoprocessing Environment Settings for Working with Rasters

General->Analysis Extent When making models with raster functions you should be sure to set your Analysis Extent to a specific area (perhaps defined by the extent of one of your input layers. The default extent: "Minimum of Inputs." will cause the extent of some outputs to be clipped -- especially when your input is a point feature-class.

General->Coordinate System In many raster analysis operations a cell size may be declared, or a distance may be calculated. The Units that will be used will default to the units of the coordinate system of the input data. This can lead to unpredictable results. Better to set the ouput coordinate system to an apropriate projection for your area.

Raster Analysis -> Cell Size The default cellsize is "Maximum of Inputs" but yopu may want more control than this

Basic Raster Transformations and Associations

The following is a discussion of the basic raster functions expected in any Raster GIS software implementation. YOu can learn more about how ArcGIS exposes these functions by looking them up in your ESRI Toolbox using the search tool and then looking at their help.

  • Feature to Raster Lets you rasterize a vector layer.
  • Euclidian Distance, Cost Distance The cells on the resulting grid have as values, the distance to the non-nodata cells in the specified existing grid. This function would be used if we wanted to find areas within a certain distance of wetlands. Cost Distance lets you calculate the cost of travelling to or from a souce zone or feature class through a raster of uneven resistance.
  • Reclassify: permits the reclassification of the values in a layer. This is what you would do to reclassify the values from a land-use grid according to their suitibility for siting a garbage dump or to categorize slope into classes such as Low, Medium and Steep. .
  • Single Output Map Algebra The values for cells in the output are a function of the cells in two or more input grids. IN this tool you can do math or logical functions -- calculatinbg an expression with a single layer or with with several layers at once. There are also many other functions that can be entered here. This is the tool that can let you calculate a weighted combination of varioius intermediate vlue grids in your mchargian overlay.
  • Slope, Aspect: These functions take an existing elevation grid (aka digital elevation mode, or DEM) and will return grids of Slope or Aspect or Synthetic shaded relief.
  • Focal Statistics, Point Statistics: Analyzes a data raster using a moving neighborhhood region. Returns a raster, each cell value expressng a summary of the data grod for the neighborhood centered on that cell. The point statistics function can visit each cell and count the number of bakeries within 500 meters.
  • Zonal Statistics: Given a layer of regions (or zones) and a data raster the zonal analysis tools will compute a summary of data values occurring under each zone. The zones can be polygons or rasters. YOu could use this tool to calculate the range of elevations that are covberd by the zones of various land cover types.
  • Viewshed: Calculates the area visible from a point on (or a specified offset elevation above) a terrain surface.
  • Regiongroup: Creates distinct regions for groups of contiguous cells. This is very useful for "Clumping" areas of forest in order to distinguish large patches from small ones.
  • Merge (Map Algebra Function): Merge is the function that allows you to 'Cover' values in one raster with values from another. FOr example, lets say you want to add roads to your land use layer. To do this you could reclass your roads to a value that does not already exist on the landuse layer, and make sure that all the cells that aren't roads are classed as NoData. For some reason there is no wizard for this function in the toolbox, you must use it as a function in the Map Algebra tool e.g.
    Merge(roads, landuse)
    Two or more rasters can be listed within the parentheses, Values from the layers listed first supercede the values on lower grids, and NoDAta cells behave as if they are transparent. Documentation for this function can be found in the ArcGIS on-line help under How Merge Works
  • Combine: for each cell in the output grid, we get a unique value for each combination of values on any number of input grids. This would be useful if we had a grid of 4 classes of solar aspect, and a grid showing diseased and non-diseased Aspen trees. The resulting grid would distinguish between healthy/north, healthy/south diseased/north diseased/south... up to 8 different combinations.
  • Conditional: conditional functions help you choose values for cells from a raster that is based on a logical expression in a conditional raster. For example, if the value for a location in the Elevation raster is greater than zero the cell's value on the output will be taken from terrestial_species. If the elevation is less than or equal to zero, the value of the output cell will be taken from the Marine_Species grid.