The National Spatial Data Infrastructure: A Societal Information System

Presented November 8, 2000 by Paul Cote as a guest lecture for ENR-100.

Professor Clark has asked me to prepare a lecture, some readings, and laboratory exercise for you dealing with the subject of GIS and Remote Sensing as they relate to biodiversity. I will admit from the start that I am not an expert on biodiversity; but I have done a lot of thinking about GIS and a little about remote sensing. I know that investigations of biodiversity, and analysis of policy impacts upon it are very information intensive enterprises, and many if not most of the important questions related to biodiversity have an important spatial component. Therefore, if we are interested in understanding biodiversity and how our actions may affect it, we really need to have an understanding of the information systems that permit us to look at and model these complicated spatial relationships.

Information policy is a very important piece of environmental policy.

In my class on GIS, I find it useful to demystify information systems by beginning with the fundamentals: The essential functions of an information system consist in systematically storing and retrieving representations, and modeling systematic associations among these elements. One important corollary of this definition is that the effective differences among various information systems can be boiled down to the following questions:

  1. What is the nature of the representations that may be stored?
  2. What is the nature of the associations that may be made?

Given this theoretical framework for evaluating information systems, lets look at the common information systems that are important to us, and how they function to help people expand our mental models of the world:

The fundamental capacities of information systems when reduced to their elements, as above, appear symplistic. Information systems have many intersting properties when complicated sequences of data input and processing are combined. The web is an interesting case of an information system that promotes societal intellegence -- as the structure of the web itself is not designed, but the individual compnents are dynamically accessed and associated in such a way that new information is generated by the sum of the contributions. In the next few topics we will look how GIS is beginning to behave in a similar way.


From Project-Based GIS to Information Infrastructure

An interesting evolution is occurring in information systems. In the early stages of information systems and GIS, most databases were single-purpose. More recently, organizations that have learned to manage information are planning I.S. strategies to leverage information resources in virtual Data Warehouses that permit people to collect and inter-associate information collected for different purposes within the organization. (Dangermond, 1999) Such information infrastructure-building efforts require strategic information-system planning at the very top of the organization.

From the perspective of GIS, one such strategic information infrastructure plan has been undertaken by the Federal Government of the United States. The National Spatial Data Infrastructure was mandated by an executive order of President Clinton.

The National Spatial Data Infrastructure (NSDI) recognizes that:

These justifications make sense merely at face value; but in a short time, we will see that freely available government information is creating elemental building blocks for a much more intellegent society.

A small sampling of federal data components of the NSDI available for free or at no charge with minimal or no copyright restrictions:

The US Census TIGER Line Files

US Geological Survey Digital Line Graphs

US Department of Agriculture Soils Data

US Geological Survey and Defense Department Digital Elevation Models

NASA Satellite Data from 1978 onward

Spatially referenced hazards data from EPA
...and many, many more

At the state level (in Massachusetts, and most other states) similar strategic data sgharing policies are in effect. Some examples of data available to the public over the web, at no cost, include:

Mid-Scale Land Use Data

USGS/MassGIS Orthophotographs

Surficial Geology
...and many, many more

At the local city level, administrators have not been quite as quick to recognize the benefits of making data free to the public or easy to share. These information resources are some of the most useful for planning and understanding environents, however, the public Boston and Cambridge public agencies (for example) attempt to fallaciosly "recover costs" of database production by charging well over the costs of reproduction for these databases.

Property Parcel Data from the Tax Assessor

Engineering-Scale information regarding Buildings, Streets and Terrain


Societal GIS

As far as I know, the term societal GIS was coined by Jack Dangermond (GSD MLA, 1969.) It is interesting to think of government officials as stewards of a physical public infrastructure as they are increasingly creators and stewards of an infrastructure of representations -- an information infrastructure. Every application that begins with government data sources, builds on this infrastructure. The spatial data infrastructure provides a basis for development, just as the physical infrastructure does.

Consider the following example (by Amy Cupples Rubiano, a former student in GSD6322, Fundamentals of GIS) which takes advantage of several independently published data sources:

  1. A person starts with a listing of events referenced with street addresses. (in this case it is the EPA's Toxic Releases Inventory.)
  2. Using a vector GIS, the person associates the Toxic Release addresses with street locations from US Census TIGER Line files.
  3. Another GIS association defines the areas within half a mile from the Toxic Release locations.
  4. Information about how many people live in the area is refined with land use data from the MassGIS.
  5. The refined population map is then used to estimate the population and ethnic composition in the areas near toxic relaeases.

Phenomena that involve individuals and groups creating, by their collective actions, systems that are more than collections of individual parts, can be termed societal phenomena. As time goes on, datasets become more detailed, and more useful. At the same time, people are sharing data more, and using one data layer as a base to make yet another useful database. Understanding this, it is easy to predict that whereas GIS is currently a very important means of storing, studying and sharing spatial information, that its use and usefulness is at the beginning of a very steep logarithmic ascent.


Growing Pains on the GIS Frontier

It should be clear from the arguments made above, that for people interested in the arrangement of things in space, that GIS and information infrastructure are important things to understand. I believe that it is useful to discuss three levels of GIS-Literacy:

  1. Formal Capacities of Information Systems: In order to plan or think critically about GIS applications, one should have the most general knowledge about how "Ideal" information systems behave. These formal capacities can be understood without regard to specific data or software or any specific application:
    1. What are the capacities of a particular information system to store and associate representations of things?
    2. How can the capacities of a particular I.S. be applied to represent entities and relationships that are of interest to us?

  2. Practical Problems of Implementation: These are the issues that come up as soon as one attempts to find or create real data to build information models.
    1. How are databases designed such that formal queries and associations return valid results?
    2. What are the most appropriate data sources, how can one assess relative quality of geographic data sources?
    3. What sorts of spatial coordinate systems is one likely to encounter? Which one is most appropriate for a particular application in a given geographic area.
    4. What sort of problems is one likely to encounter in data collection, conversion, and assimilation of various datasets into a useful assemblage of information resources for a model?

  3. Technical Problems of Implementation: These are often software-specific issues of how-to-do-it:
    1. How do the buttons and menus of this application represent the formal capacities of the information system?
    2. How should information resources be organized for an effective information system implementation?
    3. How can I write a program or customize the software to facilitate a particular process?

  4. Critical understanding of I.S. Models: No representation is complete. In an assemblage of representations, many practical and technical choices are made, each one has an effect on the quality of the result. Formal, practical and technical understanding of GIS are prerequisites for evaluating:
    1. Is there an appropriate fit between the formal capacities of information systems employed and the types of entities and relationships that are being modeled?
    2. Are the databases designed in such a way that queries and associations employed are returning valid results?
    3. What sort of errors should one expect from a given assemblage of imperfect information resources?
    4. How would evaluate whether an GIS representation is "good enough" for a given purpose?
    5. How could a given GIS technical implementation be improved?

  5. Understanding of the Institutional Concerns related to Information Infrastructure As leaders, you may choose to distance yourselves from "nerdy" issues such as database administration. I hope that you now have a sense of the importance of institutional information and GIS, and that you will recognize the importance of proper planning, stewardship and sharing of institutional information. As Bill Clinton demonstrated with his Executive Order, these issues often require mandates from the highest levels.