Beginning your Place-Based Data Collection
Most every design project considered at the Harvard Graduate School of Design has a site. One task in beginning a design project is to gather information from various sources which will help the designer and his/her clients understand the site and its context. This tutorial provides instruction for finding and assembling basic geograpghic information for the purpose for understanding and portraying basic site context. In addition to being useful for the production of basic maps, these basic information layers will provide a framework for more complex thematic maps and anlytical studies; and as a setting for new information created in the design process.
It has been observed that without some thought being devoted to structuring collections of data, beginning users of GIS will create very messy agglomerations of files which become increasingly burdonsome but impossible to reorganize as the project progresses. Therefore, we will introduce a proven set of principoles for a scaleable file structure for a GIS data collection.
Big Ideas in this Tutorial
- GIS Datasets represent measurements and observations of entities made with a specific purpose in mind, and employing particular referencing systems.
- Metadata is an essential element of a collection of place-based data. While collecting data, you should also collect metadata for each dataset, if it exists.
- Portrayals: e.g. Layers and Map Documents reference datasets through file-system path references. A dataset may be referenced in many different layers and maps.
- Structuring collections of GIS data, metadata and portrayals so that the relative path references among the portrayals and datasets remain stable is of critical importance for making a collection of resources that is shareable and reusable on different computers.
Objective:
After studying this tutorial, you should be able to begin your place-based data compilatioin using datasets from the internet and the GSD data collection. Your dataset should ve well structured and well docuymented to support collaboration by a team of designers, and archiving for use in future projects.
Overall Flow
- Explore a Collection of GIS Data
- Observe the properties of GIS Data and Metadata
- Observe the properties of GIS Data portrayals (Map Files and Layers)
- Explore two modes of GIS Data assimilation: Downloads from MassGIS, and Extracting Data from ESRi Streetmap.
- Learn to make file path references relative to working directory
- Learn to check file references
- Learn to back up project folder.
Sample Dataset
Click here to download the sample dataset. Create a folder for yourself in your c:\temp folder, and expand the zip archive containing the sample dataset into this folder.
Background Information
Software User Guides for Deeper Reading:
Explore a Compilation of Data and Maps
Our objective over the next few weeks is to gather data together into a cohesive compilation that will serve as a base of resources that we will build on over the course of the semester. Our compilation will involve data from various sources and map documents that reference these data, and later we will add models to this collection as well. Because maps and models reference the data, it is best if the organization remains stable, since movimng data will cause the path-references in out maps and models to break. This structure will also help us to keep our data compilation backed up.
The folder greenline_project in our sample dataset is such a collection. You can open this folder in Windows Explorer and see that there is a folder structure that conforms to the structure as explained below:
Collection Folder
| All of the data and documents we collect will be contained in an Outer Folder named for our site or project. |
|
|
Our folder will contain a Metadata File that explains the purpose of the collection, and when it was collected among other things. The collection may have subfolders for various types of data. The collection also contains a Documents folder named docs that contains the presentation documents and master map documents created for the project. |
The metadata in the various folders in this collection can be made with the GSD Metadata tool Here are examples for the Overall Collection Metadata and the Compilation of data from Rhode Island GIS.
GIS Data Folder
The GIS Folder in our collection has subfolders to hold data from different sources. |
|
|
Each source folder has its own Metadata File that explains the source of the data, and when it was collected among other things. In addition to holding GIS Datasets, each of these may have dataset-specific metadata and dataset-specific Portrayal Information. Within each application-specific folder, there may be working folders for each project collaborator containing working data, documents and tools created by that user. |
This structure provides a predictable means of understanding where each piece of data should go when it is collected, and where it may be found later. It also allows the project collection to grow in terms of the number of collaborators and derived datasets without becoming unmanageable. Before the project compilation is shared, user data considered to be intermediate can be cleaned up. User documents that are to be shared with the project may be moved to the Project Documents folder.
Tips for Reliable Filesystems
- Keep your working files on the local hard drive. Working with data on network filesystems or usb drives is not only slower, but also is subject to all sorts of unpredictable behavior which we don't need complicating our lives.
- Don't work in your Backup Copy The tips in this tutorial will help you build a file structure that you can move to the local filesystem before starting to work, and back-up to the network or your USB drive when you are ready to take a break.
- Save backup copies of your map documents and your entire project as versions. that way if you or the software or the computer or the electric company, or you does something terrible to your work, you can always revert to your previous working version!
- Don't work in folders that have spaces in their names. this in cludes the Desktop or My Documents folder.
- Never begin the name of a file with a numeral. its not clear why this is, but trust me, it can bring you very bad luck! It has something to do with the assumptions that are made by programming languages.
Getting started with ArcMap
Lets dig in to ArcMap and start exploring this collection of data.
ArcMap Documentation
- Workspace and Dataset Management with ArcCatalog
- Adding Layers to a Map and other documents under the heading Working with Layers
- The ArcMap Tools Toolbar
- Elements of Geographic Information
- Using ArcMap Chapter 10, Working with Tables
- Use Windows Explorer to find your way into begining/greenline_project/docs and double-click the ArcMap Map Document named compilation.mxd.
- Wou will see a nice display show up with a Map Window a Table of Contents window filled with Map Layers
- Use the various tools on the toolbar to zoom in and zoom out on these data and to identify features.
- Open the Attributes Table of your Insitiutions layer and take a look at all of the rows representing entities and attributes represented in this layer.
- Is this data useful for us? What does it mean? Who collected it, How? Why??? What do the codes for FIPSSTCO mean? Are they useful?
- Right Click on the Institutions layer and choose Data > View Metadata
- Becuse the institutions dataset uses metadata formatted according to the Federal Geographic Data Comittee standard for Geospatial Metadata, this documentation can be viewed and managed within ArcMap.
- Look at the Attributes for the MassGIS openspace layer. What does Pub_Access mean? When we try View Metadata on this layer, nothing happens. Apparently, there is no FGDC metadata for this.
- Right-Click on MassGIS openspace and open its Properties look at its Source properties to figure out where on the filesystem this dataset is stored.
- Use Windows explorer to find the lus.htm metadata file associated with the protected_openspace.shp file. Open this to get an idea of what these data represent.
- The world of metadata is still in a fairly disorganized state. You see from this example, how metadata is essential to understanding data, but some care may be necessary to make sure that we obtain and manage the metadata in various forms that comes with our data. In some cases, we may be given any metadata, but we could at least create a little text file that explains where we got the data, and when, and any other information that may be useful
Use ArcCatalog to Simplify the Task of Managing GIS Data
We have seen in our forrays into our data folders that GIS data files are complexes of files. A Shape File dataset is actually composed of 4 or 5 different files that all share the same file prefix. This presents problems for moving and copying files. The application ArcCatalog is a special file browser that makes this process simpler.
- Use ArcCatalog to explore the data (for example within the net_resources/esri_streetmap folder.) Notice how it looks simpler.
- Use arcCatalog to look at the metadata for the Public_Buildings dataset. Since this dataset has metadata that complies with the Federal Geographic data comittee standard for geospatial metadata, ArcCatalog be used to view and manage it. If the layer is copied, ArcCatalog will also copy the metadata!
- Use arcCatalog to look in the net_resources/massgis_downloads folder. Notice, that these datasets do not have FGDC complient metadata. But if you look in the folder with Windows explorer you will see that there are HTML documents in there that represent the MassGIS metadata. MassGIS downloads these files when we download the data, but we need to take care that the meartadata stays with the data when we move it around.
Beginning a Data Compilation
So now that we see what a compilation of data looks like, lets pretend we are beginning our own from scratch. We will start by building a folder structure to hold our documents and our data. Then we will go find some data on the network and bring it in to our compilation. To make this lab simpler, I have taken some things off the network and put them into the net_resources folder in our sample dataset. For our example, we will pretend to get some data from Massachusetts GIS and then we will go through some techniques for extracting location-specific data from very large national or worldwide datasets such as ESRI Streetmap wich you will find explained in GIS Data Resources.
Downloading Data from the Web
The data in the MassGIS_Downloads folder is similar to what you might download from various state or city GIS sites that allow data downloads. These sites will often package datasets as zip archives with shape files, layer files and sometimes metadata, as massgis does. When downloading data it is a good idea to create a temporary downloads folder so that you can explore what you have before mixing it up with your compilation. IN this exersize we will explore some of the data we got from massgis, and move information ito our compilation.
Working with ArcGIS Portrayals: Map Files and Layers
In this phase of the tutorial we will create a new arcmap document and use it to look at some of the data in our MassGIS downloads folder. We will look at a shape file directly, and also look at one of the massGIS official portrayals that they make available as ArcGIS layer (.lyr) files. At this stage we will begin to understand the relationship between portrayals and map files, and the data that they refer to. It is very important to understand the relationship between GIS datasets and the maps and layers that reference them. Our goal is to create a compilation of maps and data that is all contained within one folder that we can easlily move from one computer, disk or USB drive to another.
ArcMap Documentation
- Changing a layer's drawing order
- Changing a layer's text description
- Setting layer properties
- Displaying Labels
- Saving a Layer to Disk
- Referencing Data on the Map
- Reparing Data Source Links
- Create a new folder named my_compilation Within it, create subfolders named docs and gis.
- Use File > New in arcmap to create a new ArcMap document named compilation in your docs folder.
- Use the Add Data button to add the OPENSPACE_POLY.shp and the OpenSpacePrim_Purp.lyr from the MassGIS_downloads folder.
- For the moment, ignore the fact that the Protected and Recreational Openspace layer seems to have a problem.
- Change the name of the OPENSPACE_POLY.SHP layer to Public Open Space.
- Make your Parks green
- Adjust the layer's label settings to label the parks with their names.
- Right click on this layer and choose Save as Layer File to save this new portrayal to your Compilation/gis/massgis folder. You may need to use the Save dialog to create this folder.
- Use the Add Data button to add your layer back to the map
- Examine the data source properties of your new layer. Note that while this layer is within our compilation folder, it is referencing a dataset that is still in our massGIS_downloads folder.
- Since we want to keep the Openspace Poly layer with our compilation, we should use ArcCatalog to move a copy of it from downloads into compilation/gis/massgis.
- Don't forget to copy the metadata file as well! (in this case, the metadata file is named osp.htm.
- Now fix the source property of our new Public Open Space layer so that it references the dataset in your own compilation folder.
We have just created a layer file and explored how it references a dataset. This ne knowledge can help us in figuring out what is wrong with the Protected and Recreational Openspace layer that we added at the beginning of the last series of steps. The red exclamation mark indicates that the layer cannot find its source data. Examine the Source Properties of this broken layer, and fix the reference to point to our new copu of the OPENSPACE_POLY.SHP. Now you can see that this is a ready-made portrayal that MassGIS has created in order to distinquish the openspace polygons according to the Primary Purpose attribute. Take a look at the Symbology Properties for this layer and note that this portrayal assigns a color to each of the values of Prim_Purp and also translates the terse codes for this attribute into a human readable legend. It is nice to be able to use portrayal infomrmation that comes with data, even if we may need to fix the references!
Extracting a Subset of Features from a Large Feature Class
A great starting place for any data compilation is the collection of data that comes with ArcGIS. At the GSD, this data collection is stored on-line in l:\public\geo\esridata_9. These data came on 4 DVDs and cover the whole world. We want to keep our data collection focused on our area of interest and its greater context. If we aren't careful, we will end up with gigabytes of data that aren't of any use to us and our collaborators. So we will export a subset of these features to our compilation folder.
In our sample dataset, the net_resources/esri_strretmap folder contains a small subset of data from ESRI maps and data that will help us to demostrate the techniques of exploring these data and extracting small subsets to our compilation.
ArcMap Documentation
- Displaying layers at certain scales
- Exporting Features
- Referencing Data in Maps
- The ArcMap Table of Contents
- Use the Add Data button to add the layer named Streetmap Regional.lyr from the net_resources/esri_streetmap folder.
- Note that the Streetmap Regional layer is a Group Layer that contains a bunch of sub-layers, each referencing a dataset.
- note that as you zoom in and out some layers appear and dissapear. This is due to the visibility-scale property that you can examine and change from the General properties of each layer.
- Lets say we want to extract the major highways from the Streetmap Dataset to our compilation. We don;t want to extract all the streets in the world, so we will zoom in to the boston area.
- If necessary, we will right-click the Major Highways layer and click Visible Scale Range > Clear Scale Range so that we can se it.
- Now you can right click the Highways layer and use Data > Export Data to export just the data within current view extent into your compilation/gis/streetmap folder. You may need to create this folder. When this finishes, it will ask you if you want to add this layer to your display, click Yes.
- Now you see you have managed to copy the features you need, but you have not got the portrayal, which in ths calse includes a classification of road line symbols, but also an elaborate symbolization of road labels and shields.
- Drag the Highways Layer out of the Streetmap Group, and alter its source properties to point to your copy of the Major Highways layer in your compilation.
- Save this new layer portrayal to your compilation/streetmap folder.
Absolute and Relative Path Names
You are now almost ready to go. You have created your compilation folder, you have gatherd some data from the web and organized it within your compilation. We have also captured the portrayal information that controls the grpajhical display of these data. And we have a compilation map document that references all of this. So we are almost ready to copy this compilation to our backup media and share it with our colleagues. There is just one more problem to solve!
If we copy our folder to another computer, it may not be in exactly the same place. That is, our collaborator may want to copy it to a folder other than c:\temp\yourusername. And if this happens, the file references between the map document and the data will be broken and instead of seeing nice portrayals of data, they will see the Dreaded Red Exclamation Marks!. Your final step before copying your work to your portable media is to set your map's file references to Relative Path Names, and then to check the source path forall of the files referenced on your map.
| The Map Document and its Layers are Portrayals that Reference Datasets using Filesystem Paths. These path references may become invalid if the map document or the data are moved. Since we want our data compilations and maps to be portable. This is considered a BAD THING. |
|
|
Absolute, or Full Path References are specific about what disk data are on. They will work even if the map file is moved. If the data are moved, however, an absolute reference will Absolute Reference will become invalid. Relative Path References remain valid if the map document and the data are moved together but maintain the same relative positon within a folder context. |
Therefore, if we want to be able to move our data from one computer or disk to another, we need to learn how to maintain a stable relative relationship between our map documents and our data and to store relative path references in our map documents.
Saving your Map Document with all of its Portrayals to your Compilation
You have seen how layer files save portrayal information. It is also possible to save the whole map document with all of its layers to the docs folder of our compilation. To make sure that the compilation will work on another copmputer, we need to make sure that the map documents in our compilation folder do not reference datasets that aren't part of our compilation. For this, we will use the Source View of our ArcMap table of contents to view datasets according to what folders they reside in. Using this view we can eliminate any of the layers that are not in our compilation's gis folder. Then we set the Data Source Properties in ArcMap so that the document uses Relative Path Names to reference its data sources. Then we save the map document into the docs folder of our data compilation.
References:
- Referencing Data on the Map
- The ArcMap Table of Contents
- Pathnames explained: Absolute, relative, UNC, and URL
Always Check your References!!!
- Before closing your map, set your Table of Contents to Source mode.
- Look for data that are referenced from file system locations that aren't within your data collection folder.
- Delete any data sources in your map that aren't in your portable folder and move datasets as necessary.
- Set your Document Properties -> Data Source Options to: Save Relative Path References
- Save your map document to your Docs Folder in your use folder, and/or the top level docs folder of your compilation.
Conclusion and Your Assignment
You now have the fundamentals of working with ArcMAp and organizing a concise and complete data compilation! Now you should now find an area that you are interested in for your class project, investigate datasources that are available on the web, and begin your compilation, focusing first on thjose layers that will be useful for portraying the basic context of your site. One very good set of data and portrayls will be the ESRI Maps and Data CDs that ship with the ArcMap software.
ESRI Streetmap is a product that makes it very easy to make maps of anywhere in the US. Streetmap is a single CD which contains a very elaborate system of GIS data that has been packaged in a hierarchical set of ArcGIS layers which get more detailed as you zoom in. This provides a very easy source for GIS data at many scales.
Users at the GSD will find the ESRI Streetmap CD shared on the network at:
l:\public\geo\esridata_9
Collecting data from streetmap or the other esridata products uses all of the techniques covered in this tutorial. There are a couple of additional tips that you will find useful:
References:
li>Managing Group Layers| To load ESRI Streetmap you may need to use the Connect to Folder button. |
|
| Street Reference streetmap has almost every street in the country. | |
| Other layers, like hydrography appear as you zoom out. | |
| At the Regional Scale you have more general detail. | |
| On out to National-Level of detail. |
International and Other Data
If you are interested in data that is not on the ESRI Streetmap CD, such as the locations of schools and hospitals in the US, or basemap data for Europe, or general basemap data for the rest of the world, you should check out the data layers provided with ArcGIS in the ESRI Maps and Data CDs. Click here for a listing of the layers available on these CDs.
