Introduction and Data Types -
Introduction and Data Types

What is a GIS?

The US Government says...

".. a system of computer software and procedures designed to support the capture, management, manipulation, analysis, and display of spatially referenced data for solving complex planning and management problems." Antenucci, 1991, p. 7

Let's see what the US Geological Survey says....

How does a GIS move us toward understanding?

·       How much support can hardware and software give to the brain? (implying that our brain is an expert pattern recognition machine)

·       How much better is a GIS at the task finding spatial patterns than the brain?


What does a GIS do?

  1. collect, store, organize, and distribute data
    common to see data files in the 10-100 megabyte range, up to gigabytes of remote sensing data.
    USGS Seamless data or The National Atlas
  2. criteria matching
    "I need to find a place for an outdoor recreation specialist at the Forest Service that is...
    on public land
    with gentle slope
    with permeable soils (for the the privy!)
    possessing nice views of the Blue Ridge
    amongst shade trees
    and within 50 m of a canoe-able river...
    so that she can build a campsite there."
  3. allows exploration of  relationships among data layers
    "...high yield forage grass is most common on which rock types?"
    " does population density relate to water quality?"
  4. allows scenario testing
    ...if we raised the smokestack to 990 ft, would the effluent bypass the inversion layer?"  
    "how about moving it to here?... or here?..."
    CommunityViz a commercial ArcGIS extension to test land use scenarios
  5. serves as a data handler for other analyses
    e.g., passing geologic and topographic data to an erosion model of the Appalachians, or passing water quality and groundwater levels to a groundwater flow model
    these are typically written in other languages (C++, Fortran, etc) that can access the GIS.
    1. A flood inundation model (HEC-RAS flow model coupled with GIS layer of watershed)
    2. A forest fire plume model
  6. aids visualization
  • which improves understanding and pattern recognition
  • & facilitates public participation in alternative scenarios
  • & coordinates group decision making


a GIS is a good way to put "pure" research to work

    • ease of extrapolating findings from research-scale plots, sampled areas, or experimental investigations
    • increased layperson participation in scenario testing
    • heightened comprehension using visualization
But a GIS is not.....
    • easy to use
    • just a drafting (CAD) program
      it must have analytical capabilities
    • able to make decisions
      that is the job of the user/interpreter... moreover, data is not information
    • free of field work ("ground truthing") and data collection
    • cheap in terms of human effort, computer resources, and data acquisition


Course Objectives

    • learn fundamental concepts of GIS and remote sensing including the electromagnetic spectrum, map projections, and nature of geospatial data
    • understand and apply simple to complex analyses of geospatial data
    • gain familiarity with computer architecture, file systems and programming
    • make maps and present findings to lay and scientific audiences
    • develop a dogged persistence in the solution of complex computer algorithms by overcoming the errors and pitfalls inherent in the process

This is NOT a course in ArcGIS or ERMapper, although you will at times think it is.   They are just the tools for you to learn GIS and Remote Sensing analyses. However, along the way you will

    • learn how to find and eliminate errors and difficulties with files, folders, and programming (Theobald calls it "one part logic and one part intuition")
    • & make web pages
    • & use the web to find answers to questions about spatial data processing


GIS Components

  • Hardware (other than the computer)
    • data storage (lots of it)
    • media reader (now CD's and DVD's, formerly 9-track tapes for satellite data)
    • digitizer (but increasingly, people get their data from large-scale scanners, which we now have in the library)
    • scanners (plus raster to vector software), which can be regular size or large-format
      ...and the software that turns the colors into lines (roads) and/or filled polygons (geologic map units)
    • large-format printers (one in Geology, one in the Library)
    • Global Positioning System (GPS) receivers

    • (sometimes with a laptop or palm pilot running a GIS for instantaneous mapping where YOU ARE the digitizer)
  • Software
    • the GIS (which may be a series of components to do various analyses and manipulations)
    • Remote Sensing (to rectify (warp, georeference) satellite and air photograph data and to analyze it for display and/or classification)
    • AddOn: modeling software (Mike McGlue '99 used an EPA program that ran on top of ArcView called BASINS to model water quality in Rockbridge County)
    • ArcIMS: map "servers" which connect a web user to the GIS for "on-the-fly" map creation and display. (Geoffrey Marshall, '02 senior thesis in computer science)


The three types of GIS Data (spatial, attribute, meta)

  1. Spatial data
    1. vector data
      1. Point Data -- layers described by points (or "event") described by x,y (lat,long; east, north)
      2. Line/Polyline Data -- layers that are described by x,y points (nodes, events) and lines (arcs) between points (line segments and polylines)
      3. Polygon Data -- layers of closed line segments enclosing areas that are described by attributes
        Image of
        Polygon data can be "multipart" like the islands of the state of Hawaii.
    1. Raster data (grids of numbers describing e.g., elevation, population, herbicide use, etc)
    2. Images or pictures such as remote sensing data or scans of maps or other photos.  This is special "grid" where the number in each cell describe what color to paint or the spectral character of the image in that cell. (to be used, the "picture" must be placed on a coordinate system, or "rectified" or "georeferenced")
    3. TINs - Triangular Irregular Networks - used to discretize continus data

    Demo in ArcGIS

    1. Start ArcGIS from the desktop or program list
    2. Copy the folder "Q:\courses\geolgis\sharedwork\Demo\intro" to your "Q:\students\your_username\GIS\demo" folder
    3. open from that copied folder the ArcMap file called 3_types_of_data.mxd
    4. examine the three types of spatial data
  2. Attribute Data are non-spatial characteristics that are connected by tables to points, lines, events on lines, and polygons (and in some cases GRID cells)
    • A point, vector or raster geologic map might describe a "rock unit" on a map with a single number, letter or name, but the associated attribute table might have
      • age
      • lithology
      • percent quartz
      • etc, for each rock type on the map.
    • most GIS programs can either plot the polygon by the identifier or by one of the attributes
      Image of

    • Image of

     the above example from the following project shows two ways to portray census data in Virginia. Each county/city gets a name and color or using a different field from the attribute table yield

    Try it yourself in ArcGIS

    1. close maps
    2. Open the ArcMap file Q:\students\username\GIS\Demo\intro\ map_attributes.mxd
    3. right click on the name of the layer in the table of contents "Va_counties.shp" and choose open attribute table to see the range of data types.
    4. change the way the data are plotted using the symbology tab of the layer properties box.
  1. Metadata
    • metadata are the most forgotten type
    • ArcView is very poor at it (writes some stuff to a log file, but that's it)
    • absolutely necessary if you're going to use data, or if someone is going to use your data later (or your information)
    • contains information about
      1. scale
      2. accuracy
      3. projection/datum
      4. data source
      5. manipulations
      6. how to acquire data

You will be keeping metadata in ArcGIS using ArcCatalog's metadata feature.


Spatial Data

The attributes of spatial data are of serveral types and some are suited to raster represenatation, and some are suited to vector (after Theobald, 2007).

  • Qualitative or nominal data - discrete (1=basalt, 2=granite, etc for a geological map)
  • Ordinal or rank data - discrete (low, medium, high; implies a quantity but is in "bins" or discrete catagories)
  • interval - continuous (example from Theobald, Temperature)
  • ratio - continous (hillslope angle, which could be measured/calculaed to any precision and reported in floating point values or integer values)
  • cyclic - continuous (with a break at one or more points, like compass direction or the "aspect" of a hillslope)

Choosing the format for continuous vs discrete data types
Image of

Vector storage better for discrete and raster for continuous

 Image of




  • digitizing or scanner with vectorization
  • less data storage volume
  • greater boundary precision
  • complex analyses are easier
  • "overlays" rapidly increase complexity and data storage needs
  • substantially easier manipulation
  • scanner/remote sensing acquisition
  • higher data storage requirements (8-32 bytes per cell* rows* columns), but compression (run length encoding, quad trees) helps,
  • decreased boundary precision,
  • no increase in complexity with son and daughter maps


 Demo in ArcGIS

1.      Open the ArcMap document grid_v_poly.mxd

2.      change the transparency of the top layer, or swipe it away

3.      here the grid cell size is 200 m


Topology of vector datasets

  • topology describes the spatial relationships of points, nodes, and lines
  • areas usually won't "close" with digitizing errors

  • Image of

  • digitizing errors can be corrected by having a "snapping tolerance" that causes points you want to become nodes to automatically align to one another (to the required seven decimal places !), while still allowing polygon or polyline traverses or chains.
  • In the diagrams below 1 and 2 have the same topology but different geometry

  • and 1 and 3 have the same geometry but different topology

    Image of


Analysis Algorithms
You're going to use a visual algorithm to tell me about the analyses that you're comtemplating for your work.   Here's the basic layout

Image of

Make an algorithm for frog habitat that has the following elements;

  • with 50 m of a river
  • on acid soils
  • that are flooded regularly
  • and that are contiguous with a pine forest (for egg laying).

Your data are Elevation, Soils, & Forest Type

what'd you get?  Here's my attempt


Source: GIS Development (