Geocoding and PROC GMAP - Tools for Presenting Spatial Data

advertisement
NESUG 2007
Hands-On Workshops
Geocoding and PROC GMAP - Tools for Presenting Spatial Data
Michael Eberhart, MPH, Department of Public Health, Philadelphia, PA
ABSTRACT
This paper describes how to use SAS/GIS® and PROC GMAP to create spatial databases, geocode address-level
data and generate maps to display geographic trends in disease data. Topics discussed include the basic concepts of
U.S. Census Bureau TIGER/Line® files, importing TIGER/Line® data to create spatial datasets using the SAS/GIS
TIGER Import function, and storing spatial data in SAS®. The basic steps of address data preparation and batch
geocoding, creating traditional SAS datasets for map creation, using PROC GMAP to display map datasets in a
variety of formats, and plotting geocoded cases using the Annotate Facility are also discussed. Additional topics
include coordinate systems and map projections, projecting annotate data, choropleth maps, SAS/GRAPH output
(ACTIVEX, ODS), creating map output files in HTML drill down for the web, loop processing with NULL datasets, and
creating customized map legends.
GETTING STARTED
SAS® provides a geographic information system (GIS) component that can be used to organize and analyze spatial
data. Spatial data refers to anything that can be referenced based on a physical location, such as census tracts, zip
codes, and street addresses. Geocoding is the process of adding spatial information to existing data based on a
street address. Geocoding attempts to match an address in a SAS dataset with spatial information in a SAS spatial
database. If a match is found, the coordinates for the address location (x,y) are added to the observation. Additional
information about the address location (census tract) can also be added to the address dataset. Before you can
begin geocoding with SAS®, you must first create a spatial database. Using the SAS/GIS Spatial Data Importing
tool, you can use U.S. Census TIGER/Line® files to create a map database. First, you must find and download the
TIGER/Line® files for the county, or counties, of interest.
TIGER/LINE DATA
The term TIGER® refers to the Topologically Integrated Geographic Encoding and Referencing system used by the
U.S. Census Bureau. The U.S. Census website contains TIGER/Line® files for all counties in the United States.
Each TIGER/Line file set contains a series of data files that contain spatial information for geographic features such
as roads, rail lines and rivers, as well as boundary lines for census tracts, census blocks and counties. The data
include digital information such as location in latitude and longitude, the names and types of features, address ranges
(from-to, left-right), and relationships between features (e.g. where rails cross streets, or census blocks are contained
within census tracts). To access these files, go to the web address below:
http://www.census.gov/geo/www/tiger/
Follow the links for the 2006 TIGER/Line data, click on the hyperlink for the desired state, and click on the FIPS
number file to download the data. The FIPS number is 5 digits, and represents the state code (2 digits) followed by
the county code (3 digits). You must know the county number in order to select the correct download link. The
example described here uses Philadelphia County in Pennsylvania, and the FIPS code is 42101. Unzip the
downloaded file and save the TIGER files in a folder. The following files will be included in the folder:
TGR42101.RT1
TGR42101.RT2
TGR42101.RT4
TGR42101.RT5
TGR42101.RT6
TGR42101.RT7
TGR42101.RT8
TGR42101.RTA
TGR42101.RTC
TGR42101.RTE
TGR42101.RTH
TGR42101.RTI
TGR42101.RTM
TGR42101.RTP
TGR42101.RTR
TGR42101.RTS
TGR42101.RTT
TGR42101.RTZ
TGR42101.MET
Some or all of these files may be included for other counties depending on its geographic features. The content of
these files is beyond the scope of this paper; however it should be noted that the Record Type 1 file (.RT1) is the
reference file that SAS uses when importing TIGER data, and the Metadata file (.MET) contains information on data
1
NESUG 2007
Hands-On Workshops
sources and a description of the data included in building the files. For complete information regarding the contents
of TIGER files see http://www.census.gov/geo/www/tiger/tiger2006se/TGR06SE.pdf.
CREATING A SPATIAL DATABASE
SAS stores spatial data in multiple SAS datasets which all work together in one logical file. There are three main
types of data sets in a spatial database – chains, nodes and details. Chains are basically line segments. A chain has
a node on each end. Chains are connected or linked by nodes. Detail data denotes the curvature of a chain. The
chains, nodes and details form all of the features that are contained in the spatial database. For the purposes of this
paper, we are only concerned with the chains, nodes and details that form segments of streets that contain
addresses. To create a spatial database for geocoding addresses to a spatial location, import TIGER files using
SAS/GIS.
In the Editor window, assign a LIBNAME for the map and spatial data files.
/*Set libnames for map data*/
libname mapdata 'c:\nesug07\maps';
From the Solutions menu, select Analysis and then select Geographic Information System. This will open a new
SAS/GIS window. From here you can either open an existing map or create a new map using the GIS Spatial Data
Importing window. To import your TIGER files, select IMPORT from the FILE menu.
The default IMPORT TYPE is TIGER.
Use the navigation button to point to the TIGER basic data record file name – the file with .RT1 extension.
In the Map Entries section of the window, select the library name, create a catalog name and map name.
The default action is CREATE.
o
The action options are CREATE, REPLACE and UPDATE. Use create to generate a new map file,
replace to recreate an existing map file, and update to add other contiguous counties to an existing
map.
2
NESUG 2007
Hands-On Workshops
In the Spatial Data Sets section of the window, select the library name and create a name for the spatial
data files.
o The action options are CREATE, REPLACE and APPEND. Use create to generate a new spatial
data file, replace to recreate an existing spatial data file, and append to add other contiguous
counties to an existing file.
Click IMPORT to import the map data.
When importing is complete, click CLOSE to open the new map.
The map will include several layers. Layers can be turned “on” or “off” by checking the tick box next to the
layer name.
GEOCODING ADDRESS DATA
Assigning a spatial location based on a standardized street address can be accomplished either through the
GEOCODE option in the SAS/GIS tools menu or by using the batch geocoder. For either method, a prepared SAS
address dataset is required. The dataset must include one address variable including the address number, direction,
name and type (e.g. 500 S. Broad St). The dataset must also include a city variable and a state variable. A zip code
variable is desired, but not required.
Import the dataset containing the addresses to be geocoded. Create a variable that contains all of the elements of
the street address – in this example, four string variables are concatenated and compressed using the COMPBL
functions to remove extra spaces. The variables city and state are created.
/*Get case data for geocoding*/
PROC IMPORT OUT= WORK.geocode
DATAFILE= "c:\nesug07\data\diseaseA.dbf"
DBMS=DBF REPLACE;
GETDELETED=NO;
RUN;
3
NESUG 2007
Hands-On Workshops
/*Create address variables for geocoding*/
data geocode; set geocode;
address=compbl(strtno||strtdir||' '||strtname||strtave);
city='Philadelphia';
state='Pennsylvania';
run;
Manual Geocode
For geocoding from the SAS/GIS interface, select the GEOCODE option from the TOOLS menu. A new window will
open indicating that the active map is the spatial map entry, and the zip code data are in SASHELP.ZIPCODE. Set
the WORK library to WORK and tick the recreate data option.
In the Address Data Set box, use the pointer to locate the input SAS address dataset. SAS will automatically identify
what it thinks are the appropriate variables for ADDRESS, CITY, STATE, ZIP and PLUS4. If alternate variables are
desired, use the down arrow to select variables from a list. If you want to include additional values, highlight one or
more in the Values To Add box. When parameters are set, click GEOCODE. The address dataset will be recreated
with the spatial location (in latitude/longitude) of the data.
Batch Geocode
Batch geocoding is initiated using the %GCBATCH macro from the Editor window. The macro launches the
geocoding facility and supplies a series of values that determine what gets geocoded, how addresses are matched,
what spatial data is used for matching, and how the output file is generated. While processing the address data, the
geocoding facility writes a message to the log indicating its progress. As addresses are matched, the coordinates of
the address location are added to the address data set.
Additional values contained in the spatial data, such as census tract or census block, can also be added to the
address data set for matched cases. If an address cannot be matched to the spatial data but the address includes a
zipcode, the X and Y coordinates of the zipcode centroid are added instead of the exact coordinates of the address.
4
NESUG 2007
Hands-On Workshops
To enhance matching, the geocoding process converts address components to uppercase and attempts to
standardize address components such as street direction and street type values. These standardized versions of the
address are also added to the address data set. The variables M_ADDR, M_CITY, M_STATE, M_ZIP, and M_ZIP4
refer to the values that were actually matched during the geocoding process.
The following values are accepted by the %GCBATCH macro:
the name of the dataset to geocode (geod=)
the names of the address variables (av=, cv=, sv=, zv=)
the map name (mname=)
the names of any additional values to add to the address data set (pv=)
command to recreate the dataset (newdata=yes)
The code below will launch the geocoding facility using MAPDATA.TIGER.PHILA as the spatial reference map. It will
geocode addresses in WORK.GEOCODE using ADDRESS as the street address variable (av), CITY as the city
variable (cv), STATE as the state variable (sv), and ZIP1 as the zipcode variable (zv). It will recreate (newdata=yes)
the address dataset and include TRACT as an additional variable (pv).
/*Batch geocode addresses*/
%gcbatch(geod=geocode,
/* Name of
av=address,
cv=city,
sv=state,
zv=zip1,
/*
pv=tract,
newdata=yes,
mname=mapdata.tiger.phila);
dm 'af c=sashelp.gis.geocodeb.scl';
the addresses dataset*/
/* street address variable*/
/* city variable */
/* state variable*/
zipcode variable*/
/*variable added to address data*/
/*recreate dataset*/
/* Name of the map entry */
/* Invokes the batch geocoder.*/
EXPORT MAP
In order to display maps created in SAS/GIS using SAS/GRAPH, a map file must be exported into a traditional SAS
dataset that can be referenced by PROC GMAP. To do this, SAS provides an experimental sample program called
the GIS Exporter. The program and documentation can be downloaded from:
http://support.sas.com/rnd/datavisualization/mapsonline/html/tools.html
Select the GIS2SAS Exporter – 13jul04 Update (Request Download) link and follow the steps to download. This
download requires a username/password login. As stated in the program code, this is an experimental set of routines
for use only with Version 8 or above. The files are provided by SAS “as is” without warranty or support of any kind.
Once the files are downloaded, the GISEXPORT.CPT file must be taken out of transport format using PROC
CIMPORT.
libname mapdata 'c:\nesug07\maps';
LIBNAME EXPORT 'C:\NESUG07\MAPS';
FILENAME CPTFILE 'C:\nesug07\gisexport_13jul04\gisexport_13jul04.cpt';
PROC CIMPORT FILE=CPTFILE LIB=EXPORT;
RUN;
Before invoking the GIS exporter, a series of macro variables must be assigned to define parameters of the export.
The macro variables identify the location of the map data (library, catalogue and member name), the location of the
spatial data (library and file name), the target location of the exported map (library and file name), the target location
of the annotate dataset created by the GIS exporter (library and file name), the location of the graphics output, and
the name of the map layer to be exported.
5
NESUG 2007
Hands-On Workshops
/* define the parameters for the export */
%let MAPLIB=mapdata; /*libref of SAS/GIS map*/
%let MAPCAT=tiger;
/*catalog of SAS/GIS map*/
%let MAPNAME=phila;
/*name of SAS/GIS map*/
%let SPALIB=mapdata; /*libref of spatial data*/
%let SPANAME=phila;
/*name of spatial data*/
%let GRALIB=mapdata; /*libref of output map file*/
%let GRANAME=ptract; /*name of output map file*/
%let ANNLIB=mapdata; /*libref of output annotate dataset*/
%let ANNNAME=annotate;
/*name of output annotate dataset*/
%let GRAFLIB=SASUSER; /*libref of graphics output*/
%let GRAFOUT=philamap;
/*name of graphics output*/
%let LAYER2=tract;
/*name of map layer to export*/
%let EXTRA=;
/*names of additional variables to maintain*/
/* Perform the export*/
dm 'af c=export.export.exportv81.scl; run;' graph1;
When the GIS Exporter completes, it generates a graph file using PROC GMAP. This graph uses the exported SAS
dataset (in this case a map of census tract polygons) and a very complex annotate dataset generated by the
exporter. The annotate dataset draws line for the other layers in the original map. Creating a graph file without the
annotate dataset will allow you to see the exported map of census tracts boundaries.
SAS/GRAPH – PROC GMAP
Traditional map datasets can be used to produce two-dimensional (choropleth) or three-dimensional (block, surface
or prism) maps using the GMAP procedure. This paper will focus on choropleth maps. Choropleth maps can be
used to highlight geographic variation between areas of the map using colors and/or patterns. Once you have
created a traditional map dataset using the GIS Exporter, you can create graphs of the map using PROC GMAP.
The code and output for a simple choropleth map of census tracts is found below. However, you will notice that the
map appears different than it did in SAS/GIS. The reason for the difference is that the map is not projected. The
map dataset created by the GIS exporter uses latitude and longitude for its spatial values. Latitude and longitude are
coordinates for a spherical surface. Map projections convert these values to present spatial data with less distortion.
6
NESUG 2007
Hands-On Workshops
/*Presenting maps with PROC GMAP*/
title1 'Unprojected Map';
proc gmap data=mapdata.ptract map=mapdata.ptract;
id tract;
choro tract / nolegend;
run;
quit;
MAP PROJECTION
Map projections determine how a round surface (earth) is projected, or displayed, on a flat surface. All map
projections contain some distortion, but using the right projection can help limit the amount of visual distortion. Map
projections are not critical when displaying maps that cover only a small area. However, maps that cover large areas
will have much more distortion, and may even appear backwards if not properly projected. Projection is also critical
for any map that crosses the equator or meridian. The GPROJECT procedure converts latitude and longitude into
Cartesian coordinates which allows you to project maps in such a way as to minimize this distortion. For most maps
of the US, the default projection should suffice. As a default, SAS uses the Albers projection to display maps on a flat
surface.
The GPROJECT procedure requires an input dataset (data=), an output dataset (out=) and an ID statement. The ID
statement refers to a variable in the dataset that identifies a unit area, and each unit area is evaluated separately.
The DEGREE option specifies that the latitude and longitude are in degrees (the default is radians), and the
EASTLONG option specifies that longitude values increase to the east.
The code below creates a projected output map dataset called PROJTRACT and creates a graph of the projected
map using PROC GMAP.
/*Project map using default - Albers*/
proc gproject data=mapdata.ptract
out=projtract degree eastlong;
id segment;
run;
7
NESUG 2007
Hands-On Workshops
/*display projected map*/
title 'Projected Map';
proc gmap data=projtract map=projtract;
id tract;
choro tract / nolegend;
run;
quit;
PLOTTING GEOCODED DATA ON THE MAP
For the first GMAP example, the choropleth map will be used as background, or spatial reference for the geocoded
cases. Since GMAP does not draw the data points, they are placed on the map using the annotate facility. The
annotate facility can be used to enhance graphics output in a variety of ways. An annotate dataset can be created to
add additional text, highlight graph elements, enhance legend features, or anything else that will help end-users
understand what the graph is meant to convey. The annotate dataset in this example will be used to add the
geocoded cases to the map using a color-coded symbol.
Annotating is as simple as creating a SAS dataset with a few commands. In this case, the point locators (x,y) for the
cases are already in the data, and several annotate variables are specified using the length statement: function,
color, style, position and text. The values for other variables - xsys, ysys, hsys and when – are set in the data step.
The value of ‘color’ is set based on the sex of the case. The annotate option (annotate=) of the PROC GMAP
statement identifies the annotate dataset.
Annotate Variables:
XSYS and YSYS indicate which coordinate system to use. In this example, coordinate system 2 is
used. Coordinate system 2 refers to the data area and the absolute value of x,y. For more information
regarding coordinate systems refer to the SAS help tool.
when indicates when annotate graphics are drawn – ‘a’ indicates that annotate graphics are drawn
after the procedure output and ‘b’ indicates that they are drawn before the procedure output.
8
NESUG 2007
Hands-On Workshops
position indicates where a text string will be drawn in relation to the x,y coordinates. A value of ‘E”
indicates the text should be centered one-half cell below the position of the x,y coordinate. The default
is 5 (centered).
style applies to the ‘label’ function and indicates the text font.
function indicates the action to be taken – options include MOVE, LABEL, BAR, and PIE (as well as
several others).
color and size indicate the color and size of the text (or the BAR, PIE, etc.). The SIZE is interpreted
based on the function – for TEXT, size represents the height of the text.
The annotate dataset is created from the geocoded output dataset. This dataset contains all of the address data, the
x and y coordinates from the geocoding process, as well as other data describing the cases. When annotating the
case locations, male cases will be indicated by red markers and female cases will be indicated by blue markers. The
annotate dataset will also be limited to cases with a _status_=’found’, so that only cases with a matched address are
plotted.
/*annotate dataset */
data anno; set geocode;
length function style color text $ 8 ;
xsys='2';
ysys='2';
hsys='3';
when='A';
function = 'label';
style='special';
text='L';
if sex='F' then color='blue';
if sex='M' then color='red';
size=5;
position='E';
if _status_='found';
segment=1;
run;
Because the annotate data must be projected using the same projection criteria as the map, the GPROJECT
procedure is run on the annotate dataset as well. The variable segment=1 is created so that all of the elements are
evaluated together, just as was done in the map projection.
proc gproject data=anno out=annop degree eastlong;
id segment;
run;
/*projected map with annotated cases plotted*/
title1 h=14pt f=swissb j=l "Geographic Distribution of Disease A Cases";
title2 h=14pt f=swissb j=l "Philadelphia, 2006";
pattern1 value=empty;
proc gmap data=projtract map=projtract anno=annop;
id tract;
choro segment /
coutline=black
nolegend;
run;
quit;
9
NESUG 2007
Hands-On Workshops
ADDING LEGEND AND NORTH ARROW
The above GMAP examples were executed using the NOLEGEND option. Legends, like any other graphics element,
take up space in the output area and affect the size of the map. Since the geocoded cases were placed on the map
as annotations, they would not be considered legend elements. The legend elements are levels of the CHORO
variable, in this example, segment. However, a legend that describes the annotated cases can be created using a
combination of a legend statement and some additional annotating. A north arrow is also a typical map element that
is not an option in GMAP, but annotation can handle that as well (if you have an image file).
/*annotate for legend*/
data legend;
length function style color text$ 8;
retain xsys '5' ysys '5' when 'A';
function='move'; x=70;y=18;output;
function='label';style='special';
size=2;position='5';color='red';text='L';output;
function='move'; x=75;y=18;output;
function='label';style='swissb';size=.8;position='5';color='black';
text='Male';output;
function='move'; x=70;y=14;output;
function='label';style='special';
size=2;position='5';color='blue';text='L';output;
function='move'; x=76;y=14;output;
function='label';style='swissb';size=.8;position='5';color='black';
text='Female';output;
function='move'; x=4;y=4;output;
function='image'; style='fit';imgpath='c:\nesug07\maps\n_arrow.bmp';
x=10;y=16;output;
run;
10
NESUG 2007
Hands-On Workshops
For this annotate dataset, the xsys and ysys coordinate system is set to ‘5’ which represents the procedure output
area. The annotate dataset begins by moving to a position in the procedure output area that is located inside the
empty legend box (this box will be created in the PROC GMAP). A red triangle is drawn to match the map symbol for
male cases, and the label “Male” is placed in black text. Annotate moves to another position inside the empty legend
box and a blue triangle is drawn and labeled “Female”. A north arrow image is placed using the annotate function
‘image’ with style=’fit’ and imgpath=’pathname’. The style=’fit’ options positions the image to ‘fit’ in the space allotted.
/*combine cases and legend*/
data anno2; set annop legend;run;
/*Map with titles and legend*/
title1 h=14pt f=swissb j=l "Geographic Distribution of Disease A Cases";
title2 h=14pt f=swissb j=l "Philadelphia, 2006";
pattern1 value=empty;
legend1 across=1
down=4
cborder=black
position=(bottom right inside)
mode=share
label=none
offset=(-2cm,)
value=(h=8pt f=swissb justify=left 'Census Tract');
proc gmap data=projtract
map=projtract anno=anno2;
id tract;
choro segment /
coutline=black
legend=legend1;
run;
quit;
LEGEND - The LEGEND option is included in the GMAP, however most of the legend elements will be added using
the annotate facility. Therefore, the legend is included in the GMAP leaving space to annotate legend values for the
geocoded cases.
Definition of Legend Features:
across=1 defines the number of columns across for the legend entries
down=4 defines the number of rows for the legend entries (these two options [across= and down=] allow for
great flexibility in how a legend is displayed)
position=(bottom right inside) places the legend below the graph, justified right, inside of the graph output area
cborder=black creates a border around the legend
mode=share places the legend in the procedure output area allows other graphic elements to share the same
space (other options include RESERVE and PROTECT)
offset=(-2cm,) places the legend 2cm to the left of the default position for bottom/right/inside (the offset
function allows you to place the legend anywhere you want)
value= sets the height, font and text of the legend value(s)
11
NESUG 2007
Hands-On Workshops
GMAP WITH DRILLDOWN
Another way to present spatial data is to use choropleth to color areas of the map based on a range of values, and
have drill-down capability to provide more specific data for the geographic unit. The next example uses geocoded
case data that has been assigned a census tract during the geocoding process. The case data is summarized by
census tract, and the summarized data is merged with census population data to calculate a disease rate. The rate
data is then merged with the map data to create a choropleth of the rates. Also, PROC PRINT is used to generate a
detail report for each census tract. A variable is created in the new map dataset that will be used to link the reports to
the map using the HTML option of GMAP.
The first few steps are to assign librefs for the input datasets and summarize the case data by census tract. PROC
SUMMARY creates an output dataset that contains one observation for each level of the class variable (tract). The
variable _freq_ contains the total number of cases for each census tract.
/*Library where geocoded case data resides*/
libname disa 'c:\nesug07\data';
/*Library where census population data resides*/
libname census "c:\nesug07\census";
/*Library where map data resides*/
libname mapdata 'c:\nesug07\maps';
/*Summarize geocoded cases by census tract*/
proc summary nway data=disa.geocases;
class tract; output out=a;run;
In order to calculate rates of disease, the summarized case data are merged with the census population data. Both
datasets are sorted by the merge variable (tract), and then merged in a data step creating the output file MER1. The
variable rate is created by dividing the frequency by the population and multiplying by 10,000. This calculates the
rate of disease A per 10,000 population for each census tract.
12
NESUG 2007
Hands-On Workshops
/*Merge census totals with census pop for rate calculation*/
proc sort data=census.ctpop2;by tract;run;
proc sort data=a;by tract;run;
data mer1; merge census.ctpop2 a;by tract;run;
data mer1; set mer1;
rate=(_freq_/pop)*10000;
if _freq_=. then _freq_=0 and rate=0;
if _freq_=0 then rate=0;
run;
After the census tract rates have been calculated, the rate data are merged with the census tract map dataset. Both
files are sorted by tract and merged in a data step.
proc sort data=mapdata.ptract; by tract;run;
proc sort data=mer1; by tract; run;
data philarate; merge mapdata.ptract mer1;
by tract;
run;
In order to create drill-down links, a variable must be created that ODS HTML will use as a reference. The link is
established using “HREF=” followed by a filename in a variable called ‘ctlink’. The value of the variable ‘centxt’, a text
version of the census tract number, is used to create different filenames for each census tract.
/* Assign values to the HTML variable. */
data philarate;set philarate;
ctlink=compress('HREF=Census_Tract_Number_'||centxt||'.html');
run;
Now that the link has been set in the map dataset, the filenames referenced by the links must be created. Drill-down
links are typically used to present more detailed information. This example uses PROC PRINT to generate a simple
report for each census tract that contains the tract number, the population, the number of cases and the rate. Since
381 different files will be generated, a macro is created that is executed using a data _null_ step. The macro
“createfiles” receives the value of “tract” from the data step execution. The value of “tract” is used to uniquely name
each file to match the value of “ctlink”, to identify the census tract number in the title statement, and to limit the PROC
PRINT in the WHERE statement.
%macro createfiles(tract);
filename outfile "c:\nesug07\output\Census_Tract_Number_&tract..html" mod;
ods listing close;
ods html close;
ods html file=outfile (no_top_matter) anchor='second';
goptions reset=all device=gif gunit=pt
ftext=swiss ftitle=swissb htitle=14 htext=12 ctext=black;
Title1 j=c "Disease A Cases and Rates for Census Tract &tract";
proc print data=mer1 label noobs;
var centxt pop _freq_ rate;
label centxt='Census Tract'
pop='Population (2000)'
_freq_='Number of Cases'
rate='Rate per 10,000 persons';
where centxt="&tract";
run;
quit;
ods html close;
ods listing;
%mend;
The _null_ data step will execute the macro for each unique value of “centxt” and create 381 output files.
13
NESUG 2007
Hands-On Workshops
data _null_; set mer1;
by centxt;
if first.centxt then do;
call execute('%createfiles('||centxt||')');
end;
run;
The final step is to create the map using ODS HTML and PROC GMAP. When creating a choropleth map with a
continuous variable like rate, SAS will automatically create groups of values for the color scheme. You can control
the number of groups using the LEVELS= option. In this example, the LEVELS= options is set to 4 and four pattern
statements are used to control the color scheme.
goptions reset=all;
filename odsout 'c:\nesug07\output';
ods html path=odsout
file='census tract map.html';
title1 h=14pt f=swissb j=l "Incidence of Disease A Cases in Philadelphia";
title2 h=14pt f=swissb j=l "by Census Tract";
pattern1 value=solid color=red;
pattern2 value=solid color=green;
pattern3 value=solid color=yellow;
pattern4 value=solid color=blue;
legend1 across=1
down=4
cborder=black
position=(bottom right inside)
mode=share
label=(f=swissb h=8pt position=top "Rate per 10,000")
offset=(-2cm,)
value=(h=8pt f=swissb justify=right);
proc gmap data=philarate
map=philarate
;
id tract;
choro rate /html=ctlink
levels=4
coutline=gray
legend=legend1
;
format _numeric_ comma6.2;
run;
quit;
ods html close;
ods listing;
The output map is created as an HTML file. When opened in a browser, the web page opens with ‘active content’.
As the cursor is moved across the map, links to the drill-down files are visible at the bottom of the page. Clicking on
any of the census tract areas opens a new web page with the detail report for that census tract.
14
NESUG 2007
Hands-On Workshops
GMAP WITH ACTIVEX
PROC GMAP can be used to produce output in any of the formats supported by device drivers in SAS. Some of the
devices available include ACTIVEX, BMP, JPEG, Portable Network Graphic (PNG), TIFF and others depending on
your version and environment. The maps produced in the preceding examples all used the default device driver
PSCOLOR. Changing the device driver name to ACTIVEX in the GOPTIONS statement will create the same output
map with tool tips for the values of census tract and the rate for each census tract. The ACTIVEX device also allows
the user to change the map background and style, and label geographic features.
Other drivers can be used to create high-resolution files for documents or presentation slides. Each driver has its
own set of defaults (text sizes, resolution, height and width), and minor adjustments are often necessary when
changing from one device to another. Also, some features are not supported by all device drivers.
/*ACTIVEX GMAP*/
goptions device=activex;
filename odsout 'c:\nesug07\output';
ods html path=odsout
file='census tract map activex.html';
title1 h=14pt f=swissb j=l "Incidence of Disease A Cases in Philadelphia";
title2 h=14pt f=swissb j=l "by Census Tract";
pattern1 value=solid color=red;
pattern2 value=solid color=green;
pattern3 value=solid color=yellow;
pattern4 value=solid color=blue;
15
NESUG 2007
Hands-On Workshops
legend1 across=1
down=4
cborder=black
position=(bottom right inside)
mode=share
label=(f=swissb h=8pt position=top "Rate per 10,000")
offset=(-2cm,)
value=(h=8pt f=swissb justify=right);
proc gmap data=philarate
map=philarate
;
id tract;
choro rate /html=ctlink
levels=4
coutline=gray
legend=legend1
;
format _numeric_ comma6.2;
run;
quit;
ods html close;
ods listing;
16
NESUG 2007
Hands-On Workshops
REFERENCES
Odem, E. and Massengill D. “Cheap Geocoding: SAS/GIS® and Free TIGER® Data”, available online at
http://support.sas.com/rnd/papers/sugi30/CheapGeocoding.pdf
SAS Mapsonline: http://support.sas.com/rnd/datavisualization/mapsonline/html
SAS Institute Inc., SAS 9.1.3 Help and Documentation, Cary, NC: SAS Institute Inc., 2000-2004.
US Census, TIGER®, TIGER/Line®, and TIGER®-Related Products, http://www.census.gov/geo/www/tiger/
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Michael Eberhart, MPH
Philadelphia Department of Public Health
1101 Market Street
8th Floor
Philadelphia, PA 19107
Work Phone: 215-685-4766
Email: michael.eberhart@phila.gov
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
17
Download