Statistical Analysis & Dissemination of Census Data

advertisement
Statistical Analysis & Dissemination of Census Data
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Statistical Analysis and Dissemination of Census Data
 Outline

The Power of Maps


Dynamic Census Atlases


Overview & Examples
Digital Geographic Data for Dissemination


Overview & Example
Spatial Analysis Techniques


Introduction and Example
Overview & Cost and Benefits
Digital Data Dissemination Strategies and Users

Overview of Users
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Anyone or anything can be associated with a
known location in the world
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
CHILE: HOUSING AND POPULATION CENSUS DISTRICTS 2002
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Tsunami Affected Areas in Gizo, Solomon Islands
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
The power of maps






Maps
Maps
Maps
Maps
Maps
Maps




communicate a concept or an idea.
are often meant to support textual information
appeal to the viewer’s curiosity
summarize large amounts of information concisely
can be used for description, exploration, confirmation, tabulation
encourage comparisons:
Between different areas on the same map: where are population densities
highest?
Between different maps: is child mortality higher in the districts of province
A than in province B?
same area: where and by how much do literacy rates for males and females
differ in the districts?
Between maps for different time periods: did fertility rates decline since the
last census?
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Dynamic census atlases

Alternative to a static census atlas

Publishing of a digital map and database together with
mapping software can allow users to produce custom maps of
census indicators.

Normally includes digital boundary files at a lower resolution
than the full census database to allow fast drawing and low
disk usage

closely integrated attribute table should contain only a selected
number of census indicators.

Densities and ratios that are appropriate for mapping should
already be calculated.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Dynamic census atlases

The data provider should therefore provide an easy-to-use package
together with the boundaries and data.

The use of that package should require minimal training and
experience.

The application should be “plug-and-play”—after installation, the user
should immediately be able to produce maps

Drill-down options for different geographic selections

Interactive area delineation options (e.g. select schools in a district)
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
A screenshot of Ukraine’s dynamics census atlas
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 the main use of spatial analysis is for census products
and services
 Techniques include: buffering, linear interpolation, point
pattern analysis, and cartograms, etc.
 All offer functionality beyond standard thematic
(choropleth) mapping, with many tools now available in
both commercial and open-source software programs.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 Some prevalent forms of spatial analysis especially
useful for use with population data include:



Queries
Distance measurements
Transformations



Buffering
point-in-polygon analysis
Polygon overlay analysis
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques

Queries:

Often this is the first step in an analysis, where one seeks to
create a subset of units such as populated places with certain
characteristics, allowing the user to check how typical an
observation is against other observations

They use a GIS program to answer simple questions posed by the
user, with no changes in the database and no new data produced.

An example of a query using geocoded census data is, select all
towns with a population greater than 1,000 persons. These towns
can then have their attributes summarized, for instance, to
measure their total fertility rates against smaller towns and
villages, then map the results

The term exploratory data analysis refers to investigations of
patterns and trends in data using such techniques as querying
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Area delineation

E.g. Interactive
determination of school
districts with the same
number of children in each
school grade by
aggregating census
dissemination areas
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 Distance measurements

Easily done with all GIS programs, using the centroids (or
center points) of cities, towns, and villages.

An analysis can be done to select villages located more
than a kilometer from a school, clinic, or water source.

These can then be further analyzed using the attribute
information for the populated places themselves.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 Transformations

Methods of spatial analysis that use simple geometric,
arithmetic or logical rules to create new datasets

Transformations can include operations that convert raster
into vector data, or a stream of GPS coordinates into a
route or a boundary

Of all the transformational techniques, buffering is the
most well known and important
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 Buffering
(transformation)

Involves building a new data layer by identifying all areas
that are within a certain specified distance of the original.

Buffering can be performed on points, lines and polygons
and can be weighted by attribute values.

Buffering can be used to model travel time, for instance, by
creating a “catchment area” around a particular feature
such as a school or a clinic.

This provides a measure of accessibility that can be
mapped across the extent of a country.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
“is near to”: Buffer Operations
•
Point buffer
•
Affected area
around a Hospital
•
Catchment area of a
water source
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Buffer Operations
•
Line buffer
•
How many people live near the polluted river?
•
What is the area impacted by highway noise?
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Buffer Operations
•
Polygon buffer
•
Area around a reservoir where development
should not be permitted
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 point-in-polygon analysis


Determines whether a point lies inside or outside a
polygon.
Can be used to compare geocoded village centroids lying
inside and outside hazardous areas such as tropical storm
tracks or earthquake zones.
 Polygon overlay analysis


Involves comparison between the locations of two different
polygonal data layers.
For example, the boundaries of two administrative districts
could be compared to troubleshoot errors in the field
enumeration process
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques

Spatial interpolation

A spatial analysis method designed to fill in values that lie between
observations

A variety of methods including inverse-distance weighting and kriging
are used to estimate the values of unsampled sites

based on Tobler’s first law that all nearby objects are more similar
than distant objects

Kriging: interpolation technique for obtaining statistically unbiased
estimates of spatial variation of known points such as surface
elevations or yield measurements utilizing a set of control points


In kriging, the general properties of a surface are modeled to estimate the
missing parts of the surface
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Example of linear interpolation creating contours
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques

Thiessen polygons

Have the unique property that
each polygon contains only one
input point (e.g. settlements),
and any location within a
polygon is closer to its
associated point than to the
point of any other polygon.

This method assumes that the
values of the unsampled data
are equivalent to those of the
sampled points.
Thiessen polygons illustrated
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Areas of influence

Commuting
distances: daily
commuters flow
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques

Descriptive summaries are a spatial equivalent of descriptive statistics
(such as mean and standard deviation) that represent the essence of a
dataset in 1 or 2 numbers

Centers of population are the two-dimensional equivalent of a
statistical mean and are often used to display the center of population
using the weighted average of x and y coordinates of populated points

Point pattern or cluster analysis regards the distribution of points in
space irrespective of their actual locations to determine whether
patterns are random, clustered, or dispersed

hot spots are where high values are surrounded by high values, or
cold spots, where low values are surrounded by low values. These are
particularly useful for identifying populations at risk as well.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Spatial Analysis Techniques
 Cartograms

sometimes used to display
census results

The areas of the original
polygons are expanded or
contracted based on their
attribute values such as
population size or voting
habits
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Modelling: smoothing

Evolution of the
population
beetwen two
censuses
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Digital Geographic Data for Dissemination


Demand for digital databases that consist of extractions of the
census agency’s digital geographic master database will only
increase
Census data are an important input in policy planning and
academic analysis in many fields.

Health service provision, educational resource allocation,
design of utilities and infrastructure, and electoral planning are
some applications where government agencies require spatially
referenced small area population statistics.

Commercial users employ such data for marketing applications
and location decisions.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Digital Geographic Data for Dissemination
 Benefits and costs

Benefits: Unsurpassed detail and precision, the potential
use of census data in numerous applications--especially
when overlaid on other geographic data such as terrain,
and the relative ease of management and storage of
thousands of units

Costs: expense in processing and data management,
possible data disclosure issues, and quality control;costs of
metadata production should be factored into the equation
as well
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Digital Data Dissemination Strategies and Users
 The wide range of potential users of disaggregated
census data means that the NSO needs to pursue a
multi-leveled digital data dissemination strategy.
 Broadly, we can distinguish between the following types
of users:
 Advanced GIS users
 Computer literate users
 Novice users
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Digital Data Dissemination Strategies and Users

Advanced GIS users

work easily with large datasets and can use ftp to access them

Require extensive metadata. Sometimes called data extractors or
“power users”

They will want access to spatial and attribute information in a
comprehensive digital geographic format

The census office needs to supply comprehensive documentation
on the geographic parameters used for the geographic database as
well as on the individual census variables

The spatial information will be distributed in an open geographic
format that can be easily converted into any number of
commercial GIS formats
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Digital Data Dissemination Strategies and Users

Computer literate users

Government, commercial or private sector users who want to be able to
browse the thematic information in a census database spatially.

Want to produce thematic maps and thus need to be able to perform simple
manipulation of cartographic parameters.

Simple analytical functions such as aggregation of census units to customdesigned regions should also be possible.

This group of users is best served with a comprehensive, pre-packaged
application that is designed for a commercial or freely available desktop
mapping package.

Documentation requirements are somewhat smaller, since the users are
unlikely to change the geographic parameters of the database or perform
more advanced GIS operations.
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Digital Data Dissemination Strategies and Users

Novice users

Largely want to view pre-prepared maps on a computer and
perhaps perform some basic queries

Best data distribution strategy is often to produce a self-contained
digital census atlas

This atlas could consists of a series of static map images, for
example, in the form of a slide show

Or it could be a very simple mapping interface with pre-designed
map views that allow basic queries

Both, static maps and a simple map interface, can be made
accessible through the Internet
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
GRACIAS POR SU ATENCIÓN
UNSD-CELADE Regional Workshop on Census Cartography for the 2010 Latin America’s census round
Download