Research on Volunteered Geographic Information Michael F. Goodchild University of California

advertisement
Research on Volunteered
Geographic Information
Michael F. Goodchild
University of California
Santa Barbara
Geographic information
• Linking facts to locations within the
geographic domain
– geospatial
– analogous spaces
• Center for Spatial Studies
• Geographic information systems
–
–
–
–
–
remote sensing
GPS
legacy of map data
tracking
volunteered geographic information
Geographic information science
• Fundamental issues raised by these
technologies
• Ontology and representation
– coding the geographic world
– what to leave out
• Uncertainty
– measuring the differences between databases
and reality
– problems of vagueness
– propagation
GIScience and social networks
• Social networks are constrained by
geography
– the need for physical proximity
– cultures are geographically defined
• Theories of spatial interaction
– physical proximity paramount
• Theories of social networking
– “the death of distance”
• Need for new theory
– SmallWorlds
User-generated content
• Trivial to georeference information
– geotagging
– map mashups
• Trivial to make maps
– online, open-source software
• Significant alternative to traditional sources
– things that were never mapped
– traditional sources unsustainable
popvssoda.com
www.flickr.com
www.wikimapia.org
www.wikimapia.org
www.wikimapia.org
The story so far
• The modern era
– authoritative production of geographic information
• official naming
• guarantees of accuracy (or inaccuracy)
– need for economies of scale
•
•
•
•
cost of entry
aerial photography, analytic stereoplotters
advanced skills
printing
– generic products
• multiple purposes
• long-lived, emphasizing static phenomena
The end of the modern era
• Growing demands
–
–
–
–
geographic information to support Web services
wayfinding
public decision-making
management
• Legislatures less willing to fund
– efforts to make the user pay
– constraints on the US federal government
• Meltdown in the costs of entry
• Software replacing the need for skills
– soft photogrammetry
– basic cartography
– “anyone can make a map”
Neogeography
• “In other words, the old geography involves a
prescribed role/interaction between the four main
components, namely the audience, the information,
the presenter and the subject, which are common to
most standard practises of learning. In
NeoGeography, there are however no such
boundaries on roles, ownership, and interactions of
these four components.” Rana and Joliveau, Journal
of Location-Based Services
• The citizen as both consumer and producer of
geographic information
A distant mirror
• The Waldseemüller map
– St Dié-des-Vosges, 1507
– a name that stuck
Research questions
• Who’s doing it?
• About what?
• Quality
Who’s doing it?
• Long-tail distributions
– Pareto scaling
• 3 Wikimapia leaders 140,000+ each
• IP addresses
• Inference from postings
Robinson projection
Articles with geotags
# of articles per unit area (log scale, 0.1°
resolution)
988,522 articles
103,291 distinct locations
Wikipedia
authorship
• Registered authors
• Only username required
• Name, email, etc. optional
• IP address kept hidden
• Anonymous authors
• IP address made public
• But nothing else
Contributions to “Copenhagen Opera House”
# of
Contributions
Username or IP
Most Recent
18
Dybdahl
18-Sep-2005
6
85.233.237.71 (anon)
12-Jan-2008
3
Viva-Verdi
8-Sep-2006
1
Hemmingsen
3-Jan-2007
4
81.62.92.47 (anon)
15-Apr-2006
1
Thue
28-Feb-2006
2
Ghent
30-Apr-2006
3
Valentinian
7-Jan-2007
3
83.77.92.205 (anon)
10-Apr-2006
3
130.226.234.229 (anon)
29-Sep-2007
2
86.149.109.196 (anon)
15-Oct-2007
2
Uppland
24-Dec-2005
2
87.48.100.222 (anon)
12-Jan-2006
University of California, Santa Barbara
135 anonymous authors with 719 revisions; signature distance = 533 km
64% of articles at 2,000 km or less
???
Cyberscape: Placemarks in post-Katrina
New Orleans
Flooding Reports (via
Scipionus) in New Orleans,
Sept. 2005
Who was able to or
interested in using
this new technology?
Which places were
they interested in?
Crutcher and Zook. 2009. GeoForum
What are they doing it about?
• <x,Z,z(x)>
• Framework data
– common themes that support wayfinding,
georeferencing
– Federal Geographic Data Committee
•
•
•
•
•
•
•
geodetic control
property ownership
administrative boundaries
Earth imagery
topography
hydrography
transportation
www.openstreetmap.org
The gazetteer
• The “names layer”
– named features, points of interest
– the interface to geographic information
– Wikimapia
Beyond the framework
• Things that have never been mapped
– where your friends are
– cultural heritage
– graffiti, trash
• Time-critical information
– emergencies
Emergency management
• Recent fires in Santa Barbara
– Zaca Fire (July 07)
• burned for 2 months
• no houses lost
– Gap Fire (July 08)
• burned for 7 days
• no houses lost
– Tea Fire (November 08)
• burned for 2 days
• 230 houses lost
– Jesusita Fire (May 09)
• burned for 2 days
• 75 houses lost
Hits
Source
595673
Jesusita Fire (Ethan)
188308
SBC Jesusita Fire Santa Barbara, CA (Robert O'Connor - fire news blog)
89214
Jesusita Fire Map (Randy - Independent.com)
67525
Jesusita Fire in Santa Barbara - LA Times map (Los Angeles Times)
27777
Map of burned homes in Santa Barbara (Los Angeles Times)
26330
Jesusita Fire Evacuation Areas: Approximation (COSB)
25454
Santa Barbara 'Jesusita Fire' (ABC7 Eyewitness News)
19592
Jesusita Fire - Santa Barbara (lanewspace)
2446
Santa Barbara Damaged Homes 2008 (Los Angeles Times, note: mapped for comparison with Jesusita)
2048
Jesusita Fire (longhairedhippy)
1314
Santa Barbara Fire Evacuation (Gary);
962
Jesusita Fire in Santa Barbara (ABC30 Action News)
788
Wildfire ~ Santa Barbara (Buffalo)
505
Closure map - Jesusita Fire in Santa Barbara (Los Angeles Times)
461
Untitled (Matthew, note: discovered via google.com.mx);
396
Jesusita Fire Structure Damage (Paul Bartsch);
31
Lessons learned
• Authoritative information
– must be verified by officials
– too slow for the Tea and Jesusita Fires
• Asserted information
– carries risk of false positives
• false rumor of Tea Fire in Mission Canyon
• some unnecessary evacuations
– people are willing to accept false positives
– lack of authoritative information amounts to false negatives
– false negatives are far less acceptable than false positives
• there were some posted false negatives
LA Times May 8 2009
Emphasis on the easy stuff
• Placenames, streets, pictures
– georeferencing
– well-defined reference systems and objects
• Free production by citizens replacing
authoritative production
• Do other types of geographic information
require experts?
– a catalog of types
The FGDC framework layers
•
Transportation
– basic network
• rapid updates
– citizens as probes
• real-time congestion
– air quality
•
Hydrography
– water quality
•
Elevation
– adequate authoritative sources
•
Orthoimagery
– cost of entry
•
Cadastral
– legal issues
•
Administrative units
– legal issues
•
Geodetic control
– expertise in geodesy
Thematic layers
• Weather and climate
– tradition of amateur observers
– GLOBE
• Biota
– Christmas Bird Count
– e-flora
– phenology
• Soils
– Natural Resource Conservation Service
– mapping for agricultural advice
The soil map
• An area-class map
–
–
–
–
–
irregular areas denoting uniform soil type
lengthy descriptions of types
made by highly trained experts
sample points
interpolation from ground observation and aerial
photography
– every point assigned to a single class
– expressed in a unique mapping c = f(x)
• What is the nature of the expertise?
Analysis of sample soils
Application/use case
Aerial photography
Application/use case
Historical records of crop
performance
Application/use case
Expert knowledge
Application/use case
Covariates, e.g. elevation,
climate, parent material
Application/use case
Scale and accuracy issues
Application/use case
Application knowledge
Data quality
• Traditional mapping guarantees bounds on
inaccuracy
– quality can be surprisingly poor
– legacy data
• OSM studies show VGI compares well
• Geographic context
• Crowdsourcing metrics
www.flickr.com
earth.google.com
nationalmap.gov
Authority and assertion
• Authority
– inaccuracies are guaranteed
– formal testing programs
– metadata
• Assertion
– inaccuracies are undocumented
– no metadata
– data about popular places tend to be more
accurate
– inaccuracies often less than legacy authoritative
data
Jesus and Allah
BLUE = (more Jesus than Allah); RED = (more Allah than Jesus).
Size of the bubble show the magnitude of the difference
Crandall et al. 2009. Mapping the world’s photos.
http://www.cs.cornell.edu/~crandall/papers/mapping09www.pdf
Tracks inferred from Flickr postings
(http://www.cs.cornell.edu/~crandall/papers/mapping09www.pdf)
Future plans
• Conflation with traditional sources
– comparison of quality
– different emphases
• Methods of analysis and modeling for VGI
• VGI in remote regions
– digital divide
Download