erdc/cerl tr-07-draft

advertisement
ERDC/CERL TR-08-DRAFT
Best-practice Methods for Open-source
Human Geography Data Compilation and
Integration
Azerbaijan and Turkey: Data Development Efforts
Construction Engineering
Research Laboratory
Marina V. Drigo and Lynndee A. Kemmet
Approved for public release; distribution is unlimited.
September 2013
Center Directed Research
ERDC/CERL TR-08-DRAFT
September 2013
Best-practice Methods for Open-source Human
Geography Data Compilation and Integration
95th Civil Affairs Brigade Data Development Efforts
Dr. Charles R. Ehlschlaeger and Mr. Jeffrey A. Burkhalter
Construction Engineering Research Laboratory
U.S. Army Engineer Research and Development Center
2902 Newmark Drive
Champaign, IL 61822
Ms. Marina V. Drigo
The PERTAN Group
44 East Main Street, Suite 403
Champaign, IL 61820
Ms. Lynndee A. Kemmet
Network Science Center at West Point
Thayer Hall Room 119
West Point, NY 10996
Final report
Approved for public release; distribution is unlimited. [or a restricted statement]
Prepared for
Under
Monitored by
U.S. Army Corps of Engineers
Washington, DC 20314-1000
Work Unit D34502
Construction Engineering Research Laboratory
U.S. Army Engineer Research and Development Center
2902 Newmark Drive, Champaign, IL 61822
ERDC/CERL TR-08-DRAFT
Abstract: Development of human geography data for stability operations
around the world is one of the primary interests of Civil Affairs units.
There is a need for consistent and reliable tools and methods for compiling
and integrating open-source human geography data to assist Civil Affairs
teams in mission planning prior to, and during deployment.
ERDC researchers are engaged with the 95th Civil Affairs Brigade to develop best-practice methods for compiling and integrating human geography data down to the neighborhood scale using Azerbaijan and Turkey as
case studies.
This paper describes open-source online and public tools suitable for data
collection and integration. The data collection and integration follows the
methodology of the sixteen (16) data collection themes set by the National
Geospatial Intelligence Agency's Human Geography Working Group
(HGWG).
DISCLAIMER: The contents of this report are not to be used for advertising, publication, or promotional purposes.
Citation of trade names does not constitute an official endorsement or approval of the use of such commercial products.
All product names and trademarks cited are the property of their respective owners. The findings of this report are not to
be construed as an official Department of the Army position unless so designated by other authorized documents.
DESTROY THIS REPORT WHEN NO LONGER NEEDED. DO NOT RETURN IT TO THE ORIGINATOR.
ii
ERDC/CERL TR-08-DRAFT
Table of Contents
Preface ...........................................................................................................................................................vii
1
Introduction ............................................................................................................................................ 1
2
Dataset Types and Sources ................................................................................................................. 2
2.1
2.2
3
Overview of Datasets by Type ....................................................................................... 3
Overview of Dataset Sources ........................................................................................ 4
Data Search Methodology .................................................................................................................. 6
3.1
3.2
Search for Baseline and Foundational Datasets ......................................................... 6
Search for Specialized Datasets ................................................................................... 7
4
Data Integration from Disparate Sources ......................................................................................10
5
Global Sources for Human Geography Data ..................................................................................11
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15
5.16
5.17
6
Collection of Four or More Human Geography Themes ............................................ 11
Communications and Media ....................................................................................... 15
Demographic and Human Population Measures....................................................... 16
Economy ....................................................................................................................... 16
Education ..................................................................................................................... 16
Ethnicity........................................................................................................................ 16
Language ..................................................................................................................... 17
Land: Cultural Terrain .................................................................................................. 18
Land: Ownership .......................................................................................................... 18
Land: Use and Cover ............................................................................................... 18
Medical and Health ................................................................................................. 19
Organizations ........................................................................................................... 20
Religion..................................................................................................................... 20
Significant Events .................................................................................................... 20
Social Groups ........................................................................................................... 21
Transportation Use .................................................................................................. 21
Water Supply and Control ........................................................................................ 21
COCOM Sources for Human Geography Data Themes ................................................................22
6.1
AFRICOM Sources ........................................................................................................ 22
6.1.1
Collection of Four or More Human Geography Themes ............................................. 22
6.1.2
Communications and Media ....................................................................................... 22
6.1.3
Demographic and Human Population Measures ....................................................... 22
6.1.4
Economy ....................................................................................................................... 22
6.1.5
Education...................................................................................................................... 22
6.1.6
Ethnicity ........................................................................................................................ 22
6.1.7
Language ...................................................................................................................... 22
6.1.8
Land: Cultural Terrain .................................................................................................. 23
iii
ERDC/CERL TR-08-DRAFT
6.1.9
6.2
6.3
6.4
iv
Land: Ownership .......................................................................................................... 23
6.1.10
Land: Use and Cover ............................................................................................... 23
6.1.11
Medical and Health ................................................................................................. 23
6.1.12
Organizations ........................................................................................................... 23
6.1.13
Religion .................................................................................................................... 23
6.1.14
Significant Events .................................................................................................... 23
6.1.15
Social Groups .......................................................................................................... 23
6.1.16
Transportation Use .................................................................................................. 23
6.1.17
Water Supply and Control ....................................................................................... 23
CENTCOM Sources ...................................................................................................... 24
6.2.1
Collection of Four or More Human Geography Themes ............................................. 24
6.2.2
Communications and Media ....................................................................................... 24
6.2.3
Demographic and Human Population Measures ....................................................... 24
6.2.4
Economy ....................................................................................................................... 24
6.2.5
Education...................................................................................................................... 24
6.2.6
Ethnicity ........................................................................................................................ 24
6.2.7
Language ...................................................................................................................... 24
6.2.8
Land: Cultural Terrain .................................................................................................. 24
6.2.9
Land: Ownership .......................................................................................................... 25
6.2.10
Land: Use and Cover ............................................................................................... 25
6.2.11
Medical and Health ................................................................................................. 25
6.2.12
Organizations ........................................................................................................... 25
6.2.13
Religion .................................................................................................................... 25
6.2.14
Significant Events .................................................................................................... 25
6.2.15
Social Groups .......................................................................................................... 25
6.2.16
Transportation Use .................................................................................................. 25
6.2.17
Water Supply and Control ....................................................................................... 26
EUCOM Sources ........................................................................................................... 26
6.3.1
Collection of Four or More Human Geography Themes ............................................. 26
6.3.2
Communications and Media ....................................................................................... 27
6.3.3
Demographic and Human Population Measures ....................................................... 27
6.3.4
Economy ....................................................................................................................... 28
6.3.5
Education...................................................................................................................... 28
6.3.6
Ethnicity ........................................................................................................................ 28
6.3.7
Language ...................................................................................................................... 28
6.3.8
Land: Cultural Terrain .................................................................................................. 28
6.3.9
Land: Ownership .......................................................................................................... 28
6.3.10
Land: Use and Cover ............................................................................................... 28
6.3.11
Medical and Health ................................................................................................. 29
6.3.12
Organizations ........................................................................................................... 29
6.3.13
Religion .................................................................................................................... 29
6.3.14
Significant Events .................................................................................................... 29
6.3.15
Social Groups .......................................................................................................... 29
6.3.16
Transportation Use .................................................................................................. 29
6.3.17
Water Supply and Control ....................................................................................... 29
PACOM Sources ........................................................................................................... 29
ERDC/CERL TR-08-DRAFT
6.5
7
v
6.4.1
Collection of Four or More Human Geography Themes ............................................. 29
6.4.2
Communications and Media ....................................................................................... 29
6.4.3
Demographic and Human Population Measures ....................................................... 29
6.4.4
Economy ....................................................................................................................... 29
6.4.5
Education...................................................................................................................... 30
6.4.6
Ethnicity ........................................................................................................................ 30
6.4.7
Language ...................................................................................................................... 30
6.4.8
Land: Cultural Terrain .................................................................................................. 30
6.4.9
Land: Ownership .......................................................................................................... 30
6.4.10
Land: Use and Cover ............................................................................................... 30
6.4.11
Medical and Health ................................................................................................. 30
6.4.12
Organizations ........................................................................................................... 30
6.4.13
Religion .................................................................................................................... 30
6.4.14
Significant Events .................................................................................................... 30
6.4.15
Social Groups .......................................................................................................... 30
6.4.16
Transportation Use .................................................................................................. 31
6.4.17
Water Supply and Control ....................................................................................... 31
SOUTHCOM Sources .................................................................................................... 31
6.5.1
Collection of Four or More Human Geography Themes ............................................. 31
6.5.2
Communications and Media ....................................................................................... 31
6.5.3
Demographic and Human Population Measures ....................................................... 31
6.5.4
Economy ....................................................................................................................... 31
6.5.5
Education...................................................................................................................... 31
6.5.6
Ethnicity ........................................................................................................................ 31
6.5.7
Language ...................................................................................................................... 31
6.5.8
Land: Cultural Terrain .................................................................................................. 31
6.5.9
Land: Ownership .......................................................................................................... 31
6.5.10
Land: Use and Cover ............................................................................................... 32
6.5.11
Medical and Health ................................................................................................. 32
6.5.12
Organizations ........................................................................................................... 32
6.5.13
Religion .................................................................................................................... 32
6.5.14
Significant Events .................................................................................................... 32
6.5.15
Social Groups .......................................................................................................... 32
6.5.16
Transportation Use .................................................................................................. 32
6.5.17
Water Supply and Control ....................................................................................... 32
Azerbaijan Sources .............................................................................................................................33
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
Collection of Four or More Human Geography Themes ............................................ 33
Communications and Media ....................................................................................... 35
Demographic and Human Population Measures....................................................... 36
Economy ....................................................................................................................... 36
Education ..................................................................................................................... 37
Ethnicity........................................................................................................................ 37
Language ..................................................................................................................... 37
Land: Cultural Terrain .................................................................................................. 37
Land: Ownership .......................................................................................................... 37
ERDC/CERL TR-08-DRAFT
7.10
7.11
7.12
7.13
7.14
7.15
7.16
7.17
8
Turkey Sources.....................................................................................................................................41
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
8.12
8.13
8.14
8.15
8.16
8.17
9
Land: Use and Cover ............................................................................................... 37
Medical and Health ................................................................................................. 38
Organizations ........................................................................................................... 38
Religion..................................................................................................................... 38
Significant Events .................................................................................................... 39
Social Groups ........................................................................................................... 39
Transportation Use .................................................................................................. 39
Water Supply and Control ........................................................................................ 39
Collection of Four or More Human Geography Themes ............................................ 41
Communications and Media ....................................................................................... 42
Demographic and Human Population Measures ....................................................... 42
Economy ....................................................................................................................... 43
Education ..................................................................................................................... 45
Ethnicity........................................................................................................................ 45
Language ..................................................................................................................... 46
Land: Cultural Terrain .................................................................................................. 46
Land: Ownership .......................................................................................................... 46
Land: Use and Cover ............................................................................................... 46
Medical and Health ................................................................................................. 47
Organizations ........................................................................................................... 48
Religion..................................................................................................................... 48
Significant Events .................................................................................................... 48
Social Groups ........................................................................................................... 49
Transportation Use .................................................................................................. 49
Water Supply and Control ........................................................................................ 49
Conclusions ..........................................................................................................................................50
vi
ERDC/CERL TR-08-DRAFT
Preface
This study was conducted for the Director, Engineering Research and Development Center under Project D34502, “Rapid Model Prototyping for
Infrastructure and Essential Services.” The technical monitor was [T.M.
Name].
The work was performed by the Land Heritage and Resource Conservation
Branch (CN-C) of the Environmental Division (CN), U.S. Army Engineer
Research and Development Center – Construction Engineering Research
Laboratory (ERDC-CERL). At the time of publication, Dr Christopher M
White was Chief, CEERD-CN-C; Dr Michelle Hanson was Chief, CEERDCN; and Dr. Bert Davis was the Technical Director for Geospatial Research
and Engineering. The Deputy Director of ERDC-CERL was Dr. Kirankumar Topudurti and the Director was Dr. Ilker Adiguzel.
The Commander and Executive Director of ERDC was COL Kevin J. Wilson and the Director was Dr. Jeffery P. Holland.
vii
ERDC/CERL TR-08-DRAFT
1
Introduction
Development of human geography data for stability operations around the
world is one of the primary interests of Civil Affairs units. There is a need
for consistent and reliable tools and methods for compiling and integrating open-source human geography data to assist Civil Affairs teams in
mission planning prior to, and during deployment.
ERDC researchers are engaged with the 95th Civil Affairs Brigade to develop best-practice methods for compiling and integrating human geography data down to the neighborhood scale using Azerbaijan and Turkey as
case studies.
This paper describes open-source online and public tools suitable for data
collection and integration. The data collection and integration follows the
methodology of the sixteen (16) data collection themes set by the National
Geospatial Intelligence Agency's Human Geography Working Group
(HGWG).
This paper is organized as follows: section 2 describes dataset types as defined by NGA (baseline, foundational and specialized) and briefly outlines
data sources. Section 3 describes methodology for searching for and collecting open-source data. Section 4 (forthcoming) outlines problems and
provides recommendations associated with integrating data from disparate sources into a common dataset. Section 5 describes global data sources
(i.e., those covering most of the world) and classifies them according to the
16 Human Geography themes. Section 6 describes data sources covering
the areas of responsibility of each of the COCOMs (AFRICOM, CENTCOM,
EUCOM, PACOM, and SOUTHCOM) and also classifies them according to
the 16 Human Geography themes. Section 7 follows the same methodology
for Azerbaijan, and section 8 – for Turkey.
While the best efforts were made to organize all data sources according to
the 16 Human Geography themes, many data sources covered multiple
themes. In order to avoid unnecessary repetition, a special category “Collection of Four or More Human Geography Themes” was created to precede the list of Human Geography themes in sections 5 through 8.
1
ERDC/CERL TR-08-DRAFT
2
Dataset Types and Sources
In recent years, the growth of online sources for free and available sources
of data has increased the ability of public access to and control of economic, demographic and other types of spatial and tabular data. In a developing country context, while traditional gaps in technical capacity, statistical
sophistication and public transparency have been mitigated by increased
investments in capacity, there still remain large gaps in data and analysis
when compared to readily available in the United States.
However, often lacking is ready availability of spatial data, including geocoded surveys, or surveys with highly resolved geographically identifying
information, particularly in less developed regions and states. In general,
data are more plentiful in main cities. Developing countries may often lack
the internal spatial data infrastructure, or frameworks of data, metadata,
users and tools that interact for the use and creation of a coordinated spatial data infrastructure. In situations when digital capability exists, local
government agencies often consider high resolution GIS data as security
risks and choose not to make data available. Also of significance to future
data gathering is the difference in state statistical capacities which varies
by region and country.
The World-Wide Human Geography Data (WWHGD) Working Group,
https://wwhgd.org/, is currently organizing the description of the most
important data layers for Social Cultural Analysis (SCA). These SCA data
layers will be organized within a Human Geography Data Dictionary
(HGDD) and a Human Geography Entity Catalog (HGDD). The data layers
are organized by themes, of which 16 exist as of May 8, 2013. These
themes, which are called sub-models by the WWHGDWG, are Communications and Media, Demographic and Human Population Measures, Economy, Education, Ethnicity, Social Groups, Organizations, Language, Land:
Use and Cover, Land: Cultural Terrain, Land: Ownership, Medical and
Health, Religion, Significant Events, Transportation Use, and Water Supply and Control. The sub-models can be downloaded, after subscribing to
the WWHGDWG, at http://wwhgd.org/content/human-geographystandards-working-group-hgwg-sub-models.
2
ERDC/CERL TR-08-DRAFT
2.1
Overview of Datasets by Type
Availability of high quality spatial data is critical to developing human geography datasets. NGA has identified three types of spatial data: foundational, baseline, and specialized.
Foundational data layers contain information spread throughout the spatial and temporal domain. Examples of foundational data layers include
population density maps, isopleth weather maps, and similar layers that
represent information across geographic extents.
Baseline data layers are those data layers that locate information in geographic space. Examples of useful baseline data include administrative
boundaries, transportation networks, names and locations of settlements,
street network and essentially anything that can be pinpointed on a map
and used as a baseline.
Specialized data contains detailed mission specific information. Much of
this information is in tabular or narrative form. Since it is often not geotagged and often not temporally-tagged either, specialized data requires
baseline data for geo-referencing. Examples of specialized data include
listings of hospital equipment by hospital, school children enumerations
by district or city, type of crop grown by parcel, population censuses and
various surveys. When specialized data are combined with baseline data,
geo-referenced maps can be created on different topics: for example,
population density, nutrition, childhood mortality, poverty, basic needs
and others.
It must be noted that population censuses are expensive, and only occur
once a decade at most for a majority of countries. Surveys, however, use a
much smaller sample size, and even correctly sampled cannot give precise
estimates or small areas; particularly rural and less populated regions. To
create timely highly resolved spatial analyses, Census data can also be
combined with smaller more topical surveys using Small Area Estimate
statistical techniques to create region or country-wide high resolution estimates of demographic factors. Data as supplied can also be aggregated to
regional units and joined with baseline data.
3
ERDC/CERL TR-08-DRAFT
2.2
Overview of Dataset Sources
Many services provided by both private and public source initiatives collect, create, and disseminate spatial and tabular data products. International bodies, such as the Food and Agriculture Organization of the United
Nations (FAO), were among the first organizations that began collating international spatial information to improve access and use of spatial data.
Other examples of international organizations, research institutions and
organizations responsible for collecting and/or disseminating data for different countries include USGS and the World Bank. These organizations
generally provide a free access to the data through their websites, though
frequently it may be necessary to request the data specifically and to get
approved first.
Local agencies, i.e. agencies located in a study area, also collect and disseminate spatial and tabular data. Local agencies can fall into categories of
public agencies and private organizations. Examples of private institutions
may include banks, schools, churches, power companies and others. Examples of public agencies may include equivalents of the U.S. departments
(e.g., the Department of Defense, the Department of Education, U.S. Census Bureau and others). As noted earlier, local agencies in developing
countries do not always have the capabilities for collecting and disseminating high-quality and high-resolution data, may choose not to disclose such
data for security reasons or may not make the data be easily accessible. For
example, while city governments may publish an interactive map on their
website, there is frequently no user-friendly way to download displayed
data in GIS or any other format.
Crowd-sourced spatial data or citizen-collected data sources have become
increasingly popular in recent years due to the availability of Internet and
mobile phones. OpenStreetMap is one such example, as it is being increasingly used by many organizations (e.g., for developing transportation applications). Citizen-collected data may be problematic due to lack of
standardized metadata as well as device-introduced and humanintroduced errors but it is certainly most up-to-date and provides the
freshest insight into spatial infrastructure. Another problem is that the data coverage of non-western countries may be very incomplete, even for urban areas.
Finally, media (newspapers) and social media data (Twitter) are a valuable
source for collecting data (frequently geocoded), including in real time. As
4
ERDC/CERL TR-08-DRAFT
with any other citizen-collected data, data collection may be impeded by
the availability of internet and users’ online activity as well as by the fact
that the majority of data are likely to be in a native language (not English).
5
ERDC/CERL TR-08-DRAFT
3
Data Search Methodology
This section describes methods used to search for open-source geographic
base, foundational and specialized data that can be found through the Internet.
The search for open-source data was primarily conducted using Google
and Yahoo search engines. Google was used as a primary search engine,
and Yahoo was used as a secondary search engine, due to the fact that the
relevance results between Google and Yahoo may differ. Additionally, a
web search engine Google Scholar was used to search for academic publications and references, primarily on topics of religion, politics, ethnicity
and linguistics.
3.1
Search for Baseline and Foundational Datasets
The search for baseline GIS data (administrative boundaries, transportation and river networks and other) can be done by typing appropriate
search terms in a search engine, for example, “Azerbaijan GIS data”. It is
also useful to include more specific terms in a search, for example, “Azerbaijan transportation GIS data” and to alternate search terms, for example, “geographic” instead of “GIS”. The use of words such as “free” or
“open-source” allows excluding commercial GIS sources. Global datasets
containing baseline data for the entire world can be quickly identified in
this manner.
The search for free geographical data turns up not only individual websites
with data, but also websites that serve as a reference to external data
sources. Most if not all universities have spatial portals maintained
through a library or individual departments (search terms like “university
GIS spatial data” usually turn up necessary information). Reference websites may also be created by individual citizen enthusiasts or be created as
a part of a larger research project.
Such reference websites may or may not be updated on a regular basis and
many (though not all) referenced data sources overlap. Reference websites
are convenient as they provide a ready-to-use list of geographic data,
which is usually characterized by theme (e.g., ecology, human geography,
land use etc.) and location (world, country, and/or city datasets). As al-
6
ERDC/CERL TR-08-DRAFT
ready noted, reference websites are not necessarily updated and links to
the data may be broken or outdated. Additionally, referenced data may be
of a questionable quality, outdated or lacking metadata.
The search for foundational data themes (e.g., population density maps,
land use maps etc.) is similar to the search for baseline data themes as described above. Similarly, it is recommended to execute searches using a
variety of search terms with different search engines. Reference websites
almost always include links to foundational data themes on a global scale.
Global foundational themes can be in a raster or vector format. When in a
raster format, the data are usually available at the resolution of at least one
kilometer.
3.2
Search for Specialized Datasets
The search for specialized data themes (e.g., data on traffic accidents) can
be more cumbersome for international data sources, since non-western
countries often either do not collect specialized data or do not make such
data freely available to the public. In most cases, specialized data that were
collected on a neighborhood scale are made available only in an aggregate
form on a first or second administrative level due to privacy issues. Specialized data can be available in English, but usually they are published in
a language of the country in question.
The search for specialized data themes using web search engines is generally not productive unless the search is conducted using the language of
the country in question. In the latter case, there is a better chance of returning relevant search results.
It is recommended to start the search on specialized data themes by consulting the major government agency responsible for collecting statistical
information in a given country (such agency can be indicated by conducting an internet search; additionally, the United States Bureau of Labor
Statistics provides a list of international statistical agencies at
http://www.bls.gov/bls/other.htm/). While there may be issues associated
with the trustworthiness of data collected by government, government statistics are frequently the major source of information. Statistics from the
major government agency usually covers a great variety of topics ranging
from environment to socio-economic factors. Data are usually available on
the first, second and third administrative levels.
7
ERDC/CERL TR-08-DRAFT
State statistical agencies typically provide economic and demographic data
in a form of censuses, household surveys, health/HIV surveys and commodity and price surveys on a variety of administrative levels. Availability
of data and data acquisition processes differ for individual countries; frequently, the raw files and datasets are made publicly accessible. State statistical agencies in developing countries often lack robust data search
tools, which can make the process of navigation and data search less efficient when compared with the developed world.
The major statistical agency typically serves as a nexus for other places
that may have data. However, it is often useful to consult other governmental agencies and departments as they occasionally provide some additional data in a form of reports and maps on their website. Examples of
such agencies include the equivalents of the Department of Health, the
Department of Defense, the Department of Education, the Department of
Agriculture and other Departments in the United States. A list of governmental agencies can be compiled by running a web search or consulting
Wikipedia (http://www.wikipedia.org/), which is a good source for basic
information on a given country’s political and administrative organization.
Additional ways to find specialized data on a neighborhood level include
searching official websites of cities and administrative areas at level 2 or
finer. A list of these sources can be obtained with the help of Wikipedia,
which often publishes links to official websites of cities and other administrative entities. However, this method of searching for data can be timeconsuming and not productive, unless it is automated using web-scraping
software. Official websites of cities and other administrative entities can
vary greatly in terms of quantity and quality of information but sometimes
they publish scanned maps, provide links to interactive maps, and provide
reports and other kinds of data. The primary challenge with these sources
is that official websites tend to provide information in their native language only, and even if an English version is available, it tends to be much
more poorly represented than a native version.
Specialized data include various surveys (e.g., surveys on political attitudes) that are done by individual researchers, research groups and institutions. Survey reports are generally produced on a country level, and
while raw data are usually available free of charge, in most cases it is necessary to request a special permission to use the data due to confidentiality
issues. Many surveys can be found via web by using various combinations
8
ERDC/CERL TR-08-DRAFT
of search terms. Google Scholar is useful for finding academic publications
that provide analyses of the country’s issues (e.g., religion). Academic
studies that are based on a survey will provide a reference to their data
source. Individual researchers may be able to share their data. It must be
noted that smaller surveys are usually done within a city or a smaller administrative area and hence are not nationally representative.
Besides Google Scholar (which usually provides references only), academic
publications can be searched and obtained through university libraries or
other libraries that have access to academic databases. University libraries
usually have access to the ProQuest dissertation database; unpublished
dissertations and theses are another useful source of data references.
Specialized data can be further obtained from newspapers, blogs, forums,
and other social media. These sources can be found via web search and by
following any further references published on their site. One example of
specialized data found through media sites may be a number of protesters
as reported by official news and as estimated by non-government affiliated
experts. Analysis of media environment can provide useful information
about events and accompanying attitudes of the population in a near realtime setting. Aside from Facebook and Twitter, data can be retrieved
through news agencies, blogs and forums. News articles most definitely
include information related to geographic location of the event, and specific geographic information may also be found in blogs and forums.
9
ERDC/CERL TR-08-DRAFT
4
Data Integration from Disparate Sources
<forthcoming>
10
ERDC/CERL TR-08-DRAFT
5
Global Sources for Human Geography
Data
5.1
Collection of Four or More Human Geography Themes
5.1.1
The Economist
The Economist (http://www.economist.com) covers political and other
news, as well as blogs and debates.
5.1.2
Topix
Topix (http://www.topix.com) aggregates and delivers updated news from
various sources, including forums.
5.1.3
The New York Times
The New York Times (http://topics.nytimes.com) provides current news,
as well as archived articles and commentaries.
5.1.4
World Bank
World Bank’s datasets (http://data.worldbank.org/data-catalog) provide a
variety of national level thematic indicators. Access to raw data requires
registration, but reports on a national level are availably freely. For Azerbaijan, seven surveys are available: Azerbaijan - Global Financial Inclusion
(Global Findex) Database 2011 (by Development Research Group, Finance
and Private Sector Development Unit - World Bank); Azerbaijan - Enterprise Survey 2002, 2005 and 2009 (by World Bank, European Bank for
Reconstruction and Development); Azerbaijan - Financial Literacy Survey
2009 (by Azerbaijan Micro-finance Association); Azerbaijan - Multiple Indicator Cluster Survey 2000 (by State Statistical Committee of the Azerbaijan Republic, UNICEF Multiple Indicator Cluster Surveys); and Azerbaijan - Survey of Living Conditions 1995 (by Social Studies Center,
Institute of Sociology and Political Science (SORGU) and the World Bank).
11
ERDC/CERL TR-08-DRAFT
5.1.5
United Nations
United Nation’s datasets (http://data.un.org/) provide data from its constituent agencies on population, Millennium Development Goals, mortality and other social and economic information on a country level.
5.1.6
Internal Displacement Monitoring Centre
The Internal Displacement Monitoring Centre (http://www.internaldisplacement.org) provides information and analysis on IDP (internal displace population) situation and background worldwide, with social and
economic data on IDPs being most readily available on a country level.
5.1.7
WikiMapia
WikiMapia (www.wikimapia.org) is a collaborative project, where users
can create their own or update existing map data worldwide. WikiMapia’s
data may be extracted as a Google Earth file with .kml extension though an
application programming interface (API). Data coverage differs by individual country, and urban areas are typically covered more extensively.
5.1.8
OpenStreetMap
OpenStreetMap (http://www.openstreetmap.org/) creates and distributes
free geographic data for the worlds; the users are allowed to make changes
to the maps and to add new content by uploading GPS data. Available data
layers may include settlements, railway stations, transportation networks,
water features, random points of interest (schools, hotels, banks, ATMs,
etc.), land use, natural reserves and vegetation.
5.1.8.1 GeoFabrik.de
GeoFabrik.de (http://download.geofabrik.de/openstreetmap/) creates extracts of OpenStreetMap data.
5.1.8.2 BBBike.org
BBBike.org (http://download.bbbike.org/osm/) creates extracts of OpenStreetMap data.
12
ERDC/CERL TR-08-DRAFT
5.1.8.3 GIS-Lab
GIS-Lab (http://gis-lab.info/projects/osm_shp/region) creates extracts of
OpenStreetMap data.
5.1.9
USGS Earth Explorer
The USGS Earth Explorer (http://earthexplorer.usgs.gov/) has a variety of
aerial, satellite, and radar map projects, including digital elevation data
and water body data for different uses.
5.1.10
EDENext Data Portal
The EDENext Data Portal (http://www.edenextdata.com) provides datasets on a variety of topics, primarily related to climate change, biodiversity and agriculture. Links to additional global datasets are available at
http://www.edenextdata.com/?q=content/global-gis-datasets-links-0.
5.1.11
Global Administrative Areas Database
Global Administrative Areas Database (GADM, http://www.gadm.org)
provides access to administrative boundaries, hydrologic, road, railroads,
ports, airports, and populated places data.
5.1.12
Natural Earth
Natural Earth (http://www.naturalearthdata.com) provides access to administrative boundaries, hydrologic, road, railroads, ports, airports, and
populated places data.
5.1.13
DIVA-GIS
DIVA-GIS (http://www.diva-gis.org/Data) provides access to administrative boundaries, hydrologic, road, railroads, ports, airports, and populated
places data.
5.1.14
Second Administrative Level Boundaries Database
Second Administrative Level Boundaries (SALB, http://www.unsalb.org/)
Database provides access to standardized maps of subnational administrative boundaries, which are widely used in other research projects.
5.1.15
Food and Agriculture Organization (FAO) of the United Nations
13
ERDC/CERL TR-08-DRAFT
GeoNetwork
FAO GeoNetwork (http://www.fao.org/geonetwork/srv/en/main.home)
provides access to georeferenced databases, interactive maps, and satellite
imagery. The Global Administrative Units Database
(http://www.fao.org/geonetwork/srv/en/metadata.show?id=12691) provides access to first and second level administrative levels, and to lower
levels, if available. FAO also provides access to the Relational World Database II (RWDB2), which can be accessed by entering the search term.
RWDB2 is a collection of accurate second level and in some cases third
and fourth level administrative unit shapefiles, rivers, roads and other
administrative data.
5.1.16
International Center for Tropical Agriculture
International Center for Tropical Agriculture (CIAT, www.ciat.cgiar.org)
focuses on agriculture, food security and climate change research in Asia,
African and Latin America and the Caribbean. It provides access to data,
models and web mapping tools (Tools and Resources tabs at
http://dapa.ciat.cgiar.org/).
5.1.17
World Values Survey
World Values Survey (www.worldvaluessurvey.org) provides access to surveys in 87 societies including some major cities. Questions include socioeconomic status, demographics, and values related to religion, race, gender, government, politics and others. Surveys have been conducted in
1981-2014; the most recent surveys will be released in 2014.
5.1.18
PreventionWeb
PreventionWeb (http://www.preventionweb.net/) is a project of the UN
Office for Disaster Risk Reduction. It provides access to data in vector and
raster format on cyclones, droughts, earthquakes, fires, floods, landslides,
tsunamis and volcanoes.
5.1.19
Pew Research Center
Pew Research Center (http://www.pewresearch.org/) conducts annual
surveys on various topics, e.g., religion, inequality, corruption, freedom,
attitudes towards current political leaders and others. The following pro-
14
ERDC/CERL TR-08-DRAFT
jects cover countries other than the Unites States: Global Attitudes Project,
and Religion and Public Life Project.
5.1.20
SETA Foundation for Political, Economic and Social Research
SETA (http://setav.org/) is a non-profit research agency conducting work
on national, regional and interregional issues. Their reports may be useful
for assessing situations on a national and a sub-national level.
5.1.21
Carbon Monitoring for Action (CARMA)
CARMA (http://carma.org/) is associated with the Confronting Climate
Change Initiative at the Center of Global Development
(http://www.cgdev.org). It is a global database with data on the best available estimates for CO2 emissions around the world and the identities of
firms that own them.
5.1.22
The Guardian
The Guardian (http://www.theguardian.com/) makes a variety of data
available for many countries, in addition to providing news. The data be
found through their Datastore (http://www.theguardian.com/data) and
Datablog (http://www.theguardian.com/news/datablog). It is most detailed for the UK but other countries can be found as well. The data are
frequently in an interactive map format.
5.2
Communications and Media
5.2.1
The Electoral Knowledge Network
The Electoral Knowledge Network (http://aceproject.org/) provides data
on the electoral process in countries around the world. The database provides sources of data and makes an effort to verify data. Data covers a wide
array of categories related to electoral systems, including information on
voting regulations, regulations pertaining to political parties and campaigns and regulations relating to the media in elections.
5.2.2
Reporters without Borders
Reporters without Borders (http://en.rsf.org/) organization has developed
rankings for press freedom of 179 countries. It provides information on
freedom of the press and access to uncensored information.
15
ERDC/CERL TR-08-DRAFT
5.2.3
The Committee to Protect Journalists
The Committee to Protect Journalists (http://www.cpj.org/) also publishes reports on the state of press freedom worldwide.
5.2.4
The DIMES Project
The DIMES Project (http://www.netdimes.org/) is an open-source distributed scientific research project studying the connectivity, structure and
topology of the Internet. The data are collected with the help of volunteers
(a volunteer installs the DIMES software on their computer, which then
collects the data and sends it over in a manner similar to Berkeley’s
SETI@home project). The latest data (published monthly) dates to April,
2012. Data can be used to map density of internet connectivity.
5.3
Demographic and Human Population Measures
5.3.1
SEDAC
Columbia University’s Socio-Economic Data and Applications Center’s
(SEDAC, http://sedac.ciesin.columbia.edu) provides grids of local population, population density, population change as well as urban extents.
5.4
Economy
5.4.1
SEDAC
Columbia University’s Socio-Economic Data and Applications Center’s
(SEDAC, http://sedac.ciesin.columbia.edu) provides data on unsatisfied
basic needs, poverty and food security.
5.5
Education
5.6
Ethnicity
5.6.1
Joshua Project
The Joshua Project (http://www.joshuaproject.net) provides descriptions
and statistical summaries of ethnic groups, languages spoken and types of
religions. The purpose of the project is to emphasize groups with the fewest followers of Christianity.
16
ERDC/CERL TR-08-DRAFT
5.6.2
International Conflict Research
The International Conflict Research group ( www.icr.ethz.ch ) provides access to geo-referenced ethnic groups (GREG) and geo-referenced ethnic
power relations (GeoEPR) datasets ( http://www.icr.ethz.ch/data/other ).
5.6.3
People Groups
The People Groups (http://peoplegroups.org/) is similar to Joshua Project. The purpose of the project is to determine groups with the largest/fewest followers of evangelical Christianity. The project provides statistical summaries on people’s groups, including approximate location,
language spoken, religion and ethnic affiliations.
5.7
Language
5.7.1
UNESCO
The UNESCO Atlas of the World’s Languages in Danger
(http://www.unesco.org/culture/languages-atlas/index.php) provides the
number of language speakers and classifies languages as safe, vulnerable,
definitely/severely/critically endangered and extinct.
5.7.2
Ethnologue
The Ethnologue: Languages of the World (http://www.ethnologue.com)
project provides the number of language speakers, lists dialect names, describes language use, gives statistical summaries as well as language maps
(for selected regions).
5.7.3
Joshua Project
The Joshua Project (http://www.joshuaproject.net) provides descriptions
and statistical summaries of ethnic groups, languages spoken and types of
religions. The purpose of the project is to emphasize groups with the fewest followers of Christianity.
5.7.4
Lingvarium Project
Lingvarium Project (http://lingvarium.org/index.shtml) provides data on
linguistic geography as well as historical distribution of linguistic groups.
17
ERDC/CERL TR-08-DRAFT
5.7.5
People Groups
The People Groups (http://peoplegroups.org/) is similar to Joshua Project. The purpose of the project is to determine groups with the largest/fewest followers of evangelical Christianity. The project provides statistical summaries on people’s groups, including approximate location,
language spoken, religion and ethnic affiliations.
5.8
Land: Cultural Terrain
5.9
Land: Ownership
5.10 Land: Use and Cover
5.10.1
IUCN Red List of Threatened Species
The IUCN Red List of Threatened Species (www.iucnredlist.org) provides
assessments for almost 70,000 species, with about 40,000 species
mapped on a global scale. The UICN Red List provides data on distribution of sea grasses, amphibians, reptiles, mammals and marine fish as well
as mangroves and coral reefs.
5.10.2
BirdLife International
The BirdLife International
(http://www.birdlife.org/datazone/info/spcdownload) provides data on
distribution of threatened bird species. The data can be accessed with
permission only.
5.10.3
Lincoln Institute of Land Policy
Lincoln Institute of Land Policy (http://www.lincolninst.edu/) conducted
a study on landuse and landuse change in major cities worldwide. The dataset ‘Atlas of Urban Expansion’ is available for download with an accompanying report. The dataset includes landuse raster files for selected cities.
18
ERDC/CERL TR-08-DRAFT
5.10.4
Project Quicksilver
Project Quicksilver (http://forecast.io/quicksilver/) features a real-time
map of global temperature. According to the authors, this is an experimental project which may have unresolved issues (e.g., temperature over
the oceans has the least resolution and accuracy). Data can be downloaded
hourly in a TIFF format, with a resolution of 0.05 degrees.
5.11 Medical and Health
5.11.1
HIV Spatial Resource Repository
HIV Spatial Resource Repository
(http://www.hivspatialdata.net/?page=data) provides information on spatially explicit HIV-related data.
5.11.2
Demographic and Health Surveys
USAID’s Measure Demographic and Health Surveys (DHS) program
(http://www.measuredhs.com/) provides survey data on health and health
services. DHS program includes 67 surveys from 36 countries, incluing latitude and longitude coordinates of surveyed communities. Additional resource is HIV Spatial Resource Repository located at
http://www.hivspatialdata.net/?page=data.
5.11.3
SEDAC
Columbia University’s Socio-Economic Data and Applications Center’s
(SEDAC, http://sedac.ciesin.columbia.edu) provides data on infant mortality rates and prevalence of child malnutrition.
5.11.4
World Health Organization
World Health Organization (http://www.who.int/) collects data on various
health related topics. These include the World Health Survey
(http://www.who.int/healthinfo/survey/en/index.html) and Global Tobacco Surveys
(http://www.who.int/tobacco/surveillance/survey/en/index.html), Global
School-based Student Health Survey
(http://www.cdc.gov/gshs/index.htm).
19
ERDC/CERL TR-08-DRAFT
5.12 Organizations
5.13 Religion
5.13.1
Joshua Project
The Joshua Project (http://www.joshuaproject.net) provides descriptions
and statistical summaries of ethnic groups, languages spoken and types of
religions. The purpose of the project is to emphasize groups with the fewest followers of Christianity.
5.13.2
People Groups
The People Groups (http://peoplegroups.org/) is similar to Joshua Project. The purpose of the project is to determine groups with the largest/fewest followers of evangelical Christianity. The project provides statistical summaries on people’s groups, including approximate location,
language spoken, religion and ethnic affiliations.
5.14 Significant Events
5.14.1
The Amnesty International
The Amnesty International (http://www.amnesty.org/) provides annual
reports on the human rights condition in countries. These are short, readable reports that highlight significant events in human rights occurring
each year.
5.14.2
GDELT Event Database
Global Database of Events, Language and Tone (GDELT,
http://gdelt.utdallas.edu/) provides georeferenced worldwide data on
human societal-scale behavior and beliefs, as extracted from news and social media archives. The entire dataset has over a quarter-billion records
dating back to January 1979. Dataset updates occur daily.
5.14.3
Electoral Geography 2.0
Electoral Geography 2.0 (http://www.electoralgeography.com/new/en/)
is a blog dedicated to collecting and mapping data on elections worldwide.
20
ERDC/CERL TR-08-DRAFT
The data comes from many sources, including newspapers and Wikipedia.
The website also provides links to similar projects.
5.15 Social Groups
5.16 Transportation Use
5.17 Water Supply and Control
21
ERDC/CERL TR-08-DRAFT
6
COCOM Sources for Human Geography
Data Themes
6.1
AFRICOM Sources
6.1.1
Collection of Four or More Human Geography Themes
6.1.2
Communications and Media
6.1.3
Demographic and Human Population Measures
6.1.4
Economy
6.1.5
Education
6.1.6
Ethnicity
6.1.6.1 Gulf2000 Project
Gulf2000 Project (www.gulf2000.columbia.edu) provides linguistic, ethnic, religious and cultural maps.
6.1.7
Language
6.1.7.1 Gulf2000 Project
Gulf2000 Project (www.gulf2000.columbia.edu) provides linguistic, ethnic, religious and cultural maps.
22
ERDC/CERL TR-08-DRAFT
6.1.8
Land: Cultural Terrain
6.1.9
Land: Ownership
6.1.10
Land: Use and Cover
6.1.11
Medical and Health
6.1.12
Organizations
6.1.13
Religion
6.1.13.1 Gulf2000 Project
Gulf2000 Project (www.gulf2000.columbia.edu) provides linguistic, ethnic, religious and cultural maps.
6.1.14
Significant Events
6.1.15
Social Groups
6.1.16
Transportation Use
6.1.17
Water Supply and Control
23
ERDC/CERL TR-08-DRAFT
6.2
CENTCOM Sources
6.2.1
Collection of Four or More Human Geography Themes
6.2.1.1 Radio Free Europe Radio Liberty
Radio Free Europe Radio Liberty (http://www.rferl.org) describes itself as
an agency working in the countries without free press and providing access
to uncensored news and debates.
6.2.2
Communications and Media
6.2.3
Demographic and Human Population Measures
6.2.4
Economy
6.2.5
Education
6.2.6
Ethnicity
6.2.6.1 Gulf2000 Project
Gulf2000 Project (www.gulf2000.columbia.edu) provides linguistic, ethnic, religious and cultural maps.
6.2.7
Language
6.2.7.1 . Gulf2000 Project
Gulf2000 Project (www.gulf2000.columbia.edu) provides linguistic, ethnic, religious and cultural maps.
6.2.8
Land: Cultural Terrain
24
ERDC/CERL TR-08-DRAFT
6.2.9
Land: Ownership
6.2.10
Land: Use and Cover
6.2.10.1 . The Interactive Agricultural Ecological Atlas of Russia and
Neighboring Countries
The Interactive Agricultural Ecological Atlas of Russia and Neighboring
Countries (www.agroatlas.ru), funded by the USDA Agricultural Research
Service and Office and International Research Programs, provides spatial
data on crops and crop wild relatives, as well as diseases, pests, weeds and
environment (climate, soils, vegetation). The data are in a MapInfo format, which can be converted into ESRI shapefiles using ArcGIS Interpolability extension.
6.2.11
Medical and Health
6.2.12
Organizations
6.2.13
Religion
6.2.13.1 Gulf2000 Project
Gulf2000 Project (www.gulf2000.columbia.edu) provides linguistic, ethnic, religious and cultural maps.
6.2.14
Significant Events
6.2.15
Social Groups
6.2.16
Transportation Use
25
ERDC/CERL TR-08-DRAFT
6.2.17
6.3
Water Supply and Control
EUCOM Sources
6.3.1
Collection of Four or More Human Geography Themes
6.3.1.1 . Osservatorio Balcani e Caucaso
Osservatorio Balcani e Caucaso (http://www.balcanicaucaso.org) provides
news and analysis of social and political changes in South-East Europe,
Turkey and Caucasus.
6.3.1.2 European Union External Action
European Union External Action (http://eeas.europa.eu) delivers news on
the relationships within EU countries.
6.3.1.3 Portal on Central Eastern and Balkan Europe
Portal on Central Eastern and Balkan Europe by IECOB & AIS (PECOB,
http://www.pecob.eu) is primarily a collection of printed and online news
resources.
6.3.1.4 Marilisa Lorusso's Blog
Marilisa Lorusso's Blog (http://marilisalorusso.blogspot.com/) describes
the main events (primarily political but including economic and social too)
of Georgia, Armenia and Azerbaijan.
6.3.1.5 The Caucasus Research Resource Centers
The Caucasus Research Resource Centers (www.crrccenters.org) is a program of the Eurasia Foundation funded by the Carnegie Corporation of
New York, which conducts research in Armenia, Azerbaijan and Georgia.
Their research methods include desk reports and surveys on such issues as
corruption, religious beliefs, household skills, social cohesion and political
attitudes, to name a few. Studies including Azerbaijan are Caucasus Barometer annual household survey about social, economic issues and political attitudes; and Social Capital, Media and Gender Survey. The data is nationally representative and can be aggregated to a larger geographic region
(e.g., southwest, northeast etc.).
26
ERDC/CERL TR-08-DRAFT
6.3.1.6 Eurofound
Eurofound (http://www.eurofound.europa.eu/index.htm) is the European
Union agency conducting research in the areas of social and economic
change. It conducts the following surveys: the European Quality of Life
Survey, the European Working Conditions Survey and the European Company Survey. The surveys cover EU member and candidate countries; are
nationally representative; and are done in multiple waves. The surveys
cover a broad range of indicators, both objective and subjective.
6.3.1.7 European Social Survey
The European Social Survey (http://www.europeansocialsurvey.org/) is
done biennially and it covers such topics as the attitudes, beliefs and behaviors of people. Example questions include those on politics and government, social life, terrorism, religion, economy and others. Turkish survey was done in 2008.
6.3.1.8 Eurobarometer
Eurobarometer programme
(http://ec.europa.eu/public_opinion/index_en.htm;
http://www.gesis.org/en/eurobarometer/home/) conducts surveys on
such topics as social situation, health, culture, information technology, environment, the Euro, defense and others.
6.3.1.9 European Environmental Agency
European Environmental Agency (http://www.eea.europa.eu/) is the European Union agency responsible for conducting research and disseminating information on the environment. Available datasets include national
emissions, water quantity and quality, natural protected areas, land cover
and others. Some datasets cover parts of the countries adjacent to Europe.
6.3.2
Communications and Media
6.3.3
Demographic and Human Population Measures
27
ERDC/CERL TR-08-DRAFT
6.3.4
Economy
6.3.5
Education
6.3.6
Ethnicity
6.3.7
Language
6.3.8
Land: Cultural Terrain
6.3.9
Land: Ownership
6.3.10
Land: Use and Cover
6.3.10.1 The Interactive Agricultural Ecological Atlas of Russia and
Neighboring Countries
The Interactive Agricultural Ecological Atlas of Russia and Neighboring
Countries (www.agroatlas.ru), funded by the USDA Agricultural Research
Service and Office and International Research Programs, provides spatial
data on crops and crop wild relatives, as well as diseases, pests, weeds and
environment (climate, soils, vegetation). The data are in a MapInfo format, which can be converted into ESRI shapefiles using ArcGIS Interpolability extension.
6.3.10.2 The European Soil Data Centre
The European Soil Data Centre provides a thematic data infrastructure for
soils (http://eusoils.jrc.ec.europa.eu/). While soil data are available in a
digital format (the European Soil Databases), the data are only for 27 European Union countries.
28
ERDC/CERL TR-08-DRAFT
6.4
6.3.11
Medical and Health
6.3.12
Organizations
6.3.13
Religion
6.3.14
Significant Events
6.3.15
Social Groups
6.3.16
Transportation Use
6.3.17
Water Supply and Control
PACOM Sources
6.4.1
Collection of Four or More Human Geography Themes
6.4.2
Communications and Media
6.4.3
Demographic and Human Population Measures
6.4.4
Economy
29
ERDC/CERL TR-08-DRAFT
6.4.5
Education
6.4.6
Ethnicity
6.4.7
Language
6.4.8
Land: Cultural Terrain
6.4.9
Land: Ownership
6.4.10
Land: Use and Cover
6.4.11
Medical and Health
6.4.12
Organizations
6.4.13
Religion
6.4.14
Significant Events
6.4.15
Social Groups
30
ERDC/CERL TR-08-DRAFT
6.5
6.4.16
Transportation Use
6.4.17
Water Supply and Control
SOUTHCOM Sources
6.5.1
Collection of Four or More Human Geography Themes
6.5.2
Communications and Media
6.5.3
Demographic and Human Population Measures
6.5.4
Economy
6.5.5
Education
6.5.6
Ethnicity
6.5.7
Language
6.5.8
Land: Cultural Terrain
6.5.9
Land: Ownership
31
ERDC/CERL TR-08-DRAFT
6.5.10
Land: Use and Cover
6.5.11
Medical and Health
6.5.12
Organizations
6.5.13
Religion
6.5.14
Significant Events
6.5.15
Social Groups
6.5.16
Transportation Use
6.5.17
Water Supply and Control
32
ERDC/CERL TR-08-DRAFT
7
Azerbaijan Sources
7.1
Collection of Four or More Human Geography Themes
7.1.1
State Statistical Committee of the Republic of Azerbaijan
The State Statistical Committee of the Republic of Azerbaijan at
http://www.stat.gov.az covers such topics as demographics (population,
gender, labor market, education, science, culture, health, and crimes),
economy (agriculture, forestry, fishery, industry, energetics, construction,
trade, transport, telecommunications and postal, finances, and tourism),
and other (food related, entrepreneurship, environmental protection, and
information society). Data are available on a national, district (rayon),
economic region (an aggregation of several districts) and urban/rural level. While data are most readily offered on a national and economic region
levels, district level information constitutes a significant percentage of all
available data.
The State Committee provides some historical data (usually generic population data) as well as more recent data extrapolated to 2010-2012 on the
basis of 2009 Census. While the website has nearly identical sections in
English and Azerbaijani languages, the Azerbaijani component has additional files with data for each district (however, Nakhchivan economic region is not broken into districts).
The site additionally features an electronic library with reports on selected
topics (e.g., children). Some of the reports are freely available while others
are for sale only. Most of the reports are in English though some are in
Azerbaijani. Since the State Committee publishes statistical yearbooks, the
reports should be available through university libraries in the US.
7.1.2
City of Baku
The city of Baku statistical data (http://www.baku.azstat.org) provides the
same kind of statistical information as the State Statistical Committee of
the Republic of Azerbaijan.
33
ERDC/CERL TR-08-DRAFT
7.1.3
Ministry of Culture and Tourism of the Republic of Azerbaijan
The Ministry of Culture and Tourism of the Republic of Azerbaijan
(http://mct.gov.az) provides information in several languages, including
English. It provides a link to an online navigator GoMap (www.gomap.az)
developed by a commercial company SINAM (www.sinam.net). The purpose of GoMap navigator is to enhance tourism opportunities and it thus
features a great variety of data: administrative buildings, hotels and lodging, educational and medical institutions, entertainment, points of interest, industrial, nature, and other. While the navigator can be used for trip
planning purposes, there are no easy ways to download and save any data
of interest.
7.1.4
Ministry of Labor and Social Protection of the Republic of
Azerbaijan
The Ministry of Labor and Social Protection of the Republic of Azerbaijan
(http://mlspp.gov.az/) features an interactive map
(http://inforoom.mlspp.gov.az/) of districts. The map covers most districts (excluding the Nakhchivan region) and for each district, the following information is included: general information (e.g., number of enterprises, number of secondary/primary/higher education schools etc.),
population (including the number of internally displaced persons), land
area (including total area of cultivated land, area of pastures etc.), the
names of main agricultural and economic crops, and poultry production.
While the publication year of the data is not listed, the data are presumably recent.
7.1.5
Gov.az
The website at www.gov.az provides a list of links to local government
websites for each administrative district. All websites are in Azerbaijani
and are built using the same template, and they feature the same categories of interest: economy, education, health, culture and sports. However,
the availability of information differs for each district and in several cases
is absent. Information on education and health usually includes a name
and an address of an educational institution and of a hospital or a clinic.
Using other information, such as a detailed street map and/or Google
Earth maps, it may be possible to indicate geographic coordinates of each
institution and place them on a map. As for the economy, the information
can vary from purely descriptive (e.g., stating that a given region specializ-
34
ERDC/CERL TR-08-DRAFT
es in meat production) to quantitative. While some websites organize any
quantitative information in an easy-to-read format (e.g., in a table), many
simply insert such information throughout a body of text.
7.1.6
Forum azeri.net
A forum in Azerbaijani language (http://forum.azeri.net/>) has some general discussion topics, as such dating, internet (how to earn money using
internet), cooking, culture and other.
7.1.7
Ans Press
Ans Press is a news site (http://www.anspress.com/index.php?lng=ru) in
Russian.
7.1.8
Apa News Agency
Apa News Agency is a news site (http://en.apa.az/) in English.
7.1.9
Azertag
Azertag is a national news agency (http://azertag.com/en) specializing in
official government news.
7.1.10
Day.az
Day.az – Today.az is a news site (http://today.az/) in English.
7.1.11
Novosti Azerbaijan
Novosti Azerbaijan - Azerbaijan News is a news site (http://novosti.az/) in
Russian and Azeri.
7.2
Communications and Media
35
ERDC/CERL TR-08-DRAFT
7.3
Demographic and Human Population Measures
7.3.1
State Social Protection Fund of the Republic of Azerbaijan
The State Social Protection Fund of Azerbaijan Republic
(www.sspf.gov.az) features an interactive map with information on how
many people receive pensions by district. The information dates to the beginning of 2012 and is broken into three categories: 1) people receiving
pensions due to age; 2) people receiving pensions due to some disability;
and 3) people receiving pensions ABI (translated as OBI by Google Translate tool, but the definition of this abbreviation is not clear).
7.4
Economy
7.4.1
State Social Protection Fund of the Republic of Azerbaijan
The State Social Protection Fund of Azerbaijan Republic
(www.sspf.gov.az) features an interactive map with information on how
many people receive pensions by district. The information dates to the beginning of 2012 and is broken into three categories: 1) people receiving
pensions due to age; 2) people receiving pensions due to some disability;
and 3) people receiving pensions ABI (translated as OBI by Google Translate tool, but the definition of this abbreviation is not clear).
7.4.2
Centralized Information System on Mass Payments
The Centralized Information System on Mass Payments of the Central
Bank of the Republic of Azerbaijan provides information on banks: name
of the bank; name, code and address of the branch; number of operators at
the branch (http://info.apus.az/?p=banks). It also provides information
on service providers, including the name of the provider; the name and
code of the branches; and the number of individual and business subscribers (http://info.apus.az/?p=merchants). Additional data include financial
information (in Azerbaijani currency) for each bank and service provider:
daily average transactions with cash and payment cards by month for
2008-2013. Combined together, this information can inform the status of
economic development on a district level.
36
ERDC/CERL TR-08-DRAFT
7.4.3
“Azerenergy” JSC
“Azerenergy” JSC (www.azerenerji.gov.az) is the biggest power producer
in Azerbaijan and it makes available a map of current and prospective
power lines of the country in a .jpg format
7.5
Education
7.5.1
Azerbaijan Republic Education Portal
The Azerbaijan Republic Education Portal at
http://portal.edu.az/index.php?r=article/item&id=222&mid=6&lang=en
features an interactive map of schools by district
(http://portal.edu.az/index.php?r=schoolmap&lang=en#list). For each
district, a list of schools as well as the name of a settlement, where a school
is located, is available. It is possible to use this information to create a map
of school locations (as well as school types, such as primary vs. secondary)
by settlement. Currently, no additional information is available for each
school, though it is possible that the Education Portal may decide to include it in the future.
7.6
Ethnicity
7.7
Language
7.8
Land: Cultural Terrain
7.9
Land: Ownership
7.10 Land: Use and Cover
7.10.1
State Land Surveying Institute
The State Land Surveying Institute (http://www.dyli.az/en/) provides detailed topographic maps of municipalities in each of two districts: Agsu (47
37
ERDC/CERL TR-08-DRAFT
municipalities) and Samaxi (48 municipalities). The maps are in a .jpg
format and must be converted into a GIS format.
7.10.2 Real Estate Cadastre and Technical Inventory Center of the State
Committee on Property of the Republic of Azerbaijan
The Real Estate Cadastre and Technical Inventory Center of the State
Committee on Property of the Republic of Azerbaijan (http://kadastr.az)
has links to several maps of Baku City districts, though these maps are
quite small and can only be used for general reference.
7.10.3 State Committee for Architecture and Urban Planning of
Azerbaijan Republic
The State Committee for Architecture and Urban Planning of Azerbaijan
Republic (http://www.arxkom.gov.az) has links to cadastral/topographic
maps of major cities in 24 districts in .jpg format. These digital maps are
detailed and are of high enough resolution to be converted into georeferenced maps.
7.10.4
Baku Cartographic Factory
Baku Cartographic Factory (http://bkf.az) provides excerpts of digital atlases with information on ecology, topography, distribution of flora/fauna,
historical landmarks etc. Digital versions of atlas maps can be used to create corresponding georeferenced maps and paper versions of the same atlases may be available through university libraries in the US.
7.11 Medical and Health
7.12 Organizations
7.13 Religion
7.13.1
Ministry of Culture and Tourism of the Republic of Azerbaijan
The Ministry of Culture and Tourism of the Republic of Azerbaijan also
provides a list of 510 religious communities, officially registered by the
State Committee for Work with Religious Communities (located at
38
ERDC/CERL TR-08-DRAFT
http://www.scwra.gov.az/). Data on religious communities includes a
name, a description (whether it is a mosque, a church or other) and a location (usually - but not always - down to the village level). With a help of a
gazetteer, a detailed map of religious communities can be created.
7.13.2
State Committee for Work with Religious Communities
The State Committee for Work with Religious Communities
(http://www.scwra.gov.az/) has launched an interactive map of religious
institutions with such information as physical or historical description.
The map is of limited use as there are no options to download and save an
entire dataset to file.
7.14 Significant Events
7.15 Social Groups
7.16 Transportation Use
7.17 Water Supply and Control
7.17.1
Azesu OJSC
Azesu OJSC (www.azersu.az) is a company supplying drinking water and
sanitation services. For each administrative district, it provides the following information: number of served residential areas; number of served
subscribers (population and non-population); names of the water sources;
number of water reservoirs; length of pipelines and length of sewerage
network.
7.17.2 Azerbaijan Amelioration and Water Management Open Joint Stock
Company
Azerbaijan Amelioration and Water Management Open Joint Stock Company (www.mst.gov.az) lists information in Azerbaijani, English and Russian, but most of information comes from the Azerbaijani section. This
39
ERDC/CERL TR-08-DRAFT
site’s survey revealed that the only data useful for conversion into a geographic format is a melioration map of the country in a .jpg format.
40
ERDC/CERL TR-08-DRAFT
8
Turkey Sources
8.1
Collection of Four or More Human Geography Themes
8.1.1
MetroPOLL Strategic and Social Research Center
MetroPOLL Strategic and Social Research Center
(http://www.metropoll.com.tr/) conducts surveys on population’s opinions, e.g., trust of the government, political attitudes, perception of Turkey’s problems and others. Reports are freely available and summarize the
results on a national level. The English version of the website does not
provide details regarding the sampling methodology and availability of
microdata (Turkish version may be more informative).
8.1.2
BiLGESAM | Wise Men Center for Strategic Studies
BiLGESAM | Wise Men Center for Strategic Studies
(http://www.bilgesam.org/en/index.php?option=com_content&view=fro
ntpage&Itemid=1) is a research center that addresses global problems in
relation to Turkey. Several reports based on two national surveys are
available. The reports mainly cover such topics as attitudes towards Kurds,
and expectations of new constitution.
8.1.3
Turkish Statistical Institute
Turkish Statistical Institute (http://www.turkstat.gov.tr/Start.do) is the
major provider of official statistics on numerous topics such as demography, crime, economy and others. A lot of statistics are available on a subnational level, such as regions, sub-regions, provinces and districts
(http://tuikapp.tuik.gov.tr/Bolgesel/menuAction.do?dil=en).
8.1.4
Türkiye.gov.tr
Türkiye.gov.tr is the Turkish government’s site for a wide array of e-data
sources is available at https://www.turkiye.gov.tr/ . However, this site is in
Turkish. Also, accessing data through official government data sources often requires that one has a Turkish ID number, similar to a Social Security
Number in the U.S.
41
ERDC/CERL TR-08-DRAFT
8.1.5
Hurriyet Daily News
Hurriyet Daily News (http://www.hurriyetdailynews.com/) contain domestic (Turkish) as well as world news coverage. It is the oldest current
English-language daily newspaper.
8.1.6
Posta
Posta (http://www.posta.com.tr/) is a daily Turkish newspaper covering
domestic (Turkish) and international news. It is available in Turkish.
8.1.7
General Command for Mapping
General Command for Mapping
(http://www.hgk.msb.gov.tr/english/index.php) is an organization responsible for developing maps related to astronomy, topography, cadaster,
geology and other areas. However, all maps except for the most basic ones
are for purchase only. Additionally, some maps may be available to governmental agencies only (presumably Turkish).
8.2
Communications and Media
8.2.1
ICTA (Information and Communications Technologies Authority)
ICTA (http://btk.gov.tr/) is a national communications regulatory authority. The website is in both English and Turkish though the Turkish version
appears to be fuller. The Turkish version provides links to statistical data
and reports. None of the data are readily available for GIS input. Statistics
on communications are available on a province level for the years 20072012 in .xls format and can be joined to an appropriate administrative layer in GIS.
8.3
Demographic and Human Population Measures
8.3.1
Institute of Population Studies at Hacettepe University
Institute of Population Studies at Hacettepe University
(http://www.hips.hacettepe.edu.tr/eng/index.html) conducts research related to the demographic, social, economic, cultural and medical aspects of
population studies. It conducts Turkish Demographic and Health Survey
quinquennially. Additionally, it conducts other surveys, most recent ones
Turkey National Maternal Mortality Study (2005), Turkey Migration and
Internally Displaced Persons Survey (2005), and National Research on
42
ERDC/CERL TR-08-DRAFT
Domestic Violence Against Women in Turkey Survey (2008). Summary
reports are available freely and can be used to extract some georeferenced
data but microdata need to be requested from the Institute.
8.3.2
Generate Directorate of the Prisons and Detention Houses
Generate Directorate of the Prisons and Detention Houses
(http://www.cte.adalet.gov.tr/) provides only general statistics by year on
imprisoned population. The site is in Turkish only. It provides additional
links to other justice and crime-related institutions, which may be useful.
Particularly, it provides a link to the Department of Probation
(http://www.cte-ds.adalet.gov.tr/) which has an interactive map featuring
locations of probation offices and their address. The data are not readily
available for GIS input but it can be collected from the interactive map.
8.4
Economy
8.4.1
Republic of Turkey Ministry of Development
Republic of Turkey Ministry of Development (http://www.dpt.gov.tr/) is
the former State Planning Organization (reorganized in 2011). The agency
is responsible for conducting studies, developing policies and doing other
work in regards to social, economic and cultural areas of development. It
provides statistical data in these fields on the national level (in Turkish
and English). The former State Planning Organization website is located at
http://www.devplan.org/ and may have additional data in a form of reports available. Specifically, a report on building construction and parcel
statistics provides data by sub-national level for 2009. The report is in a
PDF format and the data need to be prepared for GIS input.
8.4.2
Turkish Patent Institute
Turkish Patent Institute (http://www.turkpatent.gov.tr/;
http://www.tpe.gov.tr/) is an intellectual property organization. It provides data on patents and associated information primarily in a form of
reports. Some statistics are available in a table format (.csv) on a national
level and can easily integrated into GIS.
8.4.3
Ministry of Labor and Social Security
Ministry of Labor and Social Security (http://www.csgb.gov.tr/) provides
reports statistics on workers, union members, strikes and wages. Most of it
43
ERDC/CERL TR-08-DRAFT
is on a national level but some may be available on a sub-national level.
The site requires knowledge of Turkish. The data are in a .pdf format,
which requires preprocessing prior to GIS input.
8.4.4
Social Security Agency of the Republic of Turkey
Social Security Agency of the Republic of Turkey (http://www.sgk.gov.tr/)
provides information in several languages but Turkish section is the most
informative. It features an interactive map of provinces with information
on number of employees, population receiving different types of social security assistance and similar information. The data are not readily available for GIS input and need to be collected from the interactive map. Additional data may available in a form of reports and other publications.
8.4.5
General Directorate of Petroleum Affairs
General Directorate of Petroleum Affairs (http://www.pigm.gov.tr/) is a
petroleum sector agency. Turkish section of the site is the most informative. It provides information on oil and natural gas exploration on a national level. Some information may be available by a geographic region
(provinces). The data are in .xls format and can be easily used for GIS input.
8.4.6
Ministry of Environment and Urbanization, Air Quality Monitoring
Network
Ministry of Environment and Urbanization, Air Quality Monitoring Network (http://www.havaizleme.gov.tr/) provides air quality data as recorded by stations. Data are available for each station on daily, weekly and
monthly basis. Data can be obtained from an interactive map or it can be
generated as a report for each station. Information for each station includes longitude and latitude, so it is possible to use it to generate data in
GIS format.
8.4.7
General Directorate of Electrical Power Survey and Development
General Directorate of Electrical Power Survey and Development
(http://www.eie.gov.tr/) provides data on energy-related issues (e.g., consumption in kWh) on a 3rd administrative level for 2010-2011. Information
is presented in a form of an interactive map
(http://www.eie.gov.tr/il_enerji.aspx). Additional projects include solar
energy, wind energy, and hydroelectric energy potential atlases. Data from
44
ERDC/CERL TR-08-DRAFT
these atlases is represented in a form of interactive maps as well as published reports on a third administrative level. None of these data are readily available for GIS input.
8.5
Education
8.5.1
Republic of Turkey Ministry of National Education
Republic of Turkey Ministry of National Education
(http://www.meb.gov.tr/english/indexeng.htm) publishes a statistical
bulletin on educational indicators. The latest available bulletin covers the
years of 2012-2013. Most data are available on the 3rd administrative level
(provinces). The website has Turkish and English sections; Turkish section
is the most informative.
8.5.2
Republic of Turkey Ministry of National Education, General
Directorate of Secondary Education
Republic of Turkey Ministry of National Education, General Directorate of
Secondary Education (http://ogm.meb.gov.tr/) provides statistics on secondary education schools (in Turkish only). The data are presented in a
form of an online table; by clicking on the school’s name a user can see
that school’s statistics for selected years (2010 is the most recent). The
same information is available for download in Microsoft Access database
format though the database’s macros appear to be broken and need to be
fixed in order to view the data. Assuming the data are fixed, it should be
possible to create geographic data for GIS input by using schools’ locations
(a school location appears to be represented as an address and as a province or 3rd administrative level).
8.6
Ethnicity
8.6.1
The Kurdish Institute of Paris
The Kurdish Institute of Paris (http://www.institutkurde.org/) is an independent organization supporting activities aimed at contributing to the
knowledge pool about the Kurdish community, its language and culture.
The website has links to online publications, conference proceedings and
other publications. There is no data directly suitable for GIS input but
publications and reports provide a general background and some information extracted from them may be useful for GIS.
45
ERDC/CERL TR-08-DRAFT
8.7
Language
8.8
Land: Cultural Terrain
8.9
Land: Ownership
8.9.1
Deed Inquiry: Inquiry TAKBIS Land Registry and Cadastre, Land
Purchase Event
Deed Inquire (http://www.takbis.org/) provides data on land ownership
in Turkey. It seems geared more toward providing information to potential
land purchases. As with many Turkish data sources, access to the search
features requires payment of a fee and/or that one have a Turkish ID.
Land ownership data in Turkey is also incomplete. However, with funding
support from the World Bank, Turkey has undertaken a project that will
eventually map all land parcels and ownership.
8.9.2
Generate Directorate of Land Registry and Cadastre
Generate Directorate of Land Registry and Cadastre
(http://www.tkgm.gov.tr/) provides data on land ownership in Turkey. As
with many Turkish data sources, access to the search features requires
payment of a fee and/or that one have a Turkish ID. Land ownership data
in Turkey is also incomplete. However, with funding support from the
World Bank, Turkey has undertaken a project that will eventually map all
land parcels and ownership.
8.10 Land: Use and Cover
8.10.1
Ministry of Forestry and Water Affairs of Republic of Turkey
Ministry of Forestry and Water Affairs of Republic of Turkey
(http://cbs.ormansu.gov.tr/) has launched a geoportal, which allows to
obtain a variety of forestry-related data. Presumably, the data could be
downloaded in ESRI shapefile, raster and/or geodatabase format. However, the site requires a working knowledge of Turkish (English version is
less complete and many links are broken). In September 2013, several unsuccessful attempts were made to download some data from the geoportal.
It is not clear whether the request didn’t go through, whether the geoportal
46
ERDC/CERL TR-08-DRAFT
is not completely up yet, or whether it is necessary to become a registered
user first.
8.10.2
General Directorate of Combating Desertification and Erosion
General Directorate of Combating Desertification and Erosion
(http://www.cem.gov.tr/) is a part of the Ministry of Forestry and Water
Affairs. The agency is dedicated to soil conservation, flood control and protection and development of natural resources. The website is in Turkish
only. It provides data in a form of reports and graphs, many of which are
on a national level. Some data may be available on a sub-national level and
needs to be extracted from the reports. No readily GIS data are available;
any data extracted from the reports needs to be converted into a proper
format.
8.10.3
General Directorate of Forestry
General Directorate of Forestry (http://web.ogm.gov.tr/) provides statistics on forest conditions, forest fires and similar in a form of published reports on a national level. Some data may be available on a sub-national
level, but it needs to be extracted from the published reports and prepared
from GIS input. An English version of the site is available but it is less
complete than the Turkish version.
8.11 Medical and Health
8.11.1
Institute of Population Studies at Hacettepe University
Institute of Population Studies at Hacettepe University
(http://www.hips.hacettepe.edu.tr/eng/index.html) conducts research related to the demographic, social, economic, cultural and medical aspects of
population studies. It conducts Turkish Demographic and Health Survey
quinquennially. Additionally, it conducts other surveys, most recent ones
Turkey National Maternal Mortality Study (2005), Turkey Migration and
Internally Displaced Persons Survey (2005), and National Research on
Domestic Violence Against Women in Turkey Survey (2008). Summary
reports on the national level are available freely and can be used to extract
some georeferenced data but microdata need to be requested from the Institute.
47
ERDC/CERL TR-08-DRAFT
8.11.2
Northern Cyprus Ministry of Health of the Republic of Turkey
Northern Cyprus Ministry of Health of the Republic of Turkey
(http://www.saglikbakanligi.com/) provides information in Turkish only.
Additionally, the way the website is set up makes it difficult to use Google
Translate services. The website provides health-related statistics from
2002 through 2012. The data for each year comes in different formats
(e.g., published reports and .xls/.csv files).
8.12 Organizations
8.13 Religion
8.13.1
The Presidency of Religious Affairs
The Presidency of Religious Affairs (http://www.diyanet.gov.tr/) is a
branch of government responsible for regulating religious services. Only
Turkish version of the website is available. It provides several statistical
tables (primarily covering a national level though some cover the third
administrative level) on number of religious organizations, people attending religious schools, number of religious and non-religious personnel,
and others.
8.14 Significant Events
8.14.1
The Official Gazette of Turkey
The Official Gazette for Turkey (http://www.resmigazete.gov.tr/) documents the actions of government, such as legislation, treaties, executive
orders, judiciary decisions and other official announcements. This data
mostly covers actions of the executive branch of the government and is
available back to editions from 1921 and there is a search feature on the
website. Use of the website does require a working knowledge of Turkish.
8.14.2
The Grand National Assembly of Turkey
The Grand National Assembly of Turkey (http://www.tbmm.gov.tr/) provides documentation of its activities. There is some National Assembly data as far back as 1920. Much of the data consists of pdf files, in Turkish, of
Assembly meetings, similar to the Congressional Record of the United
States.
48
ERDC/CERL TR-08-DRAFT
8.14.3
Indiegogo
A protest movement in June of 2013 in Turkey against demolition of Gezi
Park in Istanbul transformed into a movement against the ruling AK Party
that united Turks across society. A documentary film
(http://www.indiegogo.com/projects/istanbul-united-the-movie) is in
progress that examines how this political movement even united football
fans from three competing teams. The documentary seeks to examine how
the political situation in Turkey has made it possible for “ultra” football
fans to put aside their differences in sports to work together for political
change. This is an unusual data source but potentially insightful.
8.14.4
BELGEnet
BELGEnet (http://www.belgenet.net/) provides data on elections by district since 1954. The latest dataset dates to 2007.
8.15 Social Groups
8.16 Transportation Use
8.17 Water Supply and Control
8.17.1
State Hydraulic Works
State Hydraulic Works () is a state agency responsible for Turkey’s water
resources. The website is in Turkish only. It appears that some basic statistics are available (e.g., the amount of groundwater at a given reservoir).
Additionally, an interactive map related to surface water is available at
http://rasatlar.dsi.gov.tr/. It does not appear that any of the data can be
easily downloaded and integrated into GIS.
49
ERDC/CERL TR-08-DRAFT
9
Conclusions
Developing countries often lack the internal spatial data infrastructure, or
frameworks of data, metadata, users and tools that interact for the use and
creation of a coordinated spatial data infrastructure. In some cases, developing countries choose not to disclose their data, especially on a finer level. Open-source data for Azerbaijan and Turkey are often not available beyond the district level (third administrative level).
However, Azerbaijan and Turkey are actively working on improving their
digital mapping capabilities. Spatial crowd-sourced data from projects like
OpenStreetMap is becoming a valuable tool in geospatial research in order
to quickly assess on-the-ground conditions; however, the potential flaws of
such data must be understood and a hybrid approach to data integration
must be undertaken.
Statistical agencies of Azerbaijan and Turkey publish a variety of social
and economic data though it must be kept in mind that many developing
countries lack a certain degree of transparency and may alter statistical
numbers for political purposes. For these reasons, it is useful to additionally consult independent sources when possible. Occasionally, data of high
quality may be obtained from independent researchers and scholars,
though this will require establishing working relationships with these individuals and/or institutions.
50
Download