Assessing the feasibility of micro-data access Atle Alvheim Assistant Director

advertisement
Assessing the feasibility of
micro-data access
Atle Alvheim
Assistant Director
Norwegian Social Science Data Services
Luxembourg 26 - 27 October 2006
Norwegian Social Science Data Services (NSD)
www.nsd.uib.no
nsd@nsd.uib.no
There is a lack of “Tools for thought”
”Much more time went into finding or
obtaining information than into
digesting it”
Dr. J.C.R. Licklider
Maximize
(
time spent on digesting and thinking
time spent on finding and accessing
)
Situation to-day




Only a fraction of data
resources available on
line
Lack of standardization
Poor integration
between data and
metadata
Institutional, legal, and
commercial obstacles
Situation tomorrow?





All empirical data available
on-line
An integrated gateway to be
used to integrate and locate
relevant resources
The ability to browse,
visualize, and analyze data
on-line
Hyperlinks from data to
relevant scientific publications
and resources
Empirical feedback system to
build the collective memory of
a data collection
What are we looking for?
Data
Data
Sharing
Metadata
Tools
Why are metadata important?
Unlabeled stuff
Labeled stuff
The bean example is taken from: A Manager’s
Introduction to Adobe eXtensible Metadata Platform,
http://www.adobe.com/products/xmp/pdfs/whitepaper.pd
The functions of metadata
Finding
Sharing
Understanding
Assessing
Data Documentation Initiative (DDI)
An international XML-based standard for the content,
presentation, transport, and preservation of documentation
Benefits of the DDI Approach
Providing the data
analyst with broader
knowledge about a
given collection.
Interoperability
Codebooks can be exchanged and transported
seamlessly, and applications can be written to
work with these homogeneous documents.
Richer content
Single document - multiple purposes
DDI documents are
easily imported into online analysis systems,
rendering datasets
more readily usable for
a wider audience.
On-line subsetting and analysis
Precision in searching
The codebook contains all
of the information
necessary to produce
several different types of
output.
Field-specific searches
across documents and
studies are enabled.
A life-cycle model of data
Data
Archiving
Study
Concept
Data
Collection
Data
Processing
Data
Distribution
Data
Discovery
Repurposing
Combined life cycle model
Data
Analysis
A common European data portal
• Metadata is all about communication
• Madiera: A set of tools, + an idea:
Data is a kernel that facilitates a ”discussion”
• Maybe future libraries consist of datasets with
linked or derived knowledge-products, books,
papers, tables, etc, wikis
• Could we imagine libraries of hypoteses ?
• Libraries of questions and discussions more
than of answers ?
What was the specified MADIERA Objectives ?
• The development of an integrated and effective distributed
social science portal to facilitate access to a range of data
archives and disparate resources. WP3
• The development of specific add-ons to existing virtual data
library technologies, in particular data location technology
WP4
• The employment of a multi-lingual thesaurus to break the
language barriers to the discovery of key resources. WP5
• An extensive programme to add content, both at the
data/information and knowledge levels. WP6
• Extensive training of data providers and users to encourage
the continuos growth of the infrastructure WP7
A Web of the Social Sciences
• Building on a distributed
model where data and
resources are stored and
maintained locally
End users
• For the end user the
system will appear as a
integrated system
• A virtual data library
offering global access to
locally supported data
holdings
Data providers
What is then necessary to develop useful
procedures ?
•
•
•
•
Metadata standards lift data from digits to research
information
Technical solutions, software: Information- and
access systems, in addition to analysis and
download possibilities
Political agreements, conditions for data access
Economic agreements, logging, audits
EXAMPLE A common resource
European Social Survey (ESS)
europeansocialsurvey.org
ess.nsd.uib.no
An academically driven social survey designed to
chart and explain the interaction between Europe's
changing institutions and the attitudes, beliefs and
behaviour patterns of its diverse populations.
ANOTHER EXAMPLE: Aggregate data
The determinants of active civic participation
at European and national level (CIVICACTIVE)
nsd.uib.no/civicactive
A third example: a common entry-point
madiera.net
The MADIERA project has developed an effective
infrastructure for the European social science
community by integrating data with other tools,
resources and products of the research process.
A Finnish
researcher
A scheme
A Swiss
researcher
An Irish
researcher
• A registration procedure, register with home archive
• Look up and access data across holdings
Data on Finland
(A geographic area)
Eurobarometer
(A data collection)
Attitudes towards
Immigrants
(A problem area)
A ”Data-archive Political Context”
for 20+ national archives
I.
It might be money involved

Is the data a free or commercial good ?

There are categories of users,
what about non-academic use, non-CESSDA use ?

Who are to fix the prices ?
II. Varying Access rules. The crossing of national borders

What laws apply. Who set the rules

Who is responsible ? What sanctions available ?
III. There are some “Common good” data

Eurobarometers, Value studies, ISSP, ESS, Comparative
collections

Could best be provided from one single point (?)

Charging ? Access Conditions ? Double Storage ?
IV. It is a good thing to have national archives, enhances
amount of data available and betters the accessibility.

Need justification and visibility
All use the ”NESSTAR Publisher” / DDI
ELSST
Madiera: A common portal for all of Europe, ++
Portal
Functionality:
Link many local servers
Search and browse possibilities
__________________
NSD
FSD
SSD
Standardised software and
standardised documentation
 Translation possibilities
ZA
DANS
DDA
UKDA
Politics: Coordinated access rules
Politics
Money
Download