Luxembourg Income Study (LIS) asbl data access Thierry Kruten

advertisement
Luxembourg Income Study (LIS) asbl
17, rue des Pommiers
L-2343 Luxembourg –City
Tél : +(352) 26 00 30 20
Fax: +(352) 26 00 30 30
Email : kruten@lisproject.org
Web : www.lisproject.org
Luxembourg Income Study (LIS) asbl
Thierry Kruten
Kruten@lisproject.org
Luxembourg, 26-27 October 2006
OECD Conference: Assessing the feasibility of microdata access
LISSY - A Remote Access System
What for ?
• Needs of individual information in order to test the effectiveness of policy and
to set new policies implies obtain individual and possibly sensitive data which
have to be gathered, managed and disseminated
• A remote access system offers researchers the possibility to analyze microdata
from a remote location by submitting statistical queries. microdata are not
physically distributed remaining under the control of the agency, while the
users of the system can perform analyses at their own place of work
The challenge
• The trade-off between a user-friendly system and a safe system is evidently
present for remote access systems. Disclosure control limit the detail of the
output
LISSY - A Remote Access System
confidentiality, user-friendliness and feasibility
Confidentiality
• Depends on the data providers’ requirements
• Restriction of queries based on the user output. Those queries are filtered
either automatically or manually (Review queue) on a query basis, and a query
history database has been set up and maintained (Log files)
User-friendliness
• Oriented towards users. Mainly Academic researchers.
• LIS users are mainly interested in more advanced statistical analyses and like
to use specific softwares
• Relevant aspect of the system is speed. LISSY outputs are returned within a
couple of minutes and the system is available 24 hours a day, 7 days a week.
(geographical constraint)
LISSY - A Remote Access System
confidentiality, user-friendliness and feasibility
Feasibility
• Most important aspect of the feasibility is the security that can be attained (IT
Audit)
• A technological answer through the implementation of a secure architecture
(Firewall, electronic authorization via password, restrictions on some IT
formats such as HTTP encapsulations etc.)
• But also a capacity for the maintenance and administration of the system.
The automatic or manual handling and evaluation of queries determines the
labour intensiveness of the remote access facility. Other tasks such as the
management of the users database (update, pledge) or the technical
administration of the system (back-up, technical maintenance, numerous
statistics concerning system usage) are also labor intensive
• These aspects are all directly related to budgetary constraints which one of the
most important items of the LIS budget
LISSY - A Remote Access System
Confidentiality policy
• Use of the data is restricted to Social Science Research purposes only.
• Access is limited to Researchers working for an academic, government or nonprofit organization
• Users must register with the LIS and sign a pledge to obey the rules
governing the use of the data
• Registered users receive an "user account" and "password" that are strictly
personal. Therefore, listings generated from user requests will be returned
ONLY to the email address that was registered by the user when applying for
access;
• Under no circumstances shall also a registered user make any attempt to
locate and list any survey information to identify individual persons;
• No direct access of any kind is permitted to the data or the LIS network.
Users agree not to attempt in any way to copy individual records through
listing them in program output or in any other fashion;
LISSY - A Remote Access System
Implementation
LISSY - A Remote Access System
How Lissy works
• The operating system consisting of a series of
software components which work together to
receive, process and return statistical requests
• The users submit their statistical requests under the form
of SAS, SPSS or STATA programs to LIS via the
Internet mailing system
• The email requests contain the syntax created by the
user for the specific statistical package used and a
standardized header identifying the user
* user
= (your user id)
* password = (your password)
* package = (SPSS, SAS or STATA)
* project = (database used)
LISSY - A Remote Access System
How Lissy works
• The heart of the system is the job control component
(Post Office). It manages the entire access mechanism.
• It retrieves the email requests from the mail server
 It prepares these requests for processing by checking for all security
issues like clearly identifying a user, checking for the use of illegal
statistical commands, check for the usage of sequences of commands
or variables or any other combinations not allowed
 It returns any job that breaches security to the sender along with an
error message explaining the violation
 It distributes the requests to the batch processor computers
 It returns the statistical results to the proper (registered) user email
addresses
 It sends suspicious output to the review queue for manual review
instead of returning results to the user
 And finally it maintains critical databases needed for the overall
operation.
All the components of the system are physically separated and at no
moment is a user in direct contact with the data
Luxembourg Income Study
Any questions are welcome …
“When you can measure what you are speaking about and express it in numbers
you know something about it. But when you cannot measure it or express it in
numbers, you knowledge is of a meager and unsatisfactory kind.”
Kelvin sir William Thomson
British Mathematician et physician (Belfast, 1824 - Netherhall, 1907)
Luxembourg Income Study (LIS)
A variety of support services ….
• LIS workshops (Work Hard - Play Hard !)
• LIS conducts annual training workshops 10 days pre- and post-doctoral
workshop designed to introduce young scholars to comparative research in
income distribution, poverty, and labour market outcomes using the LIS
database
• Courses include a mixture of lectures and assistance and direction using
the LIS database to explore research issues chosen by the participants
• “Visiting scholars program”
• Grants to the LIS data archives of micro-data and to the relevant data
documentation are offered. The LIS staff is available for consultation,
assistance and possible collaboration
• Direct on-site access will be allowed for datasets whose providers have
given us their consent for such access
Download