Luxembourg Income Study (LIS) asbl 17, rue des Pommiers L-2343 Luxembourg –City Tél : +(352) 26 00 30 20 Fax: +(352) 26 00 30 30 Email : kruten@lisproject.org Web : www.lisproject.org Luxembourg Income Study (LIS) asbl Thierry Kruten Kruten@lisproject.org Luxembourg, 26-27 October 2006 OECD Conference: Assessing the feasibility of microdata access LISSY - A Remote Access System What for ? • Needs of individual information in order to test the effectiveness of policy and to set new policies implies obtain individual and possibly sensitive data which have to be gathered, managed and disseminated • A remote access system offers researchers the possibility to analyze microdata from a remote location by submitting statistical queries. microdata are not physically distributed remaining under the control of the agency, while the users of the system can perform analyses at their own place of work The challenge • The trade-off between a user-friendly system and a safe system is evidently present for remote access systems. Disclosure control limit the detail of the output LISSY - A Remote Access System confidentiality, user-friendliness and feasibility Confidentiality • Depends on the data providers’ requirements • Restriction of queries based on the user output. Those queries are filtered either automatically or manually (Review queue) on a query basis, and a query history database has been set up and maintained (Log files) User-friendliness • Oriented towards users. Mainly Academic researchers. • LIS users are mainly interested in more advanced statistical analyses and like to use specific softwares • Relevant aspect of the system is speed. LISSY outputs are returned within a couple of minutes and the system is available 24 hours a day, 7 days a week. (geographical constraint) LISSY - A Remote Access System confidentiality, user-friendliness and feasibility Feasibility • Most important aspect of the feasibility is the security that can be attained (IT Audit) • A technological answer through the implementation of a secure architecture (Firewall, electronic authorization via password, restrictions on some IT formats such as HTTP encapsulations etc.) • But also a capacity for the maintenance and administration of the system. The automatic or manual handling and evaluation of queries determines the labour intensiveness of the remote access facility. Other tasks such as the management of the users database (update, pledge) or the technical administration of the system (back-up, technical maintenance, numerous statistics concerning system usage) are also labor intensive • These aspects are all directly related to budgetary constraints which one of the most important items of the LIS budget LISSY - A Remote Access System Confidentiality policy • Use of the data is restricted to Social Science Research purposes only. • Access is limited to Researchers working for an academic, government or nonprofit organization • Users must register with the LIS and sign a pledge to obey the rules governing the use of the data • Registered users receive an "user account" and "password" that are strictly personal. Therefore, listings generated from user requests will be returned ONLY to the email address that was registered by the user when applying for access; • Under no circumstances shall also a registered user make any attempt to locate and list any survey information to identify individual persons; • No direct access of any kind is permitted to the data or the LIS network. Users agree not to attempt in any way to copy individual records through listing them in program output or in any other fashion; LISSY - A Remote Access System Implementation LISSY - A Remote Access System How Lissy works • The operating system consisting of a series of software components which work together to receive, process and return statistical requests • The users submit their statistical requests under the form of SAS, SPSS or STATA programs to LIS via the Internet mailing system • The email requests contain the syntax created by the user for the specific statistical package used and a standardized header identifying the user * user = (your user id) * password = (your password) * package = (SPSS, SAS or STATA) * project = (database used) LISSY - A Remote Access System How Lissy works • The heart of the system is the job control component (Post Office). It manages the entire access mechanism. • It retrieves the email requests from the mail server It prepares these requests for processing by checking for all security issues like clearly identifying a user, checking for the use of illegal statistical commands, check for the usage of sequences of commands or variables or any other combinations not allowed It returns any job that breaches security to the sender along with an error message explaining the violation It distributes the requests to the batch processor computers It returns the statistical results to the proper (registered) user email addresses It sends suspicious output to the review queue for manual review instead of returning results to the user And finally it maintains critical databases needed for the overall operation. All the components of the system are physically separated and at no moment is a user in direct contact with the data Luxembourg Income Study Any questions are welcome … “When you can measure what you are speaking about and express it in numbers you know something about it. But when you cannot measure it or express it in numbers, you knowledge is of a meager and unsatisfactory kind.” Kelvin sir William Thomson British Mathematician et physician (Belfast, 1824 - Netherhall, 1907) Luxembourg Income Study (LIS) A variety of support services …. • LIS workshops (Work Hard - Play Hard !) • LIS conducts annual training workshops 10 days pre- and post-doctoral workshop designed to introduce young scholars to comparative research in income distribution, poverty, and labour market outcomes using the LIS database • Courses include a mixture of lectures and assistance and direction using the LIS database to explore research issues chosen by the participants • “Visiting scholars program” • Grants to the LIS data archives of micro-data and to the relevant data documentation are offered. The LIS staff is available for consultation, assistance and possible collaboration • Direct on-site access will be allowed for datasets whose providers have given us their consent for such access