Orange - Big Data Paris 2015

advertisement
Data for Development - D4D
Challenge
STRICTLY CONFIDENTIAL
As part of our approach on Telco Data, we are developing
pilot initiatives on Open Data
Internal use
CORE BUSINESS
Orange to use
data to better
serve our clients
and run our
operations more
efficiently
B2C
USER CENTRIC SERVICES
offers to
empower users
to use their
own data
CONFIDENTIAL - 2
B2B
DATA BASED SERVICES
offer datasets
and services to
3rd parties,
while
controlling the
use of data
Open Data:
give 3rd parties
access to
datasets without
controlling the
use of data
Orange in Côte d’Ivoire
CONFIDENTIAL - 3
Data 4 Development – an Open Data Challenge from
Orange
 Data 4 Development is an Open Data challenge from Orange to the world
research community, to help the development of Côte d’Ivoire society
─ Exploit network signalling data, form Orange Ivory Coast
─ Address poverty, medical help, food and water distribution, spread of diseases,
traffic congestions, early signals of crisis, …
─ Improve society, support NGO’s and better public policies
 In collaboration with the University of Louvain (UCL) and MIT for the
evaluation of results:
─ Criteria: Scientific approach, Societal impact and Data Visualisation
─ Evaluation Committee with members from: Bouake University, Global Pulse
(United Nations), GSMA, MIT, Orange R&D, UCL, World Economic Forum
 Launched June 2012, submissions mid Feb 2013, results announced at the
NetMob conference at MIT on 1st of May 2013
─ A “world premiere” by Orange…
CONFIDENTIAL - 4
The D4D Committee will evaluate submissions and
offer 4 awards (see www.D4D.orange.com)
Université
de Bouaké
CONFIDENTIAL - 5
We strongly anonymised data, going further than today’s
local requirements, aiming at defining an Industry Best
Practice
CONFIDENTIAL - 6
4 large data sets, highly anonymised, were made
accessible to the research teams, after a project
submission process
 4 data sets, based on 5 months of CDR, from Dec 2011 to April 2012
─ Dynamic Population densities by volumes of all calls per hour between all
antennas
─ Geographic mobility of anonymous client samples over the whole period
─ Coarse trajectories (prefectures levels) of a 10% client sample
─ Fine trajectories (blurred antenna levels) of a biweekly 10% client sample
─ 2 degrees social communication graphs of an anonymous week by week 1%
client random sample
 Highly anonymised, running pre-release tests to further secure privacy risks
─ The dataset where provided to 2 “Friendly Test teams” who tried to “Hack” them
─ University Pierre et Marie Curie, University of Cambridge
─ Results of attempts where taken into account to further the anonymisation
 Controlled access to the data set ensured alignment with the project goals
─ Release of data sets to researcher, required a project submission to the D4D
Challenge and the signature of Terms & Condition’s controlling the use of data
CONFIDENTIAL7-
We received over 260 requests, from all over the
world…
Northwestern University
Columbia University
University of Cambridge
University of Cambridge
University of Illinois at Chicago
Harvard Medical School & Boston Children's Hospital
University of California, San Diego
City University of Hong Kong
University of San Francisco
Aalto University, Finland
University College London
Technische Universität Berlin
Bell Labs
Beijing Jiaotong University
University College Dublin
AT&T Labs
Freie Universität Berlin
Universidade da Beira Interior
MIT
FEUP Protugal
Northwestern Polytechnical University
University of Ljubljana, Slovenia
Pennsylvanian State University
University College Dublin
City University of Hong Kong
University of Modena and Reggio Emilia, Italy IBM Research
Fraunhofer IAIS, Germany
Bogazici University
Tokyo Institute of Technology, Japan
Universidad Politécnica de Madrid
Krasnow Institute for Advanced Study
University of Amsterdam
Université de Bouaké
Université Paris Est
University of Birmingham
Duke University
National University of Ireland
Bell Labs, Alcatel-Lucent
University of New Brunswick
Technische Universität Berlin
Hebrew University of Jerusalem
Colorado School of Mines
Aix-Marseille Université
Ecole Polytechnique Fédérale de Lausanne
SungKyunKwan University
Nanyang Technological University
University of Minnesota
Technische Universität Darmstadt
IBM Research
Freie Universität Berlin
Hong Kong University of Science and Technology
University of Namur
IIIT-Delhi
University of Colorado Denver
University of Oviedo, Spain
Georgia Institute of Technology
University of Trento
INSA Lyon, France
University of San Francisco
University of Southampton
Infosys Ltd.
Universidade Federal de Minas Gerais
Qatar Computing Research Institute
Oracle Labs
Columbia University
University of Cambridge
Linköping University
University of Sao Paulo, Brazil
Etc…
8
Resulting in
Orange sending over 250
invitations to compete
with Terms and conditions
to sign
Out of which:
160 teams signed the
Terms and conditions and
downloaded the Databases
Which led to:
83 submissions of high
quality scientific papers and
visualisation projects, on
time for the contest
deadline of February 15th
2013
confidentiel
What did we hope to find ?
Source: visuals extracted from NetMob 2011
CONFIDENTIAL - 9
We got results in a wide variety of domains
 Country & Communities understanding
 Emergency responses
 City and Transport planning
 Tourism and event analysis
 Population statistics
 Health improvements
 Epidemic models
 Economic indicators
 Alerting/preventing violence
 Countries comparison
 Algorithms that respect privacy
 …
 … and Orange operational performance improvement suggestions
Illustration: Ivory Coast map from GroupHuit and Orange R&D, created from D4D data sets
CONFIDENTIAL10
-
Full results will be announced during NetMob 2013
www.netmob.org
CONFIDENTIAL11
-
Learning so far…
 Telco Data can make a difference to people’s life and contribute to
improve public policies
 Innovation/thinking leadership and Orange’s business development
─ Contribute to the image of Orange
─ Contribute to our BigData business development
─ Develop internal momentum and enrich R&D network to go further
 Relatively easy to sell internally and to implement, because
─ In line with Strategic direction around personal data
─ Development theme is an “easy sell”
─ Not very expensive
─ Working with Best-in-class Universities and partners
 Learned how to deal with open data on a potentially sensitive area
─ Strong Privacy and Competitive insights constraints
─ Data quality
─ Accept to let go some direct potential value, and take some risks
─ Rich knowledge creation
CONFIDENTIAL12
-
Thank you - Merci
contacts:
Orange Group Marketing: nicolas.decordes@orange.com
Orange Labs: jacques.raguenez@orange.com
CONFIDENTIAL13
-
Download