Crowdsourcing for Business: An Emerging Paradigm Shourya Roy Area Manager, Human Computation Xerox Research Centre India, Bangalore shourya.roy@xerox.com Workshop on Social Computing, IIT Kharagpur 5th Oct, 2012 Crowdsourcing : What is it? The act of taking a task traditionally performed by an employee or contractor, and outsourcing it to an undefined, generally large group of people, in the form of an open call Digitization, image labeling, user studies, machine translation evaluation, logo design, EDA simulation, innovation contests, . . . 1. 2. Handwriting Recognition Problem* Make progress towards deciphering this handwriting Put words which you are unsure about in parenthesis Instructions to Crowd Many tasks are easy/feasible/doable for humans, but difficult/challenging/impossible for computer programs Services Thrust – Xerox Confidential – Examples (1/5) Services Thrust Examples (2/5) Services Thrust Examples (3/5) Services Thrust Examples (4/5) Services Thrust It has been Existing Humans were the first “computers,” computers, used for math computations 9 examples of crowdsourcing, before ‘crowdsourcing’ existed : http://bit.ly/mXFdRp Internet and Mobile Have Made it More Common and Promising – Xerox Confidential – Increasing Activities and Popularity Increasing Popularity as Depicted by Google Trends Crowdsourcing on Google Scholar Over the Last Few Years “2M contributors who does more than 4PY of work on an average day!” -- CEO Page 10 Services Thrust Changing Demographics Services Thrust What is the Problem Given a computational problem, design a solution using human computers and automated computers – Xerox Confidential – Why is it Different? Human in the loop (and not Guinea Pigs) Humans are actively computing(not merely carrier of sensors) Main doer is Human (and not Machines as in Assembly Lines) The outcome is determined by an algorithm (and not the natural dynamics of the crowd) Where is Research? Quality Estimation and Assurance (Redundancy and voting; Gold data; joint estimation of worker quality and task difficulty; Symbiosis with Machine Learning) Complex Tasks (No discrete answer; Exploration and exploitation; crowd workflows;) Task Design (Optimize cost, quality and time; infinite completion time; Real time) Incentive and Motivation (Payment vs. non-payment; Optimal payment; Payment and quality; ) Market Design (Reputation Mechanism; Monitoring and feedback; Task Discovery; Behavioral Aspects (Noisy behaviour; Non-reproducible;) – Xerox Confidential – An Emerging Research Field An Interdisciplinary Research Field That’s Alright – but Xerox!!? We have transformed… 2009 2011 Technology-led Services-led ~25% Services ~50% Services Revenue $15.2 billion ~$23 billion Market Opportunity $132 billion $500 billion + Services Leadership In Document Outsourcing Document Outsourcing Business Process Outsourcing Information Tech Outsourcing … into the world’s leading enterprise for Business Process and Document Management 18 Xerox Revenue by Business Segment* * http://www.fastcompany.com/magazine/161/ursula-burns-xerox Is Crowdsourcing a Viable Alternative to Outsourcing? Outsourcing is Focus on the core business while partnering with 3rd party vendors to tackle the non-core operations Tasks requiring human intelligence and skills Data and process migration by smart use of technology Heavily human intensive; typically with the help of Large distributed workforce enabled by computing technologies technology executing tiny pieces of work requiring human intelligence Page Data Entry by Crowd We started by considering a typical outsourced process (Data Entry) Objective is to understand a process in detail and identify implications for crowdsourcing Digitisation of insurance forms and medical records for US based insurance companies Typing in, validation/ correction of informationPage from Features that make Form Digitization process amenable to crowd sourcing Relatively low skill data entry work, known as ‘key what you see’ Already an outsourced process requiring a low level of interactivity between sequential steps Strong workflow tool to manage work, which flows through a series of system and human steps Between sites Between sequential tasks Between agents (given their known skill set) Page 22 Findings from Work-Practice Study (1/2) Findings from Work-Practice Study (2/2) Workplace Ecology : Data security is physical, technical & social Crowdsourcing: lose physical and social enforcement, reduced control of workforce. Need technical solutions. Skills and Knowledge 1)‘key what you see’ data entry actually involves extensive rule set. 2) Form difficulty is situational. 3) Non-standard means non-standard. Crowdsourcing: Situational-based incentives and supporting learning Being a Corporate Employee Pay alone not enough to achieve SLA. Agents made accountable. Crowdsourcing: reduced accountability could increase rejections of difficult work. Making the Workflow Work: Push model of work Crowdsourcing: Pull model of work raises coordination and completion issues. Collaborative Working: Work is not collaborative at workflow level; but it is at claim level (floorwalkers & colleagues). Page 24 Conclusion Crowdsourcing is an emerging Research area It requires expertise and research competencies from a number disciplines Crowdsourcing can be applied in various domains to solve problems in a more effective manner Finally, a large fraction of the crowd comes from India Focused research and technologies will be highly relevant Page 25 References • TurKit: Tools for Iterative Tasks on Mechanical Turk; Greg Little, Lydia B. Chilton, Robert C. Miller, and Max Goldman • Matt Lease Tutorial • Soylent – A cr • Fold.it – S. Cooper et. al Services Thrust