Crowdsourcing: Different concepts and Platforms Reshmi De Bhargabi Chakrabarti What is Crowdsourcing? Obtaining service, ideas , content, etc. by soliciting contributions from an online community. Crowdsourcing can apply to a wide range of activities. Crowdsourcing can involve division of labor for tedious tasks split to use crowdbased outsourcing, but it can also apply to specific requests, such as crowdvoting, crowdfunding, broad-based competition, general search for - answers, solutions, or a missing person. Crowdsourcing Typology In his 2013 book, Crowdsourcing, Daren C. Brabham puts forth a problembased typology of crowdsourcing approaches. These types are: Knowledge Discovery & Management - for information management problems where an organization mobilizes a crowd to find and assemble information. Ideal for creating collective resources. Distributed Human Intelligence Tasking - for information management problems where an organization has a set of information in hand and mobilizes a crowd to process or analyze the information. Ideal for processing large data sets that computers cannot easily do. Broadcast Search - for ideation problems where an organization mobilizes a crowd to come up with a solution to a problem that has an objective, provable right answer. Ideal for scientific problem solving. Peer-Vetted Creative Production - for ideation problems where an organization mobilizes a crowd to come up with a solution to a problem which has an answer that is subjective or dependent on public support. Ideal for design, aesthetic, or policy problems. Crowdfunding Crowdfunding is the process of funding your projects by a multitude of people contributing a small amount in order to attain a certain monetary goal. Goals may be for donations or for equity in a project. A well-known crowdfunding tool is Kickstarter, which is the biggest website for funding creative projects. It has raised over $100 million, despite its all-or-nothing model which requires one to reach the proposed monetary goal in order to acquire the money. Crowdrise brings together volunteers to fundraise in an online environment. Crowdvoting Crowdvoting occurs when a website gathers a large group's opinions and judgment on a certain topic. The Iowa Electronic Market is a prediction market that gathers crowds' views on politics and tries to ensure accuracy by having participants pay money to buy and sell contracts based on political outcomes. Threadless.com selects the t-shirts it sells by having users provide designs and vote on the ones they like, which are then printed and available for purchase. Despite the small nature of the company, thousands of members provide designs and vote on them, making the website's products truly created and selected by the crowd, rather than by the company alone.[ Some of the most famous examples have made use of social media channels: Domino's Pizza, Coca Cola, Heineken and Sam Adams have thus crowdsourced a new pizza, bottle design, beer or song, respectively.[29] Crowdsearching Chicago-based startup crowdfynd utilizes a version of crowdsourcing best termed as crowdsearching, which differs from Microwork in that there is no obligated payment for taking part in the search. Their platform, through geographic location anchoring, builds a virtual search party of smartphone and internet users to find a lost item, pet or person, as well as return a found item, pet or property. Citizen science Also known as crowd science, crowd-sourced science, or networked scienceIs scientific research conducted, in whole or in part, by amateur or nonprofessional scientists, often by crowdsourcing and crowdfunding. Sometimes called "public participation in scientific research.“ Citizen-science activities can take many forms: Example: Citizen scientists can help gather data that will be analyzed by professional researchers. The American Association of Variable Star Observers has gathered data on variable stars for educational and professional analysis since 1911 and promotes participation beyond its membership on its Citizen Sky website. On BugGuide.Net, an online community of naturalists who share observations of arthropods, amateurs and professional researchers contribute to the analysis. Popular Platforms Crowdforge Artigo Crowdflower Crowd4U Quadrant of Euphoria CrowdDB Turkit Turkomatic Amazon Mechanical Turk Crowdforge Task breakdown, roughly inspired by the MapReduce programming partitition -split a problem into sub-problems map -solves a small unit of work reduce - combine multiple results into one Only map task involves human intelligence. Artigo – Social Image tagging An online gaming platform providing several GWAPs (game with a purpose) Aims to supply artworks with tags provided by the University of Munich, Germany, to automatically build up an artwork search engine, to scientifically investigate artworks' reception, and finally to provide an artwork learning environment The users learn about art while playing the various games offered on the platform Crowdflower CrowdFlower uses crowdsourcing techniques to provide a wide range of enterprise solutions. Has over 50 labor channel partners, among them Amazon Mechanical Turk and TrialPay. Peer review helps maintain high accuracy levels. Crowd4U Is a project for developing an open crowdsourcing platform for academic purposes. The project started in 2010 and the first system was launched in 2011. As of December 2012, it was deployed in 14 universities in Japan. Different crowdsourcing projects, such as those for information retrieval, library problems, and help in disasters, are going on with Crowd4U.It supports a declarative programming language named CyLog for writing crowdsourcing applications. Quadrant of Euphoria A Crowdsourcing Platform for Quality of Experience Assessment in network and multimedia studies, which features low cost, participant diversity, meaningful and interpretable QoE scores, subject consistency assurance, and a burdenless experiment process. CrowdDB Answering Queries with Crowdsourcing Some queries cannot be answered by machines only. Processing such queries requires human input for providing information that is missing from the database, for performing computationally difficult functions, and for matching, ranking, or aggregating results based on fuzzy criteria. CrowdDB uses human input via crowdsourcing to process queries that neither database systems nor search engines can adequately answer. It uses SQL both as a language for posing complex queries and as a way to model data. Turkit TurKit is a Java/JavaScript API for running iterative tasks on Mechanical Turk. You can safely re-execute TurKit programs without re-running costly side effects on Mechanical Turk, like creating new HITs, but still write your program in a straightforward imperative manner—there is no need to unravel the program into a state machine TurKit is open source, and is hosted on Google Code—you can download the source code. http://code.google.com/p/turkit/source/che ckout AMAZON MECHANICAL TURK AMT MTURK Background -“The Turk” ● Automated Chess Player built in 80's ● ...had an human hiding inside it AMT ● ● ● Launched Nov 2, 2005 Initially used to solve 'in house' issues of amazon which required human judgment and intelligence Soon realized this is an unique service and shared it as a web service. AMT- Stat AMT ● HIT (Human Intelligence Task) ● Requesters/Developers. ● Workers. Designing HITS Requesters can specify: Task ● Keyword ● Expiration date ● Reward ● Time allotted ● Qualification ● Creating HITS ● createHits() - Method of RequesterService class (com.amazonaws.mturk.service.axis.RequesterService). mturk.proterties file -Contains the requester credentials needed for creating ● the HITS. .question file -Contains the questions in XML format ● .properties file -Contains how long the HIT will remain active, how many ● assignments etc. Creating multiple HITS Problem 1::Site Categorization –( eg: Search Engine, News Site, Online Retailer, Others ?) URL1 – www.google.com URL2 – www. Amazon.com URL3 – www.reuters.com etc Need to create separate (but similar) HITS for each URL. . Creating multiple HITS Solution:: Step1 - Specify the URLS in the .input file: Creating multiple HITS Step 2-Specify the template (for categorization)in a html file: .question file Creating multiple HITS Step 3- .question will have the xml format $urls variable which is defined as a field in the input file will be included in the .question file. Creating multiple HITS Step 4 - No. of assignments is specified in .properties file . Getting Response back... .success file ● -Contains the unique id of the HIT(s) created. ● getHITTypeResults(success) -Method of RequesterService class -Input is an object of HITDataInput class -Return type is object of HITTypeResults class ● writeResults() -Method of HITTypeResults class ● Sentiment Projects in MTurk Create sentiment question, specify the number of Worker responses, and upload your data(in .csv format, tags can be added). Aggregated results sent requester to understand how strong the sentiment is for each item. AMT sends an email when your project is completed. Good for comparing results and for evaluation. Sentiment Projects in MTurk Instructions for workers: Sentiment Projects in MTurk Scale of Rating Criticism - AMT Because HITs are typically simple, repetitive tasks and users are paid often only a few cents to complete themsome have criticized Mechanical Turk as a "digital sweatshop". Because workers are paid as contractors rather than employees, requesters do not have to file forms for, nor pay payroll taxes, and they avoid laws regarding minimum wage, overtime, and workers compensation. Workers, though, must report their income as self-employment income. Some requesters have taken advantage of workers by having them do the tasks, then rejecting their submissions in order to avoid paying. Amazon.com does not monitor the service and refers all complaints to the poster of the HIT.