Applications of Intelligent Systems and Robotics in Service of Society Raj Reddy Carnegie Mellon University Pittsburgh Jan 9, 2007 Keynote Speech at IJCAI 2007, Hyderabad, India 2 Outline of the Talk Needs of Developing Economies Access 3 to Knowledge, Education and healthcare, etc. Minute Introduction to AI: What is it and how it can help The role of AI in enabling access to knowledge and knowhow access to libraries access to education and learning access to health care Unfinished research agenda of AI Needs of the People with Per Capita Income of Less Than $1 a Day Access to entertainment watch any movie, TV show when desired providing links to doctors and treatment at a distance about hygiene and safe water, helping to reduce infant mortality Telemedicine Access to information Life-long learning independent of the limitations of language, distance, age and physical disabilities Price discovery Marketing assistance using eBay like auction exchanges e.g. monster.com Find jobs They need AI and IT but not Word, Excel and Powerpoint 3 4 Barriers to Entry: The Digital Divide Connectivity Divide Access to free Internet for basic services? Computer Access Divide Accessibility: Less than 5 minute walk? Affordability: Costing less than a cup of coffee per day? Digital Literacy Divide Language Divide Literacy Divide Content Divide Access to information and knowledge Access to health care Access to education and learning Access to jobs Access to entertainment Access to improved quality of life 5 A 3-Minute Introduction to AI What is it and how it can help review why the world’s poor have more to gain in relative terms by the effective use of the IT and AI technology 6 Artificial Intelligence attempts to make computers do things which would require intelligence in people, i.e. any activity which requires the use the human brain 7 A Historical View of Advances in AI 1950s: Theorem Proving; Chess 1960s: Problem Solving; Language: Understand; Question Answering 1970s: Speech; Vision; Expert Systems 1980s: Robotics; Knowledge Based Systems 1990s: Language Translation; Search 2000s: Systems that Learn with Experience 8 Some Application Domains Web Search : Google, Yahoo, MSN Intelligent car Financial planning Manufacturing control System diagnosis NL communicator Writing assistant Knowledge-based simulation Games Household robot 9 Requirements for Intelligence Learn from experience Exploit vast amounts of knowledge Exhibit Goal Directed Behavior Tolerate error and ambiguity in input Communicate with natural language Operate in real time, and Use symbols (and abstractions) 10 AI Problem Domains & Attributes Puzzles Knowledge Content Data Rate Response Time Poor Low Hours Chess Theorem Proving Expert Systems Natural Language Motor Processes Speech Vision Rich High Real Time 11 Lessons from AI Experiments Bounded Rationality implies Opportunistic Search An Expert becomes a World Class Expert only after spending at least 15 years of intensive practice and knows 70,000+20,000 patterns Search Compensates for Lack of Knowledge Knowledge Compensates for Lack of Search A Physical Symbol System is Necessary and Sufficient for Intelligent Action 12 How Can AI Help? Intelligent Systems in support of Access to Knowledge and Knowhow Learning and Education Health Robotics for Accident Avoiding Cars, Landmine Detection, and Disaster Recovery 13 Enabling Access to Knowledge and Information Village Google: Access to Knowledge for Use in a Village Access to Essential Information and Advice Medical, Agriculture, FAQ indexed and searchable Interactive access to Doctors, Rescue Personnel Price discovery, crop disease information, weather prediction Lifelong Learning and Education Agricultural Information Access to Markets and Jobs Disaster Relief and Management Access to Newspapers, Radio and TV Entertainment and Amusement Communications Video Phone, IP Telephone, Instant Messaging Video Email, Voice Email, Text Email 14 The Vision of a Global Knowledge Network Create a Knowledge Network that connects experts to the people who need help, e.g., farmers in villages End-users interact at Village Knowledge Centers Equipped with a networked computer and basic A/V equipment Staffed by a Knowledge Officer Humans are intrinsic to Knowledge Networks (raw information knowledge!) Domain experts provide answers to previously unanswered questions Answers converted into an “encyclopedia-on-demand” video documentary at higher-level centers centers and dubbed into local languages in each country Also available for direct access browsing by literate and networked users 15 System Overview 16 Multi-level Information Flow - An example scenario An illiterate farmer goes to a Village Knowledge Officer (with a computer connected to FAO multimedia database) and asks a question in his or her local language The KO retrieves answer from local Multilingual database within minutes 80 90% of the time For the remaining 10 - 20% of the time the KO puts up the question to a higher level office and gets an answer back, typically in less than 24 hrs 100s of domain experts populate the databases, both as part of their jobs and as volunteers (say, 2 questions per week) Hierarchical structure spanning districts, regions, countries, etc. Outside experts interact with higher level Knowledge Officers Builds up an ever-increasing multimedia database Can provide static (e.g., best-practices) as well as dynamic (e.g., weather, prices, etc.) information Innovative mechanisms and processes for information digitization, exchange, analysis, and dissemination Knowledge officers and Domain Experts World Knowledge Management & Coordination (global) Nation Knowledge Management & Coordination (national level) State Verification of Query-Answer Relevance And RFP to domain experts District Translation, Information Retrieval Village AV data collection, Transliteration and Transcription Information Retrieval Domain experts: Volunteer to answer at least 2 questions a week (or part of job responsibility) 17 18 Roles of Knowledge Officers Village District 3,000 people 300,000 people 30M people 0.3B people 3 Billion people Transcription (and possibly Transliteration) Translation and Information Retrieval Verification & RFP from Experts Knowledge Management & Coordination Knowledge Analysis and Inference Records question of the end-user in audio-video format. Enters text transcription of the question. Enters translation of questions. Searches local language database for answer Need not be knowledgeable in English. Searches multilingual database for answer Sends answer after translation to lower level If question not among FAQs or automated system, sends to higher level Region/Nation Picks questions of critical nature and validates the answer provided at lower level If critical or unanswered question, puts up request to experts even if not paid for by end-user (sub)continent Same as next level up, but with the range of analyses broadened to the region/subcontinent level Global Brings experts to where their knowledge is needed. Mobilization of resources towards their need. Identifies and triggers initiatives to control “epidemic”-like problems (All numbers shown are for rural, developing country populations = beneficiaries) The AI Challenges in Creating a Global Knowledge Network Farmers typically not able to tap in to existing networks 19 Often illiterate Rarely have relevant information or even communications accessible Today’s Internet and existing databases/portals are primarily intended for users literate in English and can synthesize their solutions from multiple sources 20 Internet Bill of Rights Jaime Carbonell, 1994 Get the right information To the right people e.g. machine translation With the right level of detail e.g. Just-in-Time (task modeling, planning) In the right language e.g. categorizing, routing At the right time e.g. search engines e.g. summarization In the right medium e.g. access to information in non-textual media Relevant Technologies “…right information” search engines “…right people” classification, routing “…right time” anticipatory analysis “…right language” machine translation “…right level of detail” summarization speech input and output “…right medium” 21 22 “…right information” Search Engines The Right Information Right Information from future Search Engines How to go beyond just “relevance to query” (all) and “popularity” Eliminate massive redundancy e.g. “web-based email” Should not result in Should result in multiple links to different yahoo sites promoting their email, or even nonYahoo sites discussing just Yahoo-email. a link to Yahoo email, one to MSN email, one to Gmail, one that compares them, etc. First show trusted info sources and user-community-vetted sources At least for important info (medical, financial, educational, …), I want to trust what I read, e.g., For new medical treatments First info from hospitals, medical schools, the AMA, medical publications, etc. , and NOT from Joe Shmo’s quack practice page or from the National Enquirer. 23 Beyond Pure Relevance in IR Current Information Retrieval Technology Only Maximizes Relevance to Query What about information novelty, timeliness, appropriateness, validity, comprehensibility, density, medium,...?? Novelty is approximated by non-redundancy! we really want to maximize: relevance to the query, given the user profile and interaction history, P(U(f i , ..., f n ) | Q & {C} & U & H) where Q = query, {C} = collection set, U = user profile, H = interaction history ...but we don’t yet know how. Darn. 24 25 Maximal Marginal Relevance vs. Standard Information Retrieval documents query MMR Standard IR IR 26 “…right people” Text Categorization The Right People User-focused search is key If a 7-year old is working on a school project taking good care of one’s heart and types in “heart care”, she will want links to pages like “You and your friendly heart”, “Tips for taking good care of your heart”, “Intro to how the heart works” etc. NOT the latest New England Journal of Medicine article on “Cardiological implications of immuo-active proteases”. If a cardiologist issues the query, exactly the opposite is desired Search engines must know their users better, and the user tasks Social affiliation groups for search and for automatically categorizing, prioritizing and routing incoming info or search results. New machine learning technology allows for scalable high-accuracy hierarchical categorization. Family group Organization group Country group Disaster affected group Stockholder group 27 Text Categorization Assign labels to each document or web-page Labels may be topics such as Yahoo-categories Labels may be genres finance, sports, NewsWorldAsiaBusiness editorials, movie-reviews, news Labels may be routing codes send to marketing, send to customer service 28 Text Categorization Methods Manual assignment Hand-coded rules as in Yahoo as in Reuters Machine Learning (dominant paradigm) Words in text become predictors Category labels become “to be predicted” Predictor-feature reduction (SVD, 2, …) Apply any inductive method: kNN, NB, DT,… 29 30 “…right timeframe” Just-in-Time - no sooner or later Just in Time Information Get the information to user exactly when it is needed Immediately when the information is requested Prepositioned if it requires time to fetch & download (eg HDTV video) requires anticipatory analysis and pre-fetching How about “push technology” for, e.g. stock alerts, reminders, breaking news? Depends on user activity: Sleeping or Don’t Disturb or in Meeting wait your chance Reading email now if info is urgent, later otherwise Group info before delivering (e.g. show 3 stock alerts together) 31 32 “…right language” Translation 33 Access to Multilingual Information Language Identification (from text, speech, handwriting) Trans-lingual retrieval (query in 1 language, results in multiple languages) Full translation (e.g. of web page, of search results snippets, …) General reading quality (as targeted now) Focused on getting entities right (who, what, where, when mentioned) Partial on-demand translation Requires more than query-word out-of-context translation (see Carbonell et al 1997 IJCAI paper) to do it well Reading assistant: translation in context while reading an original document, by highlighting unfamiliar words, phrases, passages. On-demand Text to Speech Transliteration “…in the Right Language” Knowledge-Engineered MT Parallel Corpus-Trainable MT Transfer rule MT (commercial systems) High-Accuracy Interlingual MT (domain focused) Statistical MT (noisy channel, exponential models) Example-Based MT (generalized G-EBMT) Transfer-rule learning MT (corpus & informants) Multi-Engine MT Omnivorous approach: combines the above to maximize coverage & minimize errors 34 35 “…right level of detail” Summarization 36 Right Level of Detail Automate summarization with hyperlink one-click drilldown on user selected section(s). Purpose Driven: summaries are in service of an information need, not one-size fits all (as in Shaom’s outline and the DUC NIST evaluations) EXAMPLE: A summary of a 650-page clinical study can focus on effectiveness of the new drug for target disease methodology of the study (control group, statistical rigor,…) deleterious side effects if any target population of study (e.g. acne-suffering teens, not eczema suffering adults ….depending on the user’s task or information query Information Structuring and Summarization Hierarchical multi-level pre-computed summary structure, or on-the-fly drilldown expansion of info. Headline <20 words Abstract 1% or 1 page Summary5-10% or 10 pages Document 100% Scope of Summary Single big document (e.g. big clinical study) Tight cluster of search results (e.g. vivisimo) Related set of clusters (e.g. conflicting opinions on how to cope with Iran’s nuclear capabilities) Focused area of knowledge (e.g. What’s known about Pluto? Lycos has good project in this via Hotbot) Specific kinds of commonly asked information(e.g. synthesize a bio on person X from any web-accessible info) 37 Document Summarization Types of Summaries Task INDICATIVE for Filtering Query-relevant Query-free (focused) (generic) Filter search engine results Short abstracts Solve problems for busy professionals Executive summaries (Do I read further?) CONTENTFUL for reading in lieu of full doc 38 39 “…right medium” Finding information in Non-textual Media Indexing and Searching Non-textual (Analog) Content Speech text (speech recognition) Text speech TTS: FESTVOX by far most popular high-quality system Handwriting text (handwriting recognition) Printed text electronic text (OCR) Picture caption key words (automatically) for indexing and searching Diagram, tables, graphs, maps caption key words (automatically) 40 41 AI and Access to Libraries The Million Book Digital Library Project 42 One Step at a Time… Million Book DL Only about 1% of all the world’s books Harvard University 12M Library of Congress 30M OCLC catalog 42M All Multilingual Books ~100M At the rate of digitization of the last decade it would take a 100 years! Million Book Project: Issues Time At one page per second (20,000 pages per day shift), it will take 100 years (200 working days per year) to scan a million books of 400 pages each Cost 100M books at US$100 per book would coat $10B Even in India and China the cost will be $1B The annual cost is currently expected to be close $10M per year with support from US, India and China. Selection Selection of appropriate books for scanning is time consuming and expensive Million Book Project: Issues (cont) Logistics Meta Data Each containers hold 10,000 to 20,000 books. Shipping and handling costs about $10,000 Accessing and/or creating Meta data requires professionals trained in Library science Optical Character Recognition Technology Essential for searching, translation and summarization Many languages don’t have OCR Million Book Project: Status 18 Centers in India 22 centers in China 1 Center in Egypt 15 Centers in Poland Planned : Australia Over 1,400,000 books scanned Over 250,000+ accessible on the web Title Author Language Subject Publisher Year Abstract Rig Veda Pandit Sriram Sharma Acharya Sanskrit Philosophy Sanskriti Sansthan Bareli 47 Rig Veda is the oldest of the Vedas. The Rig Veda is the oldest book in Sanskrit or any Indo-European language. Many great Yogis and scholars who have understood the astronomical references in the hymns, date the Rig Veda as before 4000 B.C., perhaps as early as 12,000. Modern western scholars date it around 1500 B.C., though recent archaeological finds in India (like Dwaraka) now appear to require a much earlier date Title Author Language Subject Publisher Year Abstract 48 Elementary Treatise on the Wave-Theory of Light Humphery Lloyd, D.D, D.C.L English Physics Longmans, Green & Co 1873 This book deals with the various aspects of the wave theory of light. It is a critical work which contains an analytical discussion of the most recent researches in Optics. It presents a clear and connected view of the subject. Title Author Language Subject Publisher Year Abstract 49 Mudalayiram Mulamum Periya Jeeyar Tamil Religion Sri Vaishnava Sampirathaya Sanjeevikiri Sabayai 1909 This volume is written in Tamil. It provides a detailed account of the origin of Vaishnava and is written by Periya Jeeyar. . 50 Title Author Language Subject Publisher Year Abstract Gulzar-A-Badesha Khader Badesha Urdu Literature Namipress, Chennai 1919 Literature 51 Title Author Language Subject Publisher Year Abstract Jawahar Ali Joyviyah Dr.Ilyas lomas Arabic Metrology Bakri and Issa 1876 It is a book on Metrology, a study of measurements Title Author Language Subject Publisher Year Abstract 52 Structure Des Molecules Victor Henri French Chemistry Taylor and Francis 1925 This is a unique book that explicates, in detail, the structure of molecules and touches upon certain specific characteristics of molecules with particular reference to Benzene Million Book Project: AI Research Challenges Multilingual Information Retrieval Translation Summarization Reading Assistant using Multi Lingual Speech Synthesis and Translation (e.g. for news paper DL) Easy to use interfaces for Billions Providing Access to Billions everyday Distributed Cached Servers in every region 54 AI and Education Intermediate Examination 2006 Urban – Rural Divide 67 61 56 52 48 35 30 27 18 15 8 4 Passing First division More than 75% Rural More than 90% Urban First divisionmaximum of a district First divisionminimum of a district 55 Intermediate Examination 2006 Differences in Performance of Different Social Groups – Percent Failing 59 60 46 43 43 34 27 FC BC SC ST Muslim Others Total 56 Intermediate Examination 2006 57 Differences in Performance of Different Social Groups 13 7 6 40 5 4 28 20 FC BC 2 2 10 8 SC ST 75 % or more 17 Muslim more than 90 % Others 23 Total Performance in EAMCET 2006 Rural Urban Divide 72 70 65 Percent share 62 38 35 30 28 Avg. of Math+Sci EAMCET rank less EAMCET rank less EAMCET rank less greater than 94.5% than 5,000 than 10,000 than 50,000 Rural Urban 58 59 Large Variation in School Quality No. of schools where NOT a SINGLE student got more than 75% marks and more than 50% of all taking exam failed 360 in 2004, and 965 in 2006 Intensity of problem is almost twice in rural areas compared to urban areas 60 Large Variation in College Quality Even bright fail! 1345 students who got more than 90% in Math in SSC failed in either math A or B in year I or year II Of these 1345, 222 had >90% in two subjects and 53 in three subjects 253 colleges where failing rate is more than 75% 239 colleges where not a single student gets more than 75% 829 colleges where less than 5% students passing with more than 75% (state avg. is 22%) Intensity of problem is almost twice for colleges in rural areas compared to colleges in urban areas 61 Problems with Current System Focus on national best with consequent neglect of local best Schools in remote villages Urban students with access to tuition and coaching get the highest ranks in national tests Lack of quality teachers No coaching centers Deprived of competitive atmosphere No system to nurture talent who do best in such difficult situations Financial issues often prohibit the brightest rural students from attending the best universities 62 Problems with Current System (Cont) Lack access to quality colleges Lack proper guidance, motivation and peer groups Inadequate support from families Poverty prevents access to coaching classes, tutoring etc Poverty compels them to seek work to for livelihood rather than proceed to college essential for reaching their full potential 63 Current System Admission to Engineering and Medicine Coaching for 11th and 12th (costs 60K to 120/240K), Kota, Hyderabad, Delhi, Unaffordable to many Teaching to test Not broad education Revised pattern of JEE seems not to diminish the importance of coaching Focus During Formative Years Right guidance and environment during formative years This is what famous mathematician Hardy says about mathematics genius Srinivas Ramanujan The years between eighteen and twenty-five are the critical years in the mathematician’s career and that the real tragedy is not that Ramanujan died early, but during these years his genius was misdirected, sidetracked, and to some extent even distorted 64 Problems with Current System Wastage of precious time commuting (lot of time in to-and-fro, may be 1-4 hours a day) only two semesters in a year Lack focus on development of soft skills, a key to success in today’s highly competitive job market Imperfect credit market for higher secondary education Have you heard of bank loan for “coaching classes” and 12th, JEE, EMCET, AIEEE for 11th 65 66 How AI can Help? Creating a New Affirmative Action Plan For The Socially Disadvantaged? Data Mining: Local Best instead of National Best Intelligent Tutoring Systems (AI Meets Cognitive Science) : Variable Duration Learning Online Reading Tutors Online Math Tutors Intelligent Monitoring Systems Early Detection of Promising Students and Problem Students thru Progress Monitoring Process Improvement AI and Development of Soft Skills Soft skills have become key to success in today’s highly competitive job market Develop Intelligent Tutoring Systems for: Communication skills/language proficiency Interpersonal Interaction and Negotiation Personality traits/sociability Teamwork Work ethic Courtesy Self-discipline, self-esteem and self-confidence Presentation skills 67 68 AI and Healthcare 69 PCtvt UI Design for Use by Illiterate Persons An Illiterate person needs a more powerful PC than a PhD! If not e-mail, use voice-mail Replace Text Help by Video Help Radically simple design One minute learning time Two click model Three modes of communication: Video, Audio and Text Both Synchronous and Asynchronous All-Iconic interfaces Multiple input modalities TV-remote, Speech I/O, Keyboard, Mouse or Cell phone 70 71 72 AI and eLearning Give man a fish and you will feed him for a day. Teach man to fish and you will feed him for life. (Old Chinese Proverb -- Lao Tzu) How to teach an illiterate villager who has never seen a computer to effectively use PCtvt? Self-evident, intuitive interfaces Two clicks to most applications Learning time – less than five minutes to happiness Just in Time learning Immersive Interactive Simulated Environments Short video clips: Instant access to information through vast video digital libraries in local languages Interactive Problem Solving Intensive programs for educating the local expert, the Village Information Officer Teach the Teacher Programs 73 A Call to Action to AI Researchers In India India Has 21 Official Languages! We need to Break the Language Barrier! • • • • • Language barriers can significantly slow down the economic growth Globalization requires cross-border and cross-language communication Eliminate cultural and social barriers Access to rare (and potentially beneficial) knowledge requires eliminating the language divide Preservation of minority languages, cultures and heritage 74 75 Unfinished Research Agenda for AI spoken language understanding, dialog modeling, multimedia synthesis and language generation, multi-lingual indexing and retrieval, language translation, and summarization. Next Steps 76 Create technologies and solutions for overcoming the language barrier Create toolkits for rapid acquisition of new language capabilities Character codes, optical character recognition, speech recognition, speech synthesis, translation, search engines, text mining, summarization, language tutoring, etc. Capture data, information and knowledge from masses Make fundamental advances in language processing algorithms, e.g., Deal with 1000 times more data Conceptual advance in semantic retrieval information The Educational Plan Training a generation of researchers to explore many techniques in many languages Training innovators and entrepreneurs in applications of language technology Training scholars in each country to be expert in language technology Training individuals in foreign languages and cultures 77 The Research Plan Analogy to Human Genome Project Meticulous core-science based fundamentals Researcher toolkits for known methodologies Architecture supporting diversity of methodologies Long planning horizon to support development of novel and radical approaches Quantitative evaluation against a standard of steadily accumulating improvements in performance 78 Impact and Benefits greater participation in global economy preserve local languages and cultures promote greater communication and understanding among states and individuals With over 100 orphan languages, each country of the world needs these tools in its own enlightened self interest International focus and multinational involvement will establish India as a world leader in this important technology 79 Conclusions As we enter the Second 50 Years AI R&D, we need to ask how our work can help Society at large and People at the bottom of the pyramid in particular Proactive Development of Intelligent Systems for Access to Knowledge and Know how Learning and Education Health Robotics for Accident Avoiding Cars Landmine Detection, and Disaster Rescue and Recovery 80