Towards More Natural Functional Programming Languages Brad A. Myers Human-Computer Interaction Institute School of Computer Science Carnegie Mellon University http://www.cs.cmu.edu/~bam bam@cs.cmu.edu 1 The User Interface of Programming Languages ICFP’02 Programming is a human activity Want to improve the ability of people to program It makes sense to look at the human side Brad Myers CMU - HCI Institute 2 “Millions for compilers but hardly a penny for understanding human programming language use. Now, programming languages are obviously symmetrical, the computer on one side, the programmer on the other. In an appropriate science of computer languages, one would expect that half the effort would be on the computer side, understanding how to translate the languages into executable form, and half on the human side, understanding how to design languages that are easy or productive to use.... The human and computer parts of programming languages have developed in radical asymmetry.” — Allen Newell and Stuart Card, 1985 ICFP’02 Brad Myers CMU - HCI Institute 3 What is “Usability”? Usability = “The effectiveness, efficiency, and satisfaction with which users can achieve tasks in a particular environment of a product.” Components: Learnability: Easy to learn so users can get started rapidly. Effectiveness: Experts can use it effectively and with high productivity. Low Error rate: Users make few errors. Satisfaction: Pleasant to use. No frustrations for users. Similar to motivations for functional languages ICFP’02 Brad Myers CMU - HCI Institute 4 Why Human Computer Interaction? The field of Human Computer Interaction studies how to improve and evaluate usability Data, knowledge that can guide designs Techniques for evaluating usability ICFP’02 To make systems more usable So claims can be substantiated So improvements can be made Brad Myers CMU - HCI Institute 5 Who are the Programmers? • Not just professional programmers anymore By 2005, 55 million end-user programmers ICFP’02 Compared to only 2.75 million professional programmers Brad Myers CMU - HCI Institute 6 Design of New Languages How make design decisions? Based on math, logic, type theory Designer’s intuition or sense of aesthetics Similarity to other languages But many have known problems Key concept: If you care about usability: Can leverage off of what is known and what can be learned about people to guide design decisions ICFP’02 Brad Myers CMU - HCI Institute 7 What we are doing... Studying the People 8 Examples of Problems The men and women here raise your hands! if ((isMan x) && (isWoman x)) then (raise_hands x) else () (This issue with “and” applies to other natural languages as well.) if Buy a paint that is not red or blue (((not (isRed ) || isBlue)) x) then buy x else () Research shows that these differences between natural languages and computer languages hurt understanding ICFP’02 Brad Myers CMU - HCI Institute 9 My Research Goals Make programming significantly easier to learn and more effective for non-professional programmers and beginners Try to provide a more objective basis for usability decisions for programming language design ICFP’02 Apply results of Empirical Studies of Programmers, Psychology of Programming, and Human-Computer Interaction to programming language design New studies Design new programming languages and environments based on these results Brad Myers CMU - HCI Institute 10 Multiple Criteria Focusing on learnability and naturalness for beginners Less emphasis: ICFP’02 Scalability Provability Efficiency Mathematical or Logic properties Similarity to other familiar languages etc. Brad Myers CMU - HCI Institute 11 Gentle Slope Systems Difficulty of Use Program Complexity and Sophistication Gentle Slope Systems Programming in C++ MFC Difficulty of Use Program Complexity and Sophistication Gentle Slope Systems Programming in C++ Functional Languages? MFC Difficulty of Use UI libraries Program Complexity and Sophistication Gentle Slope Systems Visual Basic Programming in C++ Functional Languages? MFC Difficulty of Use C++ Programming UI libraries Basic Program Complexity and Sophistication Gentle Slope Systems Visual Basic Programming in C++ Functional Languages? MFC Difficulty of Use C Programming UI libraries Basic Program Complexity and Sophistication My Goal What is “Natural Programming”? Attempt to make programming closer to the way people think Make programming “more natural” First, have to find out how people think about algorithms, data structures, etc. Note: Not “natural language” ICFP’02 Still creating a formal language Brad Myers CMU - HCI Institute 17 Why Might Being Natural be Good? “Programming is the process of transforming a mental plan into one that is compatible with the computer.” — Jean-Michel Hoc So process might be easier if transformation is smaller Closeness of mapping ICFP’02 "The closer the programming world is to the problem world, the easier the problem-solving ought to be.… Conventional textual languages are a long way from that goal." — Green and Petre Brad Myers CMU - HCI Institute 18 Why Might Being Natural be Good? Example: ICFP’02 Inserting item into 3rd place of high score list Conventional Languages: Loop, starting at end of array, shuffle items down, then insert Brad Myers CMU - HCI Institute 19 Why Might Being Natural be Good? Directness (as in “Direct Manipulation”) “Distance between one's goals and the actions required by the system to achieve those goals.” — Hutchins, Hollan and Norman Example: vs. VB: Let Shape1.FillColor = &H00FF00FF& ML: SetColor ( Shape1, 0x00FF00FF ) ICFP’02 Brad Myers CMU - HCI Institute 20 Background Research Empirical Studies of Programmers, Psychology of Programming, and HCI results not being used in the design of new languages Summarized in our comprehensive tech report — ICFP’02 30 years of research on what makes languages hard to learn and error-prone Java / C# looping, etc. John Pane and Brad Myers, “Usability Issues in the Design of Novice Programming Systems” TR# CMU-CS-96-132. Aug, 1996. http://www.cs.cmu.edu/~pane/cmu-cs-96-132.html Brad Myers CMU - HCI Institute 21 Examples of Problems Identified Promote Locality and Avoid Hidden Dependencies Type definitions often are far from the use Code readability is of key importance Don’t try to reduce keystrokes if makes less readable Inheritance and object-oriented design are very difficult Beware of misleading appearances ICFP’02 When novices and experts mis-read code Avoid subtle distinctions in syntax E.g., a=b vs. a==b; () vs. [] vs. {}; >= vs. => vs. -> Brad Myers CMU - HCI Institute 22 More Examples People expect consistency with external representations and usage (math, English, etc.): People will search for an analogue in their experience that is similar to the syntax “and”; a=a+1; a=2 vs. 2=a; 1<a<10; ML: ~ is unary negative, - for subtraction: 5 - ~2 So, if different meaning, should have different presentation Significant difficulties in finding bugs due to invisible data, dependencies, and control flow ICFP’02 Brad Myers CMU - HCI Institute 23 HCI Methods for Analyzing Languages Analyze languages as user interfaces Green’s Cognitive Dimensions — Green and Petre, 1996, “Usability Analysis of VP Environments: A ‘Cognitive Dimensions Framework’. Journal of VL&C, 7(2): 131-174 13 dimensions Nielsen’s heuristic analysis principles — Nielsen, J., Usability Engineering. 1993, Boston: Academic ICFP’02 Press 10 principles Can also perform usability studies for specific issues Brad Myers CMU - HCI Institute 24 Consistency Both a Cognitive Dimension and a Heuristic Analysis principle ICFP’02 C++ uses the word "static" to mean at least 3 different things In C++, can use int a,b; to define globals or locals, but not as procedure parameters Should be able to copy code and use the same code elsewhere In Visual Basic, to assign something you use “=” unless is an object, in which case you use “Set” and “=” "foo = 15" vs. "Set foo = object“ ML: “fun f x = 0” vs. Brad Myers CMU - HCI Institute “case e of x => 0” 25 Error-Proneness HCI Principle = Prevent errors ICFP’02 C and C++ array bounds errors Requiring the "break" in each branch of C, C++ switch statements causes many errors (still in Java, fixed in C#) Small typos can result in compilable programs that perform incorrectly, e.g., "=" for "==” or "x-=3” vs. "x=-3“ or fun f(SOME _)=... (a constructor pattern) vs. fun f(SOME_)=... (a variable) Brad Myers CMU - HCI Institute 26 Good Error Messages Should be: clear, helpful, precise, constructive Not “syntax error” In C++, so much flexibility, compiler often doesn’t know where error is Similar problems with type inference systems SML/NJ: stdIn:30.1-30.4 Error: operator and operand don't agree [tycon mismatch] operator domain: ?.t operand: ?.t in expression: f B ICFP’02 Brad Myers CMU - HCI Institute 27 Closeness of Mapping HCI principle = Speak the User's Language Expressions of algorithms close to the way users think of them Also, syntactic Issues: ICFP’02 C++ uses "void" to mean "none", "char" to mean 8-bit number, ... Visual Basic uses "Dim" to declare variables and "wend" to end while loops Arrays start at 0 whereas people think of counting from 1 Case sensitivity Brad Myers CMU - HCI Institute 28 Viscosity Resistance to local change ICFP’02 To change parameters of a function in C++, have to edit .h file and .cpp file, plus all call sites Changing an “if” statement into a “do” statement was difficult in early structure editors VLs are very difficult due to layout issues May have to reposition all lines and boxes to make room and neaten resulting drawing May need to disconnect and reconnect many wires Need for correct indenting may make Haskell programs resistant to editing But good editor can help Brad Myers CMU - HCI Institute 29 Less is More HCI principle (“keep it simple”) ICFP’02 C, C++ have 16 levels of precedence that have to be memorized, some of which are left-associative and some are right-associative. Consider: a=b+=c=+d*e+++f==g which is a legal statement in C++ and C Deep nesting in functional languages “Too many parentheses” Brad Myers CMU - HCI Institute 30 Help the user get started with the system Small things should be simple Programs that do small things must still often be very large, e.g., creating a window containing a single red rectangle The 2-pages needed in Motif to do “Hello World” “zero” lines in Visual Basic In Java, it still requires: class HelloWorldApp { public static void main(String[] args) { System.out.println("Hello World!"); } } Note 3 kinds of parentheses, 9 special words ML: print "Hello World!" Brad Myers ICFP’02 CMU - HCI Institute 31 Other Issues ICFP’02 Many more, see: http://www.cs.cmu.edu/~NatProg/langeval.html You can send me examples from each other’s systems! But these are mainly good for analysis Given a design question, how answer it? Brad Myers CMU - HCI Institute 32 Our Research Lots of gaps in prior research on people and programming Develop knowledge that can be used in design Ph.D. thesis of John Pane http://www.cs.cmu.edu/~pane/thesis/ Evaluate: ICFP’02 Available at: How people express algorithms and think about tasks Vocabulary and notations used Related to the HCI principles of “know the user”, “task analysis”, and “closeness of mapping” Brad Myers CMU - HCI Institute 33 Our Studies so far How people naturally express programming concepts and algorithms 1) Nine scenes from PacMan 2) Transforming and calculating data in a spreadsheet Specific issues of language design 3) Selecting specific objects from a group (“and”, “or”, “not”) ICFP’02 Brad Myers CMU - HCI Institute 34 Experimental Design Question should not bias the answer So use pictures instead of textual descriptions Concentrate on kids, non-programmers ICFP’02 Subjects should not be “tainted” by existing programming languages Tested that the results generalize to adults and programmers Brad Myers CMU - HCI Institute 35 Study 1 Usually Pacman moves like this. Now let's say we add a wall. Pacman moves like this. Not like this. Do this: Write a statement that summarizes how I (as the computer) should move Pacman in relation to the presence or absence of other things. Second Study Whether similar results from other domains and with adults Developed 11 questions with scenarios using spreadsheets ICFP’02 To test database access and operations More conventionally “computational” Brad Myers CMU - HCI Institute 37 Example Question, 2nd Study Question 4 • Describe in detailed steps what the computer should do to categorize these people into 2 groups of ‘Gold’ and ‘Black’. First name 1 Sandra 2 Bill 3 Cindy 4 Tom 5 Bill 6 Whitney 7 Michael 8 Jay 9 David 10 Will No. ICFP’02 Last name Bullock Clinton Crawford Cruise Gates Houston Jordan Leno Letterman Smith Group Brad Myers CMU - HCI Institute First name 1 Sandra 2 Bill 3 Cindy 4 Tom 5 Bill 6 Whitney 7 Michael 8 Jay 9 David 10 Will No. Last name Bullock Clinton Crawford Cruise Gates Houston Jordan Leno Letterman Smith Group Gold Gold Gold Gold Black Gold Gold Black Black Gold 38 Results Rule-based style “If PacMan loses all his lives, its game over.” Some use of Constraint style: “Pacman cannot go through a wall.” Aggregate operations instead of iterations “The monsters turn blue and run away” “Subtract 20,000 from all elements in Round 2” — These tend to eliminate control structures ICFP’02 Brad Myers CMU - HCI Institute 39 More Results The words “AND” and “THEN” often used for sequencing instead of as a logical operator “The monsters turn color and start to back up.” Boolean expression (AND, OR) not common Usually had mutually exclusive rules “If I press the up arrow, PacMan goes up. If I press the down arrow, PacMan goes down, …” General case first, then exceptions “When you encounter a ghost, it should kill you. But if you get a big pill first you can eat them.” ICFP’02 Brad Myers CMU - HCI Institute 40 Yet More Results Most arithmetic used natural language style “When PacMan eats a big dot, the score goes up 100.” Operations suggest data as lists, not arrays Objects normally moving “If PacMan hits a wall, he stops.” so objects remember their own state 2/3 of the first study subjects drew pictures ICFP’02 People don’t make space before inserting Usually to define the initial state Brad Myers CMU - HCI Institute 41 Third Study: Select Objects from a Group Concentrate on a known problematic area Use of AND, OR, NOT ICFP’02 Often eliminated from Web searching Newsweek reports that less than 6% of users manage to use “and”, “or”, “+”, “-” Still dominant in all programming languages First: generate queries given results Then, answer queries Form-based and Textual formats Order was counter-balanced Brad Myers CMU - HCI Institute 42 Generate Queries ICFP’02 Brad Myers CMU - HCI Institute 43 Answer Queries ICFP’02 Brad Myers CMU - HCI Institute 44 Results Using “unless” did not help accuracy “select the objects that are blue unless the objects are square” vs. “select the objects that match blue and not square “And” was a Boolean conjunction sometimes “select the objects that match blue and circle” vs. “select the objects that match blue and the objects that match circle” Precedence of “not” varied “select the objects that match not red and square” 64% interpreted as “(not red) and square” “select the objects that match not triangle or green” 67% interpreted as “not (triangle or green)” ICFP’02 Brad Myers CMU - HCI Institute 45 More results 2-D forms helped for generation 94% correct with match forms, vs. 85% correct with text (p<.0001) (blue and not square) or (circle and not green) ICFP’02 Brad Myers CMU - HCI Institute 46 Implications for New Languages For increased usability for novices: ICFP’02 Use event-based style for dynamic events Work to minimize the need for control structures and variables Provide operations on groups of objects Data structures that combine the capabilities of lists + arrays + sets Support simple arithmetic in natural language style (“add 1 to score”) Avoid the use of the word “and” altogether Brad Myers CMU - HCI Institute 47 New Language and System: HANDS Human-centered Advances for Novices to Develop Software ICFP’02 Video Brad Myers CMU - HCI Institute 48 Properties of HANDS Goal: Allow children age 10 to create interactive games and simulations Event based computation model Metaphor of agent manipulating cards All data is visible as properties of the cards All operations work on singletons or lists Programming in the small (in the tiny) No distinction in syntax Can generate lists on the fly with queries Minimize need for control structures Minimize need for local variables ICFP’02 Brad Myers CMU - HCI Institute 49 More Properties of Hands Verbose Language No precedence Does use parentheses But just one kind! Environment provide lots of help with syntax and graphics Tries to be extremely consistent, and also apply other HCI rules ICFP’02 Easier to read For example, combines IF, CASE (switch), and COND (from Lisp) into one construct Brad Myers CMU - HCI Institute 50 Conclusions Much more research needed on the human side of programming Usability of languages and environments can be improved Claims about usability can be tested ICFP’02 Languages can be evaluated using HCI principles and techniques If you want a useable and learnable programming language, there are data and techniques available that can help. Brad Myers CMU - HCI Institute 51 Credits Support for this research has come in part from the National Science Foundation under Grant No. IRI-9900452 and Grant No. IIS-9817527 For more information, see: http://www.cs.cmu.edu/~NatProg ICFP’02 Brad Myers CMU - HCI Institute 52 Towards More Natural Functional Programming Languages Brad A. Myers Human-Computer Interaction Institute School of Computer Science Carnegie Mellon University http://www.cs.cmu.edu/~bam http://www.cs.cmu.edu/~NatProg bam@cs.cmu.edu 53