NATIONAL OPEN UNIVERSITY OF NIGERIA FACULTY OF SCIENCES DEPARTMENT OF COMPUTER SCIENCE COURSE CODE: CIT 332 COURSE TITLE: Survey of Programming Languages CIT 332 COURSE GUIDE National Open University of Nigeria University Village, Plot 91 Jabi Cadastral Zone Nnamdi Azikiwe Expressway Jabi, Abuja Lagos Office 14/16 Ahmadu Bello Way Victoria Island, Lagos Departmental email: computersciencedepartment@noun.edu.ng NOUN e-mail: centralinfo@noun.edu.ng URL: www.nou.edu.ng First Printed 2022 ISBN: 978-058-557-5 All Rights Reserved Printed by: NOUN PRESS January 2022 ii CIT 332 COURSE GUIDE COURSE GUIDE Course Code: CIT332 Couse Title: Survey of Programming Languages Course Team: Dr. Kehinde Sotonwa - (Developer/Writer) Dr. Greg Onwodi - Content Editor Dr. Francis B. Osang – HOD/Internal Quality Control Expert iii CIT 332 COURSE GUIDE MAIN COURSE CONTENTS PAGE Introduction………………………………………….…..…………………..…………….. v What You Will Learn in this Course……………..………………………………….......... v Course Aim.………………………………………………….……………......................... v Course Objectives………………………………………….……..……………………….. v Working through this Course……………………………....………………………….... vii Course Justification…………………………………………………………………………vii Course Materials……………………………………………..……………………………. vii Course Requirements………………………………………………………………………. vii Study Units…………………………………………………….…..…………….………… vii Textbooks and References…………………………………..….……………… …………. viii Assignment File……………………………………………….…………………………... xiii Presentation Schedule……………………………………………………………………,,. xiii Assessment……………………………………………………....................................…… xiii Tutor-Marked Assignments (TMAs)……………………………………………...………. xiii Final Examination and Grading………………………………..………………………….. xiv Course Marking Scheme………………………………………………………...………… xiv Course Overview…………………………………………..………………………..…….. xiv How to Get the Most from this Course……………………………………..………........ xv Tutors and Tutorials…………………………………………..…………………………… xvi iv CIT 332 COURSE GUIDE Introduction CIT 332: Survey of Programming Languages is a Second Semester course. It is a four (4) credit degree course available to three hundred level students offering Computer Science and all related courses, studying towards acquiring a Bachelor of Science in Computer Science and other related disciplines. The course is divided into five (5) modules and 12 study units. It entails the overview of programming languages. It further deals with brief survey of programming languages. The course also introduces language description declaration and types, abstraction mechanism and modules in programming languages. What you will Learn in this Course The overall aim of this course is to teach you general overview of programming languages: evolution, generation, characteristics, taxonomy of different level of programming languages and difference between these level of programming languages. The course is designed to describe the fundamental concepts of programming language by discussing the design issues of the various language constructs, examining the design choices for these constructs in some of the most common languages, and critically comparing design alternatives. Fundamental syntactic and semantic concepts underlying modern programming languages, different modern programming languages: C/C++, Java, LISP and PROLOG etc, their syntax: bindings and scope, data types and type checking, functional scheme, expression of assignments, control structure, program statements, program units etc., analyzes modules in programming languages and finally language comparison. Course Aim This course aims to take a step further in teaching you the basic and best approach to survey programming languages. It is hoped that the knowledge would enhance both the programmer/developer expertise and Students to be familiar with popular programming languages and the advantages they have over each other. Course Objectives It is important to note that each unit has specific objectives. Students should study them carefully before proceeding to subsequent units. Therefore, it may be useful to refer to these objectives in the course of your study of the unit to assess your progress. You should always look at the unit objectives after completing a unit. In this way, you can be sure that you have done what is required of you by the end of the unit. The general objective of the course as an integral part of the Bachelor Degree for Computer Science Students in National Open University, Abeokuta, is to: • Demonstrate understanding of the evolution of programming languages and relate how this history has led to the paradigms available today. v CIT 332 • COURSE GUIDE Identify at least one outstanding and distinguishing characteristic for each of the programming paradigms covered in this unit. • Evaluate the tradeoffs between the different paradigms, considering such issues as space efficiency, time efficiency (of both the computer and the programmer), safety, and power of expression. • Identify at least one distinguishing characteristic for each of the programming paradigms covered in this unit. • Describe the importance and power of abstraction in the context of virtual machines. • Explain the benefits of intermediate languages in the compilation process. • Evaluate the tradeoffs in reliability vs. writability. • Compare and contrast compiled and interpreted execution models, outlining the relative merits of each. • Describe the phases of program translation from source code to executable code and the files produced by these phases. • Explain the differences between machine-dependent and machine-independent translation and where these differences are evident in the translation process. • Explain formal methods of describing syntax (backus-naur form, context-free grammars, and parser tree). • Describe the meanings of programs (dynamic semantics, weakest precondition). • Identify and describe the properties of a variable such as its associated address, value, scope, persistence, and size. • Explain data types: primitive types, character string types, user-defined ordinal types, array types, associative arrays, point and reference types • Demonstrate different forms of binding, visibility, scoping, and lifetime management. • Demonstrate the difference between overridden and overloaded subprograms • Explain functional side effects. • Demonstrate the difference between pass-by-value, pass-by-result, pass-by-value-result, pass-by-reference, and pass-by-name parameter passing. • Explain the difference between the static binding and dynamic binding. • Discuss evolution, history, program structure and features of some commonly used programming languages paradigm such as C/C++, Java and PROLOG. • Examine, evaluate and compare these languages vi CIT 332 COURSE GUIDE • To increase capacity of computer science students to express ideas • Improve their background for choosing appropriate languages • Increase the ability to learn new languages • To better understand the significance of programming implementation • Overall advancement of computing. Working through this Course To complete this course, you are required to study all the units, the recommended text books, and other relevant materials. Each unit contains some self-assessment exercises and tutor - marked assignments, and at some point in this course, you are required to submit the tutor marked assignments. There is also a final examination at the end of this course. Stated below are the components of this course and what you have to do. Course Justification Any serious study of programming languages requires an examination of some related topics among which are formal methods of describing the syntax and semantics of programming languages and its implementation techniques. The need to use programming language to solve our day-to-day problems grows every year. Students should be able to familiar with popular programming languages and the advantage they have over each other. They should be able to know which programming language solves a particular problem better. The theoretical and practical knowledge acquired from this course will give the students a foundation from which they can appreciate the relevant and the interrelationships of different programming languages. Course Materials The major components of the course are: 1. 2. 3. 4. 5. Course Guide Study Units Text Books Assignment Files Presentation Schedule Course Requirements This is a compulsory course for all computer science students in the University. In view of this, students are expected to participate in all the course activities and have minimum of 75% attendance to be able to write the final examination. Study Units There are 5 modules and 12 study units in this course. They are: vii CIT 332 COURSE GUIDE Module 1 Overview of Programming Languages Unit 1 Unit 2 Unit 3 History of Programming Languages Brief Survey of Programming Languages The Effects of Scale on Programming Methodology Module 2 Language Description Unit 1 Unit 2 Syntactic Structure Language Semantic Module 3 Declaration and Types Unit 1 Unit 2 The Concept of Types Overview of Type-Checking Module 4 Abstraction Mechanisms Unit 1 Unit 2 Concepts of Abstraction Parameterization Mechanism Module 5 Modules in Programming Languages Unit 1 Unit 2 Unit 3 Object Oriented Language Paradigm Functional Language Paradigms Logic Language Paradigms Textbooks and References Bjarne, Dines; Cliff, B. Jones (1978). The Vienna Development Method: The Meta-Language, Lecture Notes in Computer Science 61. Berlin, Heidelberg, New York: Springer. Maurizio Gabbrielli and Simone Martini (2010). Programming Languages: Principles and Paradigm, Springer-Verlag London Limited, 2010. Robert W. Sebesta. Concepts of Programming Languages. Tenth Edition, University of Colorado at Colorado Spring Pearson, 2011. Bernd Teufel, Organization of Programming Languages, Springer-Verlag/Wien, 1991 Michael L. Scoot. Programming Language Pragmatics. Morgan Kaufmann, 2006. Anoemuah RosemaryAruorezi, Michael and Cecilia Ibru University, 2018 Allen B. Tucker and Robert Noonan. Programming Languages Principles and Paradigm. McgrawHill Higher Education, 2006. viii CIT 332 COURSE GUIDE Robert Harle, Object Oriented Programming IA NST CS and CST Lent 2009/10. Aho, A. V., J. E. Hopcroft, and J. D. Ullman. The design and analysis of computer algorithms. Boston: Addison-Wesley, 2007. Kasabov NK, (1998) Foundations of Neural Networks, Fuzzy Systems and Knowledge Engineering. The MIT Press Cambridge. Bezem M (2010) A Prolog Compendium. Department of Informatics, University of Bergen. Bergen, Norway. Rowe NC. (1988) Artificial Intelligence through Prolog. Prentice-Hall. Olusegun Folorunso, Organization of Programming Languages, The Department of Computer Science, University of Agriculture, Abeokuta. Bramer M (2005) Logic Programming with Prolog. Springer. USA 10. Prolog Development Center (2001) Visual Prolog Version 5.x: Language Tutorial. Copenhagen, Denmark Bergin, Thomas J. and Richard G. Gibson, eds. History of Programming Languages-II. New York: ACM Press, 1996. Christiansen, Tom and Nathan Torkington. Perlfaq1 Unix Manpage. Perl 5 Porters, 1997-1999. Java History.” http://ils.unc.edu/blaze/java/javahist.html. Cited, March 29, 2000. Programming Languages.” McGraw-Hill Encyclopedia of Science and Technology. New York: McGraw-Hill, 1997. Wexelblat, Richard L., ed. History of Programming Languages. New York: Academic Press, 1981. A History of Computer Programming Languages (onlinecollegeplan.com) A History of Computer Programming Languages (brown.edu). A Brief History of Programming Languages.” http://www.byte.com/art/9509/se7/artl9.htm. Cited, March 25, 2000. A Short History of the Computer.” http://www.softlord.com/comp/. Jeremy Myers. Cited, March 25, 2000. IT Lecture 1 History of Computers.pdf (webs.com). Miss N. Nembhard, Source (1995) Robers, Eric S. The Art and Science of C. Addison-Wesley Publishing Company. Reading: 1995. Rowe NC. (1988) Artificial Intelligence through Prolog. Prentice-Hall. ix CIT 332 Why COURSE GUIDE Study Programming Languages? (lmu.edu), https://cs.Imu.edu/ray/notes/whystudyprogramminglanguages/. Hipster Hacker Reasons Why You Should Learn BASIC Programming | UoPeople https://www.uoppeople.edu/blog/6-reasons-why-you-should-learn-basic-programming/ Programming Language Concepts Test Questions/Answers, Dino Cajic, (dinocajic.blogspot.com), https://dinocajic.blogspot.com/2017/02/programminglanguage-concepts-test,html 50+ Best Programming Interview Questions and Answers https://hackr.io/blog/programming-interview-questions. in 2021 (hackr.io) ide.geeksforgeeks.org, 5th Floor, A-118, Sector-136, N.oida, Uttar Pradesh – 201305. Concepts of Programming Languages, Tenth Edition, Robert W, Sebesta, University of Colorado, Colorado Springs Pearson. Louden: Programming Languages: Principles and Practices, Cengage Learning Tucker: Programming Languages: Principles and paradigms, Tata McGraw –Hill. E Horowitz: Programming Languages, 2nd Edition, Addison Wesley Programming Language Evaluation Criteria Part 1: Readability | by ngwes | Medium, Language Evaluation Criteria (gordon.edu) Emmett Boudreau. What is a Programming Paradigm. An overview of programming paradigms. | Towards Data Science Learning Programming Methodologies: Absolute Beginners, Tutorials Point, Simply Easy Learning. Questions-on-principle-of-programming-language.pdf (oureducation.in), https://blog/oureducation.in/wp-content/uploads/2013/Questions-on-principle -ofprogramming-language.pdf#. Language Evaluation Criteria PPL by Jayesh Umre Chapter_3_Von_Neumanns_Architecture.pdf (weebly.com), https://viennaict.weebly.com. Michael L. Scott, Programming Language Pragmatics, Third Edition, Morgan Kaufmann Publishers, Elsevier. Semantics in Programming Languages - INFO4MYSTREY , BestMark Mystery2020 (info4mystery.com), https://www.info4mystery.com/2019/01/semantics-inprofgramming-languages.html. x CIT 332 COURSE GUIDE J.P. Bennett, Introduction to Compiling Techniques. Berkshire, England: McGraw-Hill, 1990. A. Pyster, Compiler Design and Construction. New York, NY: Van Nostrand Reinhold, 1988. J. Tremblay, P. Sorenson, The Theory and Practice of Compiler Writing. New York, NY: McGraw-Hill, 1985. Semantic Analysis in Compiler Design https://www.geeksforgeeks.org/semantic-analysis-in-computer-design/. GeeksforGeeks, Compiler Design Semantic Analysis (tutorialspoint.com), https://www.tutorialspoint.com/compiler_design_semantic_analysis.html. Semantics of Programming Languages Computer Science Tripos, Part 1B, Peter Sewell Computer Laboratory, University of Cambridge, 2009. UTD: Describing Syntax and Semantics: Dr. Chris Davis, cid021000@utdallas.edu Hennessy, M. (1990). The Semantics of Programming Languages. Wiley. Out of print, but available on the web at http://www.cogs.susx.ac.uk/users/matthewh/ semnotes.ps.gz12. Boston: Addison-Wesley, 2007. Pierce, B. C. (2002) Types and Programming Languages. MIT Press Composing Programs by John DeNero, based on the textbook Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Winskel, G. (1993). The Formal Semantics of Programming Languages. MIT Press. An introduction to both operational and denotational semantics; recommended for the Part II Denotational Semantics course. Plotkin, G. D. (1981). A structural approach to operational semantics. Technical Report DAIMI FN-19, Aarhus University. Programming Languages: Types and Features - Chakray, https://www.chakay.com/programminlanguages-types-and-features. A. W. Appel. Modern Compiler Implementation in Java. Cambridge University Press, Cambridge, 1998. 3. J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification, 3/E. Addison Wesley, Reading, 2005. Available online at http://java.sun.com/docs/books/jls/index.html. References 55 J. E. Hopcroft, R. Motwani, and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, 2001. C. Laneve. La Descrizione Operazionale dei Linguaggi di Programmazione. Franco Angeli, Milano, 1998 (in Italian). xi CIT 332 COURSE GUIDE C. W. Morris. Foundations of the theory of signs. In Writings on the Theory of Signs, pages 17– 74. Mouton, The Hague, 1938. Programming Data Types & Structures | Computer Science (teachcomputerscience.com), https://teachcomputerscience.com/programming-data-types/ Beginner's Guide to Passing Parameters | Codementor, Michael https://www.codementor.io/@michaelsafyan/computer-different-ways-to-passparameters-du107xfer. Safyan, Parameter Passing | Computing (glowscotland.org.uk), Stratton, https://blogs.glowscotland.org.uk Parameter Passing Techniques in C/C++ GeeksforGeeks, https://www.geeksforgeeks.org/ Parameter passing - Implementation (computational constructs) - Higher Computing Science Revision - BBC Bitesize, https://www.bbc.co.uk/ Definition of Parameters in Computer Programming (thoughtco.com), David Bolton, 2017. Modern Programming Language, Lecture 18-24: LISP Programming LISP - Data Types (tutorialspoint.com) Introduction to LISP, CS 2740, Knowledge Representation Lecture 2 by Milos Hauskrecht, 5329 Sennott Square The Evolution of Lisp: ACM History of Programming Languages Gabriel and Steele’s (FTP). Common LISP: A Gentle Introduction to Sybbolic Computattion by Davis S. Touretzky History of Python Programming Language - Python Pool History of Python - GeeksforGeeks by Sohom Pramanick Perl Programming Language: history, features, applications, Why Learn? - Answersjet History of Python (tutorialspoint.com), History Of Python Programming Language – Journal Dev. A brief history of the Python programming language | by Dan Root | Python in Plain English History and Development of Python Programming Language (analyticsinsight.net) The ALGOL Programming Language (umich.edu) Programming History: The Influence of Algol on Modern Programming Languages (Part 1) | by Mike McMillan | Better Programming xii CIT 332 COURSE GUIDE History Of Algol 68 | programming language (wordpress.com) https://www.javapont.com classification –of-programming-languages, https//www.w3schools.in/category/javatutorial, https://www.totorialspoint.com/difference-between-hihg-level-language-andlow-level-language.https://byjus.com/gate/difference-between-high-level-and-low-levellanguages/ What is a Low-Level Language? Definition https://www.tutorialandexample.com/input-devies-of-computer/ from Techopedia, Java Syntax - A Complete Guide to Master Java - DataFlair (data-flair.training), https://datraflair.training/blog/basic-jaja-syntax/ C++ Language Tutorial by Juan Soulie, http://www. cplusplus.com./doc/tutorial/ Assignment File The assignment file will be given to you in due course. In this file, you will find all the details of the work you must submit to your tutor for marking. The marks you obtain for these assignments will count towards the final mark for the course. Altogether, there are 12 tutor marked assignments for this course. Presentation Schedule The presentation schedule included in this course guide provides you with important dates for completion of each tutor marked assignment. You should therefore endeavor to meet the deadlines. Assessment There are two aspects to the assessment of this course. First, there are tutor marked assignments; and second, the written examination. Therefore, you are expected to take note of the facts, information and problem solving gathered during the course. The tutor marked assignments must be submitted to your tutor for formal assessment, in accordance to the deadline given. The work submitted will count for 40% of your total course mark. At the end of the course, you will need to sit for a final written examination. This examination will account for 60% of your total score. Tutor -Marked Assignments (TMAs) There are 12 TMAs in this course. You need to submit all the TMAs. The best 5 will therefore be counted. The total marks for the best five (5) assignments will be 40% of your total course mark. Assignment questions for the units in this course are contained in the Assignment File. You should be able to complete your assignments from the information and materials contained in your set textbooks, reading and study units. However, you may wish to use other references to broaden your viewpoint and provide a deeper understanding of the subject. When you have completed each assignment, send them to your tutor as soon as possible and make certain that it gets to your tutor on or before the stipulated deadline. If for any reason you cannot xiii CIT 332 COURSE GUIDE complete your assignment on time, contact your tutor before the assignment is due to discuss the possibility of extension. Extension will not be granted after the deadline, unless on extraordinary cases. Final Examination and Grading The final examination for the course will carry 60% percentage of the total marks available for this course. The examination will cover every aspect of the course, so you are advised to revise all your corrected assignments before the examination. This course endows you with the status of a teacher and that of a learner. This means that you teach yourself and that you learn, as your learning capabilities would allow. It also means that you are in a better position to determine and to ascertain the what, the how, and the when of your language learning. No teacher imposes any method of learning on you. The course units are similarly designed with the introduction following the contents, then a set of objectives and then the dialogue and so on. The objectives guide you as you go through the units to ascertain your knowledge of the required terms and expressions. Course Marking Scheme The following table includes the course marking scheme Table 1: Course Marking Scheme Assessment Assignments 1-12 Marks 12 assignments, 40% for the best 5 total = 8% X 5 = 40% Final Examination 60% of overall course marks Total 100% of Course Marks Course Overview This table indicates the units, the number of weeks required to complete them and the assignments. Unit Title of the Work Weeks Assessment (End of Unit) Course Guide Module 1 Overview of Programming Languages 1 History of Programming Languages Week 1 Assessment 1 2 Brief Survey of Programming Languages Week 2 Assessment 2 3 The Effects of Scale on Programming Methodology Week 3 Assessment 3 Module 2 Language Description xiv CIT 332 COURSE GUIDE 1 Syntactic Structure Week 4 Assessment 4 2 Language Semantic Week 5 Assessment 5 Module 3 Declaration Types 1 The Concept of Types Week 6 Assessment 6 2 Overview of Type-Checking Week 7 Assessment 7 Module 4 Abstraction Mechanisms 1 Concepts of Abstraction Week 8 Assessment 8 2 Parameterization Mechanism Week 9 Assessment 9 Module 5 Modules in Programming Languages 1 Object Oriented Language Paradigms Week 10 Assessment 10 2 Functional Language Paradigms Week 11 Assessment 11 3 Logic Language Paradigms Week 12 Assessment 12 HOW TO GET THE BEST FROM THIS COURSE In distance learning the study units replace the university lecturer. This is one of the great advantages of distance learning; you can read and work through specially designed study materials at your own pace, and at a time and place that suit you best. Think of it as reading the lecture instead of listening to a lecturer. In the same way that a lecturer might set you some reading to do, the study units tell you when to read your set books or other material. Just as a lecturer might give you an in-class exercise, your study units provide exercises for you to do at appropriate points. Each of the study units follows a common format. The first item is an introduction to the subject matter of the unit and how a particular unit is integrated with the other units and the course as a whole. Next is a set of learning objectives. These objectives enable you know what you should be able to do by the time you have completed the unit. You should use these objectives to guide your study. When you have finished the units you must go back and check whether you have achieved the objectives. If you make a habit of doing this, you will significantly improve your chances of passing the course. Remember that your tutor’s job is to assist you. When you need help, don’t hesitate to call and ask your tutor to provide it. 1. Read this Course Guide thoroughly. 2. Organize a study schedule. Refer to the ‘Course Overview’ for more details. Note the time you are expected to spend on each unit and how the assignments relate to the units. Whatever method you chose to use, you should decide on it and write in your own dates for working on each unit. xv CIT 332 COURSE GUIDE 3. Once you have created your own study schedule, do everything you can to stick to it. The major reason that students fail is that they lag behind in their course work. 4. Turn to Unit 1 and read the introduction and the objectives for the unit. 5. Assemble the study materials. Information about what you need for a unit is given in the Overview at the beginning of each unit. You will almost always need both the study unit you are working on and one of your set of books on your desk at the same time 6. Work through the unit. The content of the unit itself has been arranged to provide a sequence for you to follow. As you work through the unit you will be instructed to read sections from your set books or other articles. Use the unit to guide your reading. 7. Review the objectives for each study unit to confirm that you have achieved them. If you feel unsure about any of the objectives, review the study material or consult your tutor. 8. When you are confident that you have achieved a unit’s objectives, you can then start on the next unit. Proceed unit by unit through the course and try to pace your study so that you keep yourself on schedule. 9. When you have submitted an assignment to your tutor for marking, do not wait for its return before starting on the next unit. Keep to your schedule. When the assignment is returned, pay particular attention to your tutor’s comments, both on the tutor-marked assignment form and also written on the assignment. Consult your tutor as soon as possible if you have any questions or problems. 10. After completing the last unit, review the course and prepare yourself for the final examination. Check that you have achieved the unit objectives (listed at the beginning of each unit) and the course objectives (listed in this Course Guide). TUTORS AND TUTORIALS There are 12 hours of tutorials provided in support of this course. You will be notified of the dates, times and location of these tutorials, together with the name and phone number of your tutor, as soon as you are allocated a tutorial group. Your tutor will mark and comment on your assignments, keep a close watch on your progress and on any difficulties you might encounter and provide assistance to you during the course. You must mail or submit your tutor-marked assignments to your tutor well before the due date (at least two working days are required). They will be marked by your tutor and returned to you as soon as possible. Do not hesitate to contact your tutor by telephone, or e-mail if you need help. The following might be circumstances in which you would find help necessary. Contact your tutor if: • You do not understand any part of the study units or the assigned readings xvi CIT 332 COURSE GUIDE • You have difficulty with the self-tests or exercises • You have a question or problem with an assignment, with your tutor’s comments on an assignment or with the grading of an assignment. You should try your best to attend the tutorials. This is the only chance to have face to face contact with your tutor and to ask questions which are answered instantly. You can raise any problem encountered in the course of your study. To gain the maximum benefit from course tutorials, prepare a question list before attending them. You will learn a lot from participating in discussions actively. TABLE OF CONTENTS PAGE Module 1 Overview of Programming Languages …………………………………. 1 Unit 1 Unit 2 Unit 3 History of Programming Languages ………………………………………. 1 Brief Survey of Programming Languages ………..………………….……. 18 The Effects of Scale on Programming Methodology……………………… 23 Module 2 Language Description ……..…………………………………………….. 30 Unit 1 Unit 2 Syntactic Structure ……………………………………,………....……….. 30 Language Semantic ……………………………………………………….. 48 Module 3 Declaration and Types……………………………………………………. 54 Unit 1 Unit 2 The Concept of Types …………………………………………………….. 54 Overview of Type-Checking ……………………….................................... 73 Module 4 Abstraction Mechanisms…………………….……………………….….. 78 Unit 1 Unit 2 Concepts of Abstraction ………................................................................... 78 Parameterization Mechanisms..…………………………………………..... 91 Module 5 Modules in Programming Languages…………………………………… 110 Unit 1 Unit 2 Unit 3 Object Oriented Language Paradigm………………………………………. 110 Functional Language Paradigms…………………………………………… 118 Logic Language Paradigms………………………………………………....124 xvii CIT 332 SURVEY OF PROGRAMMING LANGUAGES MODULE 1 OVERVIEW OF PROGRAMMING LANGUAGES Unit 1 Unit 2 Unit 3 History Programming Language Brief Survey of Programming Languages (Procedural Languages, Object Oriented Languages, Function Languages, Declarative – Non Algorithmic Languages and Scripting Languages) The Effects of Scale on Programming Methodology UNIT 1 HISTORY OF PROGRAMMING LANGUAGES CONTENTS 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content 3.1 What is Programming Language? 3.2 Classification of Programming Languages 3.2.1 First Generation 3.2.2 Second Generation 3.2.3 Third Generation 3.2.4 Fourth Generation 3.2.5 Fifth Generation 3.3 Characteristics of Each Generation 3.3.1 Computer Characteristics and Capabilities for 1GL 3.3.2 Computer Characteristics and Capabilities for 2GL 3.3.3 Computer Characteristics and Capabilities for 3GL 3.3.4 Computer Characteristics and Capabilities for 4GL 3.3.5 Computer Characteristics and Capabilities for 5GL 3.4 Taxonomy of Programming Language 3.4.1 Low Level Language 3.4.2 Middle Level Language 3.4.3 High Level Language 3.4.4 Higher Level Language 3.4.5 Translation used by Programming Languages 3.4.6 Some General Comments on High Level Languages 3.4.7 Difference between Low Level Language and High Level Language Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION Ever since the invention of Charles Babbage’s difference engine in 1822, computers have required a means of instructing them to perform a specific task. This means is known as a programming language. Computer languages were first composed of a series of steps to wire a particular 1 CIT 332 SURVEY OF PROGRAMMING LANGUAGES program; these morphed into a series of steps keyed into the computer and then executed; later these languages acquired advanced features such as logical branching and object orientation. The computer languages of the last fifty years have come in two stages, the first major languages and the second major languages, which are in use today. Most computer programming languages were inspired by or built upon concepts from previous computer programming languages. Today, while older languages still serve as a strong foundation for new ones, newer computer programming languages make programmers’ work simpler. Businesses rely heavily on programs to meet all of their data, transaction, and customer service needs. Science and medicine need accurate and complex programs for their research. Mobile applications must be updated to meet consumer demands. And all of these new and growing needs ensure that computer programming languages, both old and new, will remain an important part of modern life. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • Define programming languages Distinguished between different generation of programming languages Describe the characteristics of different generation of programming languages Explain different levels of programming languages 3.0 MAIN CONTENT 3.1 What is Programming Language Before getting into computer programming, what is computer programs and what they do. A computer program is a sequence of instructions written using a Computer Programming Language to perform a specified task by the computer. The two important terms used in the above definition are: • • Sequence of instructions Computer Programming Language To understand these terms, consider a situation when someone asks you the road to a particular place say available hospital? You will direct the person exactly the road? You will use Human Language to say your it. something as follows: First go straight, after half kilometer, take left from the red light and then drive around one kilometer and you will find Hospital or Clinic at the right. Here, you have used English Language to give several steps to be taken to reach Hospital. If they are followed in the following sequence, then you will reach available Hospital: Step1 Go straight Step 2 Drive half kilometer Step 3 Take left 2 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Step 4 Drive around one kilometer Step 5 Search for the Hospital at your right side Now, try to map the situation with a computer program. The above sequence of instructions is actually a Human Program written in English Language, which instructs on how to reach nearby Hospital from a given starting point. This same sequence could have been given in Spanish, Hindi, Arabic, or any other human language, provided the person seeking direction knows any of these languages. Now, to understand a computer program, which is a sequence of instructions written in a Computer Language to perform a specified task by the computer. Following is a simple program: print "Hello, World!" The above computer program instructs the computer to print "Hello, World!" on the computer screen. A computer program is also called a computer software, which can range from two lines to millions of lines of instructions. • • Computer program instructions are also called program source code and computer programming is also called program coding. A computer without a computer program is just a dump box; it is programs that make computers active. Computer programming is the process that professionals use to write code that instructs how a computer, application or software program performs. At its most basic, computer programming is a set of instructions to facilitate specific actions. A programming language is a computer language that is used by programmers (developers) to communicate with computers. It is a set of instructions written in any specific language to perform a specific task. It is made up of a series of symbols that serves as a bridge that allow humans to translate our thoughts into instructions computers can understand. Human and machines process information differently, and programming languages are the key to bridging the gap between people and computers therefore the first computer programming language was created. 3.2 Classification of Programming Language Generation Generations of computers have seen changes based on evolving technologies. With each new generation, computer circuitry, size, and parts have been miniaturized, the processing and speed doubled, memory got larger, and usability and reliability improved. Note that the timeline specified for each generation is tentative and not definite. The generations are actually based on evolving chip technology rather than any particular time frame. The five generations of computers are characterized by the electrical current flowing through the processing mechanisms listed below: • • • • The first is within vacuum tubes Transistors Integrated circuits Microprocessors 3 CIT 332 • Artificial intelligence 3.2.1 First Generation (1GL) SURVEY OF PROGRAMMING LANGUAGES Machine-level programming language used to program first-generation computers. Originally, no translator was used to compile or assemble the first-generation language. Computers usually operate on a binary level which are sequences of 0's and 1 'so, in the early days of computer programming binary machine code was used to program computers. Binary programming means that the program directly reflects the hardware structure of the computer. Sequences of 0's and 1 's cause certain actions in processing, control, or memory units, i.e. programming was done on the flip-flop level. It was recognized very rapidly that programming computers on binary machine code level is very troublesome. Thus, it is obvious that binary programming could not successfully be applied to solve complex problems. 3.2.2 Second Generation Languages (2GL) Assembly languages: the obvious drawbacks of binary programming became smaller by the introduction of second generation languages (2GL). These languages allowed mnemonic abbreviations as symbolic names and the concept of commands and operands was introduced. A programmer's work became much easier, since the symbolic notation and addressing of instructions and data was now possible. Compilation systems, called assemblers, were developed to translate the assembly language/symbolic programs into machine code. Assembly languages still reflect the hardware structure of the target machine - not on the flip-flop level, but on the register level, i.e. the abstraction has changed from the flip-flop to the register level. The instruction set of the target computer directly determines the scope of an assembly language. With the introduction of linking mechanisms, assembling of code separately became possible and the first steps towards program structuring were recognizable, although the term structured programming cannot be used for programming assembly code. The major disadvantage of assembly languages is that programs are still machine dependent and, in general, only readable by the authors. 3.2.3 Third Generation Languages (3GL) Third generation languages /high level languages/Problem oriented languages: these languages allow control structures which are based on logical data objects: variables of a specific type. They provide a level of abstraction allowing machine-independent specification of data, functions or processes, and their control. A programmer can now focus on the problem to be solved without being concerned with the internal hardware structure of the target computer. When considering high level programming languages, there are four groups of programming languages: imperative, declarative, object oriented and functional languages. 3.2.4 Fourth Generation Languages (4GL) Fourth generation languages/Non procedural languages deal with the following two fields which become more and more important: database and query languages, and program or application generators. The steadily increasing usage of software packages like database systems, spread sheets, statistical packages, and other (special purpose) packages makes it necessary to have a 4 CIT 332 SURVEY OF PROGRAMMING LANGUAGES medium of control available which can easily be used by non-specialists. In fourth generation languages the user describes what he wants to be solved, instead of how he wants to solve a problem - as it is done using procedural languages. In general, fourth generation languages are not only languages, but interactive programming environments. E.g. SOL (Structured Query Language): a query language for relational databases based on Codd's requirements for nonprocedural query languages. Another example is NATURAL emphasizing on a structured programming style. Program or application generators are often based on a certain specification method and produce an output (e.g. a high level program) to an appropriate specification. There exist already a great number of fourth generation languages: 3.2.5 Fifth Generation Language (5GL) 5GL is a programming language based around solving problems using constraints given to program rather using an algorithm written by a programmer. 5GL allows computers to have their own ability to think and their own inferences can be worked out by using the programmed information in large databases. 5GL gave birth to the dream of robot with AI and Fuzzy Logic. The fifth-generation languages are also called 5GL. It is based on the concept of artificial intelligence. It uses the concept that that rather than solving a problem algorithmically, an application can be built to solve it based on some constraints, i.e., we make computers learn to solve any problem. Parallel Processing & superconductors are used for this type of language to make real artificial intelligence. Advantages of this generation is that machines can make decisions, it reduces programmer effort to solve a problem and very easier than 3GL or 4GL to learn and use. Examples are: PROLOG, LISP, etc. 3.3 Characteristics of Each Generation 3.3.1 Computer Characteristics and Capabilities for 1GL • • • • • • Size – Relatively big size. Size was equivalent to a room. Speed – slow speed, hundred instructions per second. Cost – cost was very high. Language– Machine and Assembly Language. Reliability – high failure rate, Failure of circuits per second. Power– high power Consumption and it generated much heat. Trends and Developments in Computer Hardware • • • • • • Main Component – based on vacuum tubes Main memory –Magnetic drum secondary Memory – Magnetic drum & magnetic tape. Input Media – Punched cards & paper tape Output Media – Punched card & printed reports. Example – ENIAC, UNIVAC, Mark –I,mark-III , IBM 700 series , IBM 700 series ,IBM 701 series IBM 709 series etc. 5 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 3.3.2 Computer Characteristics and Capabilities 2GL • • • • Size – Smaller than first generation Computers. Speed – Relatively fast as compared to first generation, thousand instructions per second. Cost – cost Slightly lower than first generation. Language – Assembly Language and High level languages like FORTRAN, COBOL, BASIC. Reliability – Failure of circuits per days. Power– Low power Consumption. • • Trends and Developments in Computer Hardware • • • • • • Main Component – Based on Transistor. Main Memory – Magnetic core. Secondary Memory – Magnetic tape & magnetic Disk. Input Media – Punched cards Output Media – Punched card & printed reports. Example – IBM-7000, CDC 3000 series, PDP1,PDP3,PDP 5 ,PDP8 ,ATLAS,IBM7094 etc. 3.3.3 Computer Characteristics and Capabilities for 3GL • • Size – Smaller than Second Generation Computers. Disk size mini computers. Speed – Relatively fast as compared to second generation, Million instructions per second (MIPS). Cost – cost lower than Second generation. Language– High level languages like PASCAL, COBOL, BASIC, C etc. Reliability – Failure of circuits in Weeks. Power– Low power Consumption. • • • • Trends and Developments in Computer Hardware • • • • • • Main Component – Based on Integrated Circuits (IC) Primary Memory – Magnetic core. Secondary Memory– Magnetic Tape & magnetic disk. Input Media – Key to tape & key to disk Output Media – Printed reports & Video displays. Example – IBM-307 Series, CDC 7600 series, PDP (Personal Data processer) II etc. 3.3.4 Computer Characteristics and Capabilities 4GL • • Size – Typewriter size micro Computer. Speed – Relatively fast as compared to Third generation, Tens of Millions instructions per second. Cost – Cost lower than third generation. Language– High level languages like C++, KL1, RPG, SQL. Reliability – Failure of circuits in months. • • • 6 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES Power– Low power Consumption. Trends and Developments in Computer Hardware • • • • • Main Component – Large scale integrated (LSI) Semiconductor circuits called MICRO PROCESSOR or chip and VLSI (Very Large scale integrated). Main Memory – Semiconductor memory like RAM, ROM and cache memory is used as a primary memory. Secondary Memory – Magnetic disk, Floppy disk, and Optical disk (CD, DVD). Input Media – keyboard. Output Media – Video displays, Audio responses and printed reports. Example – CRAY 2, IBM 3090/600 Series, IBM AS/400/B60 etc. 3.3.5 Computer Characteristics and Capabilities 5GL • • • • • • Size –Credit card size microcomputers. Speed – Billions instructions per second. Cost – Cost Slightly lower than first generation. Language– Artificial Intelligence (AI) Languages like LISP, PROLOG etc Reliability – Failure of circuits in year. Power– Low power Consumption. • Trends and Developments in Computer Hardware • • • • • Main Component – based on ULSI (Ultra Large scale integrated) Circuit .that is also called Parallel Processing method. Memory – Optical disk and magnetic disk. Input Media – Speech input, Tactile input. Output Media – Graphics displays, Voice responses. Example – Lap-Tops, palm –Tops, Note books, PDA (personal Digital Assistant) etc. 3.4 Taxonomy of Programming Languages Till now, thousands of programming languages have come into form. All of them have their own specific purposes. All of these languages have a variation in terms of the level of abstraction that they all provide from the hardware. A few of these languages provide less or no abstraction at all, while the others provide a very high abstraction. On the basis of the programming language level of abstraction, there are two types of programming languages: Low-level language and High-level language. The first two generations are called lowlevel languages. The next three generations are called middle level language, very high-level languages and higher level language. Both of these are types of programming languages that provide a set of instructions to a system for performing certain tasks. Though both of these have specific purposes, they vary in various ways. The primary difference between low and high-level languages is that any programmer can understand, compile, and interpret a high-level language feasibly as compared to the machine. The 7 CIT 332 SURVEY OF PROGRAMMING LANGUAGES machines, on the other hand, are capable of understanding the low-level language more feasibly compared to human beings. Below is the hierarchy of level of programming language. 3.4.1 Low-Level Language The low-level language is a programming language that provides no abstraction from the hardware, and it is represented in 0 or 1 forms, which are the machine instructions. The languages that come under this category are the Machine level language and Assembly language. A low-level programming language provides little or no abstraction from a computer's instruction set architecture—commands or functions in the language map that are structurally similar to processor's instructions. Generally, this refers to either machine code or assembly language. Low-level languages are considered to be closer to computers. In other words, their prime function is to operate, manage and manipulate the computing hardware and components. Programs and applications written in a low-level language are directly executable on the computing hardware without any interpretation or translation. It is hardware dependent language. Each processor has kept its own instruction set, and these instructions are the patterns of bits. There is the class of processors using the same structure, which is specified as an instruction set; any instruction can be divided into two parts: the operator or opcode and operand. The starting bits are known as the operator or opcode whose role is to identify the kind of operation that are required to be performed. The remaining bits are called operand, whose purpose is to show the location of activity. The programs are written in various programming languages like C, C++. Java python etc. The computer is not capable of 8 CIT 332 SURVEY OF PROGRAMMING LANGUAGES understanding these programming languages therefore, programs are compiled through compiler that converts into machine language. Below is sample of how machine language is executed. Features of Low Level Languages • • • • • • • • Fast processing: All the computer instructions are directly recognized. Hence, programs run very quickly. No need of Translator: The machine code is the natural language directly understood by the computer. Therefore, translators are not required. Assembly language on the other hand requires assemblers for conversion from mnemonics to equivalent machine code. Error prone: The instructions are written using 0's and 1's. This makes it a herculean task. Hence, there are more chances of error prone codes in these languages. Errors are reduced in assembly language when compared to machine language. Difficult to use: The binary digits (0 and 1) are used to represent the complete data or instructions, so it is a tough task to memorize all the machine codes for human beings. Assembly language is easier to use when compared to machine language. Difficulty in debugging: When there is a mistake within the logic of the program then it is difficult to find out the error (bug) and debug the machine language program. Difficult to understand: It is very difficult to understand because it requires a substantial knowledge of machine code for different system architectures. Lack of portability: They are computer dependent such that a program which is written for one computer cannot be run on different computer because every computer has unique architecture. In-depth Knowledge of Computer Architecture: To program in the Assembly languages, the programmer need to understand the computer architecture as for as the machine languages. 9 CIT 332 3.4.2 SURVEY OF PROGRAMMING LANGUAGES Middle-Level Language in Computer The middle-level language lies in between the low level and high-level language. C language is the middle-level language. By using the C language, the user is capable of doing the system programming for writing operating system as well as application programming. The Java and C++ are also middle-level languages. The middle-level programming language interacts with the abstraction layer of a computer system. It serves as the bridge between the raw hardware and programming layer of the computer system. The middle-level language is also known as the intermediate programming language and pseudo-language. The middle-level language is an output of any programming language, which is known as source code. The source code is written in a high-level language. This kind of middle-level language is designed to improve the translated code before the processor executes it. The improvement schedule helps to adjust the source code according to the computational framework of the target machine of various languages. The processor of CPU does not directly execute the source code of the middle-level language. It needs interpretation into binary code for the execution. The improvement schedule helps to adjust the source code according to the computational framework of the target machine of various languages. The processor of CPU does not directly execute the source code of the middle-level language. It needs interpretation into binary code for the execution. See the diagram of overlapping of middle level language below: Overlapping of Middle Level Language Features of Middle-Level Languages • • • They have the feature of accessing memory directly using pointers. They use system registers for fast processing They support high-level language features such as user friendly nature; 10 CIT 332 3.4.3 SURVEY OF PROGRAMMING LANGUAGES High-Level Language The high-level language is a programming language that allows a programmer to write the programs which are independent of a particular type of computer. The high-level languages are considered as high-level because they are closer to human languages than machine level language. When writing a program in a high-level language, then the whole attention needs to be paid to the logic of the problem. A compiler is required to translate a high-level language into a low-level language as seen in the diagram below. The program of high-level language must be interpreted before the execution. The high-level language deal with the variables, arrays, objects, complex arithmetic or Boolean expression, subroutines and functions, loops, threads, locks, etc. The high-level languages are closer to human languages and far from machine languages. It is similar to human language, and the machine is not able to understand this language. #include<stdio.h> int main() { Printd(“hello”); getch() return 0; } This is the example of C language, which is a middle-level language because it has the feature of both the low and high-level language. The human can understand this example easily, but the machine is not able to understand it without the translator. Every high-level language uses a different type of syntax. Some languages are designed for writing desktop software programs, and other languages are used for web development. The diagram below showed how high level language is translated to low level language for execution. 11 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Features of High-Level languages • • • • • • • • 3.4.4 Third generation Language: High Level Languages are 3rd Generation Languages that are majorly refinement to the 2nd generation languages. Understandability: programs written in High-level language is easily understood Debugging: Errors can be easily located in theses languages because of the presence of standard error handling features. Portability: programs written on one computer architecture can run on another computer. This also means that they are machine independent. Easy to Use: Since they are written in English like statements, they are easy to use. Problem-oriented Languages: High level languages are problem oriented such that they can be used for specific problems to be solved. Easy Maintenance: We can easily maintain the programs written in these languages. Translator Required: There is always a need to translate the high-level language program to machine language. Higher Level programming languages These are fifth Generation Languages (5GLs). Fifth Generation systems (5GSs) are characterized by large scale parallel processing (many instructions being executed simultaneously), different memory organizations, and novel hardware operations predominantly designed for symbol manipulation. Popular example of 5GL is prolog. These type of languages requires design of interface between human being and computer to permit affective use of natural language and images. 12 CIT 332 3.4.5 SURVEY OF PROGRAMMING LANGUAGES Translator used by Programming Languages Programming languages are artificial languages and they use translators for the computer system to convert them to usable forms. There are three levels of programming languages: They are Machine Language (ML), Low Level Language (al known as Assembly Language) and High Level Language. The translators used by these languages differ and they have been classified into: Compiler, Interpreter or Assembler. Machine Language does not use translator. Low level language uses a translator called Assembler. An Assembler converts program written in Low Level Language (Assembly Lang) to machine code. High Level Language uses compiler, interpreter or both. Our emphasis in this course is on some selected High Level Languages. Some examples of HLL include C, C++, Delphi, PASCAL, FORTRAN, Scala, Python, PERL, Delphi, QBASIC and so on. The language that will be considered mostly are: C, Java, Python and VBscript. These all languages are considered as the high-level language because they must be processed with the help of a compiler or interpreter before the code execution. The source code is written in scripting languages like Perl and PHP can be run by the interpreter. These languages can convert the high-level code into binary code so that the machine can understand as seen in the diagram below. The compiler is the translator program software. This software can translate into its equivalent machine language program. The compiler compiles a set of machine language instructions for every program in a high-level language. Below is the sample of compiler from high level language to machine language. 13 CIT 332 SURVEY OF PROGRAMMING LANGUAGES The linker is used for the large programs in which we can create some modules for the different task. When we call the module, the whole job is to link to that module and program is processed. We can use a linker for the huge software, storing all the lines of program code in a single source file. The interpreter is the high-level language translator. It takes one statement of the high-level language program and translates it into machine level language instruction seen from the diagram below. Interpreter immediately executes the resulting machine language instruction. The compiler translates the entire source program into an object program, but the interpreter translates line by line. The high-level language is easy to read, write, and maintain as it is written in English like words. The languages are designed to overcome the limitation of low-level language, i.e., portability. The high-level language is portable; i.e., these languages are machine-independent. 3.4.6 General Comments on some High Level Programming Languages • C is a compiled procedural, imperative programming language made popular as the basis of Unix. The language is very popular and is found useful in building mathematical and scientific applications. C++ is a compiled programming language that is based on C, with support for objectoriented programming. It is one of the most widely-used programming languages currently available. It is often considered to be the industry-standard language of game development, but is also very often used to write other types of computer software applications. C# is an object-oriented programming language developed by Microsoft as part of their .NET initiative, and later approved as a standard by ECMA and ISO. C# has a procedural, object oriented syntax based on C++ that includes aspects of several other programming languages (most notably Delphi, Visual Basic, and Java) with a particular emphasis on simplification. FORTRAN is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing. Originally developed by International Business Machines Corporation (IBM) in the 1950s for scientific and engineering applications. FORTRAN has come in different versions over the years. We have FORTRAN II, FORTRAN 98 and so on. VBScript (Visual Basic Script) is developed by Microsoft with the intention of developing dynamic web pages. It is client-side scripting language like JavaScript. VBScript is a light version of Microsoft Visual Basic. The syntax of VBScript is very similar to that of Visual Basic. The light language allows you to make your webpage to be more lively and interactive, then you can incorporate VBScript in your code. • • • • 14 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES • Java is an object oriented interpreted programming language. It has gained popularity in the past few years for its ability to be run on many platforms, including Microsoft Windows, Linux, Mac OS, and other systems. It was developed by Sun Microsystems. Pascal is a general-purpose structured language named after the famous mathematician and philosopher Blaise Pascal. BASIC (Beginner's All-purpose Symbolic Instruction Code) was mostly used when microcomputers first hit the market, in the 1970s. Later, it largely replaced by other languages such as C, Pascal and Java. Whilst some commercial applications have been written in BASIC it has typically been seen as a language for learning/teaching programming rather than as a language for serious development. PHP is a programming language with focus on web design and a C-like syntax. 3.4.7 Difference Between Low Level and High Level Languages • • Parameter Basic Ease of Execution Process of Translation Efficiency of Memory Portability Comprehensibility Dependency on Machines Debugging Maintenance Usage High-Level Language These are programmer-friendly languages that are manageable, easy to understand, debug, and widely used in today’s times. These are very easy to execute. High-level languages require the use of a compiler or an interpreter for their translation into the machine code. These languages have a very low memory efficiency. It means that they consume more memory than any lowlevel language. These are portable from any one device to another. High-level languages are humanfriendly. They are, thus, very easy to understand and learn by any programmer. High-level languages do not depend on machines. Low-Level Language These are machine-friendly languages that are very difficult to understand by human beings but easy to interpret by machines. These are very difficult t Low-level language requires an assembler for directly translating the instructions of the machine language. These languages have a very high memory efficiency. It means that they consume less energy as compared to any high-level language. A user cannot port these from one device to another. Low-level languages are machinefriendly. They are, thus, very difficult to understand and learn by any human. Low-level languages are machinedependent and thus very difficult to understand by a normal user. A programmer cannot easily debug these languages. It is quite complex to maintain any low-level language. It is very easy to debug these languages. High-level languages have a simple and comprehensive maintenance technique. High-level languages are very Low-level languages are not very common and widely used for common nowadays for programming in today’s times. programming. 15 CIT 332 Speed of Execution Abstraction Need of Hardware Facilities Provided Ease of Modification Examples SURVEY OF PROGRAMMING LANGUAGES High-level languages take more time for execution as compared to lowlevel languages because these require a translation program. High-level languages allow a higher abstraction. One does not require a knowledge of hardware for writing programs. High-level languages do not provide various facilities at the hardware level. The process of modifying programs is very difficult with high-level programs. It is because every single statement in it may execute a bunch of instructions. Some examples of high-level languages include Perl, BASIC, COBOL, Pascal, Ruby, etc. The translation speed of low-level languages is very high. Low-level languages allow very little abstraction or no abstraction at all. Having knowledge of hardware is a prerequisite to writing programs. Low-level languages are very close to the hardware. They help in writing various programs at the hardware level. The process of modifying programs is very easy in low-level programs. Here, it can directly map the statements to the processor instructions. Some examples of low-level languages include the Machine language and Assembly language. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. 3. 4. 5. 6. What is programming language? List the major component of computer generation Explain different generation of computer you know. Name and explain another criterion by which languages can be judged? In your opinion, what major features would a perfect programming language include? Describe the advantages and disadvantages of low level, middle level and high level languages 5.0 CONCLUSION The study of programming languages is valuable for some important reasons: It gives insight to generation of programming languages, enables to know background of computer as a whole, and the components attached to each of the generation. In terms of speed, programs written in lowlevel languages are faster than those written in middle and high-level languages. This is because these programs do not need to be interpreted or compiled. They interact directly with the registers and memory. On the other hand, programs written in a high-level language are relatively slower. All these were discussed in this unit. 6.0 SUMMARY This unit has explained what programming language is, classification and explanation of different programming language generation, basic components of each computer programming generation. You also saw different characteristics of each programming language generation in terms of 16 CIT 332 SURVEY OF PROGRAMMING LANGUAGES computer characteristic, their capabilities, trend and development in computer hardware for different generation. There are clear differences between high-level, mid-level, and low-level programming languages. We can also point out that each type of programming language is designed to serve its specific purpose. For this reason, depending on the purpose each programming language is serving is what make one type of programming better and prefer over the other. 7.0 TUTOR-MARKED ASSIGNMENT 1. 2. 3. Categorized each level of programming language on different generation of today? Discuss the capabilities that distinguished on generation from another Explain the development of computer hardware in each generation of programming languages. Describe the advantages and disadvantages of some programming environment you have used. What is the name of the category of programming languages whose structure is dictated by the von Neumann computer architecture? What two programming language deficiencies were discovered as a result of the research in software development in the 1970s? What are some features of specific programming languages you know whose rationales are a mystery to you? Was the first high-level programming language you learned implemented with a pure interpreter, a hybrid implementation system, or a compiler? (You may have to research this.) 4. 5. 6. 7. 8. 8.0 REFERENCES/FURTHER READING 1. 2. A History of Computer Programming Languages (brown.edu) A Brief History of Programming Languages.” http://www.byte.com/art/9509/se7/artl9.htm. Cited, March 25, 2000. A Short History of the Computer.” http://www.softlord.com/comp/. Jeremy Myers. Cited, March 25, 2000. Bergin, Thomas J. and Richard G. Gibson, eds. History of Programming Languages-II. New York: ACM Press, 1996. Christiansen, Tom and Nathan Torkington. Perlfaq1 Unix Manpage. Perl 5 Porters, 19971999. Java History.” http://ils.unc.edu/blaze/java/javahist.html. Cited, March 29, 2000. Programming Languages.” McGraw-Hill Encyclopedia of Science and Technology. New York: McGraw-Hill, 1997. Wexelblat, Richard L., ed. History of Programming Languages. New York: Academic Press, 1981. A History of Computer Programming Languages (onlinecollegeplan.com) IT Lecture 1 History of Computers.pdf (webs.com) Prepared by Miss N. Nembhard, Source: Robers, Eric S. The Art and Science of C. Addison-Wesley Publishing Company. Reading: 1995. 3. 4. 5. 6. 7. 8. 9. 10. 17 CIT 332 UNIT 2 SURVEY OF PROGRAMMING LANGUAGES BRIEF SURVEY OF PROGRAMMING LANGUAGES CONTENTS 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content 3.1 Language paradigm 3.1.1 Procedural languages 3.1.2 Object Oriented languages 3.1.3 Functional languages 3.1.4 Declarative-Non Algorithmic languages 3.1.5 Scripting languages Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION When programs are developed to solve real-life problems like inventory management, payroll processing, student admissions, examination result processing, etc., they tend to be huge and complex. The approach to analyzing such complex problems, planning for software development and controlling the development process is called programming methodology. New software development methodologies (e.g. object Oriented Software Development) led to new paradigms in programming and by extension, to new programming languages. A programming paradigm is a pattern of problem-solving thought that underlies a particular genre of programs and languages. Also a programming paradigm is the concept by which the methodology of a programming language adheres to. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • • • To introduces several different paradigms of programming To explain the history and evolution of different languages paradigm To gain experience with these paradigms by using examples programming languages Describe fundamental properties of programming languages To better understand the significance of programming implementation To understand how to break problems of different paradigm into smaller units and place them into different approach. 3.0 MAIN CONTENT 18 CIT 332 3.1 SURVEY OF PROGRAMMING LANGUAGES Language Paradigm Paradigm is a model or world view. Paradigms are important because they define a programming language and how it works. A great way to think about a paradigm is as a set of ideas that a programming language can use to perform tasks in terms of machine-code at a much higher level. These different approaches can be better in some cases, and worse in others. A great rule of thumb when exploring paradigms is to understand what they are good at. While it is true that most modern programming languages are general-purpose and can do just about anything, it might be more difficult to develop a game, for example, in a functional language than an object-oriented language. Many people classify languages into these main paradigms: 3.1.1 Procedural/Imperative languages These are mostly influenced by the von Neumann computer architecture. Problem is broken down into procedures, or blocks of code that perform one task each. All procedures taken together form the whole program. It is suitable only for small programs that have low level of complexity. Typical elements of such languages are assignment statements, data structures and type binding, as well as control mechanisms; active procedures manipulate passive data objects. Example: For a calculator program that does addition, subtraction, multiplication, division, square root and comparison, each of these operations can be developed as separate procedures. The key concepts of imperative programming languages are: • Variables 19 CIT 332 • • • SURVEY OF PROGRAMMING LANGUAGES Commands Procedures Data abstraction. In the main program each procedure would be invoked on the basis of user’s choice. E.g. FORTRAN Algol, Pascal, C/C++, C#, Java, Perl, JavaScript, Visual BASIC.NET. 3.1.2. Object-oriented languages know the object representing data as well as procedures. Data structures and their appropriate manipulation processes are packed together to form a syntactical unit. Here the solution revolves around entities or objects that are part of problem. The solution deals with how to store data related to the entities, how the entities behave and how they interact with each other to give a cohesive solution. Example − If we have to develop a payroll management system, we will have entities like employees, salary structure, leave rules, etc. around which the solution must be built. The key concepts of OOP are: • • • • • Objects Classes Subclasses Inheritance Inclusion polymorphism. Object Oriented Programming has been widely accepted in the computing world because objects give a very natural way to model both real-world and cyber-world entities. Classes and class hierarchies lead to highly suitable as well as reusable units for constructing large programs. Objectoriented programming fits well with object-oriented analysis and design and hence supports these development of large software systems. E.g. SIMULA 67, SMALLTALK, C++, Java, Python, C#, Perl, Lisp or EIFFEL. 3.1.3 Functional/applicative languages These type of languages have no assignment statements. Their syntax is closely related to the formulation of mathematical functions. Thus, functions are central for functional programming languages. Here the problem, or the desired solution, is broken down into functional units. Each unit performs its own task and is self-sufficient. These units are then stitched together to form the complete solution. Example − A payroll processing can have functional units like employee data maintenance, basic salary calculation, gross salary calculation, leave processing, loan repayment processing, etc. The key concepts of functional programming are: • • • Expressions are key concept of functional programming because their purpose is to compute new values from old ones. This is the very essence of functional programming. Functions are key concept of functional programming because functions abstract over expressions. Parametric polymorphism is a key concept of functional programming because it enables a function to operate on values of a family of types. In practice, many useful functions are 20 CIT 332 • • 3.1.4 SURVEY OF PROGRAMMING LANGUAGES naturally polymorphic, and parametric polymorphism greatly magnifies the power and expressiveness of a functional language. Data abstraction is a key concept in the more modern functional languages such as ML and HASKELL. Data abstraction supports separation of important issues, which is essential for the design and implementation of large programs. Lazy evaluation is based on the simple notion that an expression whose value is never used need never be evaluated. E.g. LISP, Scala, Haskell, Python, Clojure, Erlang. It may also include OO (Object Oriented) concepts. Declarative/logic/ruled/non algorithmic based languages Facts and rules (the logic) are used to represent information (or knowledge) and a logical inference process is used to produce results. In contrast to this, control structures are not explicitly defined in a program, they are part of the programming language (inference mechanism). Here the problem is broken down into logical units rather than functional units. Example: In a school management system, users have very defined roles like class teacher, subject teacher, lab assistant, coordinator, academic in-charge, etc. So the software can be divided into units depending on user roles. Each user can have different interface, permissions, etc. e.g. PROLOG, PERL, this may also include OO concepts. The key concepts of logic programming are therefore: • • • Assertions Horn clauses Relations 3.1.5. Scripting Languages A scripting language or script language is a programming language for a runtime system that automates the execution of tasks that would otherwise be performed individually by a human operator. Scripting languages are usually interpreted at runtime rather than compiled. Scripting languages are a popular family of programming languages that allow frequent tasks to be performed quickly. Early scripting languages were generally used for niche applications – and as “glue languages” for combining existing systems. With the rise of the World Wide Web, a range of scripting languages emerged for use on web servers. Since scripting languages simplify the processing of text, they are ideally suited to the dynamic generation of HTML pages. The user can learn to code in scripting languages quickly, not much knowledge of web technology is required. It is highly efficient with the limited number of data structures and variables to use, it helps in adding visualization interfaces and combinations in web pages. There are different libraries which are part of different scripting languages. They help in creating new applications in web browsers and are different from normal programming languages. Examples are Node js, JavaScript, Ruby, Python, Perl, bash, PHP etc. The key concept includes: Key Concept of Scripting Language High-level string processing: All scripting languages provide very high-level support for string processing. The ubiquitous nature of textual data such as e-mail messages, database queries and results and HTML documents necessitated the need for High-level string processing. High-level 21 CIT 332 SURVEY OF PROGRAMMING LANGUAGES graphical user interface support: High-level support for building graphical user interfaces (GUIs) is vital to ensure loose coupling between the GUI and the application code. This is imperative because GUI can evolve rapidly as usability problems are exposed. Dynamic typing: Many scripting languages are dynamically typed. Scripts must be able to pass data to and from subsystem written in different languages when used as glue. Scripts often process heterogeneous data, whether in forms, databases, spreadsheets, or Web pages. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. 3. 4. What was the first functional language? Name three languages in each of the following categories: von Neumann, functional, object oriented. Name two logic languages. Name two widely used concurrent languages. 5.0 CONCLUSION The unit has explained what programming language is, classification and explanation of different programming language generation, basic components of each computer programming generation. You also saw different characteristics of each programming language generation in terms of computer characteristic, their capabilities, trend and development in computer hardware for different generation. 6.0 SUMMARY All programming paradigms have their benefits to both education and ability. Functional languages historically have been very notable in the world of scientific computing. Of course, taking a list of the most popular languages for scientific computing today, it would be obvious that they are all multi-paradigm. Object-oriented languages also have their fair share of great applications. Software development, game development, and graphics programming are all great examples of where object-oriented programming is a great approach to take. The biggest note one can take from all of this information is that the future of software and programming language is multi-paradigm. It is unlikely that anyone will be creating a purely functional or object-oriented programming language anytime soon. If you ask me, this isn’t such a bad thing, as there are weaknesses and strengths to every programming approach that you take, and a lot of true optimization is performing tests to see which methodology is more efficient or better than the other overall. This also puts a bigger thumbtack into the idea that everyone should know multiple languages from multiple paradigms. With the paradigms merging using the power of generics, it is never known when one might run into a programming concept from an entirely different programming language 7.0 TUTOR-MARKED ASSIGNMENT 1. 2. What are the three fundamental features of an object-oriented programming language? What language was the first to sup 22 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 3. 4. 5. What distinguishes declarative languages from imperative languages? What is generally considered the first high-level programming language? Why aren’t concurrent languages listed as a category 7.0 REFERENCES/FURTHER READING 1. What Is A Programming Paradigm?. An overview of programming paradigms… | by Emmett Boudreau | Towards Data Science Learning Programming Methodologies: Absolute Beginners, Tutorials Point, Simply Easy Learning Michael L. Scott, Programming Language Pragmatics, Third Edition, Morgan Kaufmann Publishers, Elsevier. Robert W. Sebesta. Concepts of Programming Languages. Addison Wesley, 2011. Aho, A. V., J. E. Hopcroft, and J. D. Ullman. The design and analysis of computer algorithms. Boston: Addison-Wesley, 2007. Questions-on-principle-of-programming-language.pdf (oureducation.in), https://blog/oureducation.in/wp-content/uploads/2013/Questions-on-principle -ofprogramming-language.pdf# 2. 3. 4. 5. 6. Unit 3 THE EFFECTS OF SCALE ON PROGRAMMING METHODOLOGY CONTENTS 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content 3.1 What are Major Reasons for Studying Concepts of Programming Languages 3.2 Roles of Programming Language 3.3 Language Evaluation Criteria Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION It is natural for students to wonder how they will benefit from the study of programming language concepts. After all, many other topics in computer science are worthy of serious study. The following is what we believe to be a compelling list of potential benefits of studying concepts of programming languages. 2.0 OBJECTIVES At the end of this unit, you should be able to: 23 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES • • To increase the student’s capacity to use different constructs and develop effective algorithms. To improve use of existing programming and enable students to choose language more intelligently. To make learning new languages easier and design your own language. To encounter fascinating ways of programming you might never have imagined before. 3.0 MAIN CONTENT 3.1 Reasons for Studying the Concept of Programming Language • It is natural for students to wonder how they will benefit from the study of programming language concepts. After all, many other topics in computer science are worthy of serious study. The following is what we believe to be a compelling list of potential benefits of studying concepts of programming languages. • • • • Increased ability to express ideas/algorithms: In Natural language, the depth at which people think is influenced by the expressive power of the language they use. In programming language, the complexity of the algorithms that people Implement is influenced by the set of constructs available in the programming language. The language in which they develop software places limits on the kinds of control structures, data structures, and abstractions they can use; thus, limiting the forms of algorithms they can construct. Awareness of a wider variety of programming language features can reduce such limitations in software development. Programmers can increase the range of their software development thought processes by learning new language constructs. In other words, the study of programming language concepts builds an appreciation for valuable language features and constructs and encourages programmers to use them, even when the language they are using does not directly support such features and constructs. Improved background for choosing appropriate Languages: Many programmers use the language with which they are most familiar, even though poorly suited for their new project. It is ideal to use the most appropriate language. If these programmers were familiar with a wider range of languages and language constructs, they would be better able to choose the language with the features that best address the problem. However, it is preferable to use a feature whose design has been integrated into a language than to use a simulation of that feature, which is often less elegant, more cumbersome, and less safe. Increased ability to learn new languages: For instance, knowing the concept s of object oriented programming OOP makes learning Java significantly easier and also, knowing the grammar of one’s native language makes it easier to learn another language. If thorough understanding of the fundamental concepts of languages is acquired, it becomes far easier to see how these concepts are incorporated into the design of the language being learned therefore it is essential that practicing programmers know the vocabulary and fundamental concepts of programming languages so they can read and understand programming language descriptions and evaluations, as well as promotional literature for languages and compilers. Better Understanding of Significance of implementation: This leads to understanding of why languages are designed the way they are. This is an ability to use a language more intellectually, as it was designed to be used. We can become better programmers by 24 CIT 332 • • 3.2 SURVEY OF PROGRAMMING LANGUAGES understanding the choices among programming language constructs and the consequences of those choices. Certain kinds of program bugs can be found and fixed only by a programmer who knows some related implementation details. It allows visualization of how a computer executes various language constructs. It provides hints about the relative efficiency of alternative constructs that may be chosen for a program. For example, programmers who know little about the complexity of the implementation of subprogram calls often do not realize that a small subprogram that is frequently called can be a highly inefficient design choice. Better use of languages that are already known: Many contemporary programming languages are large and complex. It is uncommon for a programmer to be familiar with and use all of the features of a language uses. By studying the concepts of programming languages, programmers can learn about previously unknown and unused parts of the languages they already use and begin to use those features. The overall advancement of computing: The study of programming language concepts should be justified and the choose languages should be well informed so that better languages would eventually squeeze out poorer ones. Roles of Programming Languages Programming languages evolve and eventually pass out of use. Algo from 1960 is no longer used and was replaced by Pascal which in turn was replaced by C++ and Java. In addition, the older languages still is used have undergone periodic revisions to reflect changing influence from other areas of computing. As result, newer languages reflect a composite of experiences gain in the design and use of older languages. Having stated that, the roles of programming language are: • • • • • To improve computer capability: Computer have evolved from the small, old and costly vacuum tubes machine of the 1950`s to the super and microcomputers of today. At the same time layer of OS software have been inserted between the programming language and they underlying computer hardware. This factors have influence both the structure and cost of using the features of high-level language. Improved application: Computer use has moved rapidly from the original concentration on military, scientific business and industrial application where the cost could have been justified to computer games, artificial intelligence, robotics, machine learning and application in learning of every human activity. The requirement of these new application areas influences the designs of new language and the revision and extension of older ones. Improved programming method: Language designs have evolved to reflect our changing understanding of good method for writing large and complex programs. It has also reflected the changing environment in which programming is done. Improved implementation method: The development of better implementation method has affected the change of features in the new languages. Standardization: The development of language that can be implemented easily on variety of computer system has made it easy for programs to be transported from one computer to another. This has provided a strong conservative influence on the evolution of language designs. 25 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 3.3 Language evaluation criteria • Expressivity: means the ability of a language to clearly reflect the meaning intended by the algorithm designer (the programmer). Thus an “expressive” language permits an utterance to be compactly stated, and encourages the use of statement forms associated with structured programming (usually “while “loops and “if – then – else” statements). Well-Definedness: By “well-definiteness”, we mean that the language’s syntax and semantics are free of ambiguity, are internally consistent and complete. Thus the implementer of a well-defined language should have, within its definition a complete specification of all the language’s expressive forms and their meanings. The programmer, by the same virtue should be able to predict exactly the behavior of each expression before it is actually executed. Data types and structures: By “Data types and Structures”, we mean the ability of a language to support a variety of data values (integers, real, strings, pointers etc.) and nonelementary collect ions of these. Readability: One of the most important criteria for judging a programming language is the ease with which programs can be read and understood. Maintenance was recognized as a major part of the cycle, particularly in terms of cost and once the ease of maintenance is determined in large part by the readability of programs, readability became an important measure of the quality of programs and programming languages. Overall Simplicity: The overall simplicity of a programming language strongly affects its readability. A language with a large number of basic constructs is more difficult to learn than one with a smaller number. Modularity: Modularity has two aspects: the language’s support for sub-programming and the language’s extensibility in the sense of allowing programmer – defined operators and data types. By sub programming, we mean the ability to define independent procedures and functions (subprograms), and communicate via parameters or global variables with the invoking program. Input-Output facilities: In evaluating a language’s “Input-Output facilities” we are looking at its support for sequential, indexed, and random access files, as well as its support for database and information retrieval functions. Portability: A language which has “portability” is one which is implemented on a variety of computers. That is, its design is relatively “machine – independent”. Languages which are well- defined tend to be more portable than others. Efficiency: An “efficient” language is one which permits fast compilation and execution on the machines where it is implemented. Traditionally, FORTRAN and COBOL have been relatively efficient languages in their respective application areas. Orthogonality: in a programming language means that a relatively small set of primitive constructs can be combined in a relatively small number of ways to build the control and data structures of the language. Furthermore, every possible combination of primitives is legal and meaningful. Pedagogy: Some languages have better “pedagogy” than others. That is, they are intrinsically easier to teach and to learn, they have better textbooks; they are implemented in a better program development environment, they are widely known and used by the best programmers in an application area. • • • • • • • • • • 26 CIT 332 SURVEY OF PROGRAMMING LANGUAGES • Generality: Generality: Means that a language is useful in a wide range of programming applications. For instance, APL has been used in mathematical applications involving matrix algebra and in business applications as well. The table below show language evaluation criteria and the characteristics that affect them. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. 3. 4. Why is readability important to writability? How is the cost of compilers for a given language related to the design of that language? Itemize the reasons responsible for program languages concept Discuss four (4) out the reasons mentioned 5.0 CONCLUSION This unit has explained different reasons for studying concepts of programming languages, enables students to know language more intelligently, make programming easy to learn, enhanced problem solving skills, help to choose language appropriate for a particular project, internally diverse networking, it has also created opportunities for invention and innovation which enable to learning new languages. Among the most important criteria for evaluating languages are readability, writability, reliability, and overall cost. These will be the basis on which we examine and judge the various language features that were discussed. 6.0 SUMMARY Like any language spoken or written, it is easier to learn earlier in life. Computer languages teach us logical skills in thinking, processing and communicating. Combined with creative visions, some of the most influential products were designed around a programming language. The best 27 CIT 332 SURVEY OF PROGRAMMING LANGUAGES inventions are born where skills and creativity meet. After all, who is better than the young generations of the world to imagine the future of our technology? One of the benefit of learning how to code at a young age is enhanced academic performance. Learning how this technology works will prepare children for a quick advancing world in which computers and smartphones are utilized for almost every function of our daily lives. Evaluation results are likely to suggest that your program has strengths as well as limitations, which should not be a simple declaration of program success or failure. Evidence that your program is not achieving all of its ambitious objectives can be hard to swallow, but it can also help to learn where to best to put limited resources. A good evaluation is one that is likely to be replicable, meaning that the same evaluation should be conducted and get the same results. The higher the quality of the evaluation design, its data collection methods and its data analysis, the more accurate its conclusions and the more confident others will be in its findings. 7.0 TUTOR-MARKED ASSIGNMENT 1. 2. 8. 9. 10. What makes a programming language successful? Enumerate and explain the various types of errors that can occur during the execution of a computer program? Explain an algorithm. What are some of its important features? What do you understand by machine code? What do you mean by “beta version” of a computer program? What are some features of specific programming languages you know whose rationales are a mystery to you? What common programming language statement, in your opinion, is most detrimental to readability? Explain the different aspects of the cost of a programming language.? In your opinion, what major features would a perfect programming language include? Write an evaluation of some programming language you know, using the criteria described 8.0 REFERENCES/FURTHER READING 1. 2. 3. Michael L. Scoot. Programming Language Pragmatics. Morgan Kaufmann, 2006. Rowe NC. (1988) Artificial Intelligence through Prolog. Prentice-Hall Olusegun Folorunso, Organization of Programming Languages, The Department of Computer Science, University of Agriculture, Abeokuta. Why Study Programming Languages? (lmu.edu), Hipster Hacker https://cs.Imu.edu/ray/notes/whystudyprogramminglanguages/ Reasons Why You Should Learn BASIC Programming | UoPeople https://www.uoppeople.edu/blog/6-reasons-why-you-should-learn-basic-programming/ Programming Language Concepts Test Questions/Answers, Dino Cajic, (dinocajic.blogspot.com), https://dinocajic.blogspot.com/2017/02/programminglanguage-concepts-test,html 50+ Best Programming Interview Questions and Answers in 2021 (hackr.io) https://hackr.io/blog/programming-interview-questions ide.geeksforgeeks.org, 5th Floor, A-118, Sector-136, Noida, Uttar Pradesh - 201305 Bernd Teufel, Organization of Programming Languages, Springer-Verlag/Wien, 1991 3. 4. 5. 6. 7. 4. 5. 6. 7. 8. 9. 28 CIT 332 10. 11. 12. 13. 14. 15. 16. 17. SURVEY OF PROGRAMMING LANGUAGES Sebesta: Concept of programming Language, Pearson Edu Louden: Programming Languages: Principles and Practices, Cengage Learning Tucker: Programming Languages: Principles and paradigms, Tata McGraw –Hill. E Horowitz: Programming Languages, 2nd Edition, Addison Wesley ide.geeksforgeeks.org, 5th Floor, A-118, Sector-136, Noida, Uttar Pradesh – 201305 Programming Language Evaluation Criteria Part 1: Readability | by ngwes | Medium Language Evaluation Criteria (gordon.edu) Language Evaluation Criteria PPL by Jayesh Umre 29 CIT 332 SURVEY OF PROGRAMMING LANGUAGES MODULE 2 LANGUAGE DESCRIPTION Unit 1 Unit 2 Syntactic Structure Language Semantic UNIT 1 SYNTACTIC STRUCTURE CONTENTS 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content 3.1 Syntactic Structure 3.1.1 Language Recognizer 3.1.2 Language Generator 3.1.3 Syntactic Ambiguity 3.2 Abstract Syntax Tree 3.2.1 Parsing 3.2.2 Parse Trees 3.2.3 Parser 3.2.4 Types of Parser 3.3 Expression Notations 3.3.1 Overloaded Operators 3.3.2 Short Circuit Evaluation 3.3.3 Categories of Expression 3.3.4 Operator Precedence 3.4 Lexical Syntax 3.5 Grammar for Expressions and Variants Grammars Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION The task of providing a concise yet understandable description of a programming language is difficult but essential to the language’s success has some languages have suffered the problem of having many slightly different dialects, a result of a simple but informal and imprecise definition. One of the problems in describing a language is the diversity of the people who must understand the description. Among these are initial evaluators, implementers, and users. Most new programming languages are subjected to a period of scrutiny by potential users, often people within the organization that employs the language’s designer, before their designs are completed. These are the initial evaluators. The success of this feedback cycle depends heavily on the clarity of the description. Programming language implementers obviously must be able to determine how the expressions, statements, and program units of a language are formed, and also their intended effect 30 CIT 332 SURVEY OF PROGRAMMING LANGUAGES when executed. The difficulty of the implementers’ job is, in part, determined by the completeness and precision of the language description. Finally, language users must be able to determine how to encode software solutions by referring to a language reference manual. Textbooks and courses enter into this process, but language manuals are usually the only authoritative printed information source about a language. The study of programming languages, like the study of natural languages, can be divided into examinations of syntax and semantics. The syntax of a programming language is the form of its expressions, statements, and program units while its semantics is the meaning of those expressions, statements, and program units. For example, the syntax of a Java while statement is: while (boolean_expr) statement The semantics of this statement form is that when the current value of the Boolean expression is true, the embedded statement is executed. Otherwise, control continues after the while construct. Then control implicitly returns to the Boolean expression to repeat the process. Although they are often separated for discussion purposes, syntax and semantics are closely related. In a welldesigned programming language, semantics should follow directly from syntax; that is, the appearance of a statement should strongly suggest what the statement is meant to accomplish. Describing syntax is easier than describing semantics, partly because a concise and universally accepted notation is available for syntax description, but none has yet been developed for semantics. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • How to learn a syntactical analysis; how learn new grammar and syntax of grammar understand generative grammar to consider rules of grammar in order to define the logical meaning as well as correctness of the sentences. 3.0 MAIN CONTENT 3.1 Syntactic Analysis A language, whether natural such as English or artificial such as Java, is a set of strings of characters from some alphabet. The strings of a language are called sentences or statements. The syntax rules of a language specify which strings of characters from the language’s alphabet are in the language. English, for example, has a large and complex collection of rules for specifying the syntax of its sentences. By comparison, even the largest and most complex programming languages are syntactically very simple. 31 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Syntax is the set of rules that define what the various combinations of symbols mean. This tells the computer how to read the code. Syntax refers to a concept in writing code dealing with a very specific set of words and a very specific order to those words when we give the computer instructions. This order and this strict structure is what enables us to communicate effectively with a computer. Syntax is to code, like grammar is to English or any other language. A big difference though is that computers are really exacting in how we structure that grammar or our syntax. This syntax is why we call programming coding. Even amongst all the different languages that are out there. Each programming language uses different words in a different structure in how we give it information to get the computer to follow our instructions. Syntax analysis is a task performed by a compiler which examines whether the program has a proper associated derivation tree or not. The syntax of a programming language can be interpreted using the following formal and informal techniques: • • • Lexical syntax for defining the rules for basic symbols involving identifiers, literals, punctuators and operators. Concrete syntax specifies the real representation of the programs with the help of lexical symbols like its alphabet. Abstract syntax conveys only the vital program information. The Syntax of a programming language is used to signify the structure of programs without considering their meaning. It basically emphasizes the structure, layout of a program with their appearance. It involves a collection of rules which validates the sequence of symbols and instruction used in a program. In general, languages can be formally defined in two distinct ways: by recognition and by generation. 3.1.1 Language Recognizer The syntax analysis part of a compiler is a recognizer for the language the compiler translates. In this role, the recognizer need not test all possible strings of characters from some set to determine whether each is in the language. Rather, it need only determine whether given programs are in the language. In effect then, the syntax analyzer determines whether the given programs are syntactically correct. The structure of syntax analyzers, also known as parsers as discussed before. Language recognizer is like a filters, separating legal sentences from those that are incorrectly formed. 3.1.2 Language Generator A language generator is a device that can be used to generate the sentences of a language. a generator seems to be a device of limited usefulness as a language descriptor. However, people prefer certain forms of generators over recognizers because they can more easily read and understand them. By contrast, the syntax-checking portion of a compiler (a language recognizer) is not as useful a language description for a programmer because it can be used only in trial-anderror mode. For example, to determine the correct syntax of a particular statement using a compiler, the programmer can only submit a speculated version and note whether the compiler accepts it. On the other hand, it is often possible to determine whether the syntax of a particular statement is 32 CIT 332 SURVEY OF PROGRAMMING LANGUAGES correct by comparing it with the structure of the generator. There is a close connection between formal generation and recognition devices for the same language which led to formal languages. 3.1.3 Syntactic Ambiguity Syntactic ambiguity is a property of sentences which may be reasonably interpreted in more than one way, or reasonably interpreted to mean more than one thing. Ambiguity may or may not involve one word having two parts of speech or homonyms. Syntactic ambiguity arises not from the range of meanings of single words, but from the relationship between the words and clauses of a sentence, and the sentence structure implied thereby. When a reader can reasonably interpret the same sentence as having more than one possible structure, the text is equivocal and meets the definition of syntactic ambiguity. 3.2 Abstract Syntax Tree 3.2.1 Parsing Parsing is the process of analyzing a text, made of a sequence of tokens for example, words, to determine its grammatical structure with respect to a given (more or less) formal grammar. Parsing can also be used as a linguistic term, especially in reference to how phrases are divided up in garden path sentences. The diagram below shows overview process. Overview of process 33 CIT 332 3.2.2 SURVEY OF PROGRAMMING LANGUAGES Parse Trees One of the most attractive features of grammars is that they naturally describe the hierarchical syntactic structure of the sentences of the languages they define. These hierarchical structures are called parse trees. For example, the parse tree in the diagram below showed the structure of the assignment statement derived. Every internal node of a parse tree is labeled with a nonterminal symbol; every leaf is labeled with a terminal symbol. Every subtree of a parse tree describes one instance of an abstraction in the sentence. For example: A parse tree for the simple statement A = B * (A + C) 3.2.3 Parser In computing, a parser is one of the components in an interpreter or compiler, which checks for correct syntax and builds a data structure; often some kind of parse tree, abstract syntax tree or other hierarchical structure implicit in the input tokens. The parser often uses a separate lexical analyzer to create tokens from the sequence of input characters. Parsers may be programmed by hand or may be (semi-)automatically generated (in some programming languages) by a tool. 3.2.4 Types of parser The task of the parser is essentially to determine if and how the input can be derived from the start symbol of the grammar. This can be done in essentially two ways: 34 CIT 332 • • SURVEY OF PROGRAMMING LANGUAGES Top-down parsing: Top-down parsing can be viewed as an attempt to find leftmost derivations of an input-stream by searching for parse trees using a top-down expansion of the given formal grammar rules. Tokens are consumed from left to right. Inclusive choice is used to accommodate ambiguity by expanding all alternative right-hand-sides of grammar rules. Examples includes: Recursive descent parser, LL parser (Left-to-right, Leftmost derivation), and so on. Bottom-up parsing: A parser can start with the input and attempt to rewrite it to the start symbol. Intuitively, the parser attempts to locate the most basic elements, then the elements containing these, and so on. LR parsers are examples of bottom-up parsers. Another term used for this type of parser is Shift-Reduce parsing. This is the process of recognizing an utterance (a string in natural languages) by breaking it down to a set of symbols and analyzing each one against the grammar of the language. Most languages have the meanings of their utterances structured according to their syntax—a practice known as compositional semantics. As a result, the first step to describing the meaning of an utterance in language is to break it down part by part and look at its analyzed form (known as its parse tree in computer science, and as its deep structure in generative grammar) as discussed earlier. 3.3 Expression Notations Expressions are the fundamental means of specifying computations in a programming language. An expression is a combination of operators, constants and variables. An expression may consist of one or more operands, and zero or more operators to produce a value. Programming languages generally support a set of operators that are similar to operations in mathematics. A language may contain a fixed number of built-in operators (e.g. + - * = in C and C++), or it may allow the creation of programmer defined operators (e.g. Haskell). Some programming languages restrict operator symbols to special characters like + or := while others allow also names like div. Sample of expression is given in the diagram 3.3.1 Overloaded operators In some programming languages an operator may be ad-hoc polymorphic, that is, have definitions for more than one kind of data, (such as in Java where the + operator is used both for the addition of numbers and for the concatenation of strings). Such an operator is said to be overloaded. In languages that support operator overloading by the programmer but have a limited set of operators, operator overloading is often used to define customized uses for operators. 35 CIT 332 3.3.2 SURVEY OF PROGRAMMING LANGUAGES Short-Circuit Evaluation Short-circuit evaluation and minimal evaluation, denotes the semantics of some Boolean operators in some programming languages in which the second argument is only executed or evaluated if the first argument does not suffice to determine the value of the expression: when the first argument of the AND function evaluates to false, the overall value must be false; and when the first argument of the OR function evaluates to true, the overall value must be true. In some programming languages like Lisp, the usual Boolean operators are short-circuit. In others like Java, Ada; both short-circuit and standard Boolean operators are available. 3.3.3 Categories of Expressions • Boolean Expression - an expression that returns a Boolean value, either true or false. This kind of expression can be used only in the IF-THEN-ELSE control structure and the parameter of the display condition command. A relational expression has two operands and one relational operator. The value of a relational expression is Boolean. In programming languages that include a distinct Boolean data type in their type system, like Java, these operators return true or false, depending on whether the conditional relationship between the two operands holds or not. Numeric Expression - an expression that returns a number. This kind of expression can be used in numeric data fields. It can also be used in functions and commands that require numeric parameters. Character Expression - an expression that returns an alphanumeric string. This kind of expression can be used in string data fields (format type Alpha). They can also be used in functions and command that require string parameters. Relational Expression: Relational operator is a programming language construct or operator that test or define some kind of relation between two entities. These include numerical equality (e.g., 5 = 5) and inequalities (e.g., 4 ≥ 3). An expression created using a relational operator forms what is known as a relational expression or a condition. Relational operators are also used in technical literature instead of words. Relational operators are usually written in infix notation, if supported by the programming language, which means that they appear between their operands (the two expressions being related). • • • 3.3.4 Operator Precedence When several operations occur in an expression, each part is evaluated and resolved in a predetermined order called operator precedence. Parentheses can be used to override the order of precedence and force some parts of an expression to be evaluated before other parts. Operations within parentheses are always performed before those outside. Within parentheses, however, normal operator precedence is maintained. When expressions contain operators from more than one category, arithmetic operators are evaluated first, comparison operators are evaluated next, and logical operators are evaluated last. Comparison operators all have equal precedence; that is, they are evaluated in the left-to-right order in which they appear. Arithmetic and logical operators are evaluated in the following order of precedence: 36 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Arithmetic Exponentiation (^) Negation (-) Multiplication and division (*, /) Integer division (\) Modulus arithmetic (Mod) Comparison Equality (=) Inequality (<>) Less than (<) Greater than (>) Less than or equal to (<=) Addition and subtraction Greater than or (+, -) equal to (>=) String concatenation (&) Is Logical Not And Or XOR Eqv Imp & When multiplication and division occur together in an expression, each operation is evaluated as it occurs from left to right. Likewise, when addition and subtraction occur together in an expression, each operation is evaluated in order of appearance from left to right. The string concatenation operator (&) is not an arithmetic operator, but in precedence it does fall after all arithmetic operators and before all comparison operators. The Is operator is an object reference comparison operator. It does not compare objects or their values; it checks only to determine if two object references refer to the same object. 3.3.5 Operators in in Sample Programming Languages like C / C++ and Java Program Operators are the foundation of any programming language. Thus, the functionality of C/C++ /Java programming language is incomplete without the use of operators. We can define operators as symbols that helps us to perform specific mathematical and logical computations on operands. In other words, we can say that an operator operates the operands. For example, consider the below statement: c = a + b; Here, ‘+’ is the operator known as addition operator and ‘a’ and ‘b’ are operands. The addition operator tells the compiler to add both of the operands ‘a’ and ‘b’. C/C++ has many built-in operator types and they can be classified as: • 1. 2. • Arithmetic Operators: These are the operators used to perform arithmetic/mathematical operations on operands. Examples: (+, -, *, /, %,++,–). Arithmetic operator are of two types: Unary Operators: Operators that operates or works with a single operand are unary operators. For example: (++ , –) Binary Operators: Operators that operates or works with two operands are binary operators. For example: (+ , – , * , /) Relational Operators: Relational operators are used for comparison of the values of two operands. For example: checking if one operand is equal to the other operand or not, an operand is greater than the other operand or not etc. Some of the relational operators are: (==, > , = , <= ). 37 CIT 332 • • • SURVEY OF PROGRAMMING LANGUAGES Logical Operators: Logical Operators are used to combine two or more conditions/constraints or to complement the evaluation of the original condition in consideration. The result of the operation of a logical operator is a Boolean value either true or false. Bitwise Operators: The Bitwise operators is used to perform bit-level operations on the operands. The operators are first converted to bit-level and then calculation is performed on the operands. The mathematical operations such as addition, subtraction, multiplication etc. can be performed at bit-level for faster processing. Assignment Operators: Assignment operators are used to assign value to a variable. The left side operand of the assignment operator is a variable and right side operand of the assignment operator is a value. The value on the right side must be of the same data-type of variable on the left side otherwise the compiler will raise an error. Different types of assignment operators are shown below: “=”: This is the simplest assignment operator. This operator is used to assign the value on the right to the variable on the left. For example: a = 10; b = 20; ch = 'y'; “+=”: This operator is combination of ‘+’ and ‘=’ operators. This operator first adds the current value of the variable on left to the value on right and then assigns the result to the variable on the left. Example: (a += b) can be written as (a = a + b) If initially value stored in a is 5. Then (a += 6) = 1 “-=”: This operator is combination of ‘-‘ and ‘=’ operators. This operator first subtracts the current value of the variable on left from the value on right and then assigns the result to the variable on the left. Example: (a -= b) can be written as (a = a - b) If initially value stored in a is 8. Then (a -= 6) = 2. “*=”: This operator is combination of ‘*’ and ‘=’ operators. This operator first multiplies the current value of the variable on left to the value on right and then 21 assigns the result to the variable on the left. Example: (a *= b) can be written as (a = a * b) If initially value stored in a is 5. Then (a *= 6) = 30. “/=”: This operator is combination of ‘/’ and ‘=’ operators. This operator first divides the current value of the variable on left by the value on right and then assigns the result to the variable on the left. Example: (a /= b) can be written as (a = a / b) If initially value stored in a is 6. Then (a /= 2) = 3. 38 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Examples on some Mathematical operators Given that P=23 and Q=5 Find: (i) R=P%Q (ii) R1=P/Q Solution: (i) R=P%Q Note that the symbol % stands for modulus (remainder). Then, substitute the values of A and B into the equation R=P%Q meaning, the remainder when P is divided by Q R= 23%5 R=3 since 23 divided by 5 will produce 4 remainder 3. (ii) R1=P/Q (Integer division) R1=23/5 R1=4 3.4 Lexical Syntax A lexical analyzer is essentially a pattern matcher. A pattern matcher attempts to find a substring of a given string of characters that matches a given character pattern. Pattern matching is a traditional part of computing. One of the earliest uses of pattern matching was with text editors, such as the ed line editor, which was introduced in an early version of UNIX. Since then, pattern matching has found its way into some programming languages—for example, Perl and JavaScript. It is also available through the standard class libraries of Java, C++, and C#. A lexical analyzer serves as the front end of a syntax analyzer. Technically, lexical analysis is a part of syntax analysis. A lexical analyzer performs syntax analysis at the lowest level of program structure. There are three reasons why lexical analysis is separated from syntax analysis: • • • Simplicity: Techniques for lexical analysis are less complex than those required for syntax analysis, so the lexical-analysis process can be simpler if it is separate. Also, removing the low-level details of lexical analysis from the syntax analyzer makes the syntax analyzer both smaller and less complex. Efficiency: Although it pays to optimize the lexical analyzer, because lexical analysis requires a significant portion of total compilation time, it is not fruitful to optimize the syntax analyzer. Separation facilitates this selective optimization. Portability: Because the lexical analyzer reads input program files and often includes buffering of that input, it is somewhat platform dependent. However, the syntax analyzer can be platform independent. It is always good to isolate machine-dependent parts of any software system. 39 CIT 332 SURVEY OF PROGRAMMING LANGUAGES An input program appears to a compiler as a single string of characters. The lexical analyzer collects characters into logical groupings and assigns internal codes to the groupings according to their structure. These logical groupings are named lexemes, and the internal codes for categories of these groupings are named tokens. Lexemes are recognized by matching the input character string against character string patterns. Although tokens are usually represented as integer values, for the sake of readability of lexical and syntax analyzers, they are often referenced through named constants. Consider the following example of an assignment statement: result = oldsum – value / 100; Following are the tokens and lexemes of this statement: Token IDENT ASSIGN_OP IDENT SUB_OP IDENT DIV_OP INT_LIT SEMICOLON Lexeme result = oldsum value / 100 ; Lexical analyzers extract lexemes from a given input string and produce the corresponding tokens. In the early days of compilers, lexical analyzers often processed an entire source program file and produced a file of tokens and lexemes. Now, however, most lexical analyzers are subprograms that locate the next lexeme in the input, determine its associated token code, and return them to the caller, which is the syntax analyzer. So, each call to the lexical analyzer returns a single lexeme and its token. The only view of the input program seen by the syntax analyzer is the output of the lexical analyzer, one token at a time. The lexical-analysis process includes skipping comments and white space outside lexemes, as they are not relevant to the meaning of the program. Also, the lexical analyzer inserts lexemes for userdefined names into the symbol table, which is used by later phases of the compiler. Finally, lexical analyzers detect syntactic errors in tokens, such as ill-formed floating-point literals, and report such errors to the user. Lexical units are considered the building blocks of programming languages. The lexical structured of all programming languages are similar and normally include the following kinds of units: • Identifiers: Names that can be chosen by programmers to represent objects lie variables, labels, procedures and functions. Most programming languages require that an identifier start with an alphabets letter and can be optionally followed by letters, digits and some special characters. 40 CIT 332 • • • • • SURVEY OF PROGRAMMING LANGUAGES Keywords: Names reserved by the language designer are used to form the syntactic structure of the languages. Operators: Symbols used to represent the operations. All general-purpose programming languages should provide certain minimum operators such as mathematical operators like +, -, *, ?, relational operators like <, ≤, ==. >, ≥, and logic operators like AND, OR, NOT, etc. Separators: Symbols used to separate lexical or syntactic units of the language. Space, comma, colon, semicolon and parentheses are used as separators. Literals: Values that can be assigned to variables of different types. For example, integertype literals are integer numbers, character-type literals are any character from the character set of the language and string-type literals are any string of characters. Comments: Any explanatory text embedded in the program. Comments start with a specific keyword or separator. When the compiler translates a program into machine code, all comments will be ignored. There are three approaches to building a lexical analyzer: • • • Write a formal description of the token patterns of the language using a descriptive language related to regular expressions. These descriptions are used as input to a software tool that automatically generates a lexical analyzer. There are many such tools available for this. The oldest of these, named lex, is commonly included as part of UNIX systems. Design a state transition diagram that describes the token patterns of the language and write a program that implements the diagram. Design a state transition diagram that describes the token patterns of the language and hand-construct a table-driven implementation of the state diagram. A state transition diagram, or just state diagram, is a directed graph. The nodes of a state diagram are labeled with state names. The arcs are labeled with the input characters that cause the transitions among the states. An arc may also include actions the lexical analyzer must perform when the transition is taken. State diagrams of the form used for lexical analyzers are representations of a class of mathematical machines called finite automata. Finite automata can be designed to recognize members of a class of languages called regular languages. Regular grammars are generative devices for regular languages. The tokens of a programming language are a regular language, and a lexical analyzer is a finite automaton. 3.5 Grammar for Expressions and Variants Grammars The formal language-generation mechanisms, usually called grammars, are commonly used to describe the syntax of programming languages. A formal grammar is a set of rules of a specific kind, for forming strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax. A grammar describes only the form of the strings and not the meaning or what can be done with them in any context. A formal grammar is a set of rules for rewriting strings, along with a "start symbol" from which rewriting must start. Therefore, a grammar is usually thought of as a language generator. However, it can also sometimes be used as the basis for a recognizer. 41 CIT 332 SURVEY OF PROGRAMMING LANGUAGES The grammar of a language establishes the alphabet and lexicon. Then by means of a syntax, it defines those sequences of symbols corresponding to well-formed phrases and sentences. A grammar mainly consists of a set of rules for transforming strings. To generate a string in the language, one begins with a string consisting of a single start symbol. The production rules are then applied in any order, until a string that contains neither the start symbol nor designated nonterminal symbols is produced. The language formed by the grammar consists of all distinct strings that can be generated in this manner. Any particular sequence of production rules on the start symbol yields a distinct string in the language. If there are multiple ways of generating the same single string, the grammar is said to be ambiguous. A grammar is usually thought of as a language generator. However, it can also sometimes be used as the basis for a recognizer: a function in computing that determines whether a given string belongs to the language or is grammatically incorrect. To describe such recognizers, formal language theory uses separate formalisms, known as automata theory. One of the interesting results of automata theory is that it is not possible to design a recognizer for certain formal languages 3.5.1 The Syntax of Grammars In the classic formalization of generative grammars, a grammar G consists of the following components: • • • • a finite set N of non-terminal symbols, none of which appear in strings formed from G. a finite set of terminal symbols that is disjoint from N. a finite set P of production rules, each rule of the form where * is the Kleene star operator and denotes set union. That is, each production rule maps from one string of symbols to another, where the first string (the "head") contains at least one non-terminal symbol. In the case that the second string (the "body") consists solely of the empty string – i.e. it contains no symbols at all, it may be denoted with a special notation (often _, e or _) in order to avoid confusion. A distinguished symbol, that is, the start symbol. A grammar is formally defined as the tuple (N, _, P, S). Such a formal grammar is often called a rewriting system or a phrase structure grammar in the literature A grammar can be formally written as four tuple element i.e. G = {N, ℇ, S, P} where N is the set of variable or non-terminal symbols ∑ is a set of terminal symbol known as the alphabet G = (N, ℇ, S, P) N → {S, A} where S is the start symbol ∑ = {λ, a} Example 1: Assuming the alphabet consists of a and b, the start symbol is S and we have the following production rules: 42 CIT 332 1. 2. SURVEY OF PROGRAMMING LANGUAGES S→aSb S→ ba then we start with S, and can choose a rule to apply to it. If we choose rule 1, we obtain the string aSb. If we choose rule 1 again, we replace S with aSb and obtain the string aaSbb. If we now choose rule 2, we replace S with ba and obtain the string aababb, and are done. We can write this series of choices more briefly, using symbols: S→aSb →aaSbb →aababb. The language of the grammar is then the infinite set {an babn | n ≥ 0} = {ba, abab, aababb, aaababbb, ------}, where ak is a repeated k times (and n in particular represents the number of times production rule 1 has been applied). Example 2: Consider the grammar G where N = {S, B} = {a, b, c}, S is the start symbol and P consists of the following production rules: 1. 2. 3. 4. S→ S→ Ba Bb aBSc abc → aB → bb This grammar defines the language L(G) = anbncn | n ≥ 1} where an denotes a string of n consecutive a’s. Thus the language is the set of strings that consist of one or more a’s, followed by the same number of b’s and then by the same number of c’s. Solution: S →2abc S → 1aBSc →2aBabcc →3aaBbcc →4aabbcc S →1aBSc →1aBaBScc →2 aBaBabccc →3aaBBabaccc →3aaBaBbccc →3aaaBBbccc →4aaaBbbccc →4aaabbbccc Example 3: Consider G = ({S, A, B}, {a, b}, S, P) N = {S, A, B} produce 2 sample strings from the grammar and a2b4 is contained in the grammar? 1. P: S → AB 2. A → Aa B → Bb 3. 4. A → a 5. B → b Solution: S → 1AB S → 1AB → 2AaB → 3AaBb → 3AaBbb → 3AaBbbb → 4aaBbbb → 5aabbbb Example 4: Analyze the structure of this grammar G = {V, T, S, P} where V is used in place of N and T is used in place of ℇ and produce 4 sample strings from the grammar? 43 CIT 332 SURVEY OF PROGRAMMING LANGUAGES G = (V, T, S, P), V = (A, S, B), T = (λ, a, b) 1. 2. 3. 4. P: S → ASB A → a B → b S → λ Solution: S S S S → → → → 3.5.2 4λ → → 1ASB → 1ASB 2aSB 1ASB 4AB 2aSB → 3aSb → 4aB → 4ab Noam Chomsky and John Backus In the middle to late 1950s, two men, Noam Chomsky and John Backus, developed the same syntax description formalism, which subsequently became the most widely used method for programming language syntax. Four classes of generative devices or grammars that define four classes of languages were described. Two of these grammar classes, named context-free and regular, turned out to be useful for describing the syntax of programming languages. The forms of the tokens of programming languages can be described by regular grammars. The syntax of whole programming languages, with minor exceptions, can be described by context-free grammars. It is remarkable that BNF is nearly identical to Chomsky’s generative devices for context-free languages, called context-free grammar and context-free grammars simply refer to as grammars. Furthermore, the terms BNF and grammar are used interchangeably. The difference between these types is that they have increasingly strict production rules and can express fewer formal languages. Two important types are context-free grammars (Type 2) and regular grammars (Type 3). The languages that can be described with such a grammar are called context-free languages and regular languages, respectively. Although much less powerful than unrestricted grammars (Type 0), which can in fact express any language that can be accepted by a Turing machine, these two restricted types of grammars are most often used because parsers for them can be efficiently implemented. 3.5.3 Types of Grammars and Automata The following table and diagram summarize each of Chomsky's four types of grammars, the class of language it generates, the type of automaton that recognizes it, and the form its rules must have. • • • Type-0 grammars (unrestricted grammars) include all formal grammars. Type-1 grammars (context-sensitive grammars) generate the context-sensitive languages. Type-2 grammars (context-free grammars) generate the context free languages. 44 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES Type-3 grammars (regular grammars) generate the regular languages. Grammar Type Type 0 Type 1 Type 2 Type 3 Grammar Accepted Unrestricted grammar Language Accepted Recursively enumerable language needs counters, registers, selection and memory to recognized the grammar Context sensitive Context sensitive language grammar needs counter, register and selection to recognized the grammar Context free grammar Context free language needs a counter and register to recognized the grammar Regular expression Regular expression, regular language only need counters to recognize grammar Automation Turing machine (ancmbn)k Linearly bounded ancmbn Push down automata (ab)n = anbn Finite state automaton (ab)2 = ab 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. 3. 4. Discuss the general overview of programming process Explain the two types of parser Exercise: Obtain a gramma that generates a language L (G) = {an bn+1: n ≥0} Using the parse tree figure, show a parse tree and a leftmost derivation for each of the following statements: a. A = A * (B + (C * A)) b. B = C * (A * C + B) c. A = A * (B + (C) 5.0 CONCLUSION 45 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Three different approaches to implementing programming languages are compilation, pure interpretation, and hybrid implementation. The compilation approach uses a program called a compiler, which translates programs written in a high-level programming language into machine code. Pure interpretation systems perform no translation; rather, programs are interpreted in their original form by a software interpreter and hybrid implementation systems translate programs written in high-level languages into intermediate forms, which are interpreted. All three of the implementation approaches just discussed use both lexical and syntax analyzers. A formal grammar is a set of rules of a specific kind, for forming strings in a formal language. It has four components that form its syntax and a set of operations that can be performed on it, which form its semantic. An attribute grammar is a descriptive formalism that can describe both the syntax and static semantics of a language. Finally, we discussed different types of grammar sand obtained some grammars from some exercise questions. 6.0 SUMMARY Syntax analysis is a common part of language implementation, regardless of the implementation approach used. Syntax analysis is normally based on a formal syntax description of the language being implemented. A context-free grammar, which is also called BNF, is the most common approach for describing syntax. Syntax analyzers have two goals: to detect syntax errors in a given program and to produce a parse tree, or possibly only the information required to build such a tree, for a given program. Syntax analyzers are either top-down, meaning they construct leftmost derivations and a parse tree in top-down order, or bottom-up, in which case they construct the reverse of a rightmost derivation and a parse tree in bottom-up order. Parsers that work for all unambiguous grammars have complexity O(n3). However, parsers used for implementing syntax analyzers for programming languages work on subclasses of unambiguous grammars and have complexity O(n). A lexical analyzer is a pattern matcher that isolates the small-scale parts of a program, which are called lexemes. Lexemes occur in categories, such as integer literals and names. These categories are called tokens. Each token is assigned a numeric code, which along with the lexeme is what the lexical analyzer produces. There are three distinct approaches to constructing a lexical analyzer: using a software tool to generate a table for a table-driven analyzer, building such a table by hand, and writing code to implement a state diagram description of the tokens of the language being implemented. The state diagram for tokens can be reasonably small if character classes are used for transitions, rather than having transitions for every possible character from every state node. Also, the state diagram can be simplified by using a table lookup to recognize reserved words. Backus-Naur Form and context-free grammars are equivalent metalanguages that are well suited for the task of describing the syntax of programming languages. Not only are they concise descriptive tools, but also the parse trees that can be associated with their generative actions give graphical evidence of the underlying syntactic structures. Furthermore, they are naturally related 46 CIT 332 SURVEY OF PROGRAMMING LANGUAGES to recognition devices for the languages they generate, which leads to the relatively easy construction of syntax analyzers for compilers for these languages. An attribute grammar is a descriptive formalism that can describe both the syntax and static semantics of a language. Attribute grammars are extensions to context-free grammars. An attribute grammar consists of a grammar, a set of attributes, a set of attribute computation functions, and a set of predicates, which together describe static semantics rules. 7.0 Tutor-Marked Assignment 1. 2. 3. 4 9. Describe the operation of a general language generator. Describe the operation of a general language recognizer. What is the primary use of attribute grammars? Which of the following sentences are in the language generated by this grammar Baab, bbbab, bbaaaaa, bbaab List the four classes of grammar? What are three reasons why syntax analyzers are based on grammars? Explain the three reasons why lexical analysis is separated from syntax analysis. What are the primary tasks of a lexical analyzer? Describe briefly the three approaches to building a lexical analyzer? What is a state transition diagram? 8.0 References/Further Reading 1. Learning Programming Methodologies: Absolute Beginners, Tutorials Point, Simply Easy Learning Michael L. Scott, Programming Language Pragmatics, Third Edition, Morgan Kaufmann Publishers, Elsevier. Robert W. Sebesta. Concepts of Programming Languages. Addison Wesley, 2011. Aho, A. V., J. E. Hopcroft, and J. D. Ullman. The design and analysis of computer algorithms. Boston: Addison-Wesley, 2007. Composing Programs by John DeNero, based on the textbook Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Introduction to Linguistic Theory: Adam Szczegielniak, Syntax: The Sentence Pattern of Languages, copyright in part: Cengage learning, https://scholar.harvard.edu/files/adam/files/syntax.ppt.pdf An introduction to Syntactic and Theory, Hiada Koopman, Dominique Sportiche and Edward Stabler: https://linguistic.ucla.edu/people/stabler/isat.pdf Tech Differences online Specifying Syntax Introduction to Linguistic Theory: Adam Szczegielniak, Syntax: The Sentence Pattern of Languages, copyright in part:Cengage learning, https://scholar.harvard.edu/files/adam/files/syntax.ppt.pdf An introduction to Synstatic and Theory, Hiada Koopman, Dominique Sportiche and Edward Stabler: https://linguistic.ucla.edu/people/stabler/isat.pdf UTD: Describing Syntax and Semantics: Dr. Chris Davis, cid021000@utdallas.edu 5. 6. 7. 8. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 13. 47 CIT 332 14. 15. 16. 17. 18. 19. SURVEY OF PROGRAMMING LANGUAGES Learning Programming Methodologies: Absolute Beginners, Tutorials Point, Simply Easy Learning Michael L. Scott, Programming Language Pragmatics, Third Edition, Morgan Kaufmann Publishers, Elsevier. Robert W. Sebesta. Concepts of Programming Languages. Addison Wesley, 2011. Aho, A. V., J. E. Hopcroft, and J. D. Ullman. The design and analysis of computer algorithms. Boston: Addison-Wesley, 2007. Composing Programs by John DeNero, based on the textbook Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License UTD: Describing Syntax and Semantics: Dr. Chris Davis, cid021000@utdallas.edu UNIT 2 LANGUAGE SEMANTIC CONTENTS 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content 3.1 Informal Semantics 3.1.1 Operational Semantics 3.1.2 Denotational Semantics 3.1.3 Axiomatic or Logical Semantics 3.2 Types of Semantic Analysis 3.2.1 Static Semantic 3.3.2 Dynamic Semantic 3.3 Semantic Analyzer 3.4 Semantic Errors Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION Parsing only verifies that the program consists of tokens arranged in a syntactically valid combination. Now we will move forward to semantic analysis, where we delve even deeper to check whether they form a sensible set of instructions in the programming language. Whereas any old noun phrase followed by some verb phrase makes a syntactically correct English sentence, a semantically correct one has subject-verb agreement, proper use of gender, and the components go together to express an idea that makes sense. For a program to be semantically valid, all variables, functions, classes, etc. must be properly defined, expressions and variables must be used in ways that respect the type system, access control must be respected, and so forth. 48 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Semantic analysis is the front end’s penultimate phase and the compiler’s last chance to weed out incorrect programs. We need to ensure the program is sound enough to carry on to code generation. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • Be familiar with rule-based presentations of the operational semantics and type systems for some simple be able to prove properties of an operational semantics using various forms of induction Understand the fundamental semantic issues of variables, nature of names and special words in programming languages Know semantic analysis judges whether the syntax structure constructed in the source program derives any meaning or not. Be familiar with some operationally-based notions of semantic equivalence of program phrases and their basic properties 3.0 MAIN CONTENT 3.1 Informal Semantic Semantics term in a programming language is used to figure out the relationship among the syntax and the model of computation. It emphasizes the interpretation of a program so that the programmer could understand it in an easy way or predict the outcome of program execution. An approach known as syntax-directed semantics is used to map syntactical constructs to the computational model with the help of a function. Semantic analysis is to provide the task acknowledgment and statements of a semantically correct program. There are following styles of semantics. 3.1.1 Operational Semantics Determining the meaning of a program in place of the calculation steps which are necessary to idealized execution. Some definitions used structural operational semantics which intermediate state is described on the basis of the language itself others use abstract machine to make use of more ad-hoc mathematical constructions. With an operational semantics of a programming language, one usually understands a set of rules for its expressions, statements, programs, etc., are evaluated or executed. These guidelines tell how a possible implementation of a programming language should be working and it is not difficult to give skills an implementation of an interpreter of a language in any programming languages simply by monitoring and translating it operational semantics of the language destination deployment. 49 CIT 332 3.1.2 SURVEY OF PROGRAMMING LANGUAGES Denotational Semantics Determining the meaning of a program as elements of a number of abstract mathematical structures e.g. with regard to functions such as programming language specific mathematical functions. The denotational approach is to define programs in terms of (recursive) functions, i.e. a program denotes a function. Thus, the denotational semantics of a certain programming language describe the mapping from programs written in that language to the functions they denote. A function which is described by a program is composed by other functions representing individual program constructs 3.1.3 Axiomatic or logical Semantics The definition of a program defining indirectly, by providing the axioms of logic to the characteristics of the program. Compare with specification and verification. Axiomatic semantics is also based on mathematical logic. The approach provides rules of inference (the axioms) which show the change of data after the execution of a certain sentence of the language. The typical 50 CIT 332 SURVEY OF PROGRAMMING LANGUAGES situation is that the execution (i.e. the transformation of data) takes place provided that the data satisfies certain conditions, and the inference rules are used to deduce the results satisfying also some conditions. Example: { P} S {Q } where S is a program (or a part of a program) written in a certain programming language, and P and a are logical expressions (assertions, using standard mathematical notations together with logical operators) describing conditions on the program variables used in S. The meaning is "If assertion P is true when control is at the beginning of program S, then assertion Q will be true when control is at the end of program S. The language of the assertions is predicate logic, the logical expressions P and a are called predicates. Considering the transformation of predicates leads to axiomatic semantics. A simple example can be given in the following way, where we assume a and b to be integers: { b > 10} a:= 10 + 4*b {a > 20 } meaning that if b > 10 before executing the assignment a: = 10 + 4 *b, then the value of a will be greater 20 after the assignment is executed. 3.2 Types of Sematic Analysis Types of semantic analysis involves the following: static and dynamic semantics. 3.2.1 Static semantics The static semantics defines restrictions on the structure of valid texts that are hard or impossible to express in standard syntactic formalisms. For compiled languages, static semantics essentially include those semantic rules that can be checked at compile time. Examples include checking that every identifier is declared before it is used (in languages that require such declarations) or that the labels on the arms of a case statement are distinct. Many important restrictions of this type, like checking that identifiers are used in the appropriate context (e.g. not adding an integer to a function name), or that subroutine calls have the appropriate number and type of arguments, can be enforced by defining them as rules in a logic called a type system. Other forms of static analyses like data flow analysis may also be part of static semantics. Newer programming languages like Java and C# have definite assignment analysis, a form of data flow analysis, as part of their static semantics. 3.2.2 Dynamic semantics Once data has been specified, the machine must be instructed to perform operations on the data. For example, the semantics may define the strategy by which expressions are evaluated to values, or the manner in which control structures conditionally execute statements. The dynamic semantics (also known as execution semantics) of a language defines how and when the various constructs of a language should produce a program behavior. There are many ways of defining execution 51 CIT 332 SURVEY OF PROGRAMMING LANGUAGES semantics. Natural language is often used to specify the execution semantics of languages commonly used in practice. A significant amount of academic research went into formal semantics of programming languages, which allow execution semantics to be specified in a formal manner. Results from this field of research have seen limited application to programming language design and implementation outside academia. 3.3 Semantic Analyzer It uses syntax tree and symbol table to check whether the given program is semantically consistent with language definition. It gathers type information and stores it in either syntax tree or symbol table. This type information is subsequently used by compiler during intermediatecode generation. 3.4 Semantic Errors Some of the semantics errors that the semantic analyzer is expected to recognize: • • Type mismatch Undeclared variable Reserved identifier misuse. Multiple declaration of variable in a scope. Accessing an out of scope variable. Actual and formal parameter mismatch. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. 3. 4. List some error sematic errors that you know What is the main function of semantic analyzer Itemize function of sematic analysis? In a tabular form state out the difference between syntax and semantics 5.0 Conclusion • • • • A brief introduction to three methods of semantic description: operational, denotational, and axiomatic. Operational semantics is a method of describing the meaning of language constructs in terms of their effects on an ideal machine. In denotational semantics, mathematical objects are used to represent the meanings of language constructs. Language entities are converted to these mathematical objects with recursive functions. Axiomatic semantics, which is based on formal logic, was devised as a tool for proving the correctness of programs. 6.0 Summary Static and dynamic sematic were discussed with different types of semantic error, binding and scope were discussed with their types as well and finally, fundamental semantic issues of variables, nature of names and special words in programming languages were analyzed. 52 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 7.0 Tutor-Marked Assignment 1. 4. Some programming languages are type less. What are the obvious advantages and disadvantages of having no types in a language? What determines whether a language rule is a matter of syntax or of static semantics? Why is it impossible to detect certain program errors at compile time, even though they can be detected at run time? In what ways are reserved words better than keywords? 8.0 References/Further Reading 1. Semantics in Programming Languages - INFO4MYSTREY , BestMark Mystery2020 (info4mystery.com), https://www.info4mystery.com/2019/01/semantics-inprofgramming-languages.html A. Aho, R. Sethi, J. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986. J.P. Bennett, Introduction to Compiling Techniques. Berkshire, England: McGraw-Hill, 1990. A. Pyster, Compiler Design and Construction. New York, NY: Van Nostrand Reinhold, 1988. J. Tremblay, P. Sorenson, The Theory and Practice of Compiler Writing. New York, NY: McGraw-Hill, 1985. Semantic Analysis in Compiler Design GeeksforGeeks, https://www.geeksforgeeks.org/semantic-analysis-in-computer-design/ Compiler Design Semantic Analysis (tutorialspoint.com), https://www.tutorialspoint.com/compiler_design_semantic_analysis.html. Semantics of Programming Languages Computer Science Tripos, Part 1B, Peter Sewell Computer Laboratory, University of Cambridge, 2009. Specifying Syntax UTD: Describing Syntax and Semantics: Dr. Chris Davis, cid021000@utdallas.edu Hennessy, M. (1990). The Semantics of Programming Languages. Wiley. Out of print, but available on the web at http://www.cogs.susx.ac.uk/users/matthewh/ semnotes.ps.gz12. Boston: Addison-Wesley, 2007. Pierce, B. C. (2002) Types and Programming Languages. MIT Press Composing Programs by John DeNero, based on the textbook Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Pierce, B. C. (ed) (2005) Advanced Topics in Types and Programming Languages. MIT Press. Winskel, G. (1993). The Formal Semantics of Programming Languages. MIT Press. An introduction to both operational and denotational semantics; recommended for the Part II Denotational Semantics course. Plotkin, G. D. (1981). A structural approach to operational semantics. Technical Report DAIMI FN-19, Aarhus University. 2. 3. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 53 CIT 332 SURVEY OF PROGRAMMING LANGUAGES MODULE 3 DECLARATION AND TYPES UNIT 1 THE CONCEPT OF TYPES CONTENTS 1.0 2.0 3.0 3.1 3.2 3.3 3.4 3.5 3.7 3.8 3.9 3.10 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content Primitives Data Types 3.1.1 Common Data Types Data Types in Programming Languages Type Conversion in Languages 3.3.1 Type Conversion in C 3.3.2 Type Conversion in Java Declaration Model 3.4.1 Variable 3.4.2 Naming Conventions Variable Declaration in Languages 3.5.1 Implicit Variable Declaration 3.5.2 Explicit Variable Declaration 3.5.3 Variable Declaration in C Programming Language 3.5.4 Variable Declaration in C++ Programming Language 3.5.5 Variable Declaration in Java Programming Language 3.5.6 The Basic Data Type Associated with Variables in Languages 3.5.7 Data Structures in Java Binding in Programming Languages 3.7.1 Dynamic Binding 3.7.2 Static Binding 3.7.3 Demonstration of Binding in C and Java 3.3.1 Visibility Scope 3.8.1 Static Scope 3.8.2 Dynamic Scope 3.8.3 Referencing Flow Control Structure in Programming Languages 3.9.1 Sequence logic or Sequential flow 3.9.2 Selection logic or conditional flow 3.9.3 Iteration logic or repetitive flor Runtime Consideration in Programming Language Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 54 CIT 332 1.0 SURVEY OF PROGRAMMING LANGUAGES INTRODUCTION Various possibilities exist to define structured data types to express real world problems. But we know that complex real world problems do not require only an abstraction in terms of data structures but also in terms of operations on data objects of such structured types. This means, that programming languages should provide constructs for the definition of abstract data types. The fundamental ideas are that data and the appropriate operations on it belong together, and that implementation details are hidden to those who use the abstract data types. The concept that data and the appropriate operations on it should form a syntactic unit (i.e. the concept of abstract data types). Every programming language uses different word sets in different orders, which means that each programming language uses its own syntax. But, no matter the programming language, computers are really exacting in how we structure our syntax. Programming language is more than just a means for instructing a computer to perform tasks. The language also serves as a framework within which we organize our ideas about computational processes. Programs serve to communicate those ideas among the members of a programming community. Thus, programs must be written for people to read, and only incidentally for machines to execute. When we describe a language, we should pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas. Every powerful language has three such mechanisms: • • • primitive expressions and statements, which represent the simplest building blocks that the language provides, means of combination, by which compound elements are built from simpler ones, and means of abstraction, by which compound elements can be named and manipulated as units. In programming, we deal with two kinds of elements: functions and data. Informally, data is stuff that we want to manipulate, and functions describe the rules for manipulating the data. Thus, any powerful programming language should be able to describe primitive data and primitive functions, as well as have some methods for combining and abstracting both functions and data. To develop any instruction there are some elements needed or we can essentially present in all language. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • Know features to declare new types. Know features to name new types and to specify operations to manipulate data objects of these new types. Know features to hide implementation details about both the objects and the operations. Encourage the students to understand control flow, arithmetic expressions and design issues in arithmetic expression.to determine whether the text string on input is a sentence in the given (natural) language. 55 CIT 332 • • • • SURVEY OF PROGRAMMING LANGUAGES Introduce students to control flow and execution sequence in different programming languages. Introduce students to subprograms in different programming languages as the fundamental building blocks of programs Explore the design concepts including parameter-passing methods, local referencing environment, overload subprograms, generic subprograms and the aliasing and side effect problems. Explore programming language constructs that support data abstraction and discuss the general concept of abstraction in programming and programming languages. 3.0 MAIN CONTENT 3.1 Primitive Data type This is a classification identifying one of various types of data, such as floating point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; and the way values of that type can be stored. There are several classifications of data types, some of which include: It is a basic data type which is provided by a programming language as a basic building block. Most languages allow more complicated composite types to be recursively constructed starting from basic types. It also a built-in data type for which the programming language provides built-in support. 3.1.1 Common Data Types • Integer: a whole number that can have a positive, negative, or zero value. It cannot be a fraction, nor can it include decimal places. It is commonly used in programming, especially for increasing values. Addition, subtraction, and multiplication of two integers results in an integer. However, division of two integers may result in either an integer or a decimal. The resulting decimal can be rounded off or truncated in order to produce an integer. Character: any number, letter, space, or symbol that can be entered in a computer. Every character occupies one byte of space. String: used to represent text. It is composed of a set of characters that can include spaces and numbers. Strings are enclosed in quotation marks to identify the data as strings, and not as variable names, nor as numbers. Floating Point Number: a number that contains decimals. Numbers that contain fractions are also considered floating-point numbers. Varchar: as the name implies, a varchar is a variable character, on account of the fact that the memory storage has variable length. Each character occupies one byte of space, plus 2 bytes additional for length information. Array: a kind of a list that contains a group of elements which can be of the same data type as an integer or string. It is used to organize data for easier sorting and searching of related sets of values. An array is a collection of items stored at contiguous memory locations. The idea is to store multiple items of the same type together. This makes it easier to calculate the position of each element by simply adding an offset to a base value, i.e., the memory location of the first element of the array (generally denoted by the name of the array). The base value is index 0 and the difference between the two indexes is the offset. • • • • • 56 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Each variable, called an element, is accessed using a subscript (or index). All arrays consist of contiguous memory locations. The lowest address corresponds to the first element and the highest address to the last element. 3.2 Data Types in Programming Languages Programming languages have support for different data types. This is why they are able to handle input values to be supplied by the users of programs developed using such languages. Some programming languages categorize the kinds of data they handle into: Simple and Composite data. The simple data types are generally termed primitives. The data types supported by programming languages differ in different forms. For instance, C programming language has the following examples of data as being supported: Simple: integer, long integer, short integer, float, double, character, Boolean and so on. While the composite data types include: Enumeration, Structure, Array, and String. The various data types are used to specify and handle the kind of data to be used in programming problem. These data types are declared in a programming problem through a process called variable declaration 3.3. Type Conversion in Languages Different programming languages supports different kinds of data types. At times in programming, there may be a need to convert a data from one type to the other. The term used for describing this process is called Type Conversion. Java and C++ (and some other languages) are good examples of programming language that support type conversion. 3.3.1 Type Conversion in C The type conversion process in C is basically converting one type of data type to other to perform some operation. The conversion is done only between those datatypes wherein the conversion is possible ex – char to int and vice versa. 1) Implicit Type Conversion This type of conversion is usually performed by the compiler when necessary without any commands by the user. Thus it is also called "Automatic Type Conversion". 57 CIT 332 SURVEY OF PROGRAMMING LANGUAGES The compiler usually performs this type of conversion when a particular expression contains more than one data type. In such cases either type promotion or demotion takes place. Example: int a = 20; double b = 20.5; a + b char ch='a'; 30 int a =13; a + c 2) Explicit Type Conversion Explicit type conversion rules out the use of compiler for converting one data type to another instead the user explicitly defines within the program the datatype of the operands in the expression. The example below illustrates how explicit conversion is done by the user. Example: double da = 4.5; double db = 4.6; double dc = 4.9; //explicitly defined by user int result = (int)da + (int)db + (int)dc; printf("result = %d", result); Expected Output (When these lines of codes are run in C environment) result = 12 Thus, in the above example we find that the output result is 12 because in the result expression the user has explicitly defined the operands (variables) as integer data type. Hence, there is no implicit conversion of data type by the compiler. 3.3.2 Type Conversion in Java When you assign value of one data type to another, the two types might not be compatible with each other. If the data types are compatible, then Java will perform the conversion automatically known as Automatic Type Conversion and if not then they need to be casted or converted explicitly. For example, assigning an int value to a long variable. 58 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Automatic Type Conversion in Java Widening conversion takes place when two data types are automatically converted. This happens when: • The two data types are compatible. • When we assign value of a smaller data type to a bigger data type. For Example, in java the numeric data types are compatible with each other but no automatic conversion is supported from numeric type to char or Boolean. Also, char and Boolean are not compatible with each other. class Test { public static void main(String[] args) { int i = 100; // automatic type conversion long l = i; // automatic type conversion float f = l; System.out.println("Int value "+i); System.out.println("Long value "+l); System.out.println("Float value "+f); } } Narrowing or Explicit Conversion in Java If we want to assign a value of larger data type to a smaller data type we perform explicit type casting or narrowing. • • • • • • • This is useful for incompatible data types where automatic conversion cannot be done. Here, target-type specifies the desired type to convert the specified value to. char and number are not compatible with each other. Let’s see when we try to convert one into other. filter_none edit play_arrow brightness_4 59 CIT 332 SURVEY OF PROGRAMMING LANGUAGES //Java program to illustrate incompatible data // type for explicit type conversion public class Test { public static void main(String[] argv) { char ch = 'c'; int num = 88; ch = num; } } The above lines of code will result in an error as shown below: 7: error: incompatible types: possible lossy conversion from int to char ch = num; ^ 1 error How to do Explicit Conversion? Example: filter_none 33 edit play_arrow brightness_4 //Java program to illustrate explicit type conversion class Test { public static void main(String[] args) { double d = 100.04; //explicit type casting 60 CIT 332 SURVEY OF PROGRAMMING LANGUAGES long l = (long)d; //explicit type casting int i = (int)l; System.out.println("Double value "+d); //fractional part lost System.out.println("Long value "+l); //fractional part lost System.out.println("Int value "+i); } } /Java program to illustrate Conversion of int and double to byte class Test { public static void main(String args[]) { byte b; int i = 257; double d = 323.142; System.out.println("Conversion of int to byte."); //i%256 b = (byte) i; System.out.println("i = " + i + " b = " + b); System.out.println("\nConversion of double to byte."); //d%256 b = (byte) d; System.out.println("d = " + d + " b= " + b); } } 61 CIT 332 3.4 SURVEY OF PROGRAMMING LANGUAGES Declaration Models 3.4.1 Variables Variables in programming tells how the data is represented which can be range from very simple value to complex one. The value they contain can be change depending on condition. When creating a variable, we also need to declare the data type it contains. This is because the program will use different types of data in different ways. Programming languages define data types differently. Data can hold a very simplex value like an age of the person to something very complex like a student track record of his performance of whole year. It is a symbolic name given to some known or unknown quantity or information, for the purpose of allowing the name to be used independently of the information it represents. Compilers have to replace variables' symbolic names with the actual locations of the data. While the variable name, type, and location generally remain fixed, the data stored in the location may get altered during program execution. For example, almost all languages differentiate between ‘integers’ (or whole numbers, eg 12), ‘non-integers’ (numbers with decimals, eg 0.24), and ‘characters’ (letters of the alphabet or words). In programming languages, we can distinguish between different type levels which from the user's point of view form a hierarchy of complexity, i.e. each level allows new data types or operations of greater complexity. • Elementary level: Elementary (sometimes also called basic or simple) types, such as integers, reals, boo leans, and characters, are supported by nearly every programming language. Data objects of these types can be manipulated by well-known operators, like +, - , *, or /, on the programming level. It is the task of the compiler to translate the operators onto the correct machine instructions, e.g. fixed-point and floating-point operations. See diagram below: Type levels in programming languages 62 CIT 332 • • 3.4.2 SURVEY OF PROGRAMMING LANGUAGES Structured level: Most high level programming languages allow the definition of structured types which are based on simple types. We distinguish between static and dynamic structures. Static structures are arrays, records, and sets, while dynamic structures are a b it more complicated, since they are recursively defined and may vary in size and shape during the execution of a program. Lists and trees are dynamic structures. Abstract level: Programmer defined abstract data types are a set of data objects with declared operations on these data objects. The implementation or internal representation of abstract data types is hidden to the users of these types to avoid uncontrolled manipulation of the data objects (i.e the concept of encapsulation). Global variable A Global variable is the kind of variable that is accessible from other classes outside the program or class in which it is declared. Different programming languages have various of ways in which global variables are being declared, when the need arises. 3.4.3 Local Variable A local variable is the kind of variable that is not accessible from other classes outside the program or class in which it is declared. These are variables that are used within the current program unit (or function) in a later section we will looking at global variables - variables that are available to all the program's functions. 3.4.4 Naming conventions Unlike their mathematical counterparts, programming variables and constants commonly take multiple-character names, e.g. COST or total. Single-character names are most commonly used only for auxiliary variables; for instance, i, j, k for array index variables. Some naming conventions are enforced at the language level as part of the language syntax and involve the format of valid identifiers. In almost all languages, variable names cannot start with a digit (0-9) and cannot contain whitespace characters. Whether, which, and when punctuation marks are permitted in variable names varies from language to language; many languages only permit the underscore (_) in variable names and forbid all other punctuation. In some programming languages, specific (often punctuation) characters (known as sigils) are prefixed or appended to variable identifiers to indicate the variable's type. Case-sensitivity of variable names also varies between languages and some languages require the use of a certain case in naming certain entities; most modern languages are case-sensitive; some older languages are not. Some languages reserve certain forms of variable names for their own internal use; in many languages, names beginning with 2 underscores ("__") often fall under this category. 63 CIT 332 3.5 SURVEY OF PROGRAMMING LANGUAGES Variable Declaration in Languages There are various kinds of programming languages. Some languages do not require that variables are declared in a program before being used while some require that variables are declared. We can therefore have implicit and explicit variable declaration. Variables can also be classified as global or local, depending on the level of access from within the program. 3.5.1 Implicit Variable Declaration That is, in Implicit Variable Declaration programming languages in which the variables to be used in a program may not be declared are said to support implicit variable declaration. A good example of programming language that supports this variable declaration type is Python. That is, in Python, a programmer may declare or choose not to declare the variable that he intends using in a programming language. 3.5.2 Explicit Variable Declaration Programming languages in which the variables to be used in a program should be declared are said to support Explicit variable declaration. Examples of such languages that support explicit variable declaration are: Java, C, C++, PASCAL, FORTRAN, and many others. It is important for a programmer to always declare variable before using them in languages like C, C++, VBscript and Java. Depending on the data type to be used in a program, examples of variable declaration some selected languages are as below: 3.5.3 Variable Declaration in C Programming Language int num1,num2,result; float score1,score2,score3; double x1,x2,x3,x4,total; bool x,y; 3.5.4 Variable Declaration in C++ Programming Language int num1,num2,result; float score1,score2,score3; double x1,x2,x3,x4,total; bool x,y; 3.5.5. Variable Declaration in Java Programming Language int num1,num2,result; float score1,score2,score3; double x1,x2,x3,x4,total; bool x,y; 64 CIT 332 3.5.6 SURVEY OF PROGRAMMING LANGUAGES The Basic data types associated with variables in languages such as C and Java These include the following: • • • • • • int - integer: a whole number. float - floating point value: i.e a number with a fractional part. double - a double-precision floating point value. char - a single character. Byte void - valueless special purpose type which we will examine closely in later sections. Note: In C++ and Java, the modifiers or specifiers are used to indicate whether a variable is global or local. 3.5.7 Data Structures in Java It has earlier been pointed out that Java has a wide range of data structures. For instance, Java has It has earlier been pointed out that Java has a wide range of data structures. The data structures provided by the Java utility package are very powerful and perform a wide range of functions. These data structures consist of the following interface and classes: • • • • • • • Enumeration BitSet Vector Stack Dictionary Hashtable Properties 3.7 Binding in Programming Languages We may have to start this part by asking the question “What is binding in programming languages?” Recall that a variable is the storage location for the storing of data in a program. Binding simply means the association of attributes with program entities. Binding can be static or dynamic. C, Java are examples of programming languages that support binding. Dynamic Binding allows greater flexibility but at the expense of readability, efficiency and reliability. Binding describes how a variable is created and used (or "bound") by and within the given program, and, possibly, by other programs, as well. There are two types of binding; Dynamic, and Static binding. 3.7.1 Dynamic Binding Also known as Dynamic Dispatch) is the process of mapping a message to a specific sequence of code (method) at runtime. This is done to support the cases where the appropriate method cannot be determined at compile-time. It occurs first during execution, or can change during execution of the program. 65 CIT 332 3.7.2 SURVEY OF PROGRAMMING LANGUAGES Static Binding It occurs first before run time and remains unchanged throughout program execution 3.7.3 Demonstration of Binding in C and Java As an example, let us take a look at the program below, implemented in C. In line 1, we have defined three names: inti,x=0; voidmain(){ for(i=1;i<=50;i++) x+=do_something(x); } The same implementation can also be done in Java, which is as follows: publicclassExample{ inti,x=0; publicstaticvoidmain(String[]args){ for(i=1;i<=50;i++){ x+=do_something(x); } } } int, i and x. One of them represents a type while the others represent the declaration of two variables. The specification of the C language defines the meaning of the keyword int. The properties related to this specification are bound when the language is defined. There are other properties that are left out of the language definition. An example of this is the range of values for the int type. In this way, the implementation of a compiler can choose a particular range for the int type that is the most natural for a specific machine. The type of variables i and x in the first line is bound at compilation time. In line 4, the program calls the function do_something whose definition can be in another source file. This reference is solved at link time. The linker tries to find the function definition for generating the executable file. At loading time, just before a program starts running, the memory location for main, do_something, i and x are bound. Some bindings occur when the program is running, i.e., at runtime. An example is the possible values attributed to i and x during the execution of the program. 66 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Initialization as a concept in programming languages Initialization is the binding of a variable to a value at the time the variable is being bounded to storage. For instance, in Java, C and C++, a variable can be declared and initialized at the same time as follows: float x1,x2; int num1,num2,total,avareagevalue; bool P,R,Q; 3.8 Scope The scope of a variable describes where in a program's text, the variable may be used, while the extent (or lifetime) describes when in a program's execution a variable has a (meaningful) value. Scope is a lexical aspect of a variable. Most languages define a specific scope for each variable (as well as any other named entity), which may differ within a given program. The scope of a variable is the portion of the program code for which the variable's name has meaning and for which the variable is said to be "visible". It is also of two type; static and dynamic scope. 3.8.1 Static Scope The static scope of a variable is the most immediately enclosing block, excluding any enclosed blocks where the variable has been re-declared. The static scope of a variable in a program can be determined by simply studying the text of the program. Static scope is not affected by the order in which procedures are called during the execution of the program. 3.8.2 Dynamic Scope The dynamic scope of a variable extends to all the procedures called thereafter during program execution, until the first procedure to be called that re-declares the variable. 3.8.3 Referencing The referencing environment is the collection of variable which can be used. In a static scoped language, one can only reference the variables in the static reference environment. A function in a static scoped language does have dynamic ancestors (i.e. its callers), but cannot reference any variables declared in that ancestor. 3.9 Flow Control structures in Programming Languages Different programming languages have support for flow control. Flow control constructs such as if-then-else, while-do, and Do while take as arguments a Boolean expression and one or more action statements. The actions are either flow control constructs or assignment operations, which can be constructed recursively too. The Java, or C++ class written to provide an 67 CIT 332 SURVEY OF PROGRAMMING LANGUAGES if<condition<then<else< construct takes three arguments in its constructor, a Boolean result object and two action objects, each constructed using one or more primitive constructs. The evaluate() method of the if-then-else class performs the if-then-else logic upon invocation with appropriate arguments. Most flow control constructs can be provided in a similar manner. In programming languages, a complex program logic can be constructed as a hierarchy of objects corresponding to the primitive constructs. Such a conglomerate of objects, each individually performing a simple task, and cooperating together to achieve a complex one is the key design principle in this technology and ideally suited for distributed processing. Other programming languages like C, PASCAL, DELPHI, Python, FORTRAN, Scala among others have their syntaxes for handling flow control. The flow control structures are generally used when there is a need to reach some conclusion based on some set of conditions. Control structures allow the programmer to define the order of execution of statements. The availability of mechanisms that allow such a control makes programming languages powerful in their usage for the solution of complex problems. Usually problems cannot be solved just by sequencing some expressions or statements, rather they require in certain situations decisions depending on some conditions - about which of some alternatives has to be executed, and/or to repeat or iterate parts of a program an arbitrary time - probably also depending on some conditions. Control Structures are just a way to specify flow of control in programs. Any algorithm or program can be more clear and understood if they use self-contained modules called as logic or control structures. It basically analyzes and chooses in which direction a program flows based on certain parameters or conditions. There are three basic types of logic, or flow of control, known as: 3.9.1 Sequential Logic (Sequential Flow) Sequential logic as the name suggests follows a serial or sequential flow in which the flow depends on the series of instructions given to the computer. Unless new instructions are given, the modules are executed in the obvious sequence. The sequences may be given, by means of numbered steps explicitly. Also, implicitly follows the order in which modules are written. Most of the processing, even some complex problems, will generally follow this elementary flow pattern. Sequential Flow Control 68 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 3.9.2 Selection Logic (Conditional Flow) Selection Logic simply involves a number of conditions or parameters which decides one out of several written modules. The structures which use these type of logic are known as Conditional Structures. These structures can be of three types: Single Alternative: This structure has the form: If (condition) then: Module A] [End of If structure] Double Alternative: This structure has the form: If (Condition), then: [Module A] Else: [Module B] [End if structure] Multiple Alternatives: This structure has the form: If (condition A), then: [Module A] Else if (condition B), then: [Module B] .. .. Else if (condition N), then: [Module N] [End If structure] In this way, the flow of the program depends on the set of conditions that are written. This can be more understood by the following flow charts: Double Alternative Control Flow 69 CIT 332 3.9.3 SURVEY OF PROGRAMMING LANGUAGES Iteration Logic (Repetitive Flow) The Iteration logic employs a loop which involves a repeat statement followed by a module known as the body of a loop. The two types of these structures are: Repeat-For Structure: This structure has the form: Repeat for i = A to N by I: [Module] [End of loop] Here, A is the initial value, N is the end value and I is the increment. The loop ends when A>B. K increases or decreases according to the positive and negative value of I respectively. Repeat-For Flow Repeat-While Structure: It also uses a condition to control the loop. This structure has the form: Repeat while condition: [Module] [End of Loop] In this, there requires a statement that initializes the condition controlling the loop, and there must also be a statement inside the module that will change this condition leading to the end of the loop. 70 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Repeat While Flow 3.10 Runtime of a Program Run time is also called execution time. It is the time during which a program is running (executing), in contrast to other program lifecycle phases such as compile time, link time and load time. When a program is to be executed, a loader first performs the necessary memory setup and links the program with any dynamically linked libraries it needs, and then the execution begins starting from the program's entry point. Some program debugging can only be performed (or is more efficient or accurate when performed) at runtime. Logic errors and array bounds checking are examples of such errors in programming language. 3.10.1 Runtime Consideration in Programming Languages Runtime programming is about being able to specify program logic during application execution, without going through the code-compile-execute cycle. This article describes key elements of the infrastructure required to support runtime programming in Java, and presents a fairly detailed analysis of the design. Programmers are expected to consider the run time for his choice of implementation in the chosen programming language. 4.0 SELF-ASSESSMENT EXERCISE 1. Write a program in the language of your choice that behaves differently if the language used name equivalence than if it used structural equivalence 5.0 CONCLUSION At the end of this unit various primitive data types and variables declaration were discussed. Examples of variable declaration were shown in some programs.Type conversion, control flow structures, binding and scope were discussed with their types as well. 6.0 SUMMARY The data types of a language are a large part of what determines that language style and usefulness. The primitive data types of most imperative languages include numeric, character, string, floating 71 CIT 332 SURVEY OF PROGRAMMING LANGUAGES points, arrays and varchar. Case sensitivity and the relationship of names to special words represent design issues of names. Variables are characterized by the sextuples: name, address, value, type, lifetime, scope. Binding is the association of attributes with program entities, expression is of 5 categories: Boolean, character, numerical and relational expression and so also control structure are of 3 types sequence, selection and iteration were all discussed in this unit. 7.0 TUTOR-MARKED ASSIGNMENT 1. 2. 4. 5. 6. What are the advantages and disadvantages of dynamic scoping? Define static binding and dynamic binding Define lifetime, scope, static scope, and dynamic scope What is the definition of control structure? Rewrite the following pseudocode segment using a loop structure in the specified languages: K = (j + 13)/27 Loop: If k > 10 then goto out K = k + 1 I = 3 * k-1 Goto loop Out: ... a. C b. C++, c. Java d. C# e. Python 8.0 REFERENCES/FURTHER READING 1. Programming Languages: Types and Features Chakray, https://www.chakay.com/programmin-languages-types-and-features. A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. A. Aho, R. Sethi, J. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986. A. W. Appel. Modern Compiler Implementation in Java. Cambridge University Press, Cambridge, 1998. 3. J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification, 3/E. Addison Wesley, Reading, 2005. Available online at http://java.sun.com/docs/books/jls/index.html. References 55. J. E. Hopcroft, R. Motwani, and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, 2001. C. Laneve. La Descrizione Operazionale dei Linguaggi di Programmazione. Franco Angeli, Milano, 1998 (in Italian). C. W. Morris. Foundations of the theory of signs. In Writings on the Theory of Signs, pages 17–74. Mouton, The Hague, 1938. G. D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI FN-19, University of Aarhus, 1981. 2. 3. 4. 5. 6. 7. 8. 72 CIT 332 9. 10. 11 12. 13. 14. 15. SURVEY OF PROGRAMMING LANGUAGES G. Winskel. The Formal Semantics of Programming Languages. MIT Press, Cambridge, 1993. Programming Data Types & Structures | Computer Science (teachcomputerscience.com), https://teachcomputerscience.com/programming-data-types/. Beginner's Guide to Passing Parameters | Codementor, Michael Safyan, https://www.codementor.io/@michaelsafyan/computer-different-ways-to-passparameters-du107xfer. Parameter Passing | Computing (glowscotland.org.uk), Stratton, https://blogs.glowscotland.org.uk. Parameter Passing Techniques in C/C++ GeeksforGeeks, https://www.geeksforgeeks.org/. Parameter passing - Implementation (computational constructs) - Higher Computing Science Revision - BBC Bitesize, https://www.bbc.co.uk/ Definition of Parameters in Computer Programming (thoughtco.com), David Bolton, 2017. UNIT 2 OVERVIEW OF TYPE-CHECKING CONTENTS 1.0 2.0 3.0 4.0 5.0 6.0 7.0 Introduction Objectives Main Content 3.1 Type Checking 3.1. Strongly Typed and Weakly Typed 3.1.2 Type Compatibility 3.2 Garbage Collection Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION For the discussion of type checking, the concept of operands and operators is generalized to include subprograms and assignment statements. Subprograms will be thought of as operators whose operands are their parameters. The assignment symbol will be thought of as a binary operator, with its target variable and its expression being the operands. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • Forming the left and right side of an operator Forming the left and right side of an assignment statement and Being actual and formal parameters. 73 CIT 332 3.0 MAIN CONTENT 3.1 Type Checking SURVEY OF PROGRAMMING LANGUAGES Type checking is the activity of ensuring that the operands of an operator are of compatible types. A compatible type is one that either is legal for the operator or is allowed under language rules to be implicitly converted by compiler-generated code (or the interpreter) to a legal type. This automatic conversion is called a coercion. For example: if an int variable and a float variable are added in Java, the value of the int variable is coerced to float and a floating-point add is done. A type error is the application of an operator to an operand of an inappropriate type. For example, in the original version of C, if an int value was passed to a function that expected a float value, a type error would occur (because compilers for that language did not check the types of parameters). If all bindings of variables to types are static in a language, then type checking can nearly always be done statically. Dynamic type binding requires type checking at run time, which is called dynamic type checking. Some languages, because of their dynamic type binding, allow only dynamic type checking. It is better to detect errors at compile time than at run time, because the earlier correction is usually less costly. The penalty for static checking is reduced programmer flexibility. Fewer shortcuts and tricks are possible. Such techniques, though, are now generally recognized to be error prone and detrimental to readability. Type checking is complicated when a language allows a memory cell to store values of different types at different times during execution. In these cases, type checking, if done, must be dynamic and requires the run-time system to maintain the type of the current value of such memory cells. So, even though all variables are statically bound to types in some languages not all type errors can be detected by static type checking. 3,1.1 Strongly Typed and Weakly Typed A programming language is strongly typed if type errors are always detected. This requires that the types of all operands can be determined, either at compile time or at run time. The importance of strong typing lies in its ability to detect all misuses of variables that result in type errors. A strongly typed language also allows the detection, at run time, of uses of the incorrect type values in variables that can store values of more than one type. A good example of strongly type language is LISP while, weakly typed languages have a weak type-check and allows expressions between various different types which has looser typing rules and may produce unpredictable or even erroneous results or may perform implicit type conversion at runtime. C and C++ are weakly typed languages because both include union types, which are not type checked. However, the program designer can choose to implement as strict type checking as desired, by checking for type-compatibility during construction of each of the primitive constructs. 74 CIT 332 3.1.2 SURVEY OF PROGRAMMING LANGUAGES Type Compatibility In most language a value’s type must be compatible with that of the context in which it appears. In an assignment statement, the type of the right-hand side must be compatible with that of the left-hand side. The types of the operands of + must both be compatible with some common type that supports addition (integers, real numbers, or perhaps strings or sets). In a subroutine call, the types of any arguments passed into the subroutine must be compatible with the types of the corresponding formal parameters, and the types of any formal parameters passed back to the caller must be compatible with the types of the corresponding arguments. Type checking is the practice of ensuring that data objects which are somehow related are of compatible types. Two objects are related by forming the left and right side of an operator, by forming the left and right side of an assignment statement and by being actual and formal parameters. Consistency checks which are made before the execution of a source program (i.e. by the compiler) are said to be static checks, while those checks performed during the execution of a source program are called dynamic checks (or run-time checks). Checking the syntax is an example for static checks, while type checks are an example of checks which often can be done statically, and which sometimes must be done dynamically. 3.2 Garbage Collection Explicit reclamation of heap objects is a serious burden on the programmer and a major source of bugs (memory leaks and dangling references). The code required to keep track of object lifetimes makes programs more difficult to design, implement, and maintain. An attractive alternative is to have the language implementation notice when objects are no longer useful and reclaim them automatically. Automatic reclamation (otherwise known as garbage collection) is more-or-less essential for functional languages: delete is a very imperative sort of operation, and the ability to construct and return arbitrary objects from functions means that many objects that would be allocated on the stack in an imperative language must be allocated from the heap in a functional language, to give them unlimited extent. Over time, automatic garbage collection has become popular for imperative languages as well. It can be found in, among others, Clu, Cedar, Modula-3, Java, C#, and all the major scripting languages. Automatic collection is difficult to implement, but the difficulty pales in comparison to the convenience enjoyed by programmers once the implementation exists. Automatic collection also tends to be slower than manual reclamation, though it eliminates any need to check for dangling references. Garbage collection presents a classic tradeoff between convenience and safety on the one hand and performance on the other. Manual storage reclamation, implemented correctly by the application program, is almost invariably faster than any automatic garbage collector. It is also more predictable: automatic collection is notorious for its tendency to introduce intermittent “hiccups” in the execution of real-time or interactive programs. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. What purpose(s) do types serve in a programming language? 75 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 2. What is garbage collection? How does it work? 5.0 CONCLUSION This unit discussed how type checking serves two principal purposes: they provide implicit context for many operations, freeing the programmer from the need to specify that context explicitly, and they allow the compiler to catch a wide variety of common programming errors. A type system consists of a set of built-in types, a mechanism to define new types. Type compatibility determines when a value of one type may be used in a context that “expects” another type. A language is said to be strongly typed if it never allows an operation to be applied to an object that does not support it; a language is said to be statically typed if it enforces strong typing at compile time. 6.0 SUMMARY Strong typing is the concept of requiring that all type errors be detected. The value of strong typing is increased reliability. Type theories have been developed in many areas. In computer science, the practical branch of type theory defines the types and type rules of programming languages. Set theory can be used to model most of the structured data types in programming languages. Techniques for garbage collection, that is the automatic recovery of memory, briefly presenting collectors based on reference counters, mark and sweep, mark and compact and copy. 7.0 TUTOR-MARKED ASSIGNMENT 1. What does it mean for a language to be strongly typed? Statically typed? What prevents, say, C from being strongly typed? Name two important programming languages that are strongly but dynamically typed. Do garbage ever arise in the same programming language? Why or why not? Why was automatic garbage collection so slow to be adopted by imperative programming languages? 2. 3. 4. 8.0 REFERENCES/FURTHER READING 1. Semantics in Programming Languages - INFO4MYSTREY , BestMark Mystery2020 (info4mystery.com), https://www.info4mystery.com/2019/01/semantics-inprofgramming-languages.html A. Aho, R. Sethi, J. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986. J.P. Bennett, Introduction to Compiling Techniques. Berkshire, England: McGraw-Hill, 1990. A. Pyster, Compiler Design and Construction. New York, NY: Van Nostrand Reinhold, 1988. J. Tremblay, P. Sorenson, The Theory and Practice of Compiler Writing. New York, NY: McGraw-Hill, 1985. Semantic Analysis in Compiler Design GeeksforGeeks, https://www.geeksforgeeks.org/semantic-analysis-in-computer-design/ Compiler Design Semantic Analysis (tutorialspoint.com), https://www.tutorialspoint.com/compiler_design_semantic_analysis.html 2. 3. 4. 5. 6. 7. 76 CIT 332 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. SURVEY OF PROGRAMMING LANGUAGES Semantics of Programming Languages Computer Science Tripos, Part 1B, Peter Sewell Computer Laboratory, University of Cambridge, 2009. Specifying Syntax UTD: Describing Syntax and Semantics: Dr. Chris Davis, cid021000@utdallas.edu Hennessy, M. (1990). The Semantics of Programming Languages. Wiley. Out of print, but available on the web at http://www.cogs.susx.ac.uk/users/matthewh/ semnotes.ps.gz12. Boston: Addison-Wesley, 2007. Pierce, B. C. (2002) Types and Programming Languages. MIT Press Composing Programs by John DeNero, based on the textbook Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Jay Sussman, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Pierce, B. C. (ed) (2005) Advanced Topics in Types and Programming Languages. MIT Press. Winskel, G. (1993). The Formal Semantics of Programming Languages. MIT Press. An introduction to both operational and denotational semantics; recommended for the Part II Denotational Semantics course. Plotkin, G. D. (1981). A structural approach to operational semantics. Technical Report DAIMI FN-19, Aarhus University. 77 CIT 332 SURVEY OF PROGRAMMING LANGUAGES MODULE 4 ABSTRACTION MECHANISMS UNIT 1 CONCEPT OF ABSTRACTION CONTENTS 1.0 2.0 3.0 3.1 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content Procedures Mechanisms 3.1.1 Characteristics 3.1.2 Procedure Implementations Functions and Subroutine Mechanisms 3.2.1 Advantages of Subroutine 3.2.2 Disadvantages Iteration Mechanisms 3.3.1 Strategies to Consider Iteration 3.3.2 Iteration Relationship with Recursion 3.3.3 Characteristics for Iteration Self-Assessment Exercise Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION 3.2 3.3 Data Abstraction may also be defined as the process of identifying only the required characteristics of an object ignoring the irrelevant details. The properties and behaviors of an object differentiate it from other objects of similar type and also help in classifying/grouping the objects. In programming we do apply the same meaning of abstraction by making classes those are not associated with any specific instance. The abstraction is done when we need to only inherit from a certain class, but not need to instantiate objects of that class. In such case the base class can be regarded as "Incomplete". Such classes are known as "Abstract Base Class". 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • To explore programming language constructs that support data abstraction To discuss the general concept of abstraction in programming and programming languages. To introduce students to subprograms in different programming languages as the fundamental building blocks of programs. To explore the design concepts including parameter-passing methods, local referencing environment, overload subprograms, generic subprograms and the aliasing and side effect problems. 78 CIT 332 3.0 SURVEY OF PROGRAMMING LANGUAGES MAIN CONTENT An Abstract Data Type is a user-defined data type that satisfies the following two conditions: • • The representation of, and operations on, objects of the type are defined in a single syntactic unit. The representation of objects of the type is hidden from the program units that use these objects, so the only operations possible are those provided in the type's definition Abstract classes and Abstract methods: • • • • • • • An abstract class is a class that is declared with an abstract keyword. An abstract method is a method that is declared without implementation. An abstract class may or may not have all abstract methods. Some of them can be concrete methods A method defined abstract must always be redefined in the subclass, thus making overriding compulsory OR either make the subclass itself abstract. Any class that contains one or more abstract methods must also be declared with an abstract keyword. There can be no object of an abstract class. That is, an abstract class can not be directly instantiated with the new operator. An abstract class can have parameterized constructors and the default constructor is always present in an abstract class. Advantages of Abstraction • • It reduces the complexity of viewing the things. Avoids code duplication and increases reusability. Helps to increase the security of an application or program as only important details are provided to the user. 3.1 Procedure Mechanisms Today, procedures are still used to write multiply occurring code only once in a program, but the concept of abstraction in the usage of procedures has become more important, since procedures are now used to modularize complex problems. Procedures are a mechanism to control and reduce complexity of programming systems by grouping certain activities together into syntactically separated program units. Therefore, the multiple usage of program code is no longer a criterion for separating code, rather procedures represent more and more source code which is used only once within a program, but which performs some basic operations. This is shown below: 3.1.1 Characteristics Although procedures might be slightly different in their usage and implementation in different programming languages, there exist some common characteristics. Among them are: 79 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES Procedures are referred to by a name. They usually have local variable declarations, as well as some parameters forming the communication interface to other procedures and the main program. In addition to this, functions are bound to a particular type, i.e. the t~ result they produce. Code Reduction Complexity Reduction Abstraction Concepts using Procedures 80 CIT 332 • • • • • 3.1.2 SURVEY OF PROGRAMMING LANGUAGES A procedure has only one entry point (FORTRAN makes an exception, since it allows the execution of a subprogram to begin at any desired executable statement using the ENTRY statement; but the overall concept is the same as for single entry procedures). The procedure call mechanism allocates storage for local data structures. The procedure call mechanism transfers control to the called instance and suspends the calling instance during the execution of the procedure (i.e. no form of parallelism or concurrency is allowed). The procedure return mechanism deallocates storage for local data structures. The procedure return mechanism transfers control back to the calling instance when the procedure execution terminates. Procedure Implementation The principle actions associated with a procedure call/return mechanism are as follows: • • • • • • • • 3.2 The status of the calling program unit must be saved. Storage for local variables must be allocated. Parameters must be passed according to the applied method. Access links to non-local variables must be established. Control must be transferred to the called procedure, e.g. loading the address of the first instruction of the procedure. Before terminating, local values must be moved to actual parameters according to passing methods. Storage for local variables must be deallocated. Control must be transferred back to the calling unit, e.g. restoring the address of the instruction that must be returned to on leaving the procedure. Functions and Subroutine Mechanisms The element of the programming allows a programmer to use snippet of code into one location which can be used over and over again. The primary purpose of the functions is to take arguments in numbers of values and do some calculation on them after that return a single result. Functions are required where you need to do complicated calculations and the result of that may or may not be used subsequently used in an expression. If we talk about subroutines that return several results. Where calls to subroutines cannot be placed in an expression whether it is in the main program where subroutine is activated by using CALL statement which include the list of inputs and outputs that enclosed in the open and closed parenthesis and they are called the arguments of the subroutines. There are some of the rules follow by both to define name like less than six letters and start with the letters. The name should be different that used for variables and functions. Subroutine (also called procedure, function, routine, method, or subprogram) is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code. A subroutine is often coded so that it can be started ("called") several times and/or from several places during a single execution of the program, including from other subroutines, and then branch back (return) to the next instruction after the "call" once the subroutine's task is done. 81 CIT 332 SURVEY OF PROGRAMMING LANGUAGES A subroutine may be written so that it expects to obtain one or more data values from the calling program (its parameters or arguments). It may also return a computed value to its caller (its return value), or provide various result values or out(put) parameters. Indeed, a common use of subroutines is to implement mathematical functions, in which the purpose of the subroutine is purely to compute one or more results whose values are entirely determined by the parameters passed to the subroutine. (Examples might include computing the logarithm of a number or the determinant of a matrix.) 3.2.1 Advantages of Subroutines • • • • Decomposition of a complex programming task into simpler steps: this is one of the two main tools of structured programming, along with data structures. Reducing the duplication of code within a program, Enabling the reuse of code across multiple programs, Hiding implementation details from users of the subroutine. 3.2.2 Disadvantages • The invocation of a subroutine (rather than using in-line code) imposes some computational overhead in the call mechanism itself. 82 CIT 332 SURVEY OF PROGRAMMING LANGUAGES • The subroutine typically requires standard housekeeping code—both at entry to, and exit from, the function (function prologue and epilogue—usually saving general purpose registers and return address as a minimum). 3.3 Iterations Mechanisms Iteration is the technique marking out of a block of statements within a computer program for a defined number of repetitions. That block of statements is said to be iterated. An iterator is a mechanism for control abstraction that allows a procedure to be applied iteratively to all members of a collection in some definite order. The elements of the collection are provided by the iterator one at a time. But the policy that selects the next element from the collection is hidden to the user of the iterator and implemented by the iterator. Below is an example of iteration; the line of code between the brackets of the for loop will iterate three times: a = 0 for I from 1 to 3 { a = a + i } Print a //loop three times //add the current value of i to a // the number 6 is printed (0 + 1; 1+ 2; 3 + 3) It is permissible, and often necessary, to use values from other parts of the program outside the bracketed block of statements, to perform the desired function. In the example above, the line of code is using the value of i as it increments. An iteration abstraction is an operation that gives the client an arbitrarily long sequence of values. Ideally, it computes the elements in the sequence lazily, so it does not do work computing elements that might never be used if the instruction does not look that far into the sequence. There are four strategies to consider in iteration • • • • different interfaces for iterations iterator pattern implementing Iterator coroutine iterators 3.3.1 Strategies of Consider Iteration Return an array: A straightforward approach is to just return a big array containing all the values that might wanted. This is not a terrible approach, because the entire array needs to be computed. If the number of elements in the sequence is very large, the array will just be too big. Also, returning an array invites implementations that expose the rep when the underlying implementation stores the elements in an array which avoid committing the implementer to using arrays, we can alternatively add observer methods for doing iteration, but have an array-like interface: 83 CIT 332 SURVEY OF PROGRAMMING LANGUAGES class Collection<T> { ... int numElements(); T getElement(int n); ... } Given a collection c, we can write a loop using this interface: for (int i = 0; i < c.numElements(); i++) { T x = c.getElement(i); } The downside of this is that it is often hard to implement random access to the nth element without simply computing the whole array of results. Iterator state in the collection Another idea is to have the collection know about the state of the iteration, with an interface like this: class Collection<T> { ... /** Set the state of the iteration back to the first element. */ void resetIteration(); /** Return the next element in the sequence of iterated values. */ T next(); ... } This approach is tempting, but has serious problems. It makes it hard to share the object across different pieces of code, because there can be only one instruction trying to iterate over the object at a time. Otherwise, they will interfere with each other. Iterator pattern The standard solution to the problem of supporting iteration abstraction is what is known as the iterator pattern. It is also known as cursor objects or generators. The idea is that since we cannot have the collection keep track of the iteration state directly, we instead put that iteration state into a separate iterator object. Then, different instruction can have their own iteration proceeding on the same object, because each will have their own separate iterator object, this is what the interface to that object looks like: interface Iterator<T> { 84 CIT 332 SURVEY OF PROGRAMMING LANGUAGES /** Whether there is a next element in the sequence. */ boolean hasNext(); /** Advance to the next element in the sequence and return it. Throw the exception NoSuchElementException if there is no next element. */ T next(); /** Remove the current element. */ void remove(); } Then we can provide iteration abstractions by defining operations that return an object that implements the Iterator interface. For example, the Collection class has an operation iterator for iterating through all the element of a collection: class Collection<T> { Iterator<T> iterator(); } To use this interface, a for loop is used, as in the following example: Collection<String> c; for (Iterator<String> i = c.iterator(); i.hasNext(); ) { String x = i.next(); // use x here } Java even has syntactic sugar for writing a loop like this: Collection<String> c; for (String x : c) { // use x here } Under the covers, exactly the same thing is happening when this loop runs, even though you do not have to declare an iterator object i. Notice there is another operation in the interface, remove, which changes the underlying collection to remove the last element returned by next. Not every iterator supports this operation, because it does not make sense for every iteration abstraction to remove elements. Because all Java collections provide an iterator method, iterate over the elements of a collection can be done without needing to know what kind of collection it is, or how iteration is implemented 85 CIT 332 SURVEY OF PROGRAMMING LANGUAGES for that collection, or even that the iterator comes from a collection object in the first place. That is the advantage of having an iteration abstraction. Specifications for Iterators Specifications for iteration abstractions are similar to ordinary function specifications. There are couple of issues specific to iteration. One is the order in which elements are produced by the iterator. It is useful to specify whether there is or is not an ordering and the second issue is what operation can be performed during iteration. Usually these consist of the observers. However, sometimes observers have hidden side effects that conflict with iterators. Implementing Iterator Java Iterators are a nice interface for providing iteration abstraction, but they do have a downside: they are not that easy to implement correctly. There are several problems for the implementer to confront: • • • The iterator must remember where it is in the iteration state and be able to restart at any point. Part of the solution is that the iterator has to have some state. The methods next and hasNext do similar work and it can be tricky to avoid duplicating work across the two methods. The solution is to have hasNext save the work it does so that next can take advantage of it. The underlying collection may be changed either by invoking its methods or by invoking the remove method on this or another iterator. Mutations except through the current iterator invalidate the current iterator. The built-in iterators throw a ConcurrentModificationException in this case. Example: A list iterator. Suppose we have a linked list with a header object of class LinkedList (In this example, it is a parameterized class with a type parameter T). class LinkedList<T> { ListNode<T> first; Iterator<T> iterator() { return new LLIter<T>(first); } } class ListNode<T> { T elem; ListNode<T> next; } The Iterator interface is implemented here by a separate class LLIter: class LLIter<T> implements Iterator<T> { ListNode<T> curr; 86 CIT 332 SURVEY OF PROGRAMMING LANGUAGES LLIter(ListNode<T> first) { curr = first; } boolean hasNext() { return (curr != null); } T next() { if (curr == null) throw new NoSuchElementException(); T ret = curr.next; curr = curr.next; return ret; } void remove() { if (curr != null) { // oops! can't implement without a pointer to // previous node. } } } Notice that we can only have one iterator object if any iterator object is mutating the collection. But if no iterators are mutating the collection, and there are no other mutations to the collection from other instruction, then there can be any number of iterators. It is actually possible to implement iterators that work in the presence of mutations, but this requires that the collection object keep track of all the iterators that are attached to it, and update their state appropriately whenever a mutation occurs. This example showed how to implement iterators for a collection class, but we can implement iteration abstractions for other problems. For example, suppose we wanted to do some computation on all the prime numbers. We could define an iterator object that iterates over all primes. class Primes implements Iterator<Integer> { int curr = 1; boolean hasNext() { return true; } Integer next() { while (true) { curr++; for (int i = 2; curr % i != 0; i++) { if (i*i >= curr) return curr; } } } } for (Iterator<Integer> i = new Primes(); i.hasNext(); ) { 87 CIT 332 SURVEY OF PROGRAMMING LANGUAGES int p = i.next(); // use p } Coroutine iterators Some languages, support an easier way to implement iterators as coroutines. You can think of these as method that run on the side of the instruction and send back values whenever they want without returning. Instead of return, a coroutine uses a yield statement to send values to the client. This makes writing iterator code simple. class TreeNode { Node left, right; int val; int elements() iterates(result) { foreach (left != null && int elt = left.elements()) yield elt; yield val; foreach (right != null && int elt = right.elements()) yield elt; } } Using function objects A final way to support iteration abstraction is to wrap up the body of the loop that you want to do on each iteration as a function object. Here is how that approach would look as an interface: class Collection<T> { void iterate(Function<T> body); } interface Function<T> { /** Perform some operation on elem. Return true if the loop should * continue. */ boolean call(T elem); } The idea is that iterate calls body.call(v) for every value v to be sent to the client, at exactly the same places that a coroutine iterator would use yield. The client provides an implementation of Function<T>.call that does whatever it wants to on the element v. So instead of writing: for (int x : c) { print(x); } 88 CIT 332 SURVEY OF PROGRAMMING LANGUAGES This is what it looks like: c.iterate(new MyLoopBody()); ... class MyLoopBody implements Function<Integer> { boolean call(Integer x) { print("x"); return true; } } 3.3.2 Iteration Relationship with Recursion Iteration and recursion both repeatedly execute the set of instructions. Recursion is when a statement in a function calls itself repeatedly while iteration is when a loop repeatedly executes until the controlling condition becomes false. The primary difference between recursion and iteration is that recursion is a process, always applied to a function and iteration is applied to the set of instructions which we want to get repeatedly executed. 3.3.3 Characteristics for Iteration • • Iteration uses repetition structure. An infinite loop occurs with iteration if the loop condition test never becomes false and Infinite looping uses CPU cycles repeatedly. An iteration terminates when the loop condition fails. An iteration does not use the stack so it is faster than recursion. Iteration consumes less memory. Iteration makes the code longer. 4.0 SELF-ASSESSMENT EXERCISE 1. 2. Discuss the characteristics of procedure mechanisms List out the principle actions associated with a procedure call/return mechanism? 5.0 CONCLUSION • • • • This unit discussed the abstract classes and methods, advantages and disadvantages of an abstraction. We later discussed procedure mechanism as an abstraction, its characteristics and procedure implementation as the principle actions associated with a procedure call/return mechanism. The unit has taken the students through the functions and subroutines mechanism, advantages and disadvantages of subroutine and finally, discussed iterations mechanism. 89 CIT 332 6.0 SURVEY OF PROGRAMMING LANGUAGES SUMMARY A subprogram definition describes the actions represented by the subprogram. Subprograms can be either functions or procedures. We have seen a couple of good options for declaring and implementing iteration abstractions. Recursion defines an operation in terms of simpler instances of itself; it depends on procedural abstraction. Iteration repeats an operation for its side effect(s). 7.0 TUTOR MARKED ASSIGNMENT 1. 2. 4. 5. 6. What are the design issues for subprograms? Write a program, using the syntax of whatever language you like that produces different behavior depending on whether pass by reference or pass by value result is used in its parameter passing. Write a program to calculate factorial number in iteration and recursion to show relationship or difference between the two. Differentiate between function and subroutine What are the advantages of abstractions What are the advantages and disadvantages of subroutines 8.0 REFERENCES/FURTHER READING 1. Programming Languages: Types and Features Chakray, https://www.chakay.com/programmin-languages-types-and-features. A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison Wesley, Reading, 1988. A. W. Appel. Modern Compiler Implementation in Java. Cambridge University Press, Cambridge, 1998. 3. J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification, 3/E. Addison Wesley, Reading, 2005. Available online at http://java.sun.com/docs/books/jls/index.html. References 55 J. E. Hopcroft, R. Motwani, and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, 2001. C. Laneve. La Descrizione Operazionale dei Linguaggi di Programmazione. Franco Angeli, Milano, 1998 (in Italian). C. W. Morris. Foundations of the theory of signs. In Writings on the Theory of Signs, pages 17–74. Mouton, The Hague, 1938. G. D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI FN-19, University of Aarhus, 1981. G. Winskel. The Formal Semantics of Programming Languages. MIT Press, Cambridge, 1993. Programming Data Types & Structures | Computer Science (teachcomputerscience.com), https://teachcomputerscience.com/programming-data-types/ Beginner's Guide to Passing Parameters | Codementor, Michael Safyan, https://www.codementor.io/@michaelsafyan/computer-different-ways-to-passparameters-du107xfer. Parameter Passing | Computing (glowscotland.org.uk), Stratton, https://blogs.glowscotland.org.uk 3. 2. 3. 4. 5. 6. 7. 8. 9. 10 11. 90 CIT 332 12. SURVEY OF PROGRAMMING LANGUAGES Parameter Passing Techniques in C/C++ GeeksforGeeks, https://www.geeksforgeeks.org/ Parameter passing - Implementation (computational constructs) - Higher Computing Science Revision - BBC Bitesize, https://www.bbc.co.uk/ Definition of Parameters in Computer Programming (thoughtco.com), David Bolton, 2017. 13. 14. UNIT 2 PARAMETERIZATION MECHANISMS CONTENTS 1.0 2.0 3.0 3.1 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content Parameter Passing Method 3.1.1 Function Parameters 3.1.2 Value of using Parameter Parameter Techniques 3.2.1 Parameter-by-value 3.2.2 Parameter-by-reference Activation Records Memory/Storage Management 3.4.1 Static Allocation 3.4.2 Dynamic/Stack Allocation 3.4.3 Heap Allocation Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION 3.2 3.3 3.4 Different programming languages employ various parameter passing mechanisms. Most subroutines are parameterized: they take arguments that control certain aspects of their behavior, or specify the data on which they are to operate. Parameter names that appear in the declaration of a subroutine are known as formal parameters. Variables and expressions that are passed to a subroutine in a particular call are known as actual parameters. 2.0 OBJECTIVES • • To explore the design concepts including parameter-passing methods, local referencing environment Know the ways in which parameters are transmitted to and/or from called subprograms 91 CIT 332 3.0 MAIN CONTENT 3.1 Parameter Passing Method SURVEY OF PROGRAMMING LANGUAGES There are some issues to consider when using local variables instead of global variables. • How can the value of a variable in one sub-program be accessible to another sub-program? • What happens if the value of a variable needed in more than one subprogram is to change within one subprogram but not in the main program? The solution to these issues is to use parameter passing. Parameter passing allows the values of local variables within a main program to be accessed, updated and used within multiple subprograms without the need to create or use global variables. Parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, it is arguments for that call can be assigned to the corresponding parameters. Parameters identify values that are passed into a function. For example, a function to add three numbers might have three parameters. A function has a name, and it can be called from other points of a program. When that happens, the information passed is called an argument. 3.1.1 Function Parameters Each function parameter has a type followed by an identifier, and each parameter is separated from the next parameter by a comma. The parameters pass arguments to the function. When a program calls a function, all the parameters are variables. The value of each of the resulting arguments is copied into its matching parameter in a process call pass by value. The program uses parameters and returned values to create functions that take data as input, make a calculation with it and return the value to the caller. 3.1.2 Value of Using Parameters • Parameters allow a function to perform tasks without knowing the specific input values ahead of time. Parameters are indispensable components of functions; which programmers use to divide their code into logical blocks. • The Difference Between Functions and Arguments The terms parameter and argument are sometimes used interchangeably. However, parameter refers to the type and identifier, and arguments are the values passed to the function. In the following C++ example, int a and int b are parameters, while 5 and 3 are the arguments passed to the function. 92 CIT 332 SURVEY OF PROGRAMMING LANGUAGES int addition (int a, int b) { int r; r=a+b; return r; } int main () { int z; z = addition (5,3); cout << "The result is " << z; } 3.2 Parameter Passing Techniques There are a number of different ways a programming language can pass parameters: • • • • Pass-by-value Pass-by-reference Pass-by-value-result Pass-by-name The most common are pass-by-value and pass-by-reference. 3.2.1 Pass-by-value This method uses in-mode semantics. Changes made to formal parameter do not get transmitted back to the caller. Any modifications to the formal parameter variable inside the called function or method affect only the separate storage location and will not be reflected in the actual parameter in the calling environment. Example of C++ code: #include <iostream> #include <cstdlib> void AssignTo(int param) { param = 5; } int main(int argc, char* argv[]) { int x = 3; AssignTo(x); std::cout << x << std::endl; return 0; } In a pass-by-value system, the statement AssignTo(x) creates a copy of the argument x. This copy has the same value as the original argument (hence the name "pass-by-value"), but it does 93 CIT 332 SURVEY OF PROGRAMMING LANGUAGES not have the same identity. Thus the assignment within AssignTo modifies a variable that is distinct from the supplied parameter. Thus, the output in this passing style is 3 and not 5. 3.2.2 Pass-by-reference (aliasing) This technique uses in/out-mode semantics. Changes made to formal parameter do get transmitted back to the caller through parameter passing. Any changes to the formal parameter are reflected in the actual parameter in the calling environment as formal parameter receives a reference (or pointer) to the actual data. Example: #include <iostream> #include <cstdlib> void AssignTo(int& param) { implies reference param = 5; } // use of ampersand (&) int main(int argc, char* argv[]) { int x = 3; AssignTo(x); std::cout << x << std::endl; return 0; } In a pass-by-reference system, the function invocation does not create a separate copy of the argument; rather, a reference to the argument (hence the name "pass-by-reference") is supplied into the AssignTo function. Thus, the assignment in AssignTo modifies not just the variable named param, but also the variable x in the caller, causing the output to be 5 and not 3. However, in C language, the default is to pass a parameter by value. An example of Parameter passing by value in C language is as shown below: // C function using pass by value void doit( int x ) { x = 5; } // // Test function for passing by value (i.e., making a copy) // int 94 CIT 332 SURVEY OF PROGRAMMING LANGUAGES main() { int z = 27; doit( z ); printf("z is now %d\n", z); return 0; } Example of Passing by Reference in C // Pure C code using Reference Parameter // void doit( int * x ) { *x = 5; } // // Test code for passing by a variable by reference // Note the use of the & (ampersand) in the function call. // int main() { int z = 27; doit( & z ); printf("z is now %d\n", z); return 0; } Parameter Passing n Java To aid the understanding of the learner, we explain the parameter passing mechanisms in Java as follows: 95 CIT 332 SURVEY OF PROGRAMMING LANGUAGES In most cases, parameter passing in Java is always Pass-by-Value. However, the context changes depending upon whether we're dealing with Primitives or Objects. This statement means that the two popular parameter passing in Java (pass-by-reference and pass-by-value) occur under the following conditions: • • For primitive types, parameters are pass-by-value For object types, the object reference is pass-by-value Passing Object References In Java, all objects are dynamically stored in Heap space under the hood. These objects are referred from references called reference variables. A Java object, in contrast to Primitives, is stored in two stages. The reference variables are stored in stack memory and the object that they're referring to, are stored in a Heap memory. Whenever an object is passed as an argument, an exact copy of the reference variable is created which points to the same location of the object in heap memory as the original reference variable. As a result of this, whenever we make any change in the same object in the method, that change is reflected in the original object. However, if we allocate a new object to the passed reference variable, then it won't be reflected in the original object. Pass-by-Reference in Java Here, we are explaining pass-by-reference further. When a parameter is pass-by-reference, the caller and the called operate on the same object. It means that when a variable is pass-by-reference, the unique identifier of the object is sent to the method. Any changes to the parameter’s instance members will result in that change being made to the original value. The fundamental concepts in any programming language are “values” and “references”. In Java, primitive variables store the actual values, whereas non-primitives store the reference variables which point to the addresses of the objects they're referring to. Both values and references are stored in the stack memory. That is, arguments in Java are always passed-by-value. 3.3 Activation Record Control stack is a run time stack which is used to keep track of the live procedure activations i.e. it is used to find out the procedures whose execution have not been completed. When it is called (activation begins) then the procedure name will push on to the stack and when it returns (activation ends) then the procedure name will pop. Activation record is used to manage the information needed by a single execution of a procedure. An activation record is pushed into the stack when a procedure is called and it is popped when the control returns to the caller function. The diagram below shows the contents of activation records: 96 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Return value Actual parameters Control link Access link Saved machine status Local data temporaries • • • • • Return Value: It is used by calling procedure to return a value to calling procedure. Actual Parameter: It is used by calling procedures to supply parameters to the called procedures. Control Link: It points to activation record of the caller. Access Link: It is used to refer to non-local data held in other activation records. Saved Machine Status: It holds the information about status of machine before the procedure is called. Local Data: It holds the data that is local to the execution of the procedure. Temporaries: It stores the value that arises in the evaluation of an expression. Storage Allocation 3.4 Memory Management • • • Memory management is one of the functions of the interpreter associated with an abstract machine. This functionality manages the allocation of memory for programs and for data, that is determines how they must be arranged in memory, how much time they may remain and which auxiliary structures are required to fetch information from memory. The different ways to allocate memory are: • • • Static storage allocation Dynamic/Stack storage allocation Heap storage allocation 3.4.1 Static Storage Allocation Static memory management is performed by the complier before execution starts. Statically allocated memory objects reside in a fixed zone of memory (which is determined by the compiler) and they remain there for the entire duration of the program’s execution because the compiler can decide the amount of storage needed by each data object. Thus, it becomes easy for a compiler to identify the address of these data in the activation record. In static allocation, names are bound to storage locations. Typical elements for which it is possible statically to allocate memory are global variables which can be stored in a memory area that is fixed before execution begins because they are visible throughout the program. The object code instructions produced by the compiler can be considered another kind of static object, given that normally they do not change during the execution of the program, so in this case also, memory will be allocated by the compiler. Constants are other elements that can be handled statically (in the case in which their values do not depend on other 97 CIT 332 SURVEY OF PROGRAMMING LANGUAGES values which are unknown at compile time). Finally, various compiler-generated tables, necessary for the runtime support of the language (for example, for handling names, for type checking, for garbage collection) are stored in reserved areas allocated by the compiler. If memory is created at compile time, then the memory will be created in static area and only once. Static allocation supports the dynamic data structure, that means memory is created only at compile time and deallocated after program completion. Some of the drawback with static storage allocation is that the size and position of data objects should be known at compile time, there is restriction of the recursion procedure and the static allocation can be completed if the size of the data object is called on compile time. Static Memory Management 3.4.2 Dynamic/Stack Storage Allocation Most modern programming languages allow block structuring of programs. Blocks, whether inline or associated with procedures, are entered and left using the LIFO scheme. When a block A is entered, and then a block B is entered, before leaving A, it is necessary to leave B. It is therefore natural to manage the memory space required to store the information local to each block using a stack. Example: A:{int a = 1; int b = 0; B:{int c = 3; 98 CIT 332 SURVEY OF PROGRAMMING LANGUAGES int b = 3; } b=a+1; } At runtime, when block A is entered, a push operation allocates a space large enough to hold the variables a and b, as shown in the diagram below. Allocation of an activation record for block A When block B is entered, we have to allocate a new space on the stack for the variables c and b (recall that the inner variable b is different from the outer one) and therefore the situation, after this second allocation, is that shown in diagram below. Allocation of an activation record for blocks A and B When block B exits, on the other hand, it is necessary to perform a pop operation to deallocate the space that had been reserved for the block from the stack. The situation after such a deallocation and after the assignment is shown in the diagram. 99 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Organization after the Execution of the Assignment Analogously, when block A exits, it will be necessary to perform another pop to deallocate the space for A as well. The case of procedures is analogous. The memory space, allocated on the stack, dedicated to an in-line block or to an activation of a procedure is called the activation record, or frame. Note that an activation record is associated with a specific activation of a procedure (one is created when the procedure is called) and not with the declaration of a procedure. The values that must be stored in an activation record (local variables, temporary variables, etc.) are indeed different for the different calls on the same procedure. The stack on which activation records are stored is called the runtime (or system) stack. It should finally be noted, that to improve the use of runtime memory, dynamic memory management is sometimes also used to implement languages that do not support recursion. If the average number of simultaneously active calls to the same procedure is less than the number of procedures declared in the program, using a stack will save space, for there will be no need to allocate a memory area for each declared procedure, as must be done in the case of entirely static management. Activation Records for In-line Blocks The structure of a generic activation record for an in-line block is shown in the diagram below. The various sectors of the activation record contain the following information: 100 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Organization of san Activation Record for an In-line Block Intermediate results: When calculations must be performed, it can be necessary to store intermediate results, even if the programmer does not assign an explicit name to them. For example, the activation record for the block: {int a =3; b= (a+x)/ (x+y);} could take the form shown in table below, where the intermediate results (a+x) and (x+y) are explicitly stored before the division is performed. An activation Record with Space for Intermediate Results The need to store intermediate results on the stack depends on the compiler being used and on the architecture to which one is compiling. On many architectures they can be stored in registers. Local variables: Local variables which are declared inside blocks, must be stored in a memory space whose size will depend on the number and type of the variables. This information in general is recorded by the compiler which therefore will be able to determine the size of this part of the activation record. In some cases, however, there can be declarations which depend on values recorded only at runtime (this is, for example, the case for dynamic arrays, which are present in some languages, whose dimensions depend on variables which are only instantiated at execution time). In these cases, the activation record also contains a variable-length part which is defined at runtime. 101 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Dynamic chain pointer: This field stores a pointer to the previous activation record on the stack (or to the last activation record created). This information is necessary because, in general, activation records have different sizes. It is also called dynamic link or control link. The set of links implemented by these pointers is called the dynamic chain. Activation Records for Procedures The case of procedures and functions is analogous to that of in-line blocks but with some additional complications due to the fact that, when a procedure is activated, it is necessary to store a greater amount of information to manage correctly the control flow. The structure of a generic activation record for a procedure is shown in table below. Structure of the Activation Record for a Procedure A function, unlike a procedure, returns a value to the caller when it terminates its execution. Activation records for the two cases are therefore identical with the exception that, for functions, the activation record must also keep tabs on the memory location in which the function stores its return value. Examine various fields of an activation record: 102 CIT 332 • • • • • SURVEY OF PROGRAMMING LANGUAGES Intermediate results, local variables, dynamic chain pointer: The same as for in-line blocks. Static chain pointer: This stores the information needed to implement the static scope rules Return address: Contains the address of the first instruction to execute after the call to the current procedure/function has terminated execution. Returned result: Present only in functions. Contains the address of the memory location where the subprogram stores the value to be returned by the function when it terminates. This memory location is inside the caller’s activation record. Parameters: The values of actual parameters used to call the procedure or function are stored here The organization of the different fields of the activation record varies from implementation to implementation. The dynamic chain pointer and, in general, every pointer to an activation record, points to a fixed (usually central) area of the activation record. The addresses of the different fields are obtained, therefore, by adding a negative or positive offset to the value of the pointer. Variable names are not normally stored in activation records and the compiler substitutes references to local variables for addresses relative to a fixed position in (i.e., an offset into) the activation record for the block in which the variables are declared. This is possible because the position of a declaration inside a block is fixed statically and the compiler can therefore associate every local variable with an exact position inside the activation record. In the case of references to non-local variables, also, it is possible to use mechanisms that avoid storing names and therefore avoid having to perform a runtime name-based search through the activation record stack in order to resolve a reference. Finally, modern compilers often optimize the code they produce and save some information in registers instead of in the activation record. In any case for greater clarity, in the examples, it is assuming that variable names are stored in activation records. To conclude, note, that all the observations made about variable names, their accessibility and storage in activation records, can be extended to other kinds of denotable object. 3.4.3 Heap Storage Allocation Heap allocation is the most flexible allocation scheme. Allocation and deallocation of memory can be done at any time and any place depending upon the user's requirement. Heap allocation is used to allocate memory to the variables dynamically and when the variables are no more used then claim it back. • • • Heap management is specialized in data structure theory: There is generally some time and space overhead associated with heap manager. For efficiency reasons, it may be useful to handle small activation records of a particular size as a special case, as follows: For each size of interest, keep the linked list of free blocks of that size. If possible fill the request for size s with a block of size S’, where S’ is the smallest size greater than or equal to s. When the block is deallocated return back to the linked list. For a larger block of storage use the heap manager. Consider for example the following C fragment: 103 CIT 332 SURVEY OF PROGRAMMING LANGUAGES int *p, *q; /* p,q NULL pointers to integers */ p = malloc (sizeof (int)); /* allocates the memory pointed to by p */ q = malloc (sizeof (int)); /* allocates the memory pointed to by q */ *p = 0; /* dereferences and assigns */ *q = 1; /* dereferences and assigns */ free(p); /* deallocates the memory pointed to by p */ free(q); /* deallocates the memory pointed to by q */ Given that the memory deallocation operations are performed in the same order as allocations (first p, then q), the memory cannot be allocated in LIFO order. Heap management methods fall into two main categories according to whether the memory blocks are considered to be of fixed or variable length. Fixed-Length Blocks The heap is divided into a certain number of elements, or blocks, of fairly small fixed length, linked into a list structure called the free list, as seen in the diagram below: Free list in a heap with fixed-size blocks At runtime, when an operation requires the allocation of a memory block from the heap (for example using the malloc command), the first element of the free list is removed from the list, 104 CIT 332 SURVEY OF PROGRAMMING LANGUAGES the pointer to this element is returned to the operation that requested the memory and the pointer to the free list is updated so that it points to the next element. When memory is, on the other hand, freed or deallocated (for example using free), the freed block is linked again to the head of the free list. The situation after some memory allocations is seen in the diagram below. Free list for heap of fixed-size blocks after allocation of some memory. Grey blocks are allocated (in use) Conceptually, therefore, management of a heap with fixed-size blocks is simple, provided that it is known how to identify and reclaim the memory that must be returned to the free list easily. Variable-Length Blocks In the case in which the language allows the runtime allocation of variable-length memory spaces, for example to store an array of variable dimension, fixed-length blocks are no longer adequate. In fact; the memory to be allocated can have a size greater than the fixed block size, and the storage of an array requires a contiguous region of memory that cannot be allocated as a series of blocks. In such cases, a heap-based management scheme using variable-length blocks is used. This type of management uses different techniques, mainly defined with the aim of increasing memory occupation and execution speed for heap management operations. It performed at runtime and therefore impact on the execution time of the program. These two characteristics are difficult to reconcile and good implementations tend towards a rational compromise. In particular, as far as memory occupancy is concerned, it is a goal to avoid the phenomenon of memory fragmentation. 105 CIT 332 SURVEY OF PROGRAMMING LANGUAGES So-called internal fragmentation occurs when a block of size strictly larger than the requested by the program is allocated. The portion of unused memory internal to the block clearly will be wasted until the block is returned to the free list. But this is not the most serious problem because external fragmentation is worse. This occurs when the free list is composed of blocks of a relatively small size and for which, even if the sum of the total available free memory is enough, the free memory cannot be effectively used. The diagram below shows an example of this problem. If we have blocks of size x and y on the free list and we request the allocation of a block of greater size, our request cannot be satisfied despite the fact that the total amount of free memory is greater than the amount of memory that has been requested. External Fragmentation The memory allocation techniques tend therefore to compact free memory, merging contiguous free blocks in such a way as to avoid external fragmentation. To achieve this objective, merging operations can be called which increase the load imposed by the management methods and therefore reduce efficiency. There are two methods used for Heap management are as follows: • Garbage Collection Method: When the all-access path to an object is destroyed, but data object continues to exist, such types of objects are said to be garbage. Garbage collection is a technique that is used to reuse that object space. In garbage collection, firstly, we mark all the active objects, and all the remaining elements whose garbage collection is on are garbaged and returned to the free space list. 106 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES Reference Counter: By reference counter, an attempt is made to reclaim each element of heap storage immediately after it can no longer be accessed. Each memory cell on the heap has a reference counter associated with it that contains a count of the number of values that points to it. The count has incremented each time a new value point to the cell and decremented each time an amount ceases to point to it. When the counter becomes zero, the cell is returned to the free list for further allocation. Properties of Heap Allocation There are various properties of heap allocation which are as follows: • • • Space Efficiency− A memory manager should minimize the total heap space needed by a program. Program Efficiency− A memory manager should make good use of the memory subsystem to allow programs to run faster. As the time taken to execute an instruction can vary widely depending on where objects are placed in memory. Low Overhead− Memory allocation and deallocation are frequent operations in many programs. These operations must be as efficient as possible. That is, it is required to minimize the overhead. The fraction of execution time spent performing allocation and deallocation. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. 3. There are a number of different ways a programming language can pass parameters; enumerate them and discuss the most common one out of the enumerated? What is the function of activation record in a program? List out the three (3) type of memory allocation? 5.0 CONCLUSION We examined the parameter passing and discussed the most common out of them with some code samples, we also discussed the activation record memory management: main techniques for both static and dynamic memory management, illustrating the reasons for dynamic memory management using a stack and those that require the use of a heap. We have illustrated in detail the stack-based management and the format of activation records for procedures and in-line blocks. 6.0 SUMMARY Formal parameters are the names that subprograms use to refer to the actual parameters given in subprogram calls. Actual parameters can be associated with formal parameters by position or by keyword. Parameters can have default values. Subprograms can be either functions, which model mathematical functions and are used to define new operations, or procedures, which define new statements. Local variables in subprograms can be stack dynamic, providing support for recursion, or static, providing efficiency and history-sensitive local variables. In the case of heap-based management, we saw some of the more common techniques for its handling, both for fixed- and variable-sized blocks and the fragmentation problem and some methods which can be used to limit it. 107 CIT 332 SURVEY OF PROGRAMMING LANGUAGES 7.0 TUTOR-MARKED ASSIGNMENT 1. 2. 3. 4. What are the three semantic models of parameter passing? What are the modes, the conceptual models of transfer, the advantages, and the disadvantages of pass-by-value, pass-by-result, pass-by-value result, and pass-by-reference parameter-passing methods? Describe the ways that aliases can occur with pass-by-reference parameters. Discuss different type of memory allocation and explain how they are used in the program. 8.0 REFERENCES/FURTHER READING 1. 2. Storage Allocation - javatpoint A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. AddisonWesley, Reading, 1988. R. P. Cook and T. J. LeBlanc. A symbol table abstraction to implement languages with explicit scope control. IEEE Trans. Softw. Eng., 9(1):8–12, 1983. B. Randell and L. J. Russell. Algol 60 Implementation. Academic Press, London, 1964. C. Shaffer. A Practical Introduction to Data Structures and Algorithm Analysis. AddisonWesley, Reading, 1996. Programming Languages: Types and Features Chakray, https://www.chakay.com/programmin-languages-types-and-features. A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison Wesley, Reading, 1988. A. W. Appel. Modern Compiler Implementation in Java. Cambridge University Press, Cambridge, 1998. J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification, 3/E. Addison Wesley, Reading, 2005. Available online at http://java.sun.com/docs/books/jls/index.html. References 55 J. E. Hopcroft, R. Motwani, and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, 2001. C. Laneve. La Descrizione Operazionale dei Linguaggi di Programmazione. Franco Angeli, Milano, 1998 (in Italian). C. W. Morris. Foundations of the theory of signs. In Writings on the Theory of Signs, pages 17–74. Mouton, The Hague, 1938. G. D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI FN-19, University of Aarhus, 1981. G. Winskel. The Formal Semantics of Programming Languages. MIT Press, Cambridge, 1993. Programming Data Types & Structures | Computer Science (teachcomputerscience.com), https://teachcomputerscience.com/programming-data-types/ Beginner's Guide to Passing Parameters | Codementor, Michael Safyan, https://www.codementor.io/@michaelsafyan/computer-different-ways-to-passparameters-du107xfer. Parameter Passing | Computing (glowscotland.org.uk), Stratton, https://blogs.glowscotland.org.uk 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 108 CIT 332 18. 19. 20. SURVEY OF PROGRAMMING LANGUAGES Parameter Passing Techniques in C/C++ GeeksforGeeks, https://www.geeksforgeeks.org/ Parameter passing - Implementation (computational constructs) - Higher Computing Science Revision - BBC Bitesize, https://www.bbc.co.uk/ Definition of Parameters in Computer Programming (thoughtco.com), David Bolton, 2017. 109 CIT 332 SURVEY OF PROGRAMMING LANGUAGES MODULE 5 MODULES IN PROGRAMMING LANGUAGES UNIT 1 OBJECT ORIENTED PROGRAMMING PARADIGMS CONTENTS 1.0 2.0 3.0 3.1 3.2 3.3 3.4 3.5 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content Object Oriented Programming Basic Concepts of Object Oriented Programming 3.2.1 Sample Object Oriented Programming Program in C/C++ 3.2.2 C/C++ Program Structure Features of Object Oriented Programming Benefits of Object Oriented Programming Language Limitations of Object Oriented Programming Language Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION Modular programming is the process of subdividing a computer program into separate subprograms. A module is a separate software component. It can often be used in a variety of applications and functions with other components of the system. • • • 2.0 Some programs might have thousands or millions of lines and to manage such programs it becomes quite difficult as there might be too many of syntax errors or logical errors present in the program, so to manage such type of programs concept of modular programming approached. Each sub-module contains something necessary to execute only one aspect of the desired functionality. Modular programming emphasis on breaking of large programs into small problems to increase the maintainability, readability of the code and to make the program handy to make any changes in future or to correct the errors. OBJECTIVES At the end of this unit, you should be able to: • • • Limitations of each and every module should be decided. In which way a program is to be partitioned into different modules. Communication among different modules of the code for proper execution of the entire program. 110 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES • To introduce Object Oriented programming followed by an extended discussion of the primary design issues for inheritance and dynamic binding. support for object oriented programming in Smalltalk, C/C++, Java, C#, Ada 95 and Ruby and so on for implementation of dynamic bindings of method calls to methods in Object Oriented language. 3.0 MAINT CONTENT 3.1 Object Oriented Programming The main necessity behind inventing object oriented approach is to remove the drawback encountered in the procedural approach. The programming paradigm object treats data as an element in the program development and holds it tightly rather than allowing it to move freely around the system. It ties the data to the function that operates on it; hides and protects it from accidental updates by external functions. Object oriented programming paradigm allows decomposition of the system into the number of entities called objects and then ties properties and function to these objects. An object’s properties can be accessed only by the functions associated with that object but functions of one object can access the function of other objects in the same cases using access specifiers. Object-oriented programming paradigm methods enable us to create a set of objects that work together to produce software that is better understandable and models their problem domains than produced using traditional techniques. The software produced using object-oriented programming paradigm is easier to adapt to the changing requirements, easier to maintain, create modules of functionality, promote greater design, be more robust, and perform desired work efficiently. Object orientation techniques work more efficiently than traditional techniques due to the following reasons: • • • • The higher level of abstraction: Top-down approach support abstraction at the Functional level while object oriented approach support abstraction at the object level. The seamless transition among different software development phases: It uses the same language for all phases, which reduces the level of complexity and redundancy makes software development clear and robust. Good programming practice: The subroutine and attributes of a class are held together tightly. Improves reusability: it supports inheritance due to which classes can be built from each other. So only difference and enhancement between classes need to be designed and coded. All the previous functionality remains as it is and can be used without change. 3.2 Basic Concepts of Object Oriented Programming • Objects: Objects are nothing but real or abstract items that contain data to define the object and methods that can manipulate that information. Thus the object is a combination of data and methods. Classes: Class is a group of objects that has the same properties and behavior and the same kind of relationship and semantics. Once a class has been defined, we can create any • 111 CIT 332 SURVEY OF PROGRAMMING LANGUAGES number of objects belonging to thy class. Objects are variables of the class. Each object is associated with data of the type class with which they are created; this class is the collection of objects of a similar type. • • • • • Encapsulation: Encapsulation is the process of wrapping up data and functions into a single unit. It is the most striking feature of the class. Data is not accessible to the outside world, and only those functions which are wrapped in the class can access it. It provides the interface between data objects and the program. Abstraction: Abstraction represents essential features. It does not represent the background details and explanation. Classes use the concept of abstraction and define the list of abstract attributes such as name, age, gender, etc., to operate on these attributes. They encapsulate all the essential properties of the object. Inheritance: Inheritance is the property whereby one class extends another class’s properties, including additional methods and variables. The original class is called a superclass, and the class that exceeds the properties are called a subclass. As the subclass contains all the data of the superclass, it is more specific. Polymorphism: In geek terms, polymorphism means the ability to take more than one form. An operation may exhibit different behavior in a different instance. Behavior depends on the types of data used for the operation. Messaging: Object oriented system consists of sets of objects that communicate with each other. Object communicate with one another by sending and receiving data much the same way as people pass messages to one another. A message for the object is a request for execution of a method and, therefore, will invoke a method in the receiving object that generates the desired result. 112 CIT 332 3.2.1 SURVEY OF PROGRAMMING LANGUAGES Sample Object Oriented Program in C/C++ The programming language was created, designed and developed by a Danish Computer Scientist named Bjarne Stroustrup at Bell Telephone Laboratories in Murray Hill, New Jersey in 1979. As he wanted a flexible and a dynamic language which was similar to C with all its features, but with additional of active type checking, basic inheritance, default functioning argument, classes, in-lining, etc. The C++ language is a procedural and as well as object-oriented programming language and is a combination of both low-level and high-level language, regarded as (Intermediate/MiddleLevel Language). C++ is sometimes called a hybrid language because it is possible to write object oriented or procedural code in the same program in C++. This has caused some concern that some C++ programmers are still writing procedural code, but are under the impression that it is object orientated, simply because they are using C++. Often it is an amalgamation of the two. This usually causes most problems when the code is revisited or the task is taken over by another coder. C/C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. C/C++ is one of the most popular programming languages and its application domains include systems software (such as Microsoft Windows), application software, device drivers, embedded software, high-performance server and client applications, and entertainment software such as video games. It was designed for UNIX system environment to enable programmer to improve the quality of code produced thus making reusable code easier to write. C/C++ offers broad range of OOP features such as multiple inheritance, strong typing, dynamic memory management, template, polymorphism, exception handling and overloading. C/C++ also support usual features such as a variety of data type including strings, arrays, structures, full of input and output features, data pointers and typed conversion. C/C++ continues to be used and is one of the preferred programming languages to develop professional applications. C/C++ has dynamic memory allocation but does not have garbage 113 CIT 332 SURVEY OF PROGRAMMING LANGUAGES collection: that allow program to misuse and lack memory. It also supports dangerous raw memory pointers and pointer arithmetic which decrease the time needed for software development. C/C++ has no official home in web as it follows the old tradition of translating C++ into C and others are native compiler. C/C++ is the mother of all high level languages. Example of C++ compiler are GNU, C/C++ compiler, GCC. 3.2.2 C/C++ Program Structure Let us look at a simple code that would print the words Hello World. #include <iostream> Using namespace std; // main() is where program execution begins Int main() { Cout<<”Hello World”;//prints Hello World Return 0; } Let us look at the various parts of the above program • • • • • • • The C++ language defines several headers, which contain information that is either necessary or useful to your program. For this program, the header <iostream> is needed. The line using namespace std; tells the compiler to use the std namespace. Namespaces are a relatively recent addition to C++. The next line '// main() is where program execution begins.' is a single-line comment available in C++. Single-line comments begin with // and stop at the end of the line. The line int main() is the main function where program execution begins. The next line cout << "Hello World"; causes the message "Hello World" to be displayed on the screen. The next line return 0; terminates main( )function and causes it to return the value 0 to the calling process. Semicolons and Blocks in C++: In C++, the semicolon is a statement terminator. That is, each individual statement must be ended with a semicolon. It indicates the end of one logical entity. For example, following are three different statements: x = y; y = y + 1; add(x, y); A block is a set of logically connected statements that are surrounded by opening and closing braces. For example: { 114 CIT 332 SURVEY OF PROGRAMMING LANGUAGES cout << "Hello World"; // prints Hello World return 0; } C++ does not recognize the end of the line as a terminator. For this reason, it does not matter where you put a statement in a line. For example − x = y; y = y + 1; add(x, y); is the same as x = y; y = y + 1; add(x, y); 3.3 Features of Object Oriented Programming • • • • • • Programs are divided into simple elements referred to as object Focus is on properties and functions rather than procedure. Data is hidden from external functions. Functions operate on the properties of an object. Objects may communicate with each other through a function called messaging. Follow the bottom-up approach in OOP design 3.4 Benefits of Object Oriented Programming Language • • • • • OOP is very easy to partition the work in a project based on objects. It is possible to map the objects in problem domain to those in the program. The principle of data hiding helps the programmer to build secure programs which cannot be invaded by the code in other parts of the program. By using inheritance, we can eliminate redundant code and extend the use of existing classes. Message passing techniques is used for communication between objects which makes the interface descriptions with external systems much simpler. The data-centered design approach enables us to capture more details of model in an implementable form. OOP language allows to break the program into the bit-sized problems that can be solved easily (one object at a time). The new technology promises greater programmer productivity, better quality of software and lesser maintenance cost. OOP systems can be easily upgraded from small to large systems. It is possible that multiple instances of objects co-exist without any interference, 3.5 Limitations of Object Oriented Programming Language • The length of the programs developed using OOP language is much larger than the procedural approach. Since the program becomes larger in size, it requires more time to be executed which leads to slower execution of the program. We cannot apply OOP everywhere as it is not a universal language. It is applied only when it is required. It is not suitable for all types of problems. • • • • • • 115 CIT 332 • • • SURVEY OF PROGRAMMING LANGUAGES Programmers need to have brilliant designing skill and programming skill along with proper planning because using OOP is little bit tricky. OOPs take time to get used to it. The thought process involved in object-oriented programming may not be natural for some people. Everything is treated as object in OOP so before applying it we need to have excellent thinking in terms of objects. 4.0 SELF-ASSESSMENT EXERCISE(S) 1. 2. What is object-oriented programming (OOPs)? What are the basic concepts of OOPs? Explain each of one of them 5.0 CONCLUSION Object orientation is so-called because this method sees things that are part of the real world of objects. In this unit, we have been able to explain the concept of object-oriented programming, we have given the sample program in C/C++ as an intermediate language of middle level language and explain its structure in program. Features, benefits and limitations of object oriented programming were also looked into for proper understanding. 6.0 SUMMARY There has been a general but deep, examination of the object-oriented paradigm, introduced as a way of obtaining abstractions in the most flexible and extensible way which characterized the object-oriented paradigm when in the presence of: Encapsulation of data, a compatibility relation between types which we call subtype, a way to reuse code, which we refer to as inheritance and which can be divided into single and multiple inheritance and techniques for the dynamic dispatch of methods. These four concepts were viewed principally in the context of languages based on classes. We then discussed the implementation of single and multiple inheritance and dynamic method selection. It was concluded with a study of some aspects relevant to the type systems in object oriented languages: Subtype polymorphism. another form of parametric polymorphism which is possible when the language admits generics. The problem of overriding of co- and contravariant methods. The object-oriented paradigm, in addition to represent a specific class of languages, is also a general development method for system software codified by semiformal rules based on the organization of concepts using objects and classes. 7.0 TUTOR MARKED ASSIGNMENT 1. 2. 3. 4. What limitations does OOPs have? What is polymorphism? Describe how it is supported in the coding language of your choice How is polymorphism difference from inheritance in OOP What is interfacing in OOP 8.0 REFERENCES/FURTHER READING 1. K. Arnold, J. Gosling, and D. Holmes. The Java Programming Language, 4th edition. Addison-Wesley Longman, Boston, 2005. 116 CIT 332 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. SURVEY OF PROGRAMMING LANGUAGES G. Birtwistle, O. Dahl, B. Myhrtag, and K. Nygaard. Simula Begin. Auerbach Press, Philadelphia, 1973. ISBN: 0-262-12112-3. G. Bracha. Generics in the Java programming language. Technical report, Sun Microsystems, 2004. Disposable on-line at java.sun.com/j2se/1.5/pdf/genericstutorial.pdf. G. Castagna. Covariance and contravariance: Conflict without a cause. ACM Trans. Prog. Lang. Syst., 17(3):431–447, 1995. citeseer.ist.psu.edu/castagna94covariance.html. A. Goldberg and D. Robson. Smalltalk-80: The Language and Its Implementation. AddisonWesley Longman, Boston, 1983. ISBN: 0-201-11371-6. J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification, 3/E. Addison Wesley, Reading, 2005. Disposable on-line a http://java.sun.com/docs/books/jls/index.html. T. Lindholm and F. Yellin. The Java Virtual Machine Specification, 2nd edition. Sun and Addison-Wesley, Reading, 1999. J. Meyer and T. Downing. Java Virtual Machine. O’Reilly, Sebastopol, 1997. K. Nygaard and O.-J. Dahl. The development of the SIMULA languages. In HOPL-1: The First ACM SIGPLAN Conference on History of Programming Languages, pages 245–272. ACM Press, New York, 1978. doi:10.1145/800025.808391. R. Pressman. Software Engineering: A Practitioner’s Approach, 7th edition. McGraw Hill, New York, 2009. R. B. Smith and D. Ungar. Programming as an experience: The inspiration for Self. In ECOOP ’95: Proc. of the 9th European Conf. on Object-Oriented Programming, pages 303– 330. Springer, Berlin, 1995. ISBN: 3-540-60160-01. Object oriented programming paradigm | Basic Concepts and Features (educba.com) What is Object Oriented Paradigm in Java? - The Java Programmer Bloch Stepehn (2011). Survey of Programming Languages, retrieved from home.adelphi.edu Deitel Paul and Deitel Harvey (2010). C How to Program 6th Edition, Pearson Education, Inc. Upper Saddle River, New Jersey 07458 Krishnamurthi Shriram (2003). Programming Languages: Application and Interpretation, retrieved from https://cs.brown.edu/~sk/Publications/Books/ProgLangs/2007-0426/plai2007-04-26.pdf Deitel & Deitel (2015). Java: How to Program (Early Objects) 9th Edition, Paul J. Deitel, Deitel & Associates, Inc. TutorialPoint (n.d.). Java Tutorial retrieved from https://www.tutorialspoint.com/java/index.htm Guru99 (n.d). Introduction to VBscript, retrieved from https://www.guru99.com/introduction-to-vbscript.html Geeks for Geeks (n.d.) retrieved from https://www.geeksforgeeks.org/semantic-analysisincompiler-design/ Kahanwal (2013) Abstraction level taxonomy of programming language frameworks. Retrieved from https://arxiv.org/abs/1311.3293 David A. Watt (1990) Programming Language Design Concepts, University of Glasgow with contributions by William Findlay. Utah University (n.d.). Functions in the C Programming Languages retrieved form https://www.cs.utah.edu/~germain/PPS/Topics/C_Language/c_functions.html 117 CIT 332 UNIT 2 SURVEY OF PROGRAMMING LANGUAGES FUNCTIONAL PROGRAMMING PARADIGMS CONTENTS 1.0 2.0 3.0 3.1 4.0 5.0 5.0 6.0 7.0 Introduction Objectives Main Content Functional Programming Languages 3.1.1 Categories of Functional Programming Language 3.1.2 Concepts of Functional Programming Language 3.1.3 Characteristics of Functional Programming Language 3.1.4 Benefits of Functional Programming Language 3.1.5 Advantages of Functional Programming Language 3.16 Limitations of Functional Programming Language Self-Assessment Exercise Conclusion Summary Tutor-Marked Assignment References/Further Reading INTRODUCTION Functional programming is a programming paradigm in which we try to bind everything in pure mathematical functions style. It is a declarative type of programming style. Its main focus is on what to solve in contrast to an imperative style where the main focus is how to solve. It uses expressions instead of statements. An expression is evaluated to produce a value whereas a statement is executed to assign variables. 2.0 OBJECTIVES At the end of this unit, you should be able to: • • • • Mimic the mathematical functions Minimize side effect by accomplished and isolating them from the rest of the software code. Separating side effects from the rest of the logic that can make a program easier to maintain, test, debug, extend and refactor. to be declarative and treats applications as the result of pure functions which are composed with one another. 3.0 MAINT CONTENT 3.1 Functional Programming Languages 118 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Functional Programming is based on Lambda Calculus; Lambda calculus is a framework developed by Alonzo Church to study computations with functions and the foundation for functional programming is Lambda Calculus. It was developed in the 1930s for the functional application, definition, and recursion. LISP was the first functional programming language. McCarthy designed it in 1960. In the late 70’s researchers at the University of Edinburgh defined the ML (Meta Language). In the early 80’s Hope language adds algebraic data types for recursion and equational reasoning. In the year 2004 Innovation of functional language ‘Scala’ came to be. It can be called as the smallest programming language in the world. It gives the definition of what is computable. Anything that can be computed by lambda calculus is computable. It is equivalent to Turing machine in its ability to compute because It provides a theoretical framework for describing functions and their evaluation. It forms the basis of almost all current functional programming languages. Functional programming (also called FP) is a way of thinking about software construction by creating pure functions. It avoids concepts of shared state and mutable data observed in OOP. Functional languages emphasis on expressions and declarations rather than execution of statements. Therefore, unlike other procedures which depend on a local or global state, value output in FP depends only on the arguments The objective of any FP language is to mimic the mathematical functions. However, the basic process of computation is different in functional programming. Some most prominent functional programming languages are: Lisp, Python, Haskell, SML, Clojure, Scala, Erlang, Clean, F#, Scheme, XSLT, SQL, Mathematica 3.1.1 Categories of Functional Programming Language Functional programming languages are categorized into two groups, i.e: Pure Functional Languages: These types of functional language support only the functional paradigms. A function is called pure function if it always returns the same result for same argument values and it has no side effects like modifying an argument (or global variable) or outputting something. The only result of calling a pure function is the return value. Examples of pure functions are strlen(), pow(), sqrt() etc. A pure function is a type of function whose inputs are declared as inputs and none of them should be hidden. The outputs are also declared as outputs. Pure functions act on their parameters. It is not efficient if not returning anything. Moreover, it offers the same output for the given parameters. Example: Function Pure(a,b) { return a+b; } 119 CIT 332 SURVEY OF PROGRAMMING LANGUAGES Impure Functional Languages: These types of functional language support the functional paradigms and imperative style programming examples printf(), rand(), time(), etc. They have hidden inputs or output; which is called impure. Impure functions cannot be used or tested in isolation as they have dependencies. Example: int z; function notPure(){ z = z+10; } 3.1.2 Concepts of Functional Programming • Pure functions: These functions have two main properties. First, they always produce the same output for same arguments irrespective of anything else. Secondly, they have no side-effects i.e. they do not modify any arguments or local/global variables or input/output streams as earlier said. Later property is called immutability. Pure function only result; is the value it returns and they are deterministic. Programs done using functional programming are easy to debug because pure functions have no side effects or hidden I/O. Pure functions also make it easier to write parallel/concurrent applications. When the code is written in this style, a smart compiler can do many things like: it can parallelize the instructions, wait to evaluate results when needing them, and memorize the results since the results never change as long as the input does not change. Example of the pure function: sum(x, y) // sum is function taking x and y as arguments return x + y • // sum is returning sum of x and y without changing them. Recursion: There are no “for” or “while” loop in functional languages. Iteration in functional languages is implemented through recursion. Recursive functions repeatedly call themselves, until it reaches the base case. Example of the recursive function: fib(n) if (n <= 1) return 1; else return fib(n - 1) + fib(n - 2); • Referential transparency: In functional programs variables once defined do not change their values throughout the program. Functional programs do not have assignment statements. If we have to store some value, we define new variables instead. This eliminates any chances of side effects because any variable can be replaced with its actual value at any point of execution. State of any variable is constant at any instant. Example: 120 CIT 332 SURVEY OF PROGRAMMING LANGUAGES x = x + 1 // this changes the value assigned to the variable x. // So the expression is not referentially transparent. • Functions are first-class and can be higher-order: First-class functions are treated as first-class variable. The first class variables can be passed to functions as parameter, can be returned from functions or stored in data structures. Higher order functions are the functions that take other functions as arguments and they can also return functions. Example: show_output(f) // function show_output is declared taking argument f // which are another function f(); // calling passed function print_gfg() // declaring another function print("hello gfg"); show_output(print_gfg) function • // passing function in another Variables are Immutable: In functional programming, we can’t modify a variable after it’s been initialized. We can create new variables – but we can’t modify existing variables, and this really helps to maintain state throughout the runtime of a program. Once we create a variable and set its value, we can have full confidence knowing that the value of that variable will never change. 121 CIT 332 • • SURVEY OF PROGRAMMING LANGUAGES Concepts of Functional Language Modularity: Modular design increases productivity. Small modules can be coded quickly and have a greater chance of re-use which surely leads to faster development of programs. Apart from it, the modules can be tested separately which helps you to reduce the time spent on unit testing and debugging. Maintainability: Maintainability is a simple term which means FP programming is easier to maintain as you don’t need to worry about accidentally changing anything outside the given function. 3.1.3 Characteristics of Functional Programming • • • • • Functional programming method focuses on results, not the process Emphasis is on what is to be computed Data is immutable Functional programming Decompose the problem into ‘functions It is built on the concept of mathematical functions which uses conditional expressions and recursion to do perform the calculation It does not support iteration like loop statements and conditional statements like If-Else, definition and recursion • 3.1.3 Benefits of functional programming • • Allows you to avoid confusing problems and errors in the code Easier to test and execute Unit testing and debug FP Code. 122 CIT 332 SURVEY OF PROGRAMMING LANGUAGES • • • • • • • Parallel processing and concurrency Hot code deployment and fault tolerance Offers better modularity with a shorter code Increased productivity of the developer Supports Nested Functions Functional Constructs like Lazy Map & Lists, etc. Allows effective use of Lambda Calculus 3.1.5 Advantages of Functional Programming Language • Bugs-Free Code: Functional programming does not support state, so there are no sideeffect results and we can write error-free codes. Efficient Parallel Programming: Functional programming languages have NO Mutable state, so there are no state-change issues. One can program "Functions" to work parallel as "instructions". Such codes support easy reusability and testability. Efficiency: Functional programs consist of independent units that can run concurrently. As a result, such programs are more efficient. Supports Nested Functions: Functional programming supports Nested Functions. Lazy Evaluation: Functional programming supports Lazy Functional Constructs like Lazy Lists, Lazy Maps, etc. • • • • As a downside, functional programming requires a large memory space. As it does not have state, you need to create new objects every time to perform actions. Functional Programming is used in situations where we have to perform lots of different operations on the same set of data. • • Lisp is used for artificial intelligence applications like Machine learning, language processing, Modeling of speech and vision, etc. Embedded Lisp interpreters add programmability to some systems like Emacs. 3.1.6 Limitations of Functional Programming • • • • • Functional programming paradigm is not easy, so it is difficult to understand for the beginner Hard to maintain as many objects evolve during the coding Needs lots of mocking and extensive environmental setup Re-use is very complicated and needs constantly refactoring Objects may not represent the problem correctly 4.0 1. 2. 3. SELF-ASSESSMENT EXERCISE(S) Discuss the concepts of functional programming language Enumerate features, benefits and limitation of functional language List some examples of functional languages you know. 5.0 CONCLUSION In this unit, we have been able to explained in detail functional programming language, we discussed the concepts of functional language, give some other examples of functional language 123 CIT 332 SURVEY OF PROGRAMMING LANGUAGES and how a structure of functional language should look like in a program. There was discussions characteristics, features, advantages and disadvantages of functional language. 6.0 SUMMARY The functional programming paradigm in its pure form, this is a computational model which does not contain the concept of modifiable variable. Computation proceeds by the rewriting of terms that denote functions. The main concepts are: abstraction which is a mechanism for passing from one expression denoting a value to one denoting a function, application also a mechanism dual to abstraction, by which a function is applied to an argument, computation by rewriting or reduction, in which an expression is repeatedly simplified until a form that cannot be further reduced is encountered. 7.0 TUTOR MARKED ASSIGNMENT 1. 2. 4. Is the define primitive of Scheme an imperative language feature? Why or why not? Deliberate with example on the categories of functional programming language and show how it can be differentiated in the program Write purely functional Scheme functions to: (a) return all rotations of a given list. For example, (rotate ’(abcd e)) should return ((a b c d e) (b c d e a) (c d e a b) (d e a b c) (e a b c d)) (in some order). (b) return a list containing all elements of a given list that satisfy a given predicate. For example, (filter (lambda (x) (< x 5)) ’(3958 2 4 7)) should return (3 2 4). What are the differences between functional language and object oriented language? 8.0 REFERENCES/FURTHER READING 1. J. Backus. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM, 21(8):613–641, 1978. doi:10.1145/359576.359579. H. Barendregt. The Lambda Calculus: Its Syntax and Semantics. Elsevier, Amsterdam, 1984. A. Church. The Calculi of Lambda Conversion. Princeton Univ. Press, Princeton, 1941. G. Cousineau and M. Mauny. The Functional Approach to Programming. Cambridge Univ. Press, Cambridge, 1998. References 367 R. Hindley and P. Seldin. Introduction to Combinators and Lambda-Calculus. Cambridge Univ. Press, Cambridge, 1986. P. J. Landin. The mechanical evaluation of expressions. The Computer Journal, 6(4):308– 320, 1964. citeseer.ist.psu.edu/cardelli85understanding.html. J. McCarthy. Recursive functions of symbolic expressions and their computation by machine, part I. Commun. ACM, 3(4):184–195, 1960. doi:10.1145/367177.367199. R. Milner and M. Tofte. Commentary on Standard ML. MIT Press, Cambridge, 1991. R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of Standard ML— Revised. MIT Press, Cambridge, 1997. 3. 2. 3. 4. 5. 6. 7. 8. 9. 124 CIT 332 10. 11. 12. 13. 14. 15. 16. SURVEY OF PROGRAMMING LANGUAGES S. Peyton-Jones. The Implementation of Functional Programming Languages. Prentice Hall, New York, 1987. Online at http://research.microsoft.com/Users/simonpj/papers/slpjbook-1987/index.htm. J. D. Ullman. Elements of ML Programming. Prentice-Hall, Upper Saddle River, 1994. https://www.guru99.com/functional-programming-tutorial.html infoworld.com/article/3613715/what-is-functional-programming-a-practical-guide.html https://www.tutorialspoint.com/functional_programming / functional_programming _introduction.htm https://en.wikipedia.org/w/index.php?title=Functional_programming&action=edit Pure Functions - GeeksforGeeks ttps://www.devmaking.com/learn/programming-paradigms/functional-paradigm/ UNIT 3 LOGIC PROGRAMMING PARADIGMS CONTENTS 1.0 2.0 3.0 3.1 3.2 3.3 3.4 4.0 5.0 6.0 7.0 8.0 Introduction Objectives Main Content Logic Programming Languages 3.1.1 Logic Programming Terms 3.1.2 Logic Programming Division 3.1.3 Sample Logic Program in Prolog 3.1.4 Structure of Prolog Program Characteristics of Logic Programming Language Features of Logic Programming Language Advantages/Benefits o of Logic Programming Language Self-Assessment Exercise(s) Conclusion Summary Tutor-Marked Assignment References/Further Reading 1.0 INTRODUCTION The logic programming paradigm includes both theoretical and fully implemented languages, of which the best known as. Even if there are big differences of a pragmatic and, for some, of a theoretical nature between these languages, they all share the idea of interpreting computation as logical deduction. 2.0 OBJECTIVES At the end of this unit, you should be able to: • Adopt the approach that has characterized the rest of the text while examining this paradigm. 125 CIT 332 • SURVEY OF PROGRAMMING LANGUAGES • Provide enough basis for understanding and, in a short time, mastering this and other logic programming languages. examine these concepts while trying to limit the theoretical part. 3.0 MAINT CONTENT 3.1 Logic Programming Language Programming in logic, is a declarative programming language that is based on the ideas of logic programming. Prolog was an idea to make the logic look like a programming language and also the idea of the prolog was to allow the logic to be controlled by a programmer to advance the research for theorem-proving. The logic programming is a programming paradigm that is a road-map to machine learning languages. It tends to impose a certain view of the world on its users. When a paradigm is applied in problem domains that deal with the extraction of knowledge from basic relations and facts, the logic paradigm fits extremely well. In the more general areas of computation, the logic paradigm seems less natural. In logic (declarative) paradigm, programmers do not focus on the development of an algorithm to achieve any goal rather they focus only on the ‘goals of any computation’. They focus only on the logic, the what, that has to be achieved. The technique is referred to as resolution and involves the interpreter that searching through the set of all possible solutions. Computer scientists are using logic programming methods to try to allow machines to reason because this method is useful for knowledge representation. The logic paradigm is based on firstorder predicate logic. 3.1.1 Logic Programming Terms Two basic terms are used in the logic paradigm are: • Logic: is used to represent knowledge. • Inference: is used to manipulate knowledge. 3.1.2 Logic Program Division A logic program is divided into three sections: • • • a series of definitions/declarations that define the problem domain statements of relevant facts statement of goals in the form of a query In logic programming, the logic used to represent knowledge is the clausal form that is a subset of first-order predicate logic. First-order logic is used because of its understandable nature as well as it can represent all computational problems. The resolution inference system is used to manipulate knowledge. It is required for proving theorems in clausal-form logic. The essence of logic programming is shown in the below diagram. 126 CIT 332 SURVEY OF PROGRAMMING LANGUAGES In the logic paradigm, a program consists of a set of predicates and rules of inference. Predicates are fact-based statements i.e. water is wet. On the other hand, rules of inference are statements such as if X is human then X is mortal. There are multiple different logic programming languages. The most common language is Prolog or programming in logic, can also interface with other programming languages such as Java and C. On top of being the most popular logic programming language, Prolog was also one of the first such languages, with the first prolog program created in the 1970s for use with interpretations. Prolog was developed using first-order logic, also called predicate logic, which allows for the use of variables rather than propositions. Prolog utilizes artificial intelligence (AI) to help form its conclusions and can quickly process large amounts of data. Prolog can be run with or without manual inputs, meaning in it can be programmed to run automatically as part of data processing. Logic programming, and especially Prolog, can help businesses and organizations through: • Natural language processing: Natural language processing (NLP) allows for better interactions between humans and computers. NLP can listen to human language in real time, and then processes and translate it for computers. This allows technology to understand natural language. However, NLP is not limited just to spoken language. Instead, NLP can also be utilized to read and understand documentation, both in physical print or from word processing programs. NLP is used by technologies such as Amazon Alexa and Google Home to process and understand spoken instructions, as well as by email applications to filter spam emails and warn of phishing attempts. Database management: Logic programming can be used for the creation, maintenance, and querying of NoSQL databases. Logic programming can create databases out of big data. The programming can identify which information has been programmed as relevant, and store it in the appropriate area. Users can then query these databases with specific questions, and logic languages can quickly sift through all of the data, run analyses, and return the relevant result with no additional work required by the user. Predictive analysis: With large data sets, logic languages can search for inconsistencies or areas of differentiation in order to make predictions. This can be useful in identifying potentially dangerous activities or for predicting failures of industrial machines. It can also be used to analyze photos and make predictions around the images, such as predicting the identity of objects in satellite photos, or recognizing the patterns that differentiate craters from regular land masses. 127 CIT 332 3.1.3 SURVEY OF PROGRAMMING LANGUAGES Sample Logic Program in Prolog The logic programming paradigm takes a declarative approach to problem-solving. It is based on formal logic. The logic programming paradigm isn't made up of instructions - rather it is made up of facts and clauses. It uses everything it knows and tries to come up with the world where all of those facts and clauses are true. For instance, Socrates is a man, all men are mortal, and therefore Socrates is mortal. The following is a simple Prolog program which explains the above instance: man(Socrates). mortal(X) :- man(X). The first line can be read: Socrates is a man. It is a base clause, which represents a simple fact. The second line can be read: X is mortal if X is a man; in other words, "All men are mortal.'' This is a clause, or rule, for determining when its input X is "mortal.'' (The symbol ":-'', sometimes called a turnstile, is pronounced "if''.) We can test the program by asking the question: ?- mortal(Socrates). that is, "Is Socrates mortal?'' (The "?-'' is the computer's prompt for a question). Prolog will respond "yes''. Another question we may ask is: ?- mortal(X). That is, "Who (X) is mortal?'' Prolog will respond "X = Socrates''. To give you an idea, Paul is Bhoujou's and Lizy's father. Sarahyi is Bhoujou 's and Lizy's mother. Now, if someone asks a question like "who is the father of Bhoujou and Lizy?" or "who is the mother of Bhoujou and Lizy?" we can teach the computer to answer these questions using logic programming. Example in Prolog: /*We're defining family tree facts*/ father(Paul, Bhoujou). father(Paul, Lizy). mother(Sarahyi, Bhoujou). mother(Sarahyi, Lizy). 128 CIT 332 SURVEY OF PROGRAMMING LANGUAGES /*We'll ask questions to Prolog*/ ?- mother(X, Bhoujou). X = Sarahyi Explanation: father(Paul, Bhoujou). The above code defines that John is Bill's father. We are asking Prolog what value of X makes this statement true? X should be Sarahyi to make the statement true and it will respond X = Sarahyi ?- mother(X, Bhoujou). X = Sarahyi Example: domains being = symbol predicates animal(being) % all animals are beings dog(being) % all dogs are beings die(being) % all beings die clauses animal(X) :- dog(X) % all dogs are animals dog(fido). % fido is a dog die(X) :- animal(X) % all animals die In logical programming the main emphasize is on knowledge base and the problem. The execution of the program is very much like proof of mathematical statement, e.g., Prolog sum of two number in prolog: predicates sumoftwonumber(integer, integer) clauses sum(0, 0). sum(n, r):n1=n-1, sum(n1, r1), r=r1+n 3.1.4 Structure of Prolog Program 129 CIT 332 SURVEY OF PROGRAMMING LANGUAGES General structure of Prolog programs can be summarized by the following-four statement types: • • • • Facts are declared describing object and their relationships. Rules governing the objects and their relationships are then defined. Questions are then asked about the objects and their relationships. Commands are available at any time for system operations. 3.2 • • • • Characteristics of Logic Programming Discipline and idea Automatic proofs within artificial intelligence Based on axioms, inference rules, and queries. Program execution becomes a systematic search in a set of facts, making use of a set of inference rules 3.3 Features of Logical Programming • Logical programming can be used to express knowledge in a way that does not depend on the implementation, making programs more flexible, compressed and understandable. It enables knowledge to be separated from use, i.e. the machine architecture can be changed without changing programs or their underlying code. It can be altered and extended in natural ways to support special forms of knowledge, such as meta-level of higher-order knowledge. It can be used in non-computational disciplines relying on reasoning and precise means of expression. • • • 3.4 Advantages/Benefits of Logic Programming • The system solves the problem, so the programming steps themselves are kept to a minimum; Proving the validity of a given program is simple. Easy to implement the code. Debugging is easy. Since it's structured using true/false statements, we can develop the programs quickly using logic programming. As it's based on thinking, expression and implementation, it can be applied in noncomputational programs too. It supports special forms of knowledge such as meta-level or higher-order knowledge as it can be altered. • • • • • • 4.0 SELF ASSESSMENT EXERCISE(S) 1. 2. What is Logic Programming Language Explain some of the features of Logic Programming Language 5.0 CONCLUSION 130 CIT 332 SURVEY OF PROGRAMMING LANGUAGES In this unit you have been introduced to how programmers do not focus on the development of an algorithm to achieve any goal rather focus only on the goals of any computation which makes logic programming language a road-map to machine learning languages. The primary characteristics of the logic programming paradigm were discussed, so also sample logic program, some features and benefits of logic were explained. 6.0 SUMMARY The logic programming method, also known as predicate programming, is based on mathematical logic. Instead of a sequence of instructions, software programmed using this method contains a set of principles, which can be understood as a collection of facts and assumptions. All inquiries to the program are processed, with the interpreter applying these principles and previously defined rules to them in order to obtain the desired result. 7.0 TUTOR-MARKED ASSIGNMENT 1. Write in Prolog a program that computes the length (understood as the number of elements) of a list and returns this value in numeric form. (Hint: consider an inductive definition of length and use the is operator to increment the value in the inductive case.) List the principle differences between a logic program and a Prolog program. Discuss and with the aid of diagram how the three major division of logic program has held in the essence of logic programming. Given the following logic program: member(X, [X| Xs]). member(X, [Y| Xs]):- member(X, Xs). 2. 3. 4. 5. 6. State what is the result of evaluating the goal: member(f(X), [1, f(2), 3] ). Given the following PROLOG program (recall that X and Y are variables, while a and b are constants): p(b):- p(b). p(X):- r(b). p(a):- p(a). r(Y). State whether the goal p(a) terminates or not; justify your answer. Explain in detail how logic programming has helped businesses and organization through their routines? 8.0 REFERENCES/FURTHER READING 1. 2. K. Apt. From Logic Programming to Prolog. Prentice Hall, New York, 1997. S. Ceri, G. Gottlob, and L. Tanca. Logic Programming and Databases. Springer, Berlin, 1989. K. L. Clark. Predicate logic as a computational formalism. Technical Report Res. Rep. DOC 79/59, Imperial College, Dpt. of Computing, London, 1979. H. C. Coelho and J. C. Cotta. Prolog by Example. Springer, Berlin, 1988. E. Eder. Properties of substitutions and unifications. Journal of Symbolic Computation, 1:31– 46, 1985. 3. 4. 5. 131 CIT 332 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. SURVEY OF PROGRAMMING LANGUAGES J. Herbrand. Logical Writings. Reidel, Dordrecht, 1971. Edited by W.D. Goldfarb. R. A. Kowalski. Predicate logic as a programming language. Information Processing, 74:569– 574, 1974. J. W. Lloyd. Foundations of Logic Programming, 2nd edition. Springer, Berlin, 1987. K. Marriott and P. Stuckey. Programming with Constraints. MIT Press, Cambridge, 1998. A. Martelli and U. Montanari. An efficient unification algorithm. ACM Transactions on Programming Languages and Systems, 4:258–282, 1982. J. A. Robinson. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12(1):23–41, 1965. L. Sterling and E. Shapiro. The Art of Prolog. MIT Press, Cambridge, 1986. https://www.freecodecamp.org/news/what-exactly-is-a-programming-paradigm/ https://www.cs.ucf.edu/~leavens/ComS541Fall97/hw-pages/paradigms/samplelogic.html Microsoft Word - PrologPrograms (yorku.ca) 132