OS course – lecture 1 Amir Averbuch and Nezer J. Zaidenberg Announcements 1 Lecturer : Prof. Amir Averbuch Tutoring assistant :Mr. Mr. Tomer Margalit Ex. Grader : The course form has changed since previous years. It is recommended that you will attend lectures even if you taken this course previously Announcements 2 The same course is given in parallel by Nezer Zaidenberg. The material is almost identical. Exercises are indentical. Nezer’s material is in http://www.scipio.org/Courses/OS_Course.html There is a forum administrated at http://oscourse.freeforums.org/ Scribe Following success last term… 1-2 Volunteers are needed as scribe Scribe should attend every lecture and take notes. Once a month the scribe will meet Nezer to check and review the class notes. Notes will be published to the class in order to prepare for the test and homework Scribe duty well done will be considered as one homework exercise submitted at 100 grade. (so scribe do gets something for effort) A second benefit is off course reviewing the material! People volunteering for scribe duty should contact Nezer via email nzaidenberg@mac.com. Grades 50% exercises best 2 are 15% worst 2 are 10% 60% exam The exam will include multiple choice questions. You must pass both the exam and the homework to pass this course! You will have to defend (orally) each exercise you submit References The main books for this course is Operating Systems Concepts (6-8 Edition), Silberschatz, Galvin, Gagne Additional books (that will be noted on specific lectures) UNIX filesystems by S. Pate Beej Manual for network programming (web guide - networking) Solaris internals by Jim Mauro Understand the Linux kernel 3rd edition (For kernel space) Advanced programming for the UNIX environment 3rd edition by Richard W. Stevens (For user space) Linux kernel development - Robert Love Dranger guide for FFMPEG and SDL (web guide – multimedia) Course outline The goal of this course is to gain theoretical and practical understanding of OS The course is very practical with several homework assignments and exam based on class and homework You will have to defend your work The course will focus on UNIX OS (in particular Linux), Windows and general principles of OS. The exercises will be mainly coding work or reviewing kernel and low-level user code. What this course is not about Coding (to the extent of coding standards, best coding practices or efficiency) MPP (beyond simple sync algorithms) Network algorithms Encryption Etc. This code is on OS and OS related programming only What is OS Once upon a time there was no operating systems Developers used to develop hardware and software, together with everything needed to make the application run As computers become more standard and tasks become common several tasks stood out as needed by all (or practically all) software. Those common tasks became “OS” and were supplied by 3rd party vendors Commercial software used to require specific OS. This became one of the first practices of “code reusability” Common OS tasks Memory management Hardware (driver) support Task scheduling File management User management (on multi-user systems) Software providers developed a single software for all those components (and more) – the OS This software became the base for all further software development What the OS includes Core OS function Process management User management File management Memory management Hardware management Network management Additional functions are added and removed according to political and marketing scenarios Initial UNIX implementation had built in C compiler (sold separately with almost all modern OS) Most UNIX OS comes with wide range of servers such as ftp server, telnet server, daytime server, printing server and even HTTP server, J2EE platform, etc. Many OS also come with GUI (windows, OSX aqua, X Windows) and other user tools (calculator, Text processor etc.) Microsoft wished to add Internet Explorer, Windows media player etc. as part of the OS. However, the European union prevented the inclusion of media player in all windows sold in Europe… (anti trust reasons) The course will deal with “CORE OS” functions only. These services are usually managed by the OS “Kernel” We will focus on UNIX environment, Linux, Xubunto distribution What we will cover UNIX software development Process What is a process Memory management Opening files Multi tasking Multi process Multi threading Sync Memory management Network Kernel development IO devices ( char device and block device ) File systems What is UNIX UNIX is a general term for lots of things UNIX is the name of historical OS developed in 1970’s in Bell labs It’s also a trademark UNIX may refer to set of standards (POSIX) that defines OS interface that many OS follow UNIX is also general name for a family of many proprietary OS that were tested for standard completion. None of these OS is called UNIX but usually their name has an X that gives them away (AIX, OSX, HP-UX, IRIX) with Solaris as the exception Commercial UNIX OS pay royalties to use the UNIX trademark Since UNICES (plural for UNIX) share the same API and design principle it is relatively easy (but not effortless) to port software from one UNIX to the other. UNIX-Like OS’s Open source OS (such as Linux, FreeBSD, OpenBSD) that were not tested by standard committee (but tend to have the same API) are often referred to as UNIX-like. Open Source OS’s also don’t pay royalties and cannot be officially named UNIX (due to trademark rules) Open source UNIX-Like OS have become more and more popular (especially Linux) in many fields more common then real “UNIX” boxes. Today, for most people, UNIX means an OS with specific set of API and applications regardless of standards testing. (so many people will say Linux is as much of a UNIX as AIX, Solaris and OSX are.) So it is common jargon for both UNIX and UNIX-like OS’s to be considered UNIX. (at least for this course) UNIX OS internals (IMPORTANT) UNIX OS and the UNIX API is built on top of two basic principals Process File Almost all OS services are implemented as either process or file. (for example the system logger is a “process” while a socket is a file.) This is one of the first implementations of the interface design pattern Throughout the course we will see how complex services are handled using these two basic concepts. UNIX history UNIX (acr. for uniplexed information) was initially pun on multics (OS that is virtually extinct that allowed for multi-user to run multiple tasks on multi process environment) and was the name of OS that was developed at bell labs for their PDP11 computers. Original UNIX developers wanted to build a better multics. The OS was initially distributed in source form so that anybody can modify and customize it. The source was later brunched into two main trunks SYSV and BSD code (with both brunches often borrowing code from each other) Most modern Unices are usually based on either of those brunches. Commercial UNIX today mainly in high end server, workstation and desktop market. Open source UNIX tends to rule several market segments (Linux and apache rule the http serving and Linux practically runs on all Wireless routers.) and is heavy favorite on several others (running database servers) UNIX OS in the wild (partial list) SVR4 Unices Solaris and Open Solaris – Unix by SUN Microsystems HP-UX – Unix by HP Linux – The most popular open source OS (sponsored by Google, RedHat, IBM, Silicon graphics, Novell and more) AIX – Unix by IBM BSD Unices OS X and Darwin iPhoneOS – Apple OS (OS X is the proprietary version and darwin is the open source version) FreeBSD, NetBSD, OpenBSD – Open source BSD distributions NOT UNIX systems (and how much they support POSIX) Windows – not UNIX. But cygwin (open source) and microsoft POSIX subsystems each provide compatibility layer Z/OS or MVS – IBM OS for mainframe – doesn’t follow UNIX at all but have a compatibility layer UNIX API summary All UNIX and UNIX like OS’s follow the same API All OSs that follow the UNIX API share similar design principles and concepts As long as we program using the UNIX API it is relatively easy to port from one OS to the other (and for a developer or administrator from one environment to move to another) Please note – some UNIX OS’s also have proprietary “non POSIX” APIs to support more features (for example OSX COCOA) naturally such features cannot be ported easily. UNIX acceptance Super computers – 88.6% of top 500 world super computers run Linux. 99% run UNIX Midrange servers (UNIX traditional role) – AIX is the most popular midrange servers in the financial industry while Solaris and HP-UX contend for the crown in Telco enterprises Low end servers - About 70-75% of web servers (and most J2EE platforms) run on Linux. Many network products (firewalls, switches, routers) are based on Linux High end Desktops and workstations – Over 65% of workstations and desktops costing above 1000 USD run OSX Smart phones – both iPhoneOS (OSX) and Google Android (Linux) are unices Databases – Sun was recently acquired by Oracle and Solaris will be the OS used by Oracle’s Exadata. UNIX is also Tier-1 for most database vendors Embedded systems – VxWorks (arguably the most popular embedded OS) is POSIX complaint. Linux dominates STB and IGD markets Storage – most storage products run on top of UNIX (such as Linux or AIX) SAP environments – are typically managed by UNIX hosts Almost regardless of the industry or company you will end up working for… UNIX skills thought in this course will be a vital tool in your toolbox GNU source and the FSF GNU – acr. For “GNU’s not UNIX” GNU – a public license granting permission to use, modify and redistribute code provided it remains GNU FSF – acr. For “free software foundation” FSF – distribute the source for most of the free UNIX applications (The unix look and feel). Promote the GNU license Linux kernel – the “main functionality” of the OS developed by volunteers (not the FSF, but the groups are connected) under GNU license The many types of free MIT license – allow you to do what ever you want with the code BSD license – allow modification and redistribution (modification may not be free). You must keep credits to original author GNU GPL – modifications (derived work) must be free GNU LGPL – modification must be free, but linking with GNU source is not a modification (license used by GNU libraries) Other free licenses - check on FSF website Commercial Open source – you may view the source but not use it commercially or you may edit the source, but your modification belong to specific company or modifications may not be redistributed. Are not considered “free” (such as Apple public license, Sun Public license, Netscape public license, MySQL) Dual license – software available as GPL (if you modify you must open the source of the modification) same software is also available with closed source (you can modify and not open the source) for paying customers. This is the business model of many open source companies Introducing Linux…. History of Linux In the early 1990’s there were commercial UNIX (SYS V) and free UNIX (BSD) A T&T (copyright holders of SYS V) sued BSDi (BSD) for copyright infringement. BSD was crippled and many stopped using it During the legal vaccum a young Finnish student – Linus Turvalds started implementing UNIX for intel 80386 from scratch. The OS he developed was called Linux Linus developed Linux “Just for fun” With BSD in legal struggle and Linux proving itself as efficient and scalable alternative Linux started growing in popularity Linux was adopted by FSF for it’s GNU/Linux platform Linux today Linux today is the most widely used UNIX OS (arguably the most widely used OS) found in cell phones, netbooks, embedded systems, notebooks, desktops, workstations, servers and supercomputers Linux runs on most architectures today (x86, Itanium, ARM, Power, MIPS, s390 and a lot more) Is distributed by commercial companies in what is known as a Linux distribution. What is Linux distribution Linux distribution is a gathering of software from several sources, compiling them and branding them by single company All Linux distribution include the Linux kernel and most GNU sources (thereby Linux distribution….) Linux distributions by Red hat SuSe, Debian, Ubunto, slackware have been popularized. In this course we will focus on Xubunto 9.04 Differences between distributions Why and what is Xubunto The popular Ubunto with X (i.e. not KDE or Gnome) desktop The lowest foot print and hardware requirements of all modern Ubunto flavours The “founding fathers” of Linux (partial list) Linus Torvalds – Finnish – creator and maintainer. About 2% of the current Linux code (which is A LOT) is Linus’s Alan Cox – Walsh Greg Kroah-Hartman - American Ingo Molnar – Hungarian Robert Love – American Shaped UNIX (partial list) Bill Joy Eric S. Raymond Richard M. Stallman Ken Thompson Dennis Ritchie Marshall Kirk McKusick But maybe we are moving too fast here… Why do we need OS in the first Place Well we don’t NEED OS. But we want to. Almost every piece of software require some set of services such as managing files, allocating memory and usually opening threads, syncing etc. Initially people implemented it from scratch with any new software. OS is the basic set of services which allows us to write our software. The strategy of rewriting the OS with a new software is almost extinct since long ago. OS code and User code OS code – usually invoked in “the kernel” is the code that “makes the computer work”. Including drivers, memory management etc. User code – most code we write In this code we will use the terms Userland and kernelspace to distinguish between them. We will write both kinds of code in the exercises Where does the OS “ends” The OS API is often called “system calls” Often when we write code in C we call library functions (such as printf, fopen, malloc) that provide us with “OS” services (such as producing output, opening a file, allocating memory) Those are not system calls but C wrappers to system calls. (the C wrappers remain the same even on non-POSIX systems such as windows) Help request In UNIX when we want brief help on something we can type man (something) and get the help page. (try it. For example man man) When the man refers to command line executable we will find the manual in section 1 of the man When the man refers to a system call we will find the manual in section 2 of the man When the man refers to a library function we will find the manual in manual 3 of the man When a concept is found in two or more sections we can request the right manual by typing “man 1 write” or “man 2 write” (try it!) Common convention When discussing executable (such as ls) we will use (1) after the executable name (for example ls(1)) Similarly when discussing system calls and library functions we will use 2 and 3 respectfully. (open(2), printf(3)) There may be some inconsistencies on sections between UNIX flavours (functions moving between section 2 and 3) but it’s rare. Library code relation with OS code Library functions invoke OS code to complete their work For example printf(3) (short for print formatted) parses the input and format the text. Then it calls for write(2). Printf(3) is library function (and not OS function) while write(2) is a system call. Some unix file related system calls you should already know Open(2) Close(2) Read(2) Write(2) To refresh your memory about these functions type “man 2 open” to get the function C info. Example – man open Note section 2 for system call OPEN(2) This is a man page from OSX which is BSD system BSD System Calls Manual OPEN(2) What the system call does Open is the system call name NAME open -- open or create a file for reading or writing SYNOPSIS #include <fcntl.h> int Required header toCuse this style Brief function description function (prototype) declaration open(const char *path, int oflag, ...); DESCRIPTION The file name specified by path is opened for reading and/or writing … More from man open(2) … (description snipped)… RETURN VALUES If successful… (snipped) ERRORS What the function returns When something interesting/bad happens The named file is opened… (snipped) SEE ALSO Related man pages chmod(2), close(2), dup(2), getdtablesize(2), lseek(2), read(2), umask(2), write(2) HISTORY An open() function call appeared in Version 6 AT&T UNIX Standards etc. The difference between system and library function OS function Library function Lowest level interface Usually calls OS function Section 2 of man Section 3 of man System dependent and may not exist on different systems (or used with different name or parameters) Programming language dependent (but not system dependent) Almost always bug free Almost always bugfree Almost always efficient Almost always efficient Usually – kernel space code Usually - user space code fopen(3) implementation FILE *fopen(char *name, char *mode) { int fd; FILE *fp; if (*mode != ‘r’ && *mode != ‘w’ && *mode != ‘a’) return NULL; for (fp = _iob; fp < _iob + OPEN_MAX; fp++) if ((fp->flag & (_READ | _WRITE)) == 0) break; /* found free slot */ if (fp >= _iob + OPEN_MAX) /* no free slots */ return NULL; if (*mode == ‘w’) fd = creat(name, PERMS); else if (*mode == ‘a’) { if ((fd = open(name, O_WRONLY, 0)) == -1) fd = creat(name, PERMS); lseek(fd, 0L, 2); } else fd = open(name, O_RDONLY, 0); if (fd == -1) /* couldn’t access name */ return NULL; fp->fd = fd; fp->cnt = 0; fp->base = NULL; fp->flag = (*mode == ‘r’) ? _READ : _WRITE; return fp; } System calls! Conclusion So OS code is the underlying layer of the library function Homework – (not for submission) Examine printf(3) code – find the OS function call references. Follow all functions printf calls!