WWW / Internet Aaron Bloomfield CS 415 Fall 2005 1 Impact of Internet on Programming Languages 2 Formatting Languages 3 HTML Background • • • • • HyperText Markup Language Tim Berners-Lee– 1989 Subset/Instance of SGML Developed to support WWW Developed as a document-layout language 4 Markup Languages • • • • Used to encode formatting Markup tags embedded in text Output device interpreted tags Originally used in word processing 5 SGML • Standard General Markup Language • Rules for defining logical structure of a document • Basis for creating markup languages – Essential info easily transferred 6 HTML • HTML file processed and displayed by Web Browser • Passive language – Program that displays the html code decides how to interpret the description 7 Tags • Tell Web Browser how to format text on screen • Always enclosed in angle brackets: <tag> • Not case sensitive • Whitespace does not matter 8 Anchors • Allow jumping to different spots • HREF – Inserts a hyperlink <A HREF=http://www.virginia.edu>UVA</A> • Name – Allows a different spot in the document to be jumped to directly <A NAME=“sec2”>Section 2</A> 9 Images • Only need image URL • SRC tag • Images can be placed anywhere – Align 10 HTML Evolution • W3C & IETF: HTML standards • Internet Drafts: published, tested, commented on, and become Document Type Definitions • Browser authors add to HTML – Support it in their browsers – Community accepts or abandons additions 11 PDF • Portable Document Format • The second generation of PostScript • PDF files retain their formatting across different viewing environments • Applications exist for converting documents to PDF • Can contain hyperlinks 12 Tex • Document preparation system • Typesetting for math and technical material • Very close control over document 13 Tex: File Creation • Tex file created with regular text editor • Tex converts it into a DVI (DeVice Independent) file • DVI file read by another program • Creating DVI file allows for 1 file to produce the same output by different reading programs 14 LaTeX • Macro package for Tex • Used for document preparation • Author does not format document – Specify defaults for document classes • You can create your own document classes 15 Latex Example \documentclass{article} \title{Latex Example} \begin{document} Body text goes here. \end{document} 16 PL Concepts • Formatting languages are not PLs • But their use employs: – Parsing – Generating output in a different form – Nesting 17 XML the eXensible Markup Language 18 What is XML? • XML is a markup languge used to represent data in a structured, portable manner – It allows programmers to use a standard parser for many different types of data without having to worry about small changes in document structure affecting their parsing code – It allows for easy interchange of data between heterogenous systems • Why do we care about XML? Its not a programming language. 19 Example: Gradebook • Without XML: Doe, John 80 75 60 92 Doe, Jane 81 90 54 84 • With XML: <grades> <student name=”Doe, John”> <grade name=”hw1”>80</grade> <grade name=”hw2”>75</grade> ... </student> ... </grades> 20 XML isn't perfect • For some applications the tags make up a significant fraction of a documents size – In general, this doesn't matter • XML by itself doesn't provide a whole lot of services – Things like linking, templating, parsing, and defining schemas are provided by other languages • It isn't really that revolutionary – It is very similar to SGML and ASN1 21 XML DTD <?xml version=”1.0”?> <!ELEMENT grades (student+)> <!ELEMENT student (grade+)> <!ATTLIST student name #CDATA #REQUIRED> <!ATTLIST student id #CDATA> <!ELEMENT grade (#CDATA)> <!ATTLIST grade name #CDATA #REQUIRED> 22 Why do we use DTDs • They show an agreement between two groups on a specific way of expressing some data. – There are tools that will verify a document to a DTD • They don't let you enforce restrictions beyond the structure of the document – Without another tool there isn't a way to require that a grade tag only encloses a number. 23 Real-world uses of XML • Data exchange is great, but how is XML really used? – – – – – – – RSS/RDF FOAF SOAP/XMLRPC Content Management/Syndication Streaming Video (SML) Microsoft Word ... and much, much more 24 Content Mangement • Many businesses like to write content once, and format it for a variety of destinations – We want to have a version for the web (HTML), cellphones (WML), and print (SGML). – We don't want to have to make each version by hand – Gererally it works like this: • A story is written and marked up in XML • CMS software uses stylesheets to create the different output from the same source • Everybody is happy 25 Style sheets for XML • The most common style sheet language is XSLT: the eXensible StyLesheet and Templating language. – It allows us to transform one XML document into another document. – Unlike XML, XSLT is a programming language – XSLT programs just are expressed in XML – XSLT is interpreted – XSLT is dynamically typed and type-safe, does dynamic type checking, and is statically scoped 26 Example: gradebook stylesheet <xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL /Transform'> <xsl:template match=”/”> <xsl:apply-templates /> <xsl:template /> <xsl:template match=”student”> <h1><xsl:value-of select=”@name”></h1> <table><xsl:applytemplates /> </table> </xsl:template> 27 Gradebook continued <xsl:template match=”grade”> <tr> <td><xsl:value-of select=”@name”/></td> <td><xsl:value-of select=”.”/></td> </tr> </xsl:template> </xsl:stylesheet> 28 More XSLT • We can also do conditional processing – <xsl:variable name=”thisgrade” select=”.”/> <xsl:if test=”$thisgrade < 60”> <h2><xsl:value-of select=”.”/></h2> <xsl:if/> <xsl:if test=”not ($thisgrade < 60)”> <xsl:value-of select=”.”/> <xsl:if/> • ...and other things <xsl:value-of “sum(./grade) count(./grade)” /> div 29 Internet Applications and Multimedia 30 History • People like to be able to use their senses • However, the space that digital pictures, sounds, and video required was far more than a disk could hold • ~1994 the Internet became more popular in the common household • However, browsers were originally mostly text based and connections were too slow to support large files 31 Flash • Macromedia created Flash which, in one simple program allowed people to develop multimedia content • Flash uses vector graphics, which is very space efficient and incorporates other methods for compressing media types • Also provides streaming content, extremely useful to an internet user 32 ActionScript • ActionScript is Flash’s scripting language • From a programmer’s point of view, makes Flash a lot easier to work with • The programmer has complete control over what is going on in their application • Enhances one’s ability to create an interactive application, be that a presentation, a game, or a movie 33 ActionScript cont. • ActionScript and Flash are primarily used today as a 2D graphics environment. • However, people are also using ActionScript to create interactive forms on the web • Another useful application is that a company could store something on it’s main server, and when they have to give a presentation out in Seattle, they could just access that and have real time stream and interactivity 34 More on ActionScript • In ActionScript, we don’t need to specify what type the variable is. ex. favColor = “pink”; sets the variable named ‘favColor’ to a string with the value “pink”. • Also, favColor can later be changed to another type. ex. (following first example) favColor = 1; sets the variable ‘favColor’ to the number 1. 35 More on ActionScript • However, if you want a function to only take a certain type and only return a certain type, you can do something like this: ex. Function doThis(myWord:String):Number { } • Also, ActionScript (being part of Flash) was designed with interaction in mind. ex. on (release) {clip2._visible = false; } 36 More on ActionScript • There are currently other languages out there which are intended to achieve the same results as ActionScript. • The popular ones include JavaScript and upcoming Avalon. • Using ActionScript, a programmer can extremely enhance a presentation with sound, graphics, video, and their overall 37 layout More on ActionScript • Both ActionScript and JavaScript are based on the ECMA-262 (The European Computers Manufacturers Association) • One big difference is that ActionScript doesn’t support browser specific objects and commands. • Microsoft’s future OS is planning on distributing Avalon as a feature with it. Will 38 support 2D and 3D vector graphics. Internet Applications (Applets) 39 Applets • One common misconception is that an applet refers to Java applets, which isn’t necessarily the case • Something to note about these applications is that they can use the computing power of the server or of the user 40 Java Applets • Even though we have stated that applets do not necessarily have to be Java, this is a predominant form • One crucial feature of Java and Java’s Virtual Machine is that it is machine-independent • To do this, applets need to be able to access some information about the user’s machine, but Java Applets are not allowed/able to read or write files on/to the users machine 41 Interpreted/Compiled • Originally, the Java Applets for the Java Virtual Machine were intended to be Interpreted • This has its weaknesses, so there are two other methods that the JVM can use 42 More on the JVM • One thing that the JVM can do is look at what part of the program that is being most heavily used and convert that into machine code • The other option is the Just-in-Time (JIT) Compiler • This takes the Java bytecode and converts it to instructions that can be directly performed by the processor (also using machine code) 43 More on the JVM • Users of ActionScript like to point out that whatever they create will look the same on any machine (though you do need to download the Flash viewer) whereas JavaScript can be interpreted differently in different (especially the less mainstream) browsers • Along the same lines, Java applets (as previously stated) were designed to be machine independent and could potentially even work on someone’s cell phone 44 PERL & CGI Practical Extraction & Reporting Language (Pathologically Eclectic Rubbish Lister) 45 Agenda • Brief History • Language Overview • Benefits of using Perl (Efficiency, Portability) • Common Gateway Interface (CGI) 46 Brief History • Developed by Larry Wall • Perl 1.000 released in 1987 • Perl 3.000 released in 1988 under GNU Public License • Current stable release: Perl 5.8.1 • Today, Perl comes standard on most operating systems (e.g. Solaris, Redhat, Mac OS X) • 3rd party companies offer pre-built Perl distributions for other OSs (e.g. Windows) 47 What is Perl? • Interpreted language • Wall originally intended Perl for: – – – – Scanning arbitrary text files Extracting information Printing reports from extracted info System management tasks • Expression syntax borrows heavily from other languages (C, sed, awk, sh) 48 Wall on Perl • “Perl is a language for getting your job done.” • “Perl is designed to make easy jobs easy, without making hard jobs impossible.” 49 $str = “Hello, world!”; sub PrintHello { while(1) { if ($str =~ m/o..w/) { print “$str\n”; } else { print “NO MATCH\n”; break; } } } &PrintHello; 50 Language Overview - Types • 3 primitive types: $scalars, @arrays, and %hashes (associative arrays) • A $scalar variable may be a number or a string – Conversion between the two is automatic pending on your operator usage (e.g. “= =“ OR “eq”) • @arrays and %hashes are dynamically sized • A %hash consist of a set of key-data pairs 51 Type Examples $helloStr = “Hello, world!”; $someNum = 415; @myArray = ($helloStr, $someNum, 1, 2.2, “3”); %myHash{“key”} = $data; 52 Control Statements • Conditional branching: – – – – if (condition) { statement(s); } else { statement(s); } elsif (condition) { statement(s); } unless (condition) { statement(s); } 53 Control Statements - cont’d • Iteration: – – – – while (condition) { statement(s); } until (condition) { statement(s); } for ( initialize; condition; re-initialize; ) { } foreach $scalar (@list) { }; 54 Functions & Scopes • Perl allows for recursion with proper scoping declarations • Defining a subroutine: sub subname { statement(s); } • Calling a subroutine: &subname; subname() • Scoping keywords: my, our, and local 55 Regular Expressions • One of Perl’s primary purposes was to process text • Perl has built-in pattern matching – useful for searching and manipulating strings • Perl’s regular expressions are derived from established regular expressions found in popular tools like sed, awk, grep, and vi • regex operators: =~ matches, !~ doesn’t match m/ PATTERN / s/ PATTERN / PATTERN / tr/ PATTERN / PATTERN / 56 Perl Benefits - Portability • Perl programs are not platform dependant – Comes standard on many different operating systems – Avoid C’s #IFDEF statements – Does not guarantee that a program uses platform dependant tools • regex widely used (grep, sed, awk, ed, vi, emacs, etc) • Perl is a semantic superset of the above – any regex that can be described in the above can also be described in Perl (but maybe using different characters 57 Perl Benefits – Efficiency • Use of hashes instead of linear searches through an array • Use of foreach instead of for loops – No need to have extra operation on counter • regex for fast pattern matching searching large amounts of data • Use of AutoLoader module for – Defines functions the first time they’re called 58 Common Gateway Interface (CGI) • CGI script – a program on the web server which runs on demand to generate the content of a web page • You can find CGI scripts everywhere on-line: – Used to access and manipulate databases – On-line shopping • Popular CGI scripting languages: – Perl, PHP, ASP, Python – But really, anything that is executable on the server 59 What happens when a script runs? 60 CGI with Perl • Why use Perl to create CGI scripts? – Free software – Fast and powerful string manipulation (imagine doing the same thing with C/C++) – Built-in types and functions that are useful for handling records and information (arrays and hashes) – Comes standard on almost all web servers – Standard CGI module 61 CGI Example – Server-side #!/usr/bin/perl use CGI ':standard'; print header, start_html('Hello World'), b('And hello to you too'), end_html; 62 CGI – Example – Client-side Content-type: text/html <HTML><HEAD><TITLE>Hello World</TITLE><BODY> <STRONG>And hello to you too</STRONG> </BODY></HTML> 63 Perl/CGI forms 64 Example 1: Standard CGI script • Purpose was to allow computations of the form ne mod m – In particular, where the numbers involved were 500 digit integers • Used in RSA encryption – Provided to my CS 202 class • The script creates the HTML page – I’ve provided the HTML page separately to show what the HTML code looks like • Note that the script is synchronous – It requires the page be re-loaded with the new data 65 Example 2: AJAX script • AJAX stands for Asynchronous Java And Xml – Does not require the page be re-loaded to display new data • Google Maps is an example of an AJAX script • Two files involved – ajax.html: has the HTML and Javascript necessary to load the content without reloading the page – ajax: the Perl script itself, it’s pretty simple 66 URLs • Example 1 – Script can be run at http://www.cs.virginia.edu/cgibin/cgiwrap/cs415/modpow – Script can be viewed at http://www.cs.virginia.edu/~cs415/cgi-bin/modpow • Example 2 – ajax.html: http://www.cs.virginia.edu/~cs415/code/ajax.html – ajax script: http://www.cs.virginia.edu/~cs415/cgibin/ajax 67