Building a Web Browser CS1316: Representing Structure and Behavior Story How we access the Web from Java Creating a Web browser in Java • URL is an object • Open a connection, then a stream. • Basically, treat it like a file. • JEditorPane understands HTML (and Text • and RTF) Have to deal with hyperlinks as an event URL is an object URL objects represent (surprise!) URLs (Uniform Resource Locators). They can be queried to get all kinds of information about the URL, including a connection to the object at the URL. Getting the Content from a URL To get the content from a URL: • You first create a connection which allows you • to access the network. You then create the stream access for that URL—the same (hard) way we have before. URL url = new URL(“http://www.cnn.com”); URLConnection con = url.openConnection(); BufferedReader stream = new BufferedReader(new InputStreamReader(con.getInputStream())); But there are exceptions Accesses to the network can (of course!) lead to network errors, so we have to deal with that possibility. /** * Open with a URL **/ public WebPageReader(String s){ // Create the URL and the connection to it try { url = new URL(s); con = url.openConnection(); stream = new BufferedReader(new InputStreamReader(con.getInputStream())); } catch (Exception e) { System.out.println("An error opening the URL occurred."); System.out.println(e.getMessage());} } Using the WebPageReader Each time we call reader.nextLine(), we get the next line from the object > WebPageReader reader = new WebPageReader("http://www.yahoo.co m") > reader.getType() "text/html" > reader.readyToRead() true > reader.nextLine() "<html><head>" > reader.nextLine() "<script language=javascript>" > reader.nextLine() "var now=new Date,t1=0,t2=0,t3=0,t4=0,t5=0,t6=0,hp =0,cc='',ylp='';t1=now.getTime();" > reader.nextLine() "</script>" > reader.nextLine() "<title>Yahoo!</title>" > reader.nextLine() "<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l r (cz 1 lz 1 nz 1 oz 1 vz 1) gen true for "http://www.yahoo.com" r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l r (n 0 s 0 v 0 l 0) gen true for "http://www.yahoo.com" r (n 0 s 0 v 0 l 0))'>" Creating the WebPageReader /** * WebPageReader class * Given a URL, can return information about that page. **/ import java.net.*; import java.io.*; import java.util.*; public class WebPageReader { //// Fields private URL url; private URLConnection con; private BufferedReader stream; Constructor /** * Open with a URL **/ public WebPageReader(String s){ // Create the URL and the connection to it try { url = new URL(s); con = url.openConnection(); stream = new BufferedReader(new InputStreamReader(con.getInputStream())); } catch (Exception e) { System.out.println("An error opening the URL occurred."); System.out.println(e.getMessage());} } Checking if the connection is working /** * A WebPageReader is ready to read if the stream is ready **/ public boolean readyToRead(){ try {return stream.ready();} catch (Exception e) {System.out.println("I/O error occurred."); System.out.println(e.getMessage()); return false;} } What’s out there? /** * The type of the material at the other end of the URL is * the contentType from the URLConnection **/ public String getType(){return con.getContentType();} // “text/html” is the MIME type for normal Web pages Reading from that content /** * Next line is the next line from the material at the * other end of the URL. We read it like a file. * There is more material there as long as readyToRead() returns * true. We may also read a null when it's done. **/ public String nextLine(){ try {return stream.readLine();} catch (Exception e) {System.out.println("I/O error occurred."); System.out.println(e.getMessage()); return null;} } Building a Web Browser Building a web browser in Java is very easy. Swing component JEditorPane understands HTML. • • And plain text And RTF (Rich Text Format—format that Word and other word processors can generate) Does not understand JavaScript, CSS, etc. • Just plain HTML > SimpleBrowser sb = new SimpleBrowser() SimpleBrowser /** * A Simple Web Browser * Uses a JEditorPane() which knows how to interpret HTML (and RTF and Text) **/ We need all of these // Lots of imports! for Swing, networking, import java.awt.*; I/O (Input/Output import java.awt.event.*; exceptions), and import java.net.*; HTML processing. import java.io.*; import javax.swing.*; import javax.swing.event.*; import javax.swing.text.html.HTMLFrameHyperlinkEvent; import javax.swing.text.html.HTMLDocument; SimpleBrowser public class SimpleBrowser extends JFrame { /// Fields /** A field for the URL to be entered **/ private JTextField urlField; private JEditorPane webpane; Describing our UI: Assembled in Constructor Top pane deals with URL specification: Bottom part is the JEditorPane • Label for entering URL • Field for entering URL JEditorPane’s are very flexible From JDK JavaDoc Constructor: Building the UI /*** * Most of the action is in the constructor. **/ public SimpleBrowser(){ super("Simple Browser"); // Make a panel with a label and the URL field JPanel panel1=new JPanel(); this.getContentPane().add(panel1,BorderLayout.NORTH); JLabel label1= new JLabel("URL:"); panel1.add(label1,BorderLayout.EAST); urlField = new JTextField("http://www.cnn.com"); How we load URLs (upon enter key) urlField.addActionListener( new ActionListener() { event.getActionCommand() public void actionPerformed(ActionEvent e) returns the string from the { field—the one with the URL String urlString = e.getActionCommand(); in it. try { JEditorPanes can read webpane.setPage(urlString); directly from URL! Simply urlField.setText(urlString); setPage(String url). } catch (Exception e2) { System.out.println("I/O Error -- maybe bad URL?"); System.out.println(e2.getMessage());} } }); panel1.add(urlField,BorderLayout.CENTER); Setting up the JEditorPane // Second part of the browser is the viewable pane webpane = new JEditorPane(); webpane.setEditable(false); Dealing with HyperLink // Make hyperlinks work (from 1.4JDK docs) webpane.addHyperlinkListener( new HyperlinkListener() { public void hyperlinkUpdate(HyperlinkEvent e) { if (e.getEventType() == HyperlinkEvent.EventType.ACTIVATED) { JEditorPane pane = (JEditorPane) e.getSource(); if (e instanceof HTMLFrameHyperlinkEvent) { HTMLFrameHyperlinkEvent evt = (HTMLFrameHyperlinkEvent)e; HTMLDocument doc = (HTMLDocument)pane.getDocument(); doc.processHTMLFrameHyperlinkEvent(evt); } else { try { pane.setPage(e.getURL()); This is copy-pasted from JDK } catch (Throwable t) { documentation. Key observation: t.printStackTrace(); Dealing with a new kind of event } and listener! } }} }); JEditorPane gets a scrollpane this.getContentPane().add(new JScrollPane(webpane), BorderLayout.CENTER); this.pack(); this.setVisible(true); JScrollPanes contain } something that is scrolled— here, a JEditorPane. We put the JScrollPane in the Center so that it gets emphasized in the BorderLayout renderer. How we load pages public void loadPage(String urlString){ try { webpane.setPage(urlString); urlField.setText(urlString); } catch (Exception e) { System.out.println("I/O Error -- maybe bad URL?"); System.out.println(e.getMessage());} } Dealing with exceptions in all these cases is required. The compiler flags these as errors.