Building a Web Browser CS1316: Representing Structure and Behavior

advertisement
Building a Web Browser
CS1316: Representing
Structure and Behavior
Story

How we access the Web from Java

Creating a Web browser in Java
• URL is an object
• Open a connection, then a stream.
• Basically, treat it like a file.
• JEditorPane understands HTML (and Text
•
and RTF)
Have to deal with hyperlinks as an event
URL is an object


URL objects
represent (surprise!)
URLs (Uniform
Resource Locators).
They can be queried
to get all kinds of
information about
the URL, including a
connection to the
object at the URL.
Getting the Content from a URL

To get the content from a URL:
• You first create a connection which allows you
•
to access the network.
You then create the stream access for that
URL—the same (hard) way we have before.
URL url = new URL(“http://www.cnn.com”);
URLConnection con = url.openConnection();
BufferedReader stream = new BufferedReader(new
InputStreamReader(con.getInputStream()));
But there are exceptions

Accesses to
the network
can (of
course!) lead
to network
errors, so we
have to deal
with that
possibility.
/**
* Open with a URL
**/
public WebPageReader(String s){
// Create the URL and the connection to it
try {
url = new URL(s);
con = url.openConnection();
stream = new BufferedReader(new
InputStreamReader(con.getInputStream()));
}
catch (Exception e) {
System.out.println("An error opening the URL
occurred.");
System.out.println(e.getMessage());}
}
Using the
WebPageReader
Each time we call
reader.nextLine(), we get
the next line from the object
> WebPageReader reader = new
WebPageReader("http://www.yahoo.co
m")
> reader.getType()
"text/html"
> reader.readyToRead()
true
> reader.nextLine()
"<html><head>"
> reader.nextLine()
"<script language=javascript>"
> reader.nextLine()
"var now=new
Date,t1=0,t2=0,t3=0,t4=0,t5=0,t6=0,hp
=0,cc='',ylp='';t1=now.getTime();"
> reader.nextLine()
"</script>"
> reader.nextLine()
"<title>Yahoo!</title>"
> reader.nextLine()
"<meta http-equiv="PICS-Label"
content='(PICS-1.1
"http://www.icra.org/ratingsv02.html" l r
(cz 1 lz 1 nz 1 oz 1 vz 1) gen true for
"http://www.yahoo.com" r (cz 1 lz 1 nz
1 oz 1 vz 1)
"http://www.rsac.org/ratingsv01.html" l
r (n 0 s 0 v 0 l 0) gen true for
"http://www.yahoo.com" r (n 0 s 0 v 0 l
0))'>"
Creating the WebPageReader
/**
* WebPageReader class
* Given a URL, can return information about that page.
**/
import java.net.*;
import java.io.*;
import java.util.*;
public class WebPageReader {
//// Fields
private URL url;
private URLConnection con;
private BufferedReader stream;
Constructor
/**
* Open with a URL
**/
public WebPageReader(String s){
// Create the URL and the connection to it
try {
url = new URL(s);
con = url.openConnection();
stream = new BufferedReader(new InputStreamReader(con.getInputStream()));
}
catch (Exception e) {
System.out.println("An error opening the URL occurred.");
System.out.println(e.getMessage());}
}
Checking if the connection is
working
/**
* A WebPageReader is ready to read if the stream is ready
**/
public boolean readyToRead(){
try {return stream.ready();}
catch (Exception e) {System.out.println("I/O error
occurred.");
System.out.println(e.getMessage());
return false;}
}
What’s out there?
/**
* The type of the material at the other end of the URL is
* the contentType from the URLConnection
**/
public String getType(){return con.getContentType();}
// “text/html” is the MIME type for normal Web pages
Reading from that content
/**
* Next line is the next line from the material at the
* other end of the URL. We read it like a file.
* There is more material there as long as readyToRead() returns
* true. We may also read a null when it's done.
**/
public String nextLine(){
try {return stream.readLine();}
catch (Exception e) {System.out.println("I/O error occurred.");
System.out.println(e.getMessage());
return null;}
}
Building a Web Browser


Building a web browser in Java is very easy.
Swing component JEditorPane understands
HTML.
•
•

And plain text
And RTF (Rich Text Format—format that Word and
other word processors can generate)
Does not understand JavaScript, CSS, etc.
•
Just plain HTML
> SimpleBrowser sb = new
SimpleBrowser()
SimpleBrowser
/**
* A Simple Web Browser
* Uses a JEditorPane() which knows how to interpret HTML (and RTF and Text)
**/
We need all of these
// Lots of imports!
for Swing, networking,
import java.awt.*;
I/O (Input/Output
import java.awt.event.*;
exceptions), and
import java.net.*;
HTML processing.
import java.io.*;
import javax.swing.*;
import javax.swing.event.*;
import javax.swing.text.html.HTMLFrameHyperlinkEvent;
import javax.swing.text.html.HTMLDocument;
SimpleBrowser
public class SimpleBrowser extends
JFrame {
/// Fields
/** A field for the URL to be entered **/
private JTextField urlField;
private JEditorPane webpane;
Describing our UI:
Assembled in Constructor

Top pane deals with URL specification:

Bottom part is the JEditorPane
• Label for entering URL
• Field for entering URL
JEditorPane’s are very flexible
From JDK JavaDoc
Constructor: Building the UI
/***
* Most of the action is in the constructor.
**/
public SimpleBrowser(){
super("Simple Browser");
// Make a panel with a label and the URL field
JPanel panel1=new JPanel();
this.getContentPane().add(panel1,BorderLayout.NORTH);
JLabel label1= new JLabel("URL:");
panel1.add(label1,BorderLayout.EAST);
urlField = new JTextField("http://www.cnn.com");
How we load URLs
(upon enter key)
urlField.addActionListener(
new ActionListener() {
event.getActionCommand()
public void actionPerformed(ActionEvent e) returns the string from the
{
field—the one with the URL
String urlString = e.getActionCommand(); in it.
try {
JEditorPanes can read
webpane.setPage(urlString);
directly from URL! Simply
urlField.setText(urlString);
setPage(String url).
} catch (Exception e2) {
System.out.println("I/O Error -- maybe bad URL?");
System.out.println(e2.getMessage());}
}
});
panel1.add(urlField,BorderLayout.CENTER);
Setting up the JEditorPane
// Second part of the browser is the
viewable pane
webpane = new JEditorPane();
webpane.setEditable(false);
Dealing with HyperLink
// Make hyperlinks work (from 1.4JDK docs)
webpane.addHyperlinkListener(
new HyperlinkListener() {
public void hyperlinkUpdate(HyperlinkEvent e) {
if (e.getEventType() == HyperlinkEvent.EventType.ACTIVATED) {
JEditorPane pane = (JEditorPane) e.getSource();
if (e instanceof HTMLFrameHyperlinkEvent) {
HTMLFrameHyperlinkEvent evt = (HTMLFrameHyperlinkEvent)e;
HTMLDocument doc = (HTMLDocument)pane.getDocument();
doc.processHTMLFrameHyperlinkEvent(evt);
} else {
try {
pane.setPage(e.getURL());
This is copy-pasted from JDK
} catch (Throwable t) {
documentation. Key observation:
t.printStackTrace();
Dealing with a new kind of event
}
and listener!
}
}}
});
JEditorPane gets a scrollpane
this.getContentPane().add(new JScrollPane(webpane),
BorderLayout.CENTER);
this.pack();
this.setVisible(true);
JScrollPanes contain
}
something that is scrolled—
here, a JEditorPane.
We put the JScrollPane in
the Center so that it gets
emphasized in the
BorderLayout renderer.
How we load pages
public void loadPage(String urlString){
try {
webpane.setPage(urlString);
urlField.setText(urlString);
} catch (Exception e) {
System.out.println("I/O Error -- maybe bad URL?");
System.out.println(e.getMessage());}
}
Dealing with exceptions in all these cases is
required. The compiler flags these as errors.
Download