How the Web Works - Internet and Web Protocols.

advertisement
What is the Web?








The World Wide Web is a system of Internet servers through which
several Internet protocols can be accessed using a single interface.
Most protocols on the internet are available through the web.
The web creates a user-friendly environment through which you can
access many services.
The Web can work with multimedia and advanced programming
languages.
The main protocol used by the web to transfer information is HyperText
Transfer Protocol (HTTP).
Hypertext documents contain links that connect to other documents or
files. The user can activate these links or 'hot spots' (through a mouse
button click, for example) and the target document will then be
transferred on to the client machine.
These 'hot spots' are created using the HyperText Markup Language
(HTML) which can turn picture, text etc into a hyperlink.
It is because of these links it is called the Web, and interconnecting web
of hypertext documents.
Internet and Web – Client/Server Architecture

The Internet depends on the client-server architecture. Your computer
runs software called the client and it interacts with another piece of
software known as the server located on a remote computer.

The client is usually a browser. Browsers interact with the server using a
set of instructions called protocols. These protocols help in the accurate
transfer of data requests made through requests from a browser and
responses from the server.

An example of client server interaction is as follows: the client (browser)
requests an HTML file stored on the remote machine through the server
software. The server locates this file and passes it to the client. The client
then displays this file on your machine. In this case, the HTML page is
static. Static pages do not change until the developer modifies them.

There are many protocols available on the Internet. The World Wide Web,
which is a part of the Internet, brings all these protocols under one roof.
You can thus use HTTP, FTP, Telnet, Email etc through your browser.
Example:
Suppose you have requested an HTML document from a remote computer
using a web browser. The browser searches for the remote computer and
when it locates it, your computer passes the request to a program called the
server running on this distant computer. The server then examines your
request and attempts to locate the HTML file on its hard disk. When the file is
located, the server sends this file to your computer. If this HTML document
has embedded image, video, and/or sound files, the information and the
content of such files are also passed to the browser.
When the data is received from the server, the client, (a browser) displays
the HTML page. The client (eg: browser) is completely responsible for the
display of the web page, with no involvement from the servers' side. Once the
server sends the data to the requesting computer, it is finished with the
interaction. When all requested data is received, the client-server connection
is lost. Thus, the next time this client asks for some information from the
server, the server will treat it as a new request without any recollection of
previous requests. This means that client-server interaction is "stateless" with
every new request generating a new response.
Transferring Data on the Web

In order to view sites on the web, some type of data transfer must occur,
to transfer the web page you wish to view to your computer to enable you
to view it.

Usually this type of transfer starts with some sort of event.

These events can come from different sources, for example, when you
launch your web browser and web address or click on a hyperlink, you are
generating an event that will transfer data from a server to your computer.

Other events can be generated from the instructions in a program. For
example there are various programs that can help with the uploading and
downloading of data, these programs generate events.
What is a Protocol?
Definition:
In simple terms, protocols are “Agreed-upon methods of communications
used by computers.”
OR
“When data is being transmitted between two or more devices something
needs to govern the controls that keep this data intact. A formal description
of message formats and the rules two computers must follow to exchange
those messages. Protocols can describe low-level details of machine-tomachine interfaces (e.g., the order in which bits and bytes are sent across
wire) or high-level exchanges between application programs (e.g., the way in
which two programs transfer a file across the Internet).”
(Source: http://www.ichnet.org/glossary.htm)
Internet Protocols
Internet Protocols (set of instructions) are used to transfer files or data from
one machine to the other. All computers on the Internet communicate with
each other using the Transmission Control Protocol / Internet Protocol
(TCP/IP). Thus, data is sent from the server to the client (and vice-versa)
using TCP/IP.
Mostly, the client is your browser and the server is a program running on a
different computer. You use the browser on your computer (called the client
machine in Internet lingo) to access the information on another computer
(called the server machine). This server machine can be located anywhere in
the world.
Other protocols used on the internet include:
 File Transfer Protocol (FTP) used in FTP applications. It is is primarily
used to upload and download files.
 HyperText Transfer Protocol (HTTP) employed on the World Wide
Web.
 The Telnet protocol allows you to connect to another machine. Once
connected, your computer behaves like a terminal of the distant machine
and you can utilize all the resources on it if you have required permissions
 SMTP (Simple Mail Transport Protocol): used for email.
Internet Programming Languages


Although HTML is the most widespread language used on the web, many
programming languages are used in combination with HTML.
As the web has increased in size, so has the demand for more complex
programs and wider ranges of applications. Because of this, a number of
tools and languages are becoming standards on the web. These include:
1. CGI (Common Gateway Interface)
This allows the web server software to communicate with other programs
running on the server. These external programs are called CGI scripts or
CGI program and are usually written in Perl or 'C'. CGI programs are
generally used to process information submitted via a form on a web page
by a visitor.
2. Javascript/Jscript/VBScript
 Javascript is a programming language, which runs on the browser.
NOTE:Javascript is not a subset of Java, infact, the two languages
3.
4.
5.
6.
share little in common (yes, they share the basic concepts but the
syntax is different);
 Javascript runs on the browser (client) and does not require any server
software. Thus, it is a client-side scripting language. Since all execution
takes place on the browser, Javascript is responsible for most of the
interactivity on a web page. Form validation, image change or text
color change on mouseover, creating mouse trails are all possible
through Javascript.
Java
 Developed by Sun Microsystems, Java is a powerful, object-oriented
language. A lot many platform dependency issues have been ironed
out with the advent of Java
 Java can be most commonly seen on the Internet in the form of
applets embedded in an HTML page. Applets are small Java programs
that run on a Java compatible browser.
ASP
Active Server Pages is a technology promoted by Microsoft. The ASP
utilizes some special tags, which can be embedded in the HTML code, to
generate dynamic web pages. ASP scripts run on the server, typically, the
IIS on Windows NT. ASP pages carry the .asp extension that differentiates
them from plain HTML pages and instructs the web server to pass the
pages through the ASP interpreter.
PHP
PHP is a server side scripting language similar to ASP. PHP code is
embedded inside the HTML page and can link to databases to generate
dynamic HTML content.
XML
The eXtensible Markup Language is a web page developing language that
enables programmers to create customized tags. These customized tags
can provide the much-needed functionality not available with HTML. XML
documents can be accessed using JSP, PHP etc.
URLs - What is a URL?






URL stands for Uniform Resource Locator, which means it is a uniform
(same everywhere) way to locate a file (resource) on the Internet.
The URL specifies where a file is located, and how to get it.
Every file on the Internet has a unique address. Web software, such as
your browser, use the URL to retrieve a file from the computer on which it
resides.
The actual URL is a set of four numbers separated by periods
(139.179.40.4).
These octets are difficult for web users to remember, so many (not all)
numeric URLs can be represented by a alphanumeric (text and numbers)
value. For example, www.ctp.bilkent.edu.tr
The Internet Domain Name System (DNS) translates the alphanumeric
address to a numeric value.
URL Format:
Protocol://site address/path/filename
Example:
http://www.ctp.bilkent.edu.tr/~russell/outline106.htm
The structure of this URL is:
 Protocol: http
 Host computer name: www
 Domain name: simplygraphix
 Domain type: com
 Path: /portfolio
 File name 4.html
Like was mentioned before, the protocol does not have to be http, it can also
be FTP, File, mailto, https, telnet etc.
Site address:
 The site address consists of the host computer name, the domain name
and the domain type.
 The domain name should be descriptive for easy comprehension and is
usually the name of the organization or company.

There are various domain types. Some of them are listed below:
com: specifies commercial entities
net: highlights networks or network providers
org: organizations (usually non-profit)
edu: colleges and universities (education providers)
gov: government agencies
mil: military entities of the United States of America


For countries other than the U.S.A., the URL can be longer.
The general format of such URLs is:
machine name. domain name. domain type. country code.
This represents a more localized domain name. The country code is a twoletter extension standardized by the International Standards Organization
as ISO 3166. Some country codes are given below:

tr: Turkey
de: Germany
ca: Canada
jp: Japan
uk: United Kingdom

Domain types can also be different for different countries. For example, an
educational site can have the domain name www.school.ac.uk in the
United Kingdom. Thus ac (academic) is used instead of edu. Similarly
com is represented as co for Indian domain names.
Path name:
Path name specifies the hierarchic location of the file on the computer. For
instance, in http://www.ctp.bilkent.edu.tr/~russell/outline106.htm the file
outline106.htm is located in the russell subdirectory under the servers root
directory.
Port:
Browsers communicate with the server using entry points called ports. A port
is the name given to an endpoint of a logical connection. Port numbers
identify types of ports. Associated with each protocol is a default port number,
such as HTTP defaults to port 80.
The server administrator can configure the server to handle http requests at a
different port. In such cases, the port number has to be supplied as a part of
the URL. The port number is placed at the end of the URL after a colon.
www.some-address.com:50
In this example, if the port number is omitted, any http requests are directed
to port 80.
Common port numbers for other protocols include:
21
FTP
23
Telnet
25
Simple Mail Transfer Protocol (SMTP)
53
Domain Name Server (DNS)
80
Hyper Text Transfer Protocol (HTTP)
107
Remote Telnet Service (rtelnet)
110
Post Office Protocol – Version 3 (POP3)
HTTP protocol- What is HTTP?
Computers on the World Wide Web use the HyperText Transfer Protocol to
talk with each other. The HTTP provides a set of instructions for accurate
information exchange. The communication between the client (your browser)
and the server (a software located on a remote computer) involves requests
sent by the client and responses from the server.
Each client-server transaction, whether a request or a response, consists of
three main parts
1. A response or request line
2. Header information
3. The body
A client connects to the server at port 80 (unless its been changed by the
system administrator) and sends in its request. The request line from the
client consists of a request method, the address of the file requested and the
HTTP version number.
Download