How the Web Works - Faculty Personal Homepage

advertisement
How Does One Describe the
Internet?
• It is a bit like describing a city:
– Its Location on the map
– The terrain
– It’s architecture
– The Kinds of people who Live in it
– It’s Politics
– The Business Climate
– It’s History
and so on.…
What
is really
the Internet?
• Internet
is network
of networks.
• The Internet is a peer-to-peer network
• These computers communicate with one another in
a consistent fashion
• Users on one computer can access services from
other computers.
• You can access a wide variety of these services
• Each service can give you many kinds of
information.
• In summary: The Internet is, a)A way to move data
from one computer to another; b)A bunch of
protocols.
Three is One
• The Internet, technically, is also described as a Client/Server system.
• Essentially, all that we speak about concerning the Internet falls in one of
three categories: client, server or content.
• To illustrate: the software/hardware that we use to surf the Web, or send
mail, or upload files is called Client; the hardware/software that stores
information in a format that can be accessed using Web client software and
responds to client requests is called Web Server, similarly, the
software/hardware that stores mail and forwards it to the client is referred
to as mail server; likewise, that system, and the software that runs on it,
which accepts uploaded files and also allows authorised users/clients
download files from it is an FTP Server.
• Content, as the name implies, is the actual content of the web page or the
mail document or the file being uploaded/downloaded. Content can be
either text, image, sound, video, animation and the like.
What really is a Service?
• On the Internet, you can use many methods to communicate with a
computer somewhere else on the Internet. These methods are called
services because they service your requests.
• A few of the most popular Internet services are:
–Email: Electronic mail
–FTP: File Transfer Protocol for transferring computer files
–WWW: World Wide Web
–Gopher: Searchable index, selectable index of documents
–USENET: Newsgroups with different subjects enable people with common
interest to share information
–Telnet: Remote login into computer networks
–Chat: Real-time communications between people on the Internet
Email
• The most popular Internet technology with 70 million users
• Email has become the de-facto standard of communication within the
corporate and beyond
• Email works between disparate systems like PC, UNIX, Mac,
Mainframe, VAX, etc.
• The latest Email standards let users attach files (even audio, video,
animation etc) , and active URL addresses.
• The volume of data transferred is reaching billions of bytes everyday.
Email (contd.)
• Advantages
–
–
–
–
–
–
Standard way of communication for corporations
There is less interference or interrupts between work
Reply with a number of options
There is no cost within the environment
Less chances of miscommunication
Saving messages for future retrieval and records
• Disadvantages
– You need to have a computer and network connection
– Less personal then voice
How E-Mail Works
• Like other Internet services e-mail is yet another client-server system, called
SMTP (Simple Message Transfer Protocol).
• As the figure next indicates, you use a mail client program to send a message to
a post office sever (an SMTP server).
Click
• The post office server identifies the recipient's address and sends the message
through the Internet to the mail server that handles mail for each recipient's
address.
• The mail server stores the message in the recipient’s mailbox
• The recipient uses an e-mail client program to request new messages from the
mail sever.
• The mail server sends the messages in the recipient’s mailbox back to the
client.
How E-Mail Works
Sender’s
Mail
Client
Post Office
Server
(SMTP)
Mail
Server
(POP3)
Recipient’s
Mail
Client
If the sender has an account on a system that does not
use SMTP,such as a Novell MHS (Message Handling
Service) e-mail network or an online service like AOL or
CompuServe, there’s an additional step between the
mail server: a gateway that converts the address
information from SMTP format to the format used by
the other system, or from the other format to SMTP.
Hotmail, Yahoo Mail, Rocket Mail ….
• Well, Hotmail and it’s cousins which are all getting to be very popular
because they offer free e-mail accounts, basically use Web-technology
to help you receive mail, send mail and the like.
• The disadvantage is that you have to wait longer to read every
individual mail; and this can be quite a frustrating experience if the
mail is plenty and the lines are slow--which they are anyway, most of
the time!
• The major advantage, however, is that these e-mail services offer
people in corporate organisations and people on-the-move access to
their mail from virtually anywhere they can access the WWW on the
Internet. Moreover, for those who would like their personal mail to be
private and beyond the reach of their colleagues in the office, these
services offer total privacy because the mail is left on the mail
companies server and not on the company server from where others in
the organisation can access private mail.
•
FTP
File
Transfer
Protocol
The most common way for sending and receiving files on the
Internet
• You can store files in any format
– All formats including Microsoft Word, Word-perfect, Excel, Pictures,
Text, Raw data, etc.
• Companies provide FTP sites for downloading of evaluation
software, demos and beta software
• FTP runs on all popular platforms
• FTP can be run either via console, GUI or browser
• ftp://ds.internic.net/ ( ask your instructor to actually demonstrate
the sending and receiving of files via FTP).
Anonymous FTP
• FTP servers are fairly straightforward; when a server receives a file request
from an FTP client, it sends a copy of that file back to the client. Other
commands instruct the server to send the client a directory of files or to accept
an upload from the client.
• Most often you will use FTP to download files from public file archives on
FTP servers. These archives are sometimes known as “anonymous FTP”
archives, because they accept the word “anonymous” as a login name, with the
user’s
e-mail address as a password.
• Not all FTP servers accept “anonymous” as a login name.
gopher://gopher.proper.com/11/pc
Gopher
• Uses client-server architecture to browse through the
directory and file information
• Automatically opens the application depending upon the
file once downloaded
• The menu can point to FTP archives, telnet services,
gopher servers, and more
• Weakness:
– Does not allow intermixing of graphics and text
– Does not allow links from certain positions
• ‘Gopherspace’ is the equivalent of the term “the web”.
• Web overcomes the weaknesses of Gopher
Usenet Newsgroups
• The most interactive, personal and fun of all the services on the
Internet.
• The content in Usenet is created by anyone who wants to talk.
• Thousands of Newsgroups available on every single subject you can
imagine
– soc.culture.india,comp.pc, comp.unix
• You can subscribe to selective newsgroups and get only the necessary
news.
• Netscape 2.0 has made handling Usenet news very simple and easy.
• VSNL, our Internet Service Provider (ISP) has just begun to support
this service).
• You may connect to many newsgroup servers and access newsgroups.
Telnet
• Remote login into computer networks and compute on the
remote computer
• Connection can be established using SLIP, PPP or
dedicated lines
• Usually available in the universities and Internet Service
Providers
• Weakness
– Only console applications can be run. No GUI support unless X terminals
are used
– Security risk because hackers can trap the IP address of the network
• Least used part of the Web.
• Cannot use existing Web Browsers to Telnet.
World Wide Web
• The most graphical and powerful of the Internet
technologies
• Powerful linking features allow you to browse or surf,
hence the name browsers.
• Most easiest to use. Just click and and you will go from
one page to another, from one server to another without
geographical barriers
• World Wide Web = Text + Graphics + Multimedia +
Communications.
World Wide Web (contd.)
• Current Technologies include audio, video, 3D, Virtual
reality, secure transactions, plug-ins
• Near Future technologies: Web enabled applications,
Client-server, Electronic commerce.
• Uses Client-Server architecture
• Created in 1991-92 by Tim Berners-Lee at CERN, Geneva.
How the Web Works: HTTP
• The most interesting part of the way the Web works is its
simplicity. The transaction takes place in four basic phases,
all part of the underlying HTTP (Hyper Text transfer
Protocol):
–
–
–
–
Connection
Request
Response
Close
How the Web Works:
The Connection Phase
• In the connection phase, the Web client (for example
Netscape) attempts to connect with the server. This appears
on the status line of most of the browsers in the form of
Connecting to HTTP server. If the client can’t perform the
connection, nothing further happens. Usually, in fact, the
connection attempt times out, yielding an explanatory
messge saying Unable to connect to server, try again, etc.
How the Web Works:
The Request Phase
• Once the connection to the HTTP server is established, the
client sends a request to the server. The request specifies
which protocol is being used (including which version of
HTTP, but it can also be FTP, NNTP, Gopher, etc. Included
in the request is the method, which essentially is the
client’s command to the server. The most common method
is GET, which is basically a request to retrieve the object
in question.
How the Web Works:
The Response Phase
• Assuming the server can fulfill the request (it sends error
messages if it can’t), it then executes the response. You’ll
see this phase of the transaction in your browser’s status
line, usually in the form Reading Response. Like the
request, the response indicates the protocol being used, and
it also offers a reason line, which appears on the browser’s
status line. Depending on the browser, you’ll see exactly
what is going on at this point, usually represented by a
Transferring message.
How the Web Works:
The Close Phase
• Finally, the connection is closed.
• At this stage, the browser springs into action. Effectively, it loads
and displays the requested data, saves the data to a file, or
launches a viewer or helper application if the need arises. If the
object is a text file, the browser will display it as a nonhypertext
ASCII document. If it is a graphic image (such as a BMP file),
the browser will launch the graphics viewer specified in its
configuration settings. If it’s a sound or video file (AU, WAV,
MPEG, AVI, FLI, AIFF, ) the browser will launch a similarly
configured helper application or plug-in. Usually, however the
browser displays an HTML document. These documents show
the graphics, links, icons, and formatting for which the Web has
become so famous.
How to find information on the
Web
• The most common way to find
information is using the following
services
– Yellow pages
• Yahoo, GNN
– Search Engines
• InfoSeek, WebCrawler, Alta-Vista, Lycos
Web- Directories and Web-Indexes
• A Web-directory, like Yahoo, maintains a database of all the Web sites
by recording the company name and other important information from
the Web-pages like captions, etc.
• On the other hand, a Web Index, like Alta-Vista, maintains exhaustive
information of every Web-site by picking up all important and keywords from every single page of the site.
• A Web directory can be compared to the contents page and a WebIndex to the index pages of a book.
• If you are looking for Ajmals then you can find it easily using a Web
Index but if you are looking for Hamrayn Centre, chances are that
Yahoo won’t find it for you. But, ofcourse, Yahoo refers things it
cannot find to AltaVista.
Domain Name
• On the Internet every computer has a unique address and a
unique name
• Unique address is called IP address. For example
205.184.60.1
(the numbers are always between 0 and 255)
• The unique name is called Domain name. For example
webplaza.com
• Domain names enable us to easily remember the server or
network
• For example webplaza.com is more easy to remember than
205.184.60.1
What’s in a Domain Name?
• When a domain name is specified, software converts that name to
an IP address by looking up the name in a table of addresses.
• If I want to know the address for the name xxx.yyy.zzz, I’ll ask the
computer at yyy.zzz because it knows the address of everything
that ends with yyy.zzz.
• If I don’t know the address of yyy.zzz, I’ll ask the computer at zzz
because it knows the address of everything that ends with zzz.
• The names in zzz are predefined by a committee.
• Some of the most common names in place of zzz are com, edu, and
net.
• “com” in webplaza.com is referred to as the root domain or toplevel domain.
Top Level Domains
The top level domainsin the world are :
– .com
(company)
– .edu (educational Institutes)
– .gov (government)
– .int
– .mil
– .org (organizations)
What’s in a Domain Name?
• The computer identified by a particular domain name is not
necessarily always the same.
– The server whose address is www.PCS Computer Academy.com today,
for example,
may be a computer in New York , but that server may change addresses
tomorrow
to a computer in Boston while keeping the same name.
– The physical location of the computer identified by the name is not
important.
• Moreover, one domain name can point to more than one IP
address.
– This feature helps server administrators create duplicates of their servers
to speed up access for Internet users.
Applying for an Internet Domain
Contact:
• Your local Service Provider
• Your country’s network Information Centre
http://www.apnic.net/
• The InterNIC
http://www.internic.net/
Applying for an Internet Domain in
India
domain-reg@sangam.ncst.ernet.in
or
Domain Registrar
National Centre for Software Technology
8th. Floor, Air India Building
Nariman Point
Bombay-400 021
Telephone: 91-22-2024641, 91-22-2836924
Fax: 91-22-6210139
Applying for an IP Address in
India
India Network Information Centre
ERNET Project
ECE Department
Indian Institute of Science
Bangalore560 012
Telephone: 91-80-3312312, 91-80-3340855
Fax 3347991
URLs or Uniform resource Locator
• URL is the current method for specifying the addresses of
things on the Web.
• A URL tells the Web client the following three things:
– The type of Internet service that your client uses to get the item.
– The name of the computer on which the service resides.
– The request for the item you want (this part may be blank).
http://www.msn.com/tutorial/default.html
WWW Browsers
• WWW browsers are applications which display Web Pages
– Netscape Navigator, Microsoft Internet Explorer, NCSA’s Mosaic
are some of the popular browsers or Web clients.
• They interpret HTML language, display graphics, play
audio and video, simulate virtual reality...
• Netscape and Microsoft are moving towards defining
browser-centric computing where the browser becomes the
OS itself for doing everything
Popular Web Browsers
• First GUI Web browserMosaic from the NCSA. Still used
by many.
• Netscape Communicator (current version 4.0) is the most
popular Web browser. Netscape Communications claims
that it has sold 40 million copies.
• Microsoft’s free Internet Explorer 4.0 soon catching up.
• Lynx is the most commonly used Web browser for textbased browsing (Shell A/c.users)
HTML
• Stands for Hyper Text Markup Language
• The basis for World Wide Web because different computer
systems can display the information in the same manner of
the common HTML language they speak
• HTML is a subset of SGML the comprehensive standard
for documentation in large corporations
• HTML is easy to use, similar to shell scripts
• Proliferation of HTML WYSIWYG editors
• HTML 3.2 due for release
• HTML+, Enhanced HTML expected soon
Connecting to the Internet
• Things needed to connect to the Internet
– Computer
• PC, SUN, Mac or other
– ISP connection
• Dial-up connection
– Telephone connection, ISDN
• Dedicated leased lines
– TI, EI, ATM, SONET
– Software
• Email client
• WWW browser
• tcp/ip network software
Internet Service Providers
• Provide connection to the Internet, just like telephone
companies give connection to Telephone network.
• Connection Options:
– Dial-up Connection: Data over telephone lines, speeds
upto 33 KBPS
– ISDN: Integrated Service Digital Network: Even
though around for a long time, getting very popular
now, Speeds upto 128 Kbits/sec
Internet Service Providers
• Connection options:
– Leased Lines: The most popular way to connect
for bigger companies, Speeds start at 56
KBits/sec
– T1: Large companies or companies with huge
bandwidth requirements, 1.5 MBits/sec
– E1: Multinationals like IBM or Back-bone
providers / Large Internet Service Providers
like Netcom, PSInet or UUNET
Internet Presence Providers
• Concentrate on creating and hosting content
• Services include:
–
–
–
–
Home Page creation including Graphics
Integration of Databases with WWW
Hosting, maintaining web sites
Co-location of Server for WWW, FTP, etc.
• Internet Presence Providers are, most of the time,
the deciding factor of failure or success of
marketing the products on the Web
Dial-Up Options:
Accessing Internet Services via a
Dial-up Terminal
Providers Premises
INTERNET
TCP/IP
S
E
R
V
E
R
User’s Premises
RS232C
Shell
Provider
M
PSTN
RS232C
M
DIAL-UP OPTIONS:
Accessing an Int ernet Server via a TCP/IP
enabled User system
Provider’s
Premises
U se r’s Pre m ise s
T CP/I P
I N T ERN ET
S
E
R
V
E
R
Y our PPP
Provide r
PST N
SLI P/PPP
SLI P/PPP
M
Di a l u p
M
Di a l u p
PCS Computer Academy, Bangalore
Leased Line Option:
Accessing an Internet Server via a TCP/IP
enabled User system
Provider’s
Premises
Internet
S
E
R
V
E
R
User’s Premises
PSTN
SLIP/PPP
SLIP/PPP
HDLC
HDLC
M
Leased Line
M
Leased Line
Origins of the Internet
• Result of research project of US Defense during
1970’s called ARPA.
– ARPAnet = Advanced Research Projects Agency
• Goal of the project:
– 1. Connecting dissimilar computer systems to
communicate
– 2. Route data through multiple communication paths so
that the network would be able to run even if many of
the computers or the connections between them failed.
Origins of the Internet
• In 1980’s National Science Foundation(NSF) started
promoting its own network called NSFNET using ARPA
technology and a high speed back-bone network.
• These networks increased emails and information sharing
between universities and research centers.
• To overcome the problem of connecting dissimilar
computers, by 1983, all computers on the ARPAnet were
required to use TCP/IP.
• This gave birth to Internet with many changes.
• Any computer that uses the TCP/IP networking protocol
and is physically connected to another computer on the
Internet is itself on the Internet.
The NSFnet Backbone
• Some of the computers on the Internet are directly
connected to each other through the NSFnet backbone.
• The backbone is a series of cable and connecting hardware
that pass data at very high speeds (45 million bits per
second).
• About 10 sites throughout the US form the basis for this
backbone.
• Any computer connected directly or indirectly to the
backbone can be considered part of the Internet.
• As long as a single computer in one country is connected
to another computer that is connected to the backbone ,
that country has access.
The NSFnet Backbone
Who Runs the Internet
• No one owns or runs the Internet
• Every computer connected to the Internet is
responsible for its own part.
• The National Science Foundation is responsible
for maintaining only the backbone.
• If something doesn’t work, you do not complain to
the ‘management’ of the Internet. Instead you talk
to the system administrators of the computer you
are connected to.
Internet Technical Groups
• The Internet is not really a free-for-all with no one
guiding it.
• There are a few organizations who give the
Internet some structure while creating a minimum
Name
Abbreviation
number of restrictions.
IETF
Internet Engineering Task Force
IRTF
Internet Research Task Force
IAB
Internet Architecture Board
InterNIC
Central Naming Clearinghouse
Internet Technical Groups
• IETF: develops and maintains the Internet’s
communication protocols.
• IRTF: looks into long-term research problems.
• IAB: oversees the IETF and IRTF and ratifies any
major changes to the Internet that come from the
IETF.
•
The
INTERNET
SOCIETY
The three groups discussed above mostly facilitate
the technical structure and details of the Internet.
• In 1992, the Internet Society (also called ISOC)
was formed to help connect the user-oriented
people with the technical people.
• It is the parent society to the IETF and appoints
the members of the IAB.
• Unlike the other societies the Internet Society
doesn’t control anything. It keeps its members
informed about the Internet.
Internet Demographics
• Sex
– Still dominated by men
• Age
– Wider use in 20 - 40 age group
• User growth
– Huge
• Income
– Affluent
• Country
– USA still dominates hence constitute the prime target for markets
Internet Demographics: Age
25
20
15
Age
10
5
0
16-20
26-30
41-50
Internet Demographics: Gender
80
60
40
Percentage
20
0
Me
n
W
om
Internet Demographics: Internet
Hosts Growth
10
9
8
7
6
5
4
Growth(Millions)
3
2
1
0
1991
1992
1993
1994
1995
1996
Internet Demographics: Annual
Income
40
30
20
Percentage
10
0
< 35 60 >
35 - 10
Country wide distribution of
Networks
Country Code Number of
Networks
China cn
8
India
in
13
UK
uk
1436
USA
us
28470
Initial
Connection
04/94
11/90
04/89
07/88
Country wide distribution
of Networks - Graph
USA
Germany
UK
Canada
France
Japan
Australia
Netherlands
Others
Network Distribution within
USA
State
Code Number Rank
of
Networks
South Dacoto
Virginia
New York
Massachusetts
California
SD
VA
NY
MA
CA
15
1964
2152
2005
4832
Lowest
4
3
2
1
The Internet Phenomena
• Biggest development since original
PC
• High levels of investment
• Rapid innovation
• A revolution in communication
Internet Today!
• Growth is close to 10% per month!
• 7 million host computers connected to the Internet
• 50 million world-wide have access to email on the
Internet
• Internet search and retrieval increasing by 1000%
annually
• 1200 Internet articles appearing every month
• 60,000 networks world-wide
What do you do on the Internet
•
•
•
•
•
Search and Retrieve Documents
Exchange e-mail( 70 million email addresses)
Download programs, demos and graphics
Search databases of Companies and Government
Read and Response to USENET groups (30,000
different topics)
• Real-time chat, web-phone and video conference
(Internet Relay Chat: recently VSNL has added
this service too).
What do you do on the Internet
• Browse and Search catalogs of goods and services,
and make purchases
• Distribute electronic publications
• Sell products and services
• Publish your company and products information
Using the Web for Doing
Business Online
• Most popular use of web is for doing
business.
• The web is more interactive than
newspaper, magazines and TV.
• Web has generated $200 billion of
revenues in 1995
• The growth of the web is 10% every
month.
Business use of the Web
•
•
•
•
•
•
•
Communication
Info. Management & Distribution
Customer Service/Technical assistance
Cost Containment
Research
Recruitment
Marketing and Sales
Business use of the Web
• Communication
– Web is a multimedia communication
Powerhouse
– Regular use of text, graphics and sound
– Communication can be internal or
external to your enterprise
– User can interact thro’ data entry forms
and e-mail
Business use of the Web
• Info. Management & Distribution
– The Intranet is picking up and encompassing clientserver and groupware
– Management and Distribution of information is widely
used between employees, work groups, offices, etc.
– Information can be updated regularly e.g.. price lists,
specifications, inventory, etc.
– Human resources information on-line with password
protection too!
Business use of the Web
• Customer Service & Technical Assistance
– One of the most successful uses of the web in
business is in the area of customer support and
technical assistance
– Technical support with FAQ’s
– Software patches and upgrades
– Database of bug resolutions
– Customer feedback
Business use of the Web
• Customer Service & Technical
Assistance
– Documentation of Products and Services
– Company product descriptions
– Announcements of special sales
– Price lists of products and services
– Specification sheets and Technical notes
– Pictures and drawings of products
Business use of the Web
• Cost Containment
– E-mail saves on communications costs.
– In marketing and advertising, reach a
large amount of potential customers
rather inexpensively
– Intranets save on paper and printing costs
– Cut costs on Internet Commerce
transactions
Business use of the Web
• Research
– Use the web to locate existing
databases and other collection of
information for market research
– Business people can get a wealth of
information on web including stock
quotes, weather information, air/train
timings, up-to-date news, etc.
Business use of the Web
• Research
– Good web site can keep track of the
profiles of the customers who visit. This
can be valuable data to market your
products.
– Most universities have web sites and
keep their research data for access by the
public. This can be valuable information
for free.
Business use of the Web
• Recruitment
– Many business’ use the web to recruit
employees, consultants and contractors.
– Your web site can provide in-depth
information about your business to
potential employees.
– Resume’ information may be sent via
email to recruiting agents.
Business use of the Web
• Marketing and Sales
– Is the most popular business use of the
web
– Advertising and Brand Name
Recognition
– Visibility
– Public Relations
– Press Releases
– Direct Sales
Download