Lecture 24

advertisement
Lecture 24
Web Site Structure
When you decide to have a website for your business or personal interest there are a number of
things you have to consider before you start actually building your website. Website planning has
various steps:
1. Purpose of Website
The first step of website planning should be deciding on the purpose of the website. Determine
what it is that you wish to accomplish with the website. Taking the time to clearly define the
purpose of the website will affect how successfully you reach the goals you set for the project.
2. Determine Target Audience
Ask yourself, "Who is going to be looking at my site?" and "What technologies will your visitors
have?" When planning a website you need to assess what the target audience will be, what
technologies their systems will have and what their computer experience before you can decide on
your website technologies.
Determining your target audience during the website planning stage will give you a wealth of
information that can be used as the website is further developed. This information can be used
when deciding on which website technologies to incorporate, the type of website features you need
and what the target audience is looking for.
3. Website Technical Considerations
Ask yourself, "What technologies do I need?" The website technologies you will require will
depend on the type of website you are building and what type of audience you have decided to
target and accommodate. Have your list of website technologies required ready before you move
to the next step, securing hosting.
4. Website Hosting Costs
Website hosting costs is influenced by website planning. When planning a website be sure that
the web host has room to grow with your site. Are there features included with a slightly more
expensive hosting package that you will need in the future?
5. Website Budget
Ask yourself, "What is my budget?" When planning a website, budget can be a determining
factor as to what features the website will have. Seriously assessing what you can do yourself and
what you need help with will affect the website budget.
What is Web Site?
A website, also written as Web site, web site, or simply site, is a set of related web pages served
from a single web domain. A website is hosted on at least one web server, accessible via a network
such as the Internet or a private local area network through an Internet address known as a Uniform
Resource Locator. All publicly accessible websites collectively constitute the World Wide Web.
A webpage is a document, typically written in plain text interspersed with formatting instructions
of Hypertext Markup Language (HTML, XHTML). A webpage may incorporate elements from
other websites with suitable markup anchors.
Webpages are accessed and transported with the Hypertext Transfer Protocol (HTTP), which may
optionally employ encryption (HTTP Secure, HTTPS) to provide security and privacy for the user
of the webpage content. The user's application, often a web browser, renders the page content
according to its HTML markup instructions onto a display terminal.
The pages of a website can usually be accessed from a simple Uniform Resource Locator (URL)
called the web address. The URLs of the pages organize them into a hierarchy, although
hyperlinking between them conveys the reader's perceived site structure and guides the reader's
navigation of the site which generally includes a home page with most of the links to the site's web
content, and a supplementary about, contact and link page.
Some websites require a subscription to access some or all of their content. Examples of
subscription websites include many business sites, parts of news websites, academic journal
websites, gaming websites, file-sharing websites, message boards, web-based email, social
networking websites, websites providing real-time stock market data, and websites providing
various other services (e.g., websites offering storing and/or sharing of images, files and so forth).
Types of Web Sites & Documents
Static Web Pages
Static web pages don’t change content or layout with every request to the web server. They change
only when a web author manually updates them with a text editor or web editing tool like Adobe
Dreamweaver. The vast majority of web sites use static pages, and the technique is highly costeffective for publishing web information that doesn’t change substantially over months or even
years. Many web content management systems also use static publishing to deliver web content.
In the CMS the pages are created and modified in a dynamic database-driven web-editing interface
but are then written out to the web server (“published”) as ordinary static pages. Static pages are
simple, secure, less prone to technology errors and breakdown, and easily visible by search
engines.
Dynamic Web Pages
Dynamic web pages can adapt their content or appearance depending on the user’s interactions,
changes in data supplied by an application, or as an evolution over time, as on a news web site.
Using client-side scripting techniques (xml, Ajax techniques, Flash ActionScript), content can be
changed quickly on the user’s computer without new page requests to the web server. Most
dynamic web content, however, is assembled on the web server using server-side scripting
languages (asp, jsp, Perl, php, Python). Both client- and server-side approaches are used in
multifaceted web sites with constantly changing content and complex interactive features.
Dynamic web pages offer enormous flexibility, but the process of delivering a uniquely assembled
mix of content with every page request requires a rapid, high-end web server, and even the most
capable server can bog down under many requests for dynamic web pages in a short time. Unless
they are carefully optimized, dynamic web content delivery systems are often much less visible to
search engines than static pages. Always ask about search visibility when considering the merits
of a dynamic web content system.
Web Content Management
Enterprise Web Content Management Systems
Web content management systems enable large numbers of nontechnical content contributors to
update and create new web pages with ease within the context of large, enterprise-wide web sites
that may contain thousands or even millions of pages of content. These systems offer some
variation on these three core features:



Editorial workflow, an approval process, and access management for individual web
authors
Site management of pages, directories, content contributor accounts, and general system
operations
An interactive user interface, usually browser-based, that doesn’t require technical
knowledge of the web, html, or css to create web content
In a typical cms-driven web site, the web editing workflow is as follows:
1. A domain expert, local department staffer, or writer adds, updates, or otherwise modifies
the content of a page, using a web browser to access the cms features and perform editing
and site management functions;
2. The finished content is routed by a series of notifications to the designated approver for
content in that area of the larger web site;
3. The approver reviews the new content and either releases it for publication or sends it back
for revision; and
4. The cms assembles the approved content for publication and, on larger web sites, is
typically published to a “live” server on the Internet at specified intervals during the day.
Most cms products can also handle instant site updates if needed.
The text, graphic, and site management tools in a cms are designed to allow users with little or no
knowledge of html or css to create and manage sophisticated web content. Most large corporate,
enterprise, and university sites are now managed with a cms in a decentralized editorial
environment where hundreds of individual authors, content approvers, editors, and media
contributors create most of the content for the enterprise’s sites.
Most enterprise cms products use a database to store web content. Text and media files (graphics,
photos, podcasts, videos) are often stored as xml to facilitate reuse and enable flexible presentation
options, permitting content to be updated simultaneously on a variety of web pages. cms products
use templates to provide a consistent user interface, enterprise identity branding, and typographic
presentation throughout the site. cms templates increasingly are complex xslt (Extensible Style
Language Transformation) files that modify and transform xml content into web pages for viewing
in conventional web browsers, in special formats for visually impaired readers, on mobile devices
like cell phones, and in convenient print formats.
Blogs
Owing to their ease of use and the ready availability of supporting software, web logs, or blogs,
are the most popular, inexpensive, and widespread form of web content management. Blog
software such as Blogger, Roller, or WordPress allows nontechnical users to combine text,
graphics, and digital media files easily into interactive web pages.
A blog is actually a simple cms, typically designed to support three core features:



Easy publication of text, graphics, and multimedia content on the web
Built-in tools that enable blog readers to post comments (an optional feature)
Built-in rss features that allow subscribers to see when a blog site has been updated
The typical blog content genre is an online diary of life events (personal blogs) or short
commentary on particular subject (politics, technology, specialized topics), but blog software can
easily be adapted to support collaborative work within social groups or internal and external
enterprise communications. For example, many universities have adopted blog software as a
simple cms that allows nontechnical faculty and administrators to quickly post notices, emergency
announcements, and other timely material.
For a small (ten-to-twenty-page), special-purpose, small business, or department web site, a blogbased site may be all you need to get up and running quickly with a set of friendly, nontechnical
editing tools and (usually) such built-in features as calendars, automated category and navigation
controls, and automatic RSS feeds. If the blog metaphor of posted-content-plus-reader-comments
doesn’t suit your purpose, turn off the comments features and you have a friendly web site
development and editing tool plus a lightweight CMS in one inexpensive package.
Wikis
A wiki is a specialized form of content-managed web site designed to support the easy
collaborative creation of web pages by groups of users. Wikis differ from blogs and other cms
options in that wikis allow all users to change the content of the wiki pages, not just to post
comments about the content. Wikis such as the well-known Wikipedia online encyclopedia can be
publicly accessible and edited by any user, but wiki software can also be used to support more
private collaboration projects, where only members of the group can see and edit the wiki content.
Popular commercial wiki tools like PBwiki, MediaWiki (used by Wikipedia), and JotSpot offer
search, browsing, and editing features, as well as account management and security features to
limit access to selected users. In wikis the changes to content are typically visible instantly after
changes are made, and the workflow model is “open,” without a formal approval process for new
content changes and additions. This open model allows fast progress and updates by many
contributors, but may not be suitable for projects that handle sensitive or controversial material
that is visible to the reading public on your enterprise intranet or the larger World Wide Web
audience.
RSS
Really Simple Syndication is a great way to generate a set of “headlines” and web links that can
appear many places at once on the Internet or your local enterprise intranet. rss is a family of xmlbased feed formats that can automatically provide an updated set of headlines, web links, or short
content snippets to many forms of Internet media. rss can be read by a variety of display software,
including many email programs, major web browsers (Firefox, Internet Explorer, Opera, Safari),
specialized rss aggregator software like Surfpack or FeedDemon, and web portal sites such as
iGoogle, MyYahoo!, and other customizable corporate and Internet portals. Most blog software
can generate rss feeds to notify users of updated content, and there are many special-purpose rss
feed authoring programs on the market. Once the rss feed file is created by a blog or generated by
desktop rss software and placed on a web server, the feed can be addressed with a conventional
url (uniform resource locator) just like a web page (http://whatever-site.com/my-rss-feed.xml).
Every time you update the rss feed file your users see the new headlines in their email, web
browser, or portal page.
What is Domain Name
A domain name is an identification string that defines a realm of administrative autonomy,
authority, or control on the Internet. Domain names are formed by the rules and procedures of the
Domain Name System (DNS). Technically, any name registered in the DNS is a domain name.
Domain names are used in various networking contexts and application-specific naming and
addressing purposes. In general, a domain name represents an Internet Protocol (IP) resource, such
as a personal computer used to access the Internet, a server computer hosting a web site, or the
web site itself or any other service communicated via the Internet.
Domain names are organized in subordinate levels (subdomains) of the DNS root domain, which
is nameless. The first-level set of domain names are the top-level domains (TLDs), including the
generic top-level domains (gTLDs), such as the prominent domains com, info, net and org, and
the country code top-level domains (ccTLDs). Below these top-level domains in the DNS
hierarchy are the second-level and third-level domain names that are typically open for reservation
by end-users who wish to connect local area networks to the Internet, create other publicly
accessible Internet resources or run web sites. The registration of these domain names is usually
administered by domain name registrars who sell their services to the public.
A fully qualified domain name (FQDN) is a domain name that is completely specified in the
hierarchy of the DNS, having no omitted parts. Domain names are usually written in lowercase,
although labels in the Domain Name System are case-insensitive.
Top Level Domain
The top-level domains such as .com and .net and .org are the highest level of domain names of the
Internet. A top-level domain is also called a TLD. Top-level domains form the DNS root zone of
the hierarchical Domain Name System. Every domain name ends in a top-level or first-level
domain label.
Second Level Domain
Below the top-level domains in the domain name hierarchy are the second-level domain (SLD)
names. These are the names directly to the left of .com, .net, and the other top-level domains. As
an example, in the domain example.co.uk, co is the second-level domain.
Next are third-level domains, which are written immediately to the left of a second-level domain.
There can be fourth- and fifth-level domains, and so on, with virtually no limitation.
What is HTML
HyperText Markup Language (HTML) is the main markup language for creating web pages
and other information that can be displayed in a web browser. A file in HTML is first of all, a file
saved with a recognizable extension. The most common extension is htm as the traditional format
of Microsoft Windows files. The other extension that is s valid is html. Any one of these two
extensions makes it a valid HTML file. If the file you are working on has already been saved, it
should have a valid extension already. Otherwise, after creating a text file, to save it as HTML,
make sure you provide a valid extension.
An HTML document is an ASCII text file that contains embedded HTML tags. On a UNIX server,
it typically has a filename extension of .html. In general, the HTML tags are used to identify the
structure of the document and to identify hyperlinks (to be highlighted) and their associated URLs.
HTML identifies the structure of the document and it suggests the layout of the document. The
display capabilities of the Web browser determine the appearance of the HTML document on the
screeen.
Using HTML you can identify:

The title of the document






The hierarchical structure of the document with header levels and section names
Bulleted, numbered, and nested lists
Insertion points for graphics
Special emphasis for key words or phrases
Preformatted areas of the document
Hyperlinks and associated URLs
HTML cannot control the:





Typeface used for any document component
Point size of any specific font
Width or height of the screen
Centering, spacing, or line breaks of information, except in preformatted text
Background, foreground, or highlight colors
These things all depend on the browser, which may allow the user to control them.
Autoflowing and Autowrapping
The most basic element in the HTML document is the paragraph. The Web browser flows all the
contents of the paragraph together from left to right and from top to bottom given the current
window or display size. This is called autoflowing. How you break lines in that paragraph in the
HTML is irrelevant when that page is displayed by a Web browser.
The Web browser wraps anything that doesn't fit on the current line, putting it on the next line. For
example, a paragraph that displays six lines long on an 8-inch wide window rewraps to be about
12 lines long if the user resizes the Web browser window to be half as wide. This is called
autowrapping.
Your document will be read by both graphical and character-based Web browsers. Furthermore,
there will be display differences with graphical Web browsers given different screen resolutions.
So just because one browser breaks a line at one place, that doesn't mean others will do so at the
same place. Just remember that on the Web, you live in a world that is left-justified and flows from
top to bottom.
HTML Tag Syntax
When writing HTML, you add "tags" to the text in order to create the structure. These tags tell the
browser how to display the text or graphics in the document. HTML tags are encapsulated within
less-than (<) and greater-than (>) brackets. Some of the tags are single-element tags that can stand
by themselves. These are referred to as standalone tags. The syntax is simple:
<tag>
The most common standalone tag is <P>, which ends a paragraph.
Other tags are used in pairs. The beginning tag tells the Web browser to start the tag function and
the ending tag tells the Web browser to stop. The ending tag is created by adding a forward slash
(/) to the beginning tag. The syntax is:
<tag>object</tag>
The tag identifies the function that is being applied to the object. For example, if you wanted to
add special emphasis to a phrase, you would encapsulate the phrase with the <EM> tagging pair
as illustrated:
<EM>text to emphasize</EM>
Many of the standalone tags and the beginning tag of tagging pairs can have options included. So
to be complete the syntax is:
<tag option1 option2 option3>
Document Construction Guidelines
Now let's look at the three tagging pairs used to create the highest level of structure in an HTML
document:
<HTML> entire HTML document </HTML>
<HEAD> document header information </HEAD>
<BODY> body of the HTML document </BODY>
The following is a skeletal HTML document that shows the required nesting of these three
tagging pairs:
<HTML>
<HEAD>
Head elements
</HEAD>
<BODY>
Body elements and content
</BODY>
</HTML>
The Header
The HTML header contains several notable items which include:
1. doctype - This gives a description of the type of HTML document this is.
2. meta name="description" - This gives a description of the page for search engines.
3. meta name="keywords" - This line sets keywords which search engines may use to find
your page.
4. title - Defines the name of your document for your browser.
Elements in the Header
Elements allowed in the HTML 4.0 strict HEAD element are:






BASE - Defines the base location for resources in the current HTML document. Supports
the TARGET attribute in frame and transitional document type definitions.
LINK - Used to set relationships of other documents with this document.
META - Used to set specific characteristics of the web page and provide information to
readers and search engines.
SCRIPT - Used to embed script in the header of an HTML document.
STYLE - Used to embed a style sheet in the HTML document.
TITLE - Sets the document title.
HTML BODY
The HTML body element will define the rest of the HTML page which is the bulk of your
document. It will include headers, paragraphs, lists, tables, and more.
An example body section:
<body text="#000000" bgcolor="#FFFFFF" link="#0000FF" vlink="#000080"
alink="#FF0000">
<h1 style="text-align: center">HTML Document Structure</h1>
<p> This is a sample HTML file. </p>
</body>
</html>
This example controls the body background, wallpaper, and link color directly rather than using
style sheets.
The BODY Element Tags and Attributes
The <body> tag is used to start the BODY element and the </body> tag ends it. It is used to divide
a web page within one or more sections. Its tags and attributes are:
<body> - Designates the start of the body.








ONLOAD - Used to specify the name of a script to run when the document is loaded.
ONUNLOAD - Used to specify the name of a script to run when the document exits.
BACKGROUND="clouds.gif" - (Depreciated) Defines the name of a file to use for the
background for the page. The background can be specified as in the following line.
BGCOLOR="white" - (Depreciated) Designates the page background color.
TEXT="black" - (Depreciated) Designates the color of the page's text.
LINK="blue" - (Depreciated) Designates the color of links that have not been visited.
ALINK="red" - (Depreciated) Designates the color of the link currently being visited.
VliNK="green" - (Depreciated) Designates the color of visited links.
</body> - Designates the end of the body.
Download