The Electronic Library and other means of finding information on the

advertisement
The Electronic Library and other
means of finding information on
the web
Patrick & Michael Krolak
See: Week 4 in Intralearn or
http://www.cs.uml.edu/~pkrolak/lab18/lab18.html
The Electronic Library
E-libraries are the electronic front
doors to the modern physical
library.
What is the e-library
• The e-library resides on the Internet and
allows the user access to the library’s
catalogs and electronic collections (24
hours a day – 7 days a week).
• The user can use it to locate the books in
the public access catalog.
• Access electronic books and journals.
• Access to newspaper archives such as the
Boston Globe, Boston Herald, etc.
Electronic Journals and Newspaper
Archive
• Most Newspapers charge for articles from
it archive. The major state newspapers
archives are available through the e-lib.
• The Lexis Nexis sites have current
business, legal cases, and bios.
• Electronic journals can provide abstracts
and in some cases the full article.
Special Electronic Reference
Materials
Every library whether public or personal should have a
collection of materials readily available that contains
useful facts and data. Reference materials can be a
quick and accurate means of developing marketing
and business plans, determining business law and
custom, rates of exchange and interest, holidays, etc.
The dictionary can provide useful insights even for the
scholar.
Example of References
• Dictionaries, thesaurus, and similar word orientated materials:
• No college student, researcher, or writer should be without an online
collection of style guides, dictionaries, and related reference
materials.
• Dictionary, English Style Guides, and Reference Materials for the
Writer
• Thesaurus Roget's
• Online dictionaries of over 230 languages, thesaurus, and other
useful word and language tools, by yourdictionary.com
http://yourdictionary.com/
• Quotations:
• An extensive online reference for quotes, dictionary including
pronouncing the word, etc. http://bartleby.com/
The Browser’s Bookmark and
Favorites Lists
• Bookmarks and favorite lists are simple
means of creating a set of useful reference
links…
• Can be used in linking personal sites,
frequently visited sports sites, and
business.
• Travel and entertainment
Public and University e-libs in
Massachusetts
• The Boston Public Library, http://www.bpl.org/
• The CLAMS library network for Cape
Cod: http://www.clamsnet.org/
• The Minute Man Library Network Consortium of
Libraries made up of the Eastern Massachusetts
region -- Route 128 suburban public libraries
and academic institutions.
• Telnet and web links to Eastern Massachusetts
academic institutions.
The Library of Congress
• The Library of Congress (LOC) is the federal
government's main library and is accessible
electronically. The following are some important
links to the LOC:
• Library of Congress Online
Catalogs, http://lcweb.loc.gov/catalog/
• Thomas, the Congressional legislation,
record, and other information,
http://thomas.loc.gov/
• Library of Congress, Collection &
Services, http://www.loc.gov/library/
LOC online catalog of its holdings
• LOC attempts to have
a copy of every book
published.
• Its has one of the
largest collections of
books and other
media in the world.
The Virtual Library
The virtual library is an online web site
created by professional librarians for
the professional researcher.
What is the Virtual Library
• It is a collection of sites and materials evaluated
for their content and usefulness and designed to
assist the specialist in finding material in an
efficient manner.
• The virtual library is a catalogue of special
collections of data, documents, and research
and reference tools for information found in web
sites.
• It does not have a physical library of books but
has only electronic documents.
• Finally, virtual libraries are also called Gateways,
Information Portals, and Cyber libraries
Examples of Virtual Libraries
• The WWW Virtual Library was started soon after
the WWW was created by Tim Berners-Lee and
a large group of volunteers. While extensive it is
not uniformly maintained. http://www.vlib.org/
• The University of Michigan's School for
Information http://www.ipl.org/ it seeks to learn
by doing -- using the faculty and graduate
students to create a state of the art information
system. An online collection that will grow and
follow leading edge of the art practices.
Special collections and databases
• Contain information about specific topics. Librarians and research
specialists examine and select material based on relevance,
organize them into databases, and/or create index catalogs.
Scientific data, presidential documents, patents, laws and legal
precedents are examples of special collections.
• Social Science Data Archives (an extensive annotated survey of
global data) http://www.spc.uchicago.edu/SocialClass/archives.html
• IBM Intellectual Property Network http://www.patents.com/ This site
can provide a good starting point but patents have many legal and
technical issues that require the use of research professionals and
highly specialized databases.
• US Patent Office All US patents since 1790
Google’s Library Project
The library project is part of Google’s Print
Division and intends to digitize millions of
books and periodicals and put them online as
part of its search engine. Working with
publishers and major university libraries.
Source: http://www.eweek.com/print_article2/0,1217,a=140962,00.asp
From Google’s Point of View
• What is the Library Project?
Google Print makes offline information
searchable. As part of this project, we're
now working to index the book collections
of several major research libraries and
make this content searchable through
Google Print alongside books provided by
publishers through our Publisher Program.
Goals of the Library Project
What is the goal of Google Print for Libraries?
This project's aim is simple: make it easier to
find relevant books. We hope to guide more
users to books – specifically books they might
not be able to find any other way – all while
carefully respecting authors' and publishers'
copyrights. Our ultimate goal is to work with
publishers and libraries to create a
comprehensive, searchable, virtual card catalog
of all books in all languages that helps users
discover new books and publishers find new
readers.
Source: http://print.google.com/googleprint/library.html
Participating Libraries
Each Participating library has between 7-15 million
volumes.
1. Stanford University (entire collection)
2. Oxford University (digitize all volumes 1900
and before)
3. University of Michigan (entire collection)
4. Harvard University (pilot study of subset of
collect)
5. New York Public Library (pilot study of subset
of collect)
Strategy based on Copyright laws
Three classes based on copyright status:
1. Books out of copyright will be placed
online in their entirety.
2. Books in copyright with partnership of
publisher will be placed online in their
entirety
3. Books in copyright and no partnership
will be given short reference.
Progress Report
The Google library project is moving forward
but has encountered legal issues on
intellectual property rights.
E-books
The British Library
One of the great libraries of the world has
open its collections of rare books to be
viewed over the web. The digital images of
the original volumes with resource links for
the scholar and the general public.
Source: http://www.bl.uk/index.shtml
The Treasures of the British Library
• Scan the digital
images of the original
documents of the
British government
• Examine digital
images of rare books
that changed the
world and its
literature, i.e. your
own private tour of
the rare books room.
http://www.bl.uk/treasures/treasuresinfull.html
Searching for Information on
the web
1. Search engines
2. Meta-Search engines
Search Engines
Search engines have two parts:
1. The search sends out onto the Internet a software
called a spider or bot (robot).
•
•
2.
Traces all the links and returns all the pages found.
The pages are characterized by algorithms and stored in
databases
The retrieval system that takes a query and maps
against the databases.
•
•
The retrieval rank orders the responses by relevance
Each search engine uses a unique technique for retrieval and
ranking.
Meta Search Engines
• Meta search engines are search engines that use their
own resources for answering the question
• but they mostly form the query from the user input and
package it and send it off to many other search engines
simultaneously (the process is called spawning) and
then wait until the replies come back.
• After a fixed time the meta takes the responses received
and pulls them together into a report.
• There are many ways to create a meta search based on
the idea. Some allow you to search only the web, others
newsgroups, newspapers, and scientific journal.
Meta Search Engines
• Ask Jeeves -- frequently get the answer in
the first pass. Jeeves allows queries in
natural language.
• Dogpile -- for its variety of sources (web,
newsgroups, newspapers)
• Ixquick
• Metacrawler
• ProFusion
Why is an understanding of how a
search engine works important?
• From the view of a user:
– The user wants to find the information with as few
downloads as possible.
– The easier to use and the more accurate the ranking
the better.
• From the view of a web site developer:
– The developer wants the site to found by in the first 510 ranked responses to a query.
– The merit of a web design is often based on the
search rankings. This requires a knowledge of a given
search engine ranks a page.
Using Search Engines
Forming successful queries using Boolean logic:
• When searching for a large scale database, it is important to be
extremely precise in characterizing the query.
• Avoid using vague or common words that will only produce millions
of pages with little to do with the subject at hand.
• Boolean Logic is based on a proposition either being true or false.
Boolean logic has several operators among them: AND, OR, NOT
that can be used with propositions to determine the truth value of a
complex expression.
• The AND operator written as, a AND b, is only true when both the
propositions a and b are both true and false otherwise,
• OR written as, a OR b, is true when either a or b is true or when
they both are and false when both are false.
• The negation of a proposition a written as, NOT a, is true when a is
false and false when a is true.
How do we apply Boolean Logic to refine
our search to only cases of interest?
•
•
•
•
•
•
•
Most search engines use some form of keywords and Boolean logic to refine the
definition and improve the performance of the search's efficiency and results.
If you are looking for a proper name, a phrase, or an other collection of words
that normally are found together, then enclose them in double quotes, i.e.
"President Gerald Ford".
If the web page should have one or more words that must be on the page, then
use the logical And, i.e. President And Ford And "United States".
If the web page may have different forms of the name, or titles, etc. then use the
logical Or, i.e. President Or "Vice President" Or Representative And "Gerald Ford".
If document should exclude a word or phrase, then use the logical Not, i.e.
"Gerald Ford" Not "Ford automotive" and Not "Ford car" and Not "Ford truck".
While not Boolean logic, some search engines allow concepts like -- NEAR and
FOLLOWED BY are also allowed, to indicate the relationship of the words or phrases
other words and phrases. Normally these relations can be which comes first or
whether the word is within a certain number of words to the first word. This concept is
called proximity logic.
Not all search engines use the AND, OR, NOT notation some like Alta Vista use "
+" for AND and "-" for NOT.
When in doubt ask a librarian:
• The librarian is a trained professional and
are well versed in using the various WWW
resources for finding answers to a vast
array of subjects.
• The librarian should be used for difficult
searches; but the student will wisely
observe, learn, and contemplate the
librarian's techniques, resources, and
methods.
Dogpile for finding non-text
based files
This material is now largely obsolete. Due to
Security considerations most sites do not
have non-text directories that are open to
search and file download. Dogpile still allows
search for images, audio, and videos.
Evaluating and Using the
Information Found
The fact that the information matches your
query is not the end of the trail. The material
may be totally worthless, deliberately
misleading, or even criminal in intent
Did You Find Gold?
Questions to consider:
Concepts
Who is the author(s) and/ or
institution that created the
document
Does the author's supplied
background support the
needed expertise? Is the
organization reputable?
Is the information timely?
Is the information current or
is it outdated or overtaken by
events?
Is it written for the general
public, a group with an
emotional or political bias, or
is it a professional society?
Who is the target audience?
Using the UMass Lowell
E-library
Download