Conquering the Invisible Web - University of Wisconsin Law Library

advertisement
Conquering the
Invisible Web
Presented by Bonnie Shucha
University of Wisconsin Law Library
bjshucha@wisc.edu
August 10, 2005
The Problem
 Most searchers only locate 0.03% - 1 in
3,000 - of the Web pages available to them
 Even advanced searchers, using largest
search engines, can only access about
16% of Web content
Diagrams from http://brightplanet.com/technology/deepweb.asp
Why?
 Because 84% of the information available
on the Internet is found only on the
“invisible Web,” a.k.a. “deep Web,”
and is not searchable using a
general search engine
such as Google
Invisible
84%
Statistics from The Deep Web: Surfacing Hidden Value,
http://www.press.umich.edu/jep/07-01/bergman.html
Visible
16%
The Invisible Web
The Visible Web
 Visible Web page exists in “static” or
unchanging form
 Exists as a “physical” file on a computer


Most in .htm or .html format
Similar to a word processed document in .doc
or .wpd format
The Visible Web
Static Web pages considered “visible”
because standard search engines can index
them and display them as search results
Indexing & the Visible Web
Search engine spider crawls Web starting
with already indexed static pages
Spider encounters link to
a new static Web page
Webmaster registers
new static Web page
with search engine
Spider follows link
Spider adds new Web page to search engine’s index
Content rendered “visible”
The Invisible Web
 Invisible Web content is “dynamic” or
changing

Contains bits of information stored in a
database and pulled together on-the-fly
into a Web page at your request
 Page doesn’t exist until you request it
 Similar to a mail merged document
The Invisible Web
 Dynamic Web Page
Your search results
 Database
Author
Publication
B. Shucha
Searching Wisconsin
Smarter
Lawyer
J. Doe
Common
Law
J.Q. Public
Legal
Tech Tips
1. B. Shucha, “Searching Smarter,”
Wisconsin Lawyer.
2. J.Q. Public, “Legal Tech Tips,”
ABA Journal.
Title
Marquette
Law Review
ABA
Journal
The Invisible Web
 Because this content is dynamic, or
“physically” nonexistent, most search
engines are unable to retrieve it, thereby
rendering it “invisible”
Indexing & the Invisible Web
Spider crawls Web starting with
already indexed static pages
Spider encounters database
Query is required to access “dynamic” data
Spider incapable of generating query
Spider stops and cannot index data in database
Content rendered “invisible”
The Invisible Web
 Other types of Invisible Web Content
 Very recent static pages which haven’t yet
been indexed
 Password protected data
Invisible Web Content
 95% of invisible Web content is free and
available to the public
 Quality of content often exceeds that of
visible Web content
From The Deep Web: Surfacing Hidden Value,
http://www.press.umich.edu/jep/07-01/bergman.html
Invisible Web Content
 Legal & Governmental Materials Available
in the Public Domain








Case law
Statutes
Bills
Regulations
Patents
Briefs
Census Data
Government Reports
Invisible Web Content
 Business Data
 SEC filings
 Stock quotes
 Company profiles
Invisible Web Content
 General Information
 Address & phone directories
 Flight schedules
 Dictionaries
 Maps
Invisible Web Content
 NOT freely available on Web (usually)



For Profit Publications
Public domain documents with editorial
enhancements
Other material that is someone’s intellectual
property
Finding Invisible Web Content
 To find ANY information, consider where
an authoritative source might be found





Print?
Visible Web?
Invisible Web?
Subscription Database?
Phone Call?
 Next, consider the quickest, most cost-
effective way to get the information
Finding Invisible Web Content
 If you determine that it may be available
on the invisible Web, how do you find it?
By knowing
where to look!
Finding Invisible Web Content
A great deal of excellent legal and business
information is freely available on the Internet
Much of it is contained within databases and is,
therefore, invisible to most conventional search engines
Finding Invisible Web Content
The most effective way to access this information
is using the database’s own search box
The search box is usually found on a static, visible Web
page that is accessible using a conventional search engine
Finding Invisible Web Content
 Search Strategy

DON’T search for specific information using a
conventional search engine

DO use a conventional search engine to
search for a database that may contain the
information you seek

THEN use the search box for that database to
search for the specific information
Finding Invisible Web Content
“The point is that often the key
to the answer is not locating the
answer itself as the first step, but
locating the right database in
which to search for it.”
Diana Botluk, Mining Deeper into the Invisible Web,
http://www.llrx.com/features/mining.htm
Search Exercises
 Attempt to locate the following:
1. Wisconsin Statute 758.01
2. My email address
3. Brief from WI Court of Appeals case, Docket
99-2588
4. The name of the person next to you
5. Subtitle C of the U.S. Internal Revenue Code
6. “Contract Law in Wisconsin” (State Bar of
Wisconsin CLE Book)
Reviewing Search Exercises
 Were you able you find the information?
 Was it from an authoritative source?
 What was your search strategy?
 Why did you choose this strategy?
 What were the costs associated with your
search?
 Did you choose quickest, most cost
effective method to locate the information?
Group Exercise - #3
 Locate brief from WI Court of Appeals case,
Docket 99-2588
 One strategy:


Open Google or another search engine
Search for database that may contain the brief


“Wisconsin briefs” or “Wisconsin court of appeals
brief”
In the Wisconsin Briefs database, search for the
brief (follow instructions)

“992588”
Group Exercise - #5
 Locate Subtitle C of U.S. Internal Revenue
Code
 One strategy:


Open Google or another search engine
Search for database that may contain the code


“Internal Revenue Code”
On the Internal Revenue Code page, browse to
Subtitle C
Group Exercise - #5
 Locate Subtitle C of U.S. Internal Revenue
Code
 Another strategy:


Open Google or another search engine
Search for database that may contain the code


Search the GPO Access U.S. Code database,



“United States Code”
“Internal Revenue Code subtitle C”
In the IRC section, note citation to subtitle C
Go back to search box, and enter citation

“26USC3101”
Invisible Web Resources
 Federal Law



American Factfinder, http://factfinder.census.gov

Population, housing, economic, and geographic data
from the U.S. Census
FedStats, http://www.fedstats.gov

Statistics from United States government agencies
FindLaw's Cases & Codes, http://findlaw.com/casecode/

Links to databases of federal and state cases and
legislation
Invisible Web Resources
 Federal Law

FindLaw's Supreme Court Center,
http://supreme.lp.findlaw.com/supreme_court/resources.html
Recent U.S. Supreme Court opinions, orders, briefs,
docket, and more
GPO Access, http://www.gpoaccess.gov

U.S. Code, C.F.R., and so on, from the Government
Printing Office
Thomas, http://thomas.loc.gov

U.S. legislation and other congressional information



Invisible Web Resources
 Wisconsin Law

WisBar State and Federal Legal Resources,
http://www.wisbar.org/AM/Template.cfm?Section=Legal_Research
Links to Wisconsin and federal resources
Wisconsin Briefs,


http://library.law.wisc.edu/elecresources/databases/wb/index.php
Supreme Court & Court of Appeals briefs
Wisconsin Legislative Drafting Records,


http://library.law.wisc.edu/~draftingrecords

Written materials, letters, and memoranda given to or
created by the legislative drafting attorney
Invisible Web Resources
 Wisconsin Law

Wisconsin Legislature Infobases,
http://folio.legis.state.wi.us/
Wisconsin statutes, acts, bills, and more
Wisconsin Online Court Records,


http://www.wicourts.gov/casesearch.htm

Status information for Wisconsin cases (WSCCA.i &
CCAP)
Invisible Web Resources
 Journals, News, and More


Badgerlink, http://www.badgerlink.net

Scholarly & popular journals and newspapers

Available to Wisconsinites
Legaltrac, http://wsll.state.wi.us/enterlt.html

Index of legal periodicals

Available to WSLL card holders
Invisible Web Resources
 Journals, News, and More

MPL Database for Remote Use,
http://www.mpl.org/files/great/bookmark.cfm?Category=82
D&B Million Dollar Database, CQ Researcher, etc.

Available to Milwaukee PL card holders
Yahoo Search Subscriptions,


http://search.yahoo.com/subscriptions


WSJ, LexisNexis, Consumer Reports, etc.
Text available by subscription
Invisible Web Resources
 General Invisible Web Directories




CompletePlanet, http://www.completeplanet.com
Direct Search, http://www.freepint.com/gary/direct.htm
ProFusion, http://www.profusion.com
Librarian's Index to the Internet, http://lii.org
Presentation based on the article:

Bonnie Shucha, Searching Smarter: Finding Legal Resources on the
Invisible Web, Wisconsin Lawyer, September 2004, at 19, at
http://tinyurl.com/dthen.
© Bonnie Shucha
Reference & Electronic Services Librarian
University of Wisconsin Law Library
bjshucha@wisc.edu
http://wisblawg.blogspot.com
Download