Structure.ppt

advertisement
Types & structures of
information resources
What is out there for searching and
what’s under the hood?
© Tefko Saracevic
1
Definitions
• resource – Encarta Dictionary
“Source of help
…somebody who or something that can be used
as a source of help or information
… adeptness at finding solutions to problems”
• database – Webopedia.com
“A collection of information organized in such a
way that a computer program can quickly select
desired pieces of data. You can think of a
database as an electronic filing system.”
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
2
Definitions (cont.)
• Information databases are organized by fields,
records, and files. A field is a single piece of
information; a record is one complete set of fields;
and a file is a collection of records. For example, a
telephone book is analogous to a file. It contains a
list of records, each of which consists of three
fields: name, address, and telephone number.
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
3
Relations
• Terminology can be confusing & not consisted - so
beware & do your own translation
– Provider: a producer of databases; there are great many
providers covering many fields
• e.g. Dept. of Education produces ERIC – abstracts & indexes
educational materials (articles, reports)
– Vendors or aggregators: organizations or companies that
get databases from providers & organize them for
searching; there are a number of vendors; some providers
are their own vendors
• e.g DIALOG gets over 400 databases from a variety of providers,
(among them ERIC) & then organizes them for searching
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
4
Example of a vendor: DIALOG
• acquires databases, from information providers at a
fee
• organizes content according to given structures
• describes the content
– done in Bluesheets, a most important search tool for you
• provides given searching capabilities
– you have to master them for effective searching
• creates some own files – e.g super indexes
• provides you access at a fee
– there is no such thing as free lunch
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
5
BTW – why DIALOG?
• Why do we use DIALOG for so many
exercises? Several reasons
–
–
–
–
oldest and largest surviving vendor
most comprehensive set of databases
has a well developed instructional program
but most importantly: serves as a good test bed to
develop searching skills that are generalizable
– what you will systematically learn from using
DIALOG can be translated to all searching
• & you get an insight into problems with searching
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
6
Other vendors/aggregators
• Good number of other
vendors is around
• confusing?? wait, there is
more…
– the landscape is constantly
changing
– some available through RUL
– examples (examine!)
•
•
•
•
•
LexisNexis
Factiva
ScienceDirect
EBSCOhost
Ingenta … and on
– some incorporate
databases from producers,
others create own
databases from myriad of
sources
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
7
Types of information databases
• Many types are
available:
–
–
–
–
–
–
–
–
Bibliographic
Numeric
Full text
Directory
Image
Sound
Multimedia
Real time
• Some that are in
DIALOG are also
available elsewhere or
on their own
• Some vendors have
exclusive right to
some databases
• Many you find in RUL
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
8
Examples of databases
• Over 200 available at RUL – examples that
are relevant to library and information science
•
•
•
•
•
•
•
Library and Information Science Abstracts
Library Literature and Information Science
Information Science and Technology Abstracts
ERIC
IEEE Xplore
ACM Digital Library
but others also cover materials of interest e.g
– Web of Science
– INSPEC
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
9
a BIG problem
• In DIALOG & some other vendors you can
search a number of databases at the same
time – so called federated searching
– or in DIALOG search Dialindex – a meta index
of databases
• BUT in RUL & elsewhere there is no
federated searching
– you have to search each database separately
• someday there will be federated searching, but at
present do not hold your breath
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
10
as would
imagine …
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
11
Now unto structures –
getting under the hood
• Each database type has its own structure
– why? to describe various parts of content for
computers to recognize
• you can recognize that a section of a document is a title,
but computer has to be told that a title is a title
• so that it can (among others) search for terms in a title
when you request so
• Parts of documents (or objects in databases)
are labeled as to as to content or function
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
12
Labeling schemes
• Many structure schemes were developed that prescribed
what to label & what to call the label – meta languages
– by providers, vendors, organizations, authorities
– in different subjects, domains
– for different types of objects
• Meta tags are used on the web – to describe & index
– semantic web is in development, to further enable description
of and searching for meaning
• MARC is a form of meta language
• To use these schemes for effective searching you have no
choice but to get familiar
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
13
Transparency of structures
• In some databases description of structure is
readily available
– even though it may look forbidding, complicated
…
• good example: Bluesheets in DIALOG
• In others, structure is there but has to be
discovered by surmising
– even in
• But clever, appropriate use of structure in
searching is key to effective searching
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
14
Example:
file 438 Bluesheet
Library Literature and Information Science
Describes the
content of the file
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
15
file 438 fields
- each is searchable
Sample record:
indicates structure
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
16
file 438: fields in Basic Index
Basic index is searched by default –
examples how to search fields in
basic index
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
17
file 438: fields in
Additional Indexes
Additional index is searched by
indicating the field to be searched –
examples how to search them
Neat trick:
If you want to search
the latest update only,
add to search
UD=9999
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
18
file 438: fields in
Limit
Searches can be limited to cover
documents with given attributes –
examples how to limit searches
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
19
file 438: additional
uses of structure
Results can be sorted or ranked by
given fields –
examples how to sort or rank
results
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
20
file 438: options in
displaying of results
Results can be displayed in a
number of ways –
examples of available formats
But watch out!
In real life some
formats are free
other cost $$$$!
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
21
Economics – tail that wags the whole dog
• In class DIALOG searching is free
– & you can use it for class exercises, nothing else
• In real life DIALOG (as every other vendor) has an
elaborate economic structure
– different files have different price tags for use
– time of use is calculated in DialUnits
• a Byzantine structure of charges beyond understanding
– in different files different formats have different
price attached
• some are rely hefty!
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
22
Where to find all about structure?
• In DIALOG in BlueSheets
– consult often! and again! and again! and again!
– files have similarities and differences in
structure – BlueSheets show that
• For other vendors:
– some have similar description as BlueSheets
– some have to be dug up & surmised
– in some revelation comes from checking what
is available in advanced searching or in tips for
searching
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
23
Structure in search engines
• Mostly not readily apparent
– but all have capabilities to be used in searching
• Again: revelation comes from checking
what is available in Advanced Search, Search
Features, Search Tips, Help, & the like
• Most users do NOT take advantage of
using available structures in searching
– professional searchers do
• part of their tool kit & competencies
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
24
Example: structure from Advanced Search
Records
are
structured
at
minimum
by these
fields
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
25
Another example: structure from Advanced Search
Records
are also
structured
at
minimum
by these
fields
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
26
Similarities & differences
• All vendors & search
engines have a basic
search by default & an
advanced search
– but defaults &
advanced capabilities
differ & have to be
confirmed for each
– once you learn, you
will apply variations on
the theme
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
27
Similarities & differences …
• All vendors & search
engines have basic &
advanced Booleantype search
capabilities
– but how it is done &
bells and whistles differ
– once you master
concepts you can then
do an AHA! when you
encounter a variation
& then translate
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
28
Similarities & differences …
• All vendors & search
engines rank output
results
– but how it is done differs
– DIALOG uses LIFO – Last
in First Out as default, but
also allows for other ways
– search engines use ranking
by relevance, clustering,
PageRank … criteria
• not easy to discern
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
29
Similarities & differences …
• Most users
– do not know or care about
structure
– do not search beyond
default capabilities
– do not look beyond one or
two pages of results
– miss many potentially
relevant results
– do not know what is under
the hood
• Professional searchers
– know that structure is very
much connected to searching
– learn about & use available
structures
– understand defaults & use
advanced capabilities as
necessary
– know “tricks” for not missing
stuff or not getting to much
or to much junk
– explore in order to learn
what is under the hood
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
30
In conclusion!
Searching is
more art than
science,
but an art that
needs a lot of
knowledge what
is behind it
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
31
...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---... ...---...
© Tefko Saracevic
32
Download