e530 opening.ppt

advertisement
Principles of Searching
[17:610:530] or ‘e530’ for short
Overview of the course
and a bit of history
© Tefko Saracevic
1
Table of content
1. Summary of sundry
requirements
2. Basic definitions
3. Syllabus
4. Why? Rationale & objectives
5. What? Themes and topics
6. How? Goings on
7. A bit of history
© Tefko Saracevic
2
1.Summary of sundry
requirements
Described in detail in:
“Before the start: what you need to
have and know, and how to get it”
and in the Syllabus
and in eCollege tutorials
(follow the links there)
© Tefko Saracevic
3
Before
• Prerequisite courses: none
– but this course is a pre- or co-requisite for
many other courses
• Have a Rutgers University Computing
Services (RUCS) email account (NetID)
– full access to online resources in Rutgers
University Libraries (RUL) requires using
your Rutgers NetID
– get a RUL card for other library services
– but you can use any email address for
course communication
• Know how to use RUL
– particularly use from home & use of
electronic resources e.g. getting journal
articles
– many instructions on RUL site
• Have a DIALOG account
– will get it from the instructor
– will get other accounts as time goes by
© Tefko Saracevic
4
Required competencies
• eCollege:
– please take the eCollege tutorial before
the course
• Email: (of course)
– be comfortable incl. with attachments
• Word & PowerPoint (also "of course")
– take tutorials, as necessary
– (I am still taking them when I need to
finesse something)
• Computer, internet, the web
– be comfortable, take tutorials
– e.g. logins, file transfer, download
• Rutgers has many computing
services for students,
– including myRutgers, a personalized
portal
– explore and use them
© Tefko Saracevic
5
How to get them?
• A must: MLIS bootcamp tutorials
– created by MLIS students for "MLIS
students on some of the many
technical skills that they will need in
order to have a successful school
year.“ Even if you know, review them!
• Required competencies could be
gained and sharpened through
MLIS and Rutgers tutorials, as well
as many other online tutorials
• Please review your competencies
through these tutorials!
– these topics will NOT be covered in
the course, but are assumed. FULLY!
• Links to all are in mentioned course
documents
© Tefko Saracevic
6
Communication
• Email
– through eCollege email functions
• whole class
• one-on-one
• by group
• eCollege Chat
– discussion room or rooms
• groups have own chat room
• could be on different topics
• could be private
– ClassLive - a live chat room
• By phone
– instructor will provide times when available
• In person
– drop in to SCILS and see me or lets meet
at some conference or event
© Tefko Saracevic
7
Coursework
• Course Home Page
– announcements
– Course Checklist
• Class Lounge
– like a blogspace; use for blog
– introduce yourself
• Threaded discussion
– simulates class discussion
– depended on module & topic
• discuss, reply, comment…
• Dropbox
– submitting & retrieving
assignments
– graded assignments returned
© Tefko Saracevic
8
Coursework (cont.)
• Journal
• place where you can make notes &
record thoughts
• option of sharing
• Document sharing
• uploading & downloading documents
by instructor & students
• but other documents from RUL
directly
• Webliography
• relevant sites submitted by instructor
& students; could be annotated
• Calendar
• schedule of course events
• Gradebook
• providing grades& comments
© Tefko Saracevic
9
Student groups
• Groups of three or four colleagues
will be formed
– each group will have a letter
designation & a name you chose
– in addition to group chat room and
email you can work out among
yourselves a method for
communication, exchanges
• Why groups?
– foster easier and multiple exchanges
– form a small discussion assembly
– help each other, raise questions,
explain, discuss … outside of more
formal channels
• Groups will work together as necessary
& should cooperate as to exercises
• A group will present some of the results
together
© Tefko Saracevic
10
2. Basic definitions
These are really basic,
and many more will be
presented during the
course and found in
readings
© Tefko Saracevic
11
prin′ci′ple [prinsÉ™p′l]
(noun)
(courtesy of Encarta Dictionary)
1. basic assumption
an important underlying law or
assumption required in a system
of thought
2. ethical standard
a standard of moral or ethical
decision-making
3. way of working
the basic way in which something
works
4. source
the primary source of something
All fit this course, but which one fits best?
© Tefko Saracevic
12
sear′ching, search [surch]
(verb, noun, adjective)
1. penetrating or probing
observing acutely or examining
thoroughly
2. examine thoroughly
to look into, over, or through
something carefully in order to find
somebody or something
3. examine computer file
to examine a computer file, disk,
database, or network for particular
information
4. discover something by
examination
to discover, come to know, or find
something by examination
All fit, but no. 3 fits this course particularly well
© Tefko Saracevic
13
3. Syllabus
Only an outline is given here.
The document is long and detailed.
Basic to everything we are doing.
Worth a periodic consultation,
particularly as to assignments, final
project, formats, bibliography,
etc etc etc
© Tefko Saracevic
14
Content of syllabus
• Course description
– as in the catalog
• Rationale of the course – Why?
– motivation and justification for the course
• summarized in next section
– all course sections – modules – start with
a Why? and then go on to What? And
How?
• Before the start: what you need to
have and know, and how to get it
– we already went through this
• Course purpose and objectives
– summarized in section 4
• Organization of the course
– summarized in section 5
© Tefko Saracevic
15
syllabus cont.
• Coursework
– summarized in section 6
– answers FAQ, so before asking it is
good to consult syllabus
• Method of assessment
– how you will be graded
• Academic integrity
– Rutgers policy and statements on
Student responsibility and Faculty
responsibility
– Plagiarism policy
• Bibliography
– readings and how to obtain them
© Tefko Saracevic
16
4. Why? Rationale &
objectives
Now we are finally
getting to the stuff that
the course is all about
© Tefko Saracevic
17
Why we have this
course?
• Details in syllabus & course outline,
summary here
• Number & variety of information
resources is HUGE
– growing at a very high rate - called
“information explosion”
• Great many people search for
information
– few do it well
– even fewer know how well they are doing
• As professionals, librarians were
always concerned with searching for
information on behalf of users
– with the advent of electronic information
resources and the web, searching has
changed in many ways
© Tefko Saracevic
18
Why? cont.
• Searching has become a complex
process involving interaction
between
people, information, & technology
• A professional understands
complex processes & interactions
involved in searching and putting
them effectively to practice
• You are asking:
– How do I search effectively and
efficiently a variety of information
resources for users?
– How do I evaluate what was searched
and provided?
© Tefko Saracevic
19
Course objectives
Integrated understanding of:
• Content: Subject, structure &
vocabularies of information resources
• Systems: Models of information retrieval
(IR) systems, search engines & digital
libraries as used in searching
•
Human-human interaction: User
information seeking as the context for
searching; mediation & interviewing
• Human-computer interaction:
Principles for effective searching &
variations in search strategies & tactics
• Results: Alternatives in presentation of
results to users; evaluation of results
• Professional concerns: Ethical
norms & life-long learning.
© Tefko Saracevic
20
In order to search you
need an understanding of:
Content: What is in the sources?
How is it organized?
Systems: Where? What kinds?
IR, web, dig libraries...
Human-human interaction:
How? You and the user
Human-computer interaction:
How? You and the computer
Results: What & how to
present; evaluate
Professional conduct: ethics …
© Tefko Saracevic
21
Symbolically ...
Content
System
HHI
HCI
Results
?
Professionally
© Tefko Saracevic
22
What will the course
NOT do?
• Create professional searchers
or “extreme searchers” out of
you
• Make you an expert on
databases, systems,
information retrieval, search
engines, the web
© Tefko Saracevic
23
What will the course
DO?
• Provide you with a practical &
theoretical foundation and
framework on basis of which
you can then:
– develop into a professional
searcher or technical assistant
to users
– grow & evolve with the field
– adjust to inevitable changes in
the world of searching
– eventually, depending on your
other courses & life-long
learning, become an expert
© Tefko Saracevic
24
About the course
• It is demanding
– but so is searching as professional
work
• It is challenging
– but so is searching
• There is a lot of thinking
• There is a lot of work
• But there is a lot
– that can be learned
– that can be used in practice
• in other courses
– that will stay with you throughout your
career
– upon which you can build
• And the course is rewarding
– and so is searching professionally
© Tefko Saracevic
25
5. What?
Themes and modules
Organization of the
course as to coverage
with emphasis on
modules as basic units
© Tefko Saracevic
26
Organization
•
•
•
Semester lasts 16 weeks
Course has 16 modules – one for
each week of the semester
Modules are grouped into themes
– there are 8 themes following
objectives:
A.
B.
C.
D.
E.
F.
G.
H.
© Tefko Saracevic
At the start (module1)
Content (modules 2 &3)
Systems (modules 4, 5 & 6)
Human-computer interaction
(modules 7, 8 & 9)
Human-human interaction (modules
10 & 11)
Results (modules 12 & 13)
Professional concerns (modules 14
& 15)
At the end (module 16)
27
Modules
Each module has an outline as
to:
• Title of the module
• Why? the rationale for
presenting this module and
questions you should ask
• What? a list of topics covered
in the module
• How? presentation and tasks
for the module
– elaborated in section 6
© Tefko Saracevic
28
Topics covered
Theme A: AT THE START
Module 1. Overview of the course and
a bit of history
B. CONTENT
2. Types and structures of information
resources
3. Types and structures of
vocabularies
C. SYSTEMS
4. Information retrieval
5. Interaction in information retrieval
6. Search engines. Digital libraries
© Tefko Saracevic
29
Topics covered (cont.)
D. HUMAN-COMPUTER
INTERACTION
7. Search techniques and
effectiveness
8. Advanced searching
9. Web search and the invisible
web
E. HUMAN-HUMAN
INTERACTION
10. Information seeking. User
modeling
11. Mediation between search
intermediaries and users
© Tefko Saracevic
30
Topics covered (cont.)
F. RESULTS
12. Evaluation of search sources
and results
13. Presentation to users
G. PROFESSIONAL CONCERNS
14. Ethics. Competitive intelligence
15. Keeping up: sources for lifetime learning
H. AT THE END
16. Student presentations and
conclusions
© Tefko Saracevic
31
6. How? Goings on
Coursework:
Ways and means we are
going about doing the
course
AND schedules
© Tefko Saracevic
32
Mix
• The course is a mix of
– theory
– experimentation
– practice
• Why theory?
– base for further understanding &
professional development
• knowing theory separates learning from
“training”, a professional from a technician
or paraprofessional
– nothing more practical than a good
theory
– theory endures through changes in
systems & software
• theory makes learning new systems easier
– theory helps with understanding &
helps learning “stick”
© Tefko Saracevic
33
Structure of coursework
•
Each module has:
1.
2.
3.
4.
•
a lecture on the module topic
assignments as to readings
exercises for searching
tips for thought
There is also a term project
– a semester long task focusing
on providing a search service
to a selected user
© Tefko Saracevic
34
Schedule
• Assignments and exercises for
each module are done on a
weekly basis starting Monday,
due on the next Monday
• The semester long term project
is due on the Monday after the
last class week, with two
progress reports as scheduled
(1/3 and 2/3 into the semester)
• Schedule is provided on
course site
© Tefko Saracevic
35
Lectures
• Each module has a lecture on
the topic
– lectures are in PowerPoint
– best viewed if downloaded & then
run on own computer
• go to Doc Sharing; Select View:
Lectures; & open, save from there
– most lectures contain some links to
other sites, providing further
explanation, examples, or
resources
– some lectures slides have notes
with further explanatory text
• terms/phrases that have a * (asterisk)
have associated notes
© Tefko Saracevic
36
Assignments
• Assignments refer to READINGS
ONLY
– associated with module topic and
lecture
– some readings are required – they
have to be summarized and
summaries turned in
– other readings are for read-only
and discussion or reference
• Full citation to readings is in the
bibliography
• Readings are either at RUL, on
class web site, or the web
– sometimes you will have to search
• (after all this is a searching class!)
© Tefko Saracevic
37
Summaries
(for required readings only)
• Provide a brief synthesis of main
ideas, facts AND
– possibly a critical review e.g
– relate to (points added for this):
• relevant personal, professional
experiences with library & information
services; examples
• translation/implication for practice
• other readings, topics, courses,
project, exercise and/or
• raise questions for discussion and
discuss with group
• Format, style:
• format as prescribed in syllabus
• but style of summaries is your choice
© Tefko Saracevic
38
Tips for summaries
FORMAT:
• Start with
heading as
prescribed
– points deducted if
not
• Use APA style
• Two to three
pages maximum
• Use 12 point font
– single space
– 1 inch margins
• Submit on time
© Tefko Saracevic
CONTENT:
• React to readings
• Tie in with
practice
• Integrate w/ other
knowledge,
experience and
course work
• Demonstrate
thought &
learning
• Include questions
and criticism
• Do not merely
summarize
readings
39
Exercises
• Purpose: to obtain practical training
in a variety of systems
– the purpose is NOT to teach you a given
system, but to provide searching
experiences that can be generalized &
later sharpened, improved
• On a weekly basis as assigned
– using DIALOG, LexisNexis, web, search
engines, digital libraries …
– or search for answers for given
questions
– or use a variety of tactics & features
• Work cooperatively in groups
• At times independent of lecture topic
– but has its own logic in progression
© Tefko Saracevic
40
Examples of first few
exercises
• Involves DIALOG*
• Take DIALOG tutorials
• LEARN & PRACTICE:
– Contents of databases
– Structure of databases &
records - BLUE SHEETS
– Basic search commands
– Basic output commands
– Logical operators, execution
– Truncation
– Searching in fields
– DIALINDEX; OneSearch
© Tefko Saracevic
41
Tips for thought
• Informal
– questions, ideas to be pondered
on your own
– guidelines for further learning &
exploration on your own
– sometimes things to lighten up
• You can contribute
• Can be used in group
discussion
• But there is nothing that is
required, nothing to turn in
© Tefko Saracevic
42
Term project purpose
• A reality exercise designed to
give you in depth experience
that you will encounter in your
professional life
– involves every aspect of
searching from start to end
• Experiences to be shared
among classmates, so that you
can learn from each other
• It will take time and effort, thus
do NOT procrastinate
© Tefko Saracevic
43
Term project
• Select a specific user with an inf.
need to do an online search
– no family or significant others*
• Interview the user
– if necessary several times with
feedback
• Construct a user model
• Select resources for searching
• Construct search strategies &
conduct searching - reiterate
• Organize results for presentation
• Present results to user; evaluate
• Write a technical report
© Tefko Saracevic
44
Term project
deliverables
There are two:
1. A report to the user
•
•
suggest you follow presentation
guidelines as suggested in
module 13
does NOT have to be presented
to the instructor or class – it is
between you and your user!
2. A technical report to the
instructor
•
discussed next and in the
syllabus at length
© Tefko Saracevic
45
Technical report
(details in the syllabus)
• Selection of user: who?
• User question & model
– what task? how much knows? what
topics? terminology? priorities?
• Mode & results of interviews
• Summary of search tactics &
approaches, dynamics
• Changes in user model, user
definition of problem
• Changes in searching & you
• Evaluation of your effort & learning
– what does or does not work?
– what effects of decisions?
– what would you do differently?
– this section VERY important!
© Tefko Saracevic
46
7. A bit of history
A short chronology
rather than history
© Tefko Saracevic
47
Antecedents
• Europe before WWII
– strong documentation movement
• Universal Decimal Classification,
indexing of scientific literature
• In the US right after WWII
concern about information
explosion, particularly in science
– Vannevar Bush’s classic article “As
we may think” in Atlantic Monthly in
1945 stirred imagination & funding
• problem: “the massive task of making
more accessible a bewildering store
of knowledge.”
• solution: use of new technology,
“Memex” as idealized model
• can you find it?
© Tefko Saracevic
48
Beginnings
• NSF acts of 1950 & 1958 mandate
support for scientific information
– to this day supports research &
development in this area, including
digital libraries
– sparked involvement from many fields
& many funded projects
• 1951 Calvin Mooers coined term
“information retrieval” (IR)
• 1950’s mechanized IR systems
emerged
• Societies and conferences
emerged related to problems of IR
and broader issues
© Tefko Saracevic
49
Onto the real world
• 1960s saw computer applications for
IR blossoming
• Also library automation emerged, incl.
MARC
• Late 1960’s: Medline, the online
version of MEDLARS (Nat. Libr. of
Medicine) came out
– this was online way before the internet &
web
• Early 1970’s: DIALOG and ORBIT
established – commercial online
vendors (ORBIT later merged into
other vendors)
• Professional searching grew at high
rate
© Tefko Saracevic
50
Research
• In 1960’s Gerald Salton & his
students in computer science
pioneered research into advanced IR
methods
– addressed technical or system side of IR
– great many good results over decades
– but it took decades before results applied
commercially, but today all vendors &
search engines use it
– continues to this day internationally
– particularly under TREC (Text Retrieval
Conference) (find it?)
• Research and IR still closely
connected
– source of advances
© Tefko Saracevic
51
Research (cont.)
• 1970s & 80s also saw emergence of
research dealing with the human
(user) side of IR
– addressed users, use of information & IR
systems
– basic notions, such as relevance
• In the 1990’s till present areas:
– interaction in IR, or human-computer
interaction
– information seeking
– human information behavior
• Human and system side of research
do not mesh well
– still & unfortunately
© Tefko Saracevic
52
Net
• Internet first went live in 1969 as
ARPANET, an inter-university net
– in 1983 replaced by TCP/IP protocol still in
use today – i.e. present internet was born
– in 1990 became NSFnet, broadening reach
significantly
– in 1992 NSF pulled out & offered to broad
public & commercial use
• By 1980s it became a force
– by 1990’s it took the world
• In 1991 Tim Berners-Lee developed
world wide web
– in 1993 first browser developed (Mosaic to
become Netscape)
– became fastest growing & spreading
technology in history
• Search engines
– Yahoo launched in 1993 & Google in 1999
– affected searching enormously
– today over 3000 search engines in over
150 countries
© Tefko Saracevic
53
Digital libraries
• Emerged in mid 1990s
• Involved
– massive research programs ( still
going on)
– massive investments by libraries
• changed the library landscape
• particularly as to access & searching
– the two don’t communicate much
• Brought together IR & libraries
• Today vast international presence
– many institutions in addition to libraries
involved, e.g. museums, societies
• Major resource (& headache) for
searchers
– large variety of texts, images, sounds
digitized all over the world
© Tefko Saracevic
54
Future?
© Tefko Saracevic
55
A perspective: searching
is a journey of discovery
© Tefko Saracevic
56
another perspective …
© Tefko Saracevic
57
still another perspective
© Tefko Saracevic
58
Download