Saracevic_Inf_ sc - School of Communication and Information

advertisement
Perspectives of
information science in
the digital age
Tefko Saracevic, PhD
Rutgers University
USA
http://www.scils.rutgers.edu/~tefko
Information science:
“the science dealing
with the efficient
collection, storage,
and retrieval of
information”
Webster
© Tefko Saracevic, Rutgers University
2
Organization
1.
2.
3.
4.
5.
6.
7.
8.
Big picture – problems, solutions, social place
Underlying stuff – theories, phenomena
Structure – what is inside stuff
Systems stuff – information retrieval, relevance
People stuff – users, use, seeking, context
Alliances, competition – the OUCH stuff
Digital libraries – whose are they anyhow?
Conclusions – Will we have a field stuff?
© Tefko Saracevic, Rutgers University
3
1.
The big picture
Problems addressed
Bit of history: Vannevar Bush (1945):
“... the massive task of making more
accessible of a bewildering store of knowledge.”
 still with us & growing
 Problem:
Basic problem of information science:
Information explosion
today: PLUS
Communication explosion
© Tefko Saracevic, Rutgers University
4
… solution
Bush: “Memex ... association of ideas
... duplicate mental processes artificially.”
Technological fix to problem
Still with us:
technological determinant
 tail
that wags the dog
© Tefko Saracevic, Rutgers University
5
Problems & solutions:
SOCIAL CONTEXT
Professional practice AND scientific inquiry related
to:


Effective communication of knowledge records ‘literature’ - among humans in the context of social,
organizational, & individual need for and use of
information.
“modeling the world of publications with a practical goal
of being able to deliver their content to inquirers [users]
on demand.” White & McCain
Taking advantage of modern information
technology
© Tefko Saracevic, Rutgers University
6
Elaboration
Knowledge records = texts, sounds, images,
multimedia ... literature in given domains


content-bearing structures
symbol manipulations are content neutral - infrastructural
to inf. sc.
Communication = human-computer-literature
interface

study of inf. science is the interface between people &
literatures
Inf. need, seeking, and use = reason d'être
Effectiveness = relevance, utility
© Tefko Saracevic, Rutgers University
7
General characteristics
- leitmotifs
 Intedisciplinarity - relations with a number
of fields
Technological imperative - driving force, as
in many modern fields
Information society - social context and role
in evolution - shared with many fields
© Tefko Saracevic, Rutgers University
8
2.
Underlying stuff
What is information?
Intuitively well understood, but formally????

Several viewpoints, models
Shannon: source-channel-destination

grapes into wine
Cognitive: changes in cognitive structures

water into wine
Social: context is the king

whatever into wine to get drunk
© Tefko Saracevic, Rutgers University
9
K(S) +
I
= K(S +
S)
(Brookes)
Information [structured information]
when operating on a knowledge
structure produces an effect whereby
the knowledge structure is changed
Potential information added (Ingwersen)
Actually, it states the problem –



“unoperational” in information systems
involves mental events only
constructivists rejected it
© Tefko Saracevic, Rutgers University
10
Information in inf science:
Three senses (from narrowest to broadest)
Inf. in terms of decision involving little or no
cognitive processing

signals, bits, straightforward data - e.g.. inf. theory,
economics
Inf. involving cognitive processing & understanding

understanding, matching texts
Inf. also as related to situation, task, problem-athand : USERS, USE
For information science (incl. information retrieval):

third, broadest interpretation
© Tefko Saracevic, Rutgers University
11
The biggest problem
MEASUREMENT
© Tefko Saracevic, Rutgers University
12
3.
Structure
Specialties
(White & McCain)
In desc. order of author co-citation; (120 authors, 24 years):
 experimental retrieval
 citation analysis
 practical retrieval
 bibliometrics
 library systems, automation
 user studies and theory
 scientific communication
 OPAC’s
 general - other disciplines
 indexing theory
 communication theory
© Tefko Saracevic, Rutgers University
13
Structure or oeuvres
Two large sub-disciplines:
 “Domain” cluster: analytical study of literatures, their
structure, communication, social context, uses  Retrieval cluster: human-literature interface: IR systems
(largest); interaction; library systems, OPACs, user studies  within each sub-clusters, eras

e.g.. Salton & post-Salton era
Largely not connected


some authors in both, migrating
BUT: lacking integrating works, authors, texts - big payout
© Tefko Saracevic, Rutgers University
14
Paradigm split in
retrieval cluster
Split from early 80’s to date
 System-centered
algorithms, TREC
 continue traditional IR model

 Human-(user)-centered
cognitive, situational, user studies
 interaction models, some started in TREC

Calls for user-centered approaches & evaluation
But: most support for system work

in the digital age support is for digital
© Tefko Saracevic, Rutgers University
15
Human vs. system
Human (user) side:



often highly critical, even one-sided
mantra of implications for design
but does not deliver concretely
System side:


mostly ignores user side & studies
‘tell us what to do & we will’
Issue NOT H or S approach



even less H vs. S
but how can H AND S work together
major challenge for the future
© Tefko Saracevic, Rutgers University
16
4.
Systems stuff
Information Retrieval
“ IR: ... intellectual aspects of description of inf.,
... search, ... & systems, machines...”
Calvin Mooers, 1951
How to provide users with useful information
effectively?
For that objective:
1. How to organize information intellectually?
2. How to specify the search & interaction
intellectually?
3. What techniques & systems to use
effectively?
© Tefko Saracevic, Rutgers University
17
Streams in IR Res. & Dev.
1. Information science:



Services, users, use;
Human-computer interaction;
Cognitive aspects
2. Computer science:


Algorithms, techniques
Systems aspects
3. Information industry:


Products, services, Web
Market aspects
Problems: ...relative isolation
...inadequate cooperation, transfer
© Tefko Saracevic, Rutgers University
18
IR successfully effected:
Emergence & growth of the INFORMATION
INDUSTRY
Evolution of IS as a PROFESSION &
SCIENCE
Many APPLICATIONS in many fields

including on the Web – search engines
Improvements in HUMAN - COMPUTER
INTERACTION
Evolution of INTEDISCIPLINARITY
IR has a long, proud history
© Tefko Saracevic, Rutgers University
19
Broadening of IR
OPACs (Online Public Access Catalogs)
Natural language processing
Summarization
Metadata representations
Text “understanding”
Hypertext, hypermedia
Multimedia - images, sounds ...

image IR, music IR
Many human-computer interactions
Web search engines
© Tefko Saracevic, Rutgers University
20
5.
People stuff
Quite a few areas
Professional services


in organization – moving toward knowledge
management, competitive intelligence
in industry – vendors, aggregators, Internet,
Research





user & use studies
interaction studies
broadening to information seeking studies, social
context, collaboration
relevance studies
social informatics
© Tefko Saracevic, Rutgers University
21
User & use studies
Oldest area
covers many topics, methods, orientations
 many studies related to IR


e.g. searching, multitasking, browsing,
navigation
Branching into Web use studies
quantitative & qualitative studies
 emergence of webmetrics

© Tefko Saracevic, Rutgers University
22
Interaction
Traditional IR model concentrates on
matching not user side & interaction
Several interaction models suggested


Ingwersen’s cognitive, Belkin’s episode,
Saracevic’s stratified model
hard to get experiments & confirmation
Considered key to providing
basis for better design
 understanding of use of systems

Web interactions a major new area
© Tefko Saracevic, Rutgers University
23
Relevance
Effectiveness in IR = relevance

thus, relevance became a key notion

and a key headache
A number of studies & reviews on:
Nature: Framework, base?
 Manifestations: Contexts? Typologies?
 Behavior: Variables? Observations?
 Effects: Use? Evaluation?

© Tefko Saracevic, Rutgers University
24
Manifestations (types) of
relevance
System or algorithmic relevance

relation between query & objects (‘texts’) retrieved or
failed to retrieve
Topical or subject relevance
Cognitive relevance or pertinence
Situational relevance or utility

relation between the situation, task or problem at hand &
texts
Motivational or affective relevance

intent, goals, & motivation of user & “texts”
Manifestations interact dynamically
© Tefko Saracevic, Rutgers University
25
Information seeking
Concentrates on broader context not only IR or
interaction, people as they move in life & work
Number of models provided

e.g. Kuhlthau’s stages, Vakkari’s problem situation,
task complexity
Includes studies of ‘life in the round,’ making
sense, information encountering, work life,
information discovery
Based on concept of social construction of
information
© Tefko Saracevic, Rutgers University
26
6.
Alliances, competition
Relations
With a number of fields...
Strongest:
1. Librarianship
2. Computer science
© Tefko Saracevic, Rutgers University
27
Librarianship
[Library is]...
“contributing to the total communication
system in society. Created to maximize
the utility of graphic record for the
benefits of society... it achieves that
goal by working with the individual and
through the individual it reaches
society.”
J.H.Shera, 1972
© Tefko Saracevic, Rutgers University
28
Common grounds
IS & librarianship share:
Social role in information society
Concern with effective utilization of
graphic & other types of records
Research problems related to a number
of topics
Transfer to & from information retrieval
© Tefko Saracevic, Rutgers University
29
Differences
IS & librarianship differ in:
Selection & definition of many problems
addressed
Theoretical questions & framework
Nature & degree of experimentation
Tools and approaches used
Nature & strength of interdisciplinary
relations
© Tefko Saracevic, Rutgers University
30
One field or two?
Point of many debates
Suggest: TWO fields in strong
interdisciplinary relations
Not a matter of “better” or “worse” - matters
little

common arguments between many fields
Differences matter in:




problem selection & definition
agenda, paradigms
theory, methodology
practical solutions, systems
Best example: IR & library automation
© Tefko Saracevic, Rutgers University
31
Which?
Librarianship. Information science
Library and information science
Libraryandinformationscience
Information science
Information sciences
Information

like in the “Information School”
© Tefko Saracevic, Rutgers University
32
Computer science
“systematic study of algorithmic
processes that describe and transfer
information... . The fundamental
question in computing is: ‘What can be
(efficiently) automated’ .”
Denning et al., 1989
© Tefko Saracevic, Rutgers University
33
IS & computer science
CS primarily about algorithms
IS primarily about information and its users
and use
Not in competition, but complementary
Growing number of computer scientists active
in IS – particularly in IR and digital libraries
Concentrating on



advanced IR algorithms & techniques
digital library infrastructure & various domains
human computer interaction
© Tefko Saracevic, Rutgers University
34
Human-computer
interaction (HCI)
“ Human computer interaction is a discipline
concerned with the design, evaluation and
implementation of interactive computing
systems for human use and with the study of
major phenomena surrounding them.”
ACM SIGCHI, 1993
Another interdisciplinary area

computers sc., cognitive sc., ergonomics, ...
© Tefko Saracevic, Rutgers University
35
Interaction and IS
Two streams:


computer-human interaction
human-computer interaction
Modern IR is interactive

BUT: difference between retrieval engine &
retrieval interface
Many studies on:


machine aspects of interaction
human variables in interaction
Problem: little feedback between
Interaction very hard to evaluate - few
methods yet
© Tefko Saracevic, Rutgers University
36
7.
Digital libraries
LARGE & growing area
“Hot” area in R&D
a number of large grants & projects in the
US, European Union, & other countries
 but “DIGITAL” big & “libraries“ small

“Hot” area in practice
building digital collections, hybrid libraries,
 many projects throughout the world

© Tefko Saracevic, Rutgers University
37
Technical problems
Substantial - larger & more complex than
anticipated:

representing, storing & retrieving of library objects



operationally managing large collections - issues
of scale
dealing with diverse & distributed collections



particularly if originally designed to be printed & then
digitized
interoperability
assuring preservation & persistence
incorporating rights management
© Tefko Saracevic, Rutgers University
38
Digital Library
Initiatives in the US
(DLI)
Research consortia under National Science
Foundation


DLI 1: 1994-98, 3 agencies, $24M, six large projects
DLI 2: 1999-2006, 8 agencies, $60+M, 77 large &
small projects in various categories
‘digital library’ not defined to cover many topics &
stretch ideas

not constrained by practice
© Tefko Saracevic, Rutgers University
39
European Union
DELOS Network of Excelence on Digital
Libraries

many projects throughout European Union

heavily technological
many meetings, workshops
 resembles DLIs in the US
 well funded, long range

© Tefko Saracevic, Rutgers University
40
Research issues

understanding objects in DL









representing in many formats
non-textual materials
metadata, cataloging, indexing
conversion, digitization
organizing large collections
managing collections, scaling
preservation, archiving
interoperability, standardization
accessing, using,
© Tefko Saracevic, Rutgers University
41
DL projects in practice
Heavily oriented toward institutions
Assoc of Res Libraries (ARL) database:
427 DL projects in 13 countries
 374 in the US

51% in universities; 24% fed govmt; 9% hist
societies; 6% regional …
 84% are explicitly retrospective; 16%
technological
 1 listed from DLI (Illinois)
 no connection with DLI projects

© Tefko Saracevic, Rutgers University
42
Agendas
Most DL research agenda is set from top down


from funding agencies to projects
imprint of the computer science community's interest
& vision
Most DL practice agendas are set from bottom
up


from institutions, incl. many libraries
imprint of institutional missions, interests & vision


providing access to specialized materials and collections
from an institution (s) that are otherwise not accessible
covering in an integral way a domain with a range of sources
© Tefko Saracevic, Rutgers University
43
Connection?
DL research & DL practice presently are
conducted
mostly independent of each other,
 minimally informing each other,
 & having slight, or no connection

Parallel universes with little connections
& interaction
© Tefko Saracevic, Rutgers University
44
8.
Conclusions
IS contributions
IS effected handling of inf. in society
Developed an organized body of knowledge
& professional competencies
Applied interdisciplinarity
IR reached a mature stage
IR penetrated many fields & human activities
Stressed HUMAN in human-computer
interaction
© Tefko Saracevic, Rutgers University
45
Challenges
Adjust to the growing & changing social &
organizational role of inf. & related inf.
infrastructure
Play a positive role in globalization of information
Respond to technological imperative in human
terms
Respond to changes from inf. to communication
explosion - bringing own experiences to resolutions,
particularly to the INTERNET
Join competition with quality
Join DIGITAL with LIBRARIES
© Tefko Saracevic, Rutgers University
46
Juncture
IS is at a critical juncture in its evolution
Many fields, groups ... moving into information



big competition
entrance of powerful players
fight for stakes
To be a major player IS needs to progress in its:




research & development
professional competencies
educational efforts
interdisciplinary relations
Reexamination necessary
© Tefko Saracevic, Rutgers University
47
© Tefko Saracevic, Rutgers University
48
© Tefko Saracevic, Rutgers University
49
Download