ISB survey-CRADLE-2006-03-24

advertisement
Information Seeking Behavior
of Scientists
Brad Hemminger
bmh@ils.unc.edu
School of Information and Library Science
University of North Carolina at Chapel Hill
Contributors
• Assisting Researchers
–
–
–
–
–
Jackson Fox (web survey)
Steph Adams (participant recruiter)
Dihui Lu (initial descriptive statistical analysis)
Billy Saelim (continued statistical analysis)
Chris Weisen (Odum Institute, statistical consultant)
• Feedback on Survey Design
– UNC Libraries: Bill Burke (Botany), David Romito (Zoology), Jimmy
Dickerson (Chemistry), Zari Kamarei (Math/Physics)
– KT Vaughan (Health Sciences Library)
– Cecy Brown (University of Oklahoma)
• Supported by
–
–
–
–
UNC Libraries
Carolina Center for Genome Sciences
Basic Science Department chairs
RENCI P20 grant
Why Study Information Seeking
Behavior of Scientists
• Goal is to improve scholarly communications.
Other areas of my research involve
presentation aspects (visualization/computer
human interaction) and the storage and
communication of scholarly information
(digital libraries, institutional repositories,
virtual communities of practice).
• To do this we need to understand how people
search out and use information currently, and
why. As part of investigating this we found
that there has been a significant change in the
last 5-10 years.
• So we’re studying ISB both to understand it,
and to look at recent changes.
How to Study the Information Seeking
Behavior of Scientists?
• Survey
–
–
–
–
Reach many people
Address common questions
Produce lots of feedback for libraries
Quantitative, models of variance (“positivist” approach)
• Interviews
– In depth coverage of selected groups (bioinformatics)
– Use grounded theory and critical incident techniques to
capture more qualitative, contextual experiences
– Develop models of information processing and use
Survey--Long Term Plan
• Conduct an initial survey study at UNC.
Develop survey instrument and interview
methodologies that work here, but could easily
be applied on a larger scale.
• From the results of the initial UNC study, draft
national version (with feedback from national
sites).
• Run national study. Setup so that other sites
only have to recruit subjects; the entire survey
runs off of UNC website. Hopefully this results
in large number of sites and participants for
minimal experimental costs.
Survey Sampling Technique
• Census
– Need to be able to reach all members
– Best if can get response from large segment of
population
– Results in potentially more input from wider audiences,
especially for the open comment questions.
– Subject to bias (only computer users take, etc.)
• Random sample
– Statistically, generally a better choice
– Higher cost and significantly more work due to
identifying and following up with individual subjects
Questions
• Questions were based on
– Prior studies with which we wished to correlate our
results. This is facilitated by authors who have
published their surveys (in papers as appendix, e.g.
Cecy Brown), and especially to folks who have put
theirs collections of surveys online (e.g. Carol Tenopir).
– This allows us to compare results over time, as well as
to clarify current practices (for instance whether print or
electronic formats are used—and looking breaking this
out into two questions, retrieval versus reading)
– Covering issues that our librarians were concerned
about
– Developed during several drafts and that were reviewed
by representatives from all libraries on campus.
Survey Instrument Choices
•
•
•
•
Paper
Phone
Email
Web-based. While these can require more effort than
anticipated, if the number of survey respondents is
over several hundred it is generally more cost
effective*. This seemed the best choice since our pilot
survey was of several thousand subjects, and our
national survey was planned for tens of thousands.
Since we have web and database expertise we were
able to automate the process with minimal startup
costs.
*[Schonlau 2001, “Conducting Research Surveys via E-mail and the Web”].
Data Acquisition Details
• PHP Surveyor used for web based survey.
Another common choice at our school for
simpler surveys is Survey Monkey. PHP
Surveyor allowed us to ask multi-part
questions, and to constrain answers to
specific format responses.
• PHP Surveyor dumps data directly into MySQL
database.
• Data is cleaned up then feed into SAS for
analysis. (data cleaning is still a significant
manual effort! Examples were determining
Dept/CB, browsers that didn’t validate
datatypes on forms properly).
Subjects and Recruitment
• Subjects are university faculty, grad
students and research staff.
• We approached all science department
chairs to get support first.
• Contact
– Initial contact was by email giving motivation for
study, indication of support by depts&campus,
and link to web-based survey.
– Follow-ups by letter, then two emails
– Flyers in department, Pizza Party Rewards
Look at Survey
902 participants from recruited
departments, which were classified as
either science or medicine.
Participation rate was 26%.
Participants by Department
Survey
Analysis
• For the quantitative response variables
standard descriptive statistics (mean,
min, max, standard deviation) are
computed, and histograms are used to
visualize the distribution.
• Categorical variables are reported as
counts and percentages for each
category, and displayed as frequency
tables.
Analysis: Correlations
• Categorical vs Categorical
– Chi-square
• Categorical vs Quantitative
– Analysis of Variance
• Quantitative vs Quantitative
– Correlation
• Examples are by dept analysis of other
features; age vs preferred interface
(Google or Library)
Participants
Position
Science
Science
(%) Medicine
Medicine
(%)
Total Total (%)
professor
58
12.47
39
8.92
97
10.75
associate professor
23
4.95
41
9.38
64
7.10
assistant professor
40
8.60
46
10.53
86
9.53
research staff/adjunct
15
3.23
17
3.89
32
3.55
post graduate/fellow
46
9.89
37
8.47
83
9.20
others
19
4.09
48
10.98
67
7.43
doctoral student
246
52.90
179
40.96
425
47.12
masters student
18
3.87
30
6.86
48
5.32
Gender
Science
Science
%
Medicine Medicine Total Total
%
%
Female
179
38.49
280
64.07
459
50.89
Male
286
61.51
157
35.93
443
49.11
Distance to Library
Distance to Library
Count
Percentage
Same building
175
19.40
1/4 mile
570
63.19
1/2 mile
88
9.76
1 mile or more
69
7.65
Simple Questions
• Ninety-one percent of the participants had
access to the internet in their office or lab.
• Do you maintain a personal article collection?”
Most all participants (85.4%) responded that
they did, while only 14.6% did not
• Do you maintain a personal bibliographic
database for print and/or electronic
references?”, and 52.2% of the participants did
maintain one, while 47.8% did not.
How often do you use…
Daily or
Week
ly %
daily
weekly
monthl
y
quarterly
annuall
y
never
book
24%
60
157
241
223
148
73
journal
87%
509
277
72
22
6
16
preprint
18%
57
105
155
109
72
404
conference
2%
4
14
37
193
492
162
proceeding
5%
14
37
79
168
273
331
webpage
70%
362
277
132
67
19
45
online
database
67%
293
311
119
49
32
98
personal
communic
ation
52%
241
228
132
114
64
123
5
7
3
0
2
885
other
1%
Most Important Individual Sources
Basic Science Journals Count
Medicine Journals Count
Science 99
Science 45
Nature 90
Nature 39
Cell 36
JAMA 38
Journal of the American
Chemical Society 34
30
UpToDate
Journal of Cell Biology 20
New England Journal of Medicine 28
Journal of Biological
Chemistry 19
18
Journal of Immunology
Analytical Chemistry 18
American Journal of Epidemiology 17
PNAS 13
Cell 16
Journal of Neuroscience 12
Lexi-Comp 15
Evolution 11
Journal of Biological Chemistry 14
Neuron 11
Epidemiology 13
Development 10
AIDS 12
Journal of Organic
12
Important Alerts
Basic Science Alerts Count
Medical Alerts Count
PubMed
40
PubMed
53
Faculty of 1000
27
Medscape
11
ISI
14
Nature
10
ACS Journal Alert
11
Faculty of 1000
9
Nature
10
PubCrawler
9
ScienceDirect
9
ISI
7
Science
7
ePocrates
6
PubCrawler
4
ASHP
5
Biomail
3
NEJM
5
COS
3
MDLinx
4
J Biol Chem
3
Science
4
ACM
2
ScienceDirect
4
ArXiv
2
ADA Daily Knowledge
BMC alerts
2
JAMA
3
Cancer Research
2
Kaiser listserv
3
3
Tools for Searching Information
Search tool type
Frequency
Percentage
Citation index database
1084
47.25%
General web search engine
694
30.25%
Fulltext digital library
156
6.80%
Personal search tool
125
5.45%
Knowledgebase web portal
93
4.05%
Others
69
3.01%
Online or local database
52
2.27%
Library collection
21
0.92%
Types of Information Sources
Sources
Science
Medicine
Total
(electronic) library subscribed journal
20.17
19.89
20.03
(electronic) open (free) access journal or institutional
repository or digital library
7.86
9.29
8.57
(print) library subscribed journal
4.48
3.61
4.05
(electronic) web site (author's website)
4.36
3.31
3.89
(print) Personally subscribed journal
3.44
4.01
3.73
(print) copy of colleague's print copy
1.07
5.00
3.00
(electronic) personal subscribed journal
3.10
2.65
2.88
(electronic) personal digital library
2.89
1.97
2.43
(electronic) lab subscribed journal
2.72
1.14
1.97
(electronic) copy of colleague's electronic copy
1.60
1.98
1.79
(print) lab subscribed journal
2.05
0.79
1.43
(print) interlibrary loan
0.59
0.55
0.57
(print) document delivery service
0.13
0.19
0.16
other
0.02
0.13
0.07
Articles in Personal Collection
Number of
Articles
Print
Print %
Electronic
Electronic %
none
45
104
1-49
154
21.24%
259
38.89%
50-99
160
22.07%
127
19.07%
100-499
280
38.62%
210
31.53%
500-999
81
11.17%
44
6.61%
1000+
50
6.90%
26
3.90%
Articles in Personal Article Collection
that have annotations
Percentage of entries with
notes
Total count
Total Percentage
<10%
327
36.25
11-20%
75
8.31
21-30%
82
9.09
31-40%
30
3.33
41-50%
126
13.97
51-60%
19
2.11
61-70%
26
2.88
71-80%
100
11.09
81-90%
47
5.21
>90%
70
7.76
Preferred Search Method
Science Science Medicine Medicine
%
%
Total
Total %
Electronic versions of
databases and journals
443
95.27
429
98.17
872
96.67
Print versions of
databases and journals
22
4.73
8
1.83
30
3.33
Preferred Viewing Method
Science Science (%) Medicine
Medicine
(%)
Total
Total
(%)
Both/it depends
292
62.80
260
59.50
552
61.20
electronic (computer)
only
63
13.55
52
11.90
115
12.75
print (hard copy) only
110
23.66
125
28.60
235
26.05
Number of Visits to the Library
in the past 12 Months
Science
Science%
Medicine
Medicine
%
Total
Total%
0-2
101
21.72%
107
24.49%
208
23.06%
3-5
75
16.13%
99
22.65%
174
19.29%
6-10
77
16.56%
71
16.25
148
16.41
11-20
84
18.06%
55
12.59
139
15.41
21-50
85
18.28%
67
15.33
152
16.85
51-100
34
7.31%
19
4.35
53
5.88
101-200
7
1.51%
13
2.97
20
2.22
>200
2
0.43%
6
1.37
8
0.89
Reasons for Visiting the Library
Science
Science
Medicine
Medicine
Total
Total
photocopy
256
22.54%
274
22.81%
530
22.68%
get assistance
from a
librarian
65
5.72%
96
7.99%
161
6.89%
use computers
59
5.19%
112
9.33%
171
7.32%
perform searches
81
7.13%
117
9.74%
198
8.47%
read current
journals or
other
materials
161
14.17%
156
12.99%
317
13.56%
quiet reading
space
156
13.73%
179
14.90%
335
14.33%
meeting
45
3.96%
73
6.08%
118
5.05%
browse
99
8.71%
60
5.00%
159
6.80%
pick up /drop off
materials
214
18.84%
134
11.16%
348
14.89%
Factors Affecting Choice of Journal to
Publish In
Factors Affecting Choice of Journal to
Publish in
Science
Medicine
Total
Ability to include links, color, graphics,
multimedia
1.38
1.24
2.31
audience
3.52
3.38
4.45
author having to pay cost of publication
1.51
1.54
2.53
availability on campus
1.79
1.88
2.83
editorial board
2.11
1.95
3.03
page charges for long articles or color
figures
1.40
1.45
2.42
speed of publication
2.42
2.27
3.35
standing of journal in your field
3.77
3.61
4.70
support of open access to journal articles
2.09
2.17
3.13
Google vs Library Search Page
• “Which interface would you rather use to begin
you search process?” with the possible
responses “Google search page” and “Your
library’s home page”. Overall, a slight majority
of users preferred Google (53.3%) over the
library page (46.7%); however, the difference
was substantially larger for basic science
researchers (Google 58.5% versus Library
41.5%) compared to medical researchers
(Google 52.2% versus Library 47.8%).
Google vs Library Search Page
• This difference may also be larger if the
question had asked which style or type
of interface the users preferred, as many
of the comments in the survey indicated
a strong preference for a single “meta”
search tool where the user could enter a
single search string that would result in
all content in all resource collections
being searched (as opposed to manually
identifying resource collections and
individually searching them).
Summary
We never leave our chairs…
• Most all information seeking and use
interactions occur on the researchers’
computer in their office.
• As a result library visits have
dramatically declined, and the reasons
for visits to library have changed.
• Researchers read both in electronic and
print form, but print (paper) is still the
most preferred form.
Single Text Box + MetaSearch
• Researchers prefer a single text box for
initial searching, that covers all
resources.
• This is most evidenced by preference for
Google Scholar over library web page
interfaces.
More than just text
• Researchers are making increasing use of
content contained in online databases like
Genbank, or web pages of research labs.
• For the scientists in our survey this type of
access has surpassed personal
communications and is close to journal
articles in frequency of usage by
researchers.
Transformative Changes
• Transformative collaborative group
communications have already taken place in
the consumer marketplace, and are finding
their way into scholarly communications.
Examples include folksonomies supporting
community tagging (Del.icio.us), comment and
review systems like Amazon’s rankings,
FLickr, etc. Beginnings of similar changes are
in their initial stages for scholarly
communities, for instance Faculty of 1000 and
the Connotea application for online sharing of
bibliographic databases and annotations by
scientists.
What might the future hold?
• In the future the researcher may all maintain all
their scholarly knowledge online and make it
accessible to others as they see fit. Having
scholars’ descriptions and annotations of the
digital scholarly materials as well as the
materials themselves available on the web will
allow online communities and community
review systems to blossom, just like the
availability of online journals articles has
transformed basic information seeking of
science scholars today.
Future Work
• Upcoming papers from UNC survey
– Correlations, information seeking behavior predictions from
demographics
– By department/research area comparisons
– Review and reflection on major changes (with Cecy Brown, Don
King, Carol Tenopir)
– Textual analysis of library comments (Meredith Pulley)
– New work being proposed by other researchers using this data (if
you think the data from this study might help you in your research
come talk to me).
• National Study….about to begin…
• Interview Studies (labs, individuals)
bmh@ils.unc.edu
Download