Smart Subjects: Application Independent Subject

advertisement
Smart Subjects:
Application Independent
Subject Recommendations
Tito Sierra
NCSU Libraries
Code4Lib 2007
Outline
•
•
•
•
•
•
Concept
Motivation
Smart Subjects Applications
How it Works
Strengths and Weakness
Future Plans
Smart Subjects Concept
Input:
• User search query
Output:
• A list of related library subjects
Smart Subjects Concept
Input:
• User search query
Output:
• A list of related library subjects
Basically a subject recommendation engine.
Example 1
Input:
music therapy
Output:
•
•
•
•
Music
Curriculum & Instruction
Education
Communication &
Media
• Psychology
• Biochemistry
Example 2
Input:
asymptotic stability
Output:
• Bioinformatics &
Biomathematics
• Statistics
• Mathematics, Science &
Technology Education
• Mathematics
• Computer Science
• Aerospace Engineering
Example 3
Input:
Output:
illegal immigration
•
•
•
•
•
•
Criminology
Political Science
Public Administration
Biology
Zoology
Industrial Engineering
Example 3
Input:
Output:
illegal immigration
•
•
•
•
•
•
Criminology
Political Science
Public Administration
Biology
Zoology?
Industrial Engineering
Motivation
Search log analysis:
standard, international economic
development, fines, dissertation
abstracts, music therapy, ACM,
wolfcopy, Oxford English Dictionary,
audio, illegal immigration, schedule,
interlibrary, datamonitor, chemistry,
JAMA, CRC, photography, vision,
wiley, ciation builder, job, academic
search elite, ria, film studies, career
development, sanborn maps,
citation index, iee, history, industry
analysis, scholarly journals, ethics,
spss, petition, animal behavior,
psych info, repository, ENR,
diabetes, data, lrl, cancer,
textbooks, wharton, Christian
Science Monitor, ITTC, blah,
PubMed, time magazine, nutrition,
DVD, questia, conductive heat
transfer, sage, newspaper
Motivation
Search log analysis:
• Lots of topical
subject queries in
the long tail!
standard, international economic
development, fines, dissertation
abstracts, music therapy, ACM,
wolfcopy, Oxford English Dictionary,
audio, illegal immigration, schedule,
interlibrary, datamonitor, chemistry,
JAMA, CRC, photography, vision,
wiley, ciation builder, job, academic
search elite, ria, film studies, career
development, sanborn maps,
citation index, iee, history, industry
analysis, scholarly journals, ethics,
spss, petition, animal behavior,
psych info, repository, ENR,
diabetes, data, lrl, cancer,
textbooks, wharton, Christian
Science Monitor, ITTC, blah,
PubMed, time magazine, nutrition,
DVD, questia, conductive heat
transfer, sage, newspaper
Motivation
Existing work:
• Subject Browse
portal at NCSU
Subject Browse at NCSU
• Locally developed subject classification
launched in Fall 2005
• 100 subject nodes in 12 top-level
categories
• Subject nodes influenced by the
university curriculum (e.g. Crop
Science)
Subject Browse at NCSU
Subject Browse at NCSU
Smart Subjects Applications
• Quick Search integration
• OpenSearch interface
Quick Search Integration
Quick Search Integration
Quick Search Integration
OpenSearch Interface
OpenSearch Interface
How it Works
1. Harvest available institutional data
•
•
Course catalog descriptions
Faculty publications citations
2. Create “text extract” representations for
each academic department on campus
3. Index the text extracts
4. Retrieval interface queries indices
5. Retrieval algorithm crosswalks academic
departments to library subject classification
How it Works
How it Works
How it Works
Technology Used
• SWISH-E for indexing
• PHP for retrieval processing/scoring
Strengths
• Application and collection independent
• Subject recommendations can be
integrated in any library search
application
• Encourages broader, serendipitous
resource discovery
Weaknesses
• False positives (bad recommendations)
• Zero hits (no recommendations)
Future Plans
• Deploy new uses of Smart Subjects tool
• Database Advisor
• Increase the size of subject indices
• Article table of contents data
• Backlog of course descriptions
• Gauge interest for a community subject
recommendation platform
More Information
Project Site:
http://www.lib.ncsu.edu/dli/projects/smartsubjects
Thanks!
Tito Sierra
NCSU Libraries
tito_sierra@ncsu.edu
Download