beyond_marc

advertisement
VuFind Beyond MARC
discovering everything else
Demian Katz
VuFind Developer
demian.katz@villanova.edu
How VuFind Used to Work
• MARC records were loaded into Solr.
– Data parsed to fields for searching/faceting.
– Full binary record stored in “fullrecord” field.
• Solr was used for retrieving records.
• VuFind’s PHP code made heavy use of
“fullrecord” data for building displays.
What’s wrong with that?
•MARC must die.
• Not all searchable documents are MARC.
• Code for pulling data from MARC is ugly.
Redesign Goals
• Centralize MARC-specific code so it can
be easily replaced.
• Use stored Solr fields whenever possible.
• Allow arbitrary metadata formats to coexist
peacefully.
• Make no assumptions about metadata
content.
The Solution: Record Drivers
• A class interface for displaying a
document retrieved from Solr.
• A new Solr field tells VuFind which Record
Driver to instantiate for each document.
• A default Record Driver can be written to
display a document based solely on stored
Solr fields.
One Key Design Decision
• What should the Record Driver class
contain?
– Data-oriented methods (getTitle, getAuthor,
etc.)
– Screen-oriented methods (getSearchResult,
getStaffView, etc.)
The Answer: All of the Above
interface RecordInterface
public getSearchResult()
public getStaffView()
class IndexRecord
…
implements
RecordInterface
protected getAuthor()
protected getTitle()
…
class MarcRecord
extends IndexRecord
protected getAuthor()
protected getTitle()
…
Record Driver Benefits
•
•
•
•
Large-scale changes are possible.
Small-scale changes are easy.
Allows object-specific behaviors.
Eases maintenance of local
customizations.
Next Problem…
• Where’s the data?
• MARC records traditionally come from an
ILS export.
• SolrMarc traditionally takes care of
populating VuFind’s Solr index.
Growing the Toolkit
• The toolkit approach is important!
• Problems to solve:
– Obtain records from remote sources
– Process harvested files
– Index arbitrary XML
Tool #1: OAI-PMH Harvester
• Purpose of tool: harvest metadata files
from an OAI-PMH server into a directory.
• Key feature: ID manipulation.
• Key feature: delete support.
Tool #2: Batch Import Scripts
• Purpose of tool: process all metadata files
in a directory.
• Easily achieved with Windows batch or
Unix shell scripting.
• Several sample scripts ship with VuFind.
Tool #3: XSLT Importer
• Purpose of tool: with XSLT, map an XML
document to a Solr document based on
VuFind’s schema.
• Key feature: PHP integration
• Key feature: Aperture support
• Several sample XSLT documents ship
with VuFind (DSpace, OJS, VuDL).
Parting Thoughts
• Understanding Record Drivers gives you a
lot of control over VuFind.
• VuFind should be able to index practically
anything with a bit of effort.
• Don’t be afraid to build your own tools!
More Information
• VuFind:
– http://vufind.org
• Demian Katz:
– demian.katz@villanova.edu
Download