Microsoft PowerPoint - Document Repository

advertisement

Primo and Omeka : turning local databases into harvestable repositories

Alexander J. Jerabek

Bibliothécaire

Technologies de l’information

Service des bibliothèques jerabek.alexander_j@uqam.ca

2014-05-01

Goal

Make special peripheric collections more accessible and more visible by integrating them into Primo

The Pouchet collection

1.

Donation of 36 000 print documents and 20 050 vinyl records to the Music library

2.

Primarily pedagogic or popular documents

3.

Catalogued apart from main catalogue, searchable in a local database (Access, .asp)

4.

Ongoing work to catalogue all items

The Pouchet collection

The Pouchet collection

The Pouchet collection

The Pouchet collection

Problem

1.

How to get existing records into Primo?

2.

How to get new or modified records into Primo?

Local database ‘Palmaro’

Local database ‘Palmaro’

Omeka

“Omeka is a free, flexible, and open source webpublishing platform for the display of library, museum, archives, and scholarly collections and exhibitions. Its “five-minute setup” makes launching an online exhibition as easy as launching a blog .” http://omeka.org/about

Omeka is a project of the Roy Rosenzweig Center for History and New Media,

George Mason University.

Advantages of Omeka

1. Easy set up and maintainance

2. French interface

3. Does exactly what we need : create and update records and allow harvest via by Primo

4. Useful plugins

5. Create multiple users

6. Long range plans for possible digitization

Disadvantages of Omeka

1. Not possible to make global changes to records

2. Dublin Core not always best fit for data

3. Not always easy to define default values

4. Not possible to export data

Omeka plugins

1. CSV Import

2. OAI-PMH Repository

3. Simple Vocab

4. Dublin Core Extended

5. Hide Elements

6. Collection Tree

Prepare the staff

1. Create users

2. Write up procedures for creating records

3. Re-iterative process

4. Test runs in staging to find snags

Omeka admin

Omeka admin

Omeka admin

A few bugs

1. Dropping initial diacritic

2. Cannot search on three letter words

Import data into Excel

1.

Tidy data as much possible

1.

Filters in Excel

2.

Search and replace in Textpad

3.

Corrections using OpenRefine ( http://openrefine.org/ )

2.

Add columns, constants (e.g. Format)

3.

Crosswalk, column headers to DC elements

4.

Save as csv UTF-8

Excel to CSV

Dataset import into Omeka

Dataset import into Omeka

Omeka CVS import defaults

Choose Column Delimiter is : ;

Choose Tag Delimiter is : |

Choose File Delimiter : ,

Choose Element Delimiter : /

Data set import into Omeka

Data set import into Omeka

Data set import into Omeka

Setting up Primo

1. Set up a datasource

2. Set up a scope

3. Set up a pipe

4. Create new local fields

5. Create new set of normalization rules

6. Tweak Primo interface

1. Set up a data source

2. Set up a scope

3. Set up a pipe

4. Create new local fields

1. lds08 : Parolier (lyricist)

2. lds09 : Compositeur (composer)

3. l ds10 : Interprète (performer)

(see notes below for steps)

4. Rules for new local fields

Ex. new field for lyricist based on ‘ (par.) ’

5. Create new normalization rules

Strip out parenthetical notes for display

5. Create new normalization rules

Strip out parenthetical notes for display

5.(record modified in Omeka)

5. Create new normalization rules

Add complementary information not:

Dublin Core:Publisher

Bibliothèque de Musique

Instead added :

Dublin Core:Description

Disponible au comptoir de prêt

<display>

<ispartof >Musique en feuille no.10599, voir au comptoir de prêt de la Bibliothèque de Musique</ispartof>

5. Create new normalization rules

Added or modified a few elements to conform with our Aleph records

1.

<display/type> = score

2.

<search/general> = Musique en feuille

3.

<search/searchscope> = ubibmusique

4.

<facets/toplevel> = uqam_inst

5.

<facets/library> = M

6. Tweak Primo interface

No use for location/request tab or for more (sfx) tab. Hide them with CSS using the datasource prefix : ul.EXLResultTabs li.EXLRequestTab a[href*="BIBMUSIQUE"], ul.EXLResultTabs li.EXLMoreTab a[href*="BIBMUSIQUE"] {display:none;}

Html:

<ul class="EXLResultTabs …">

<li class="EXLRequestTab …">

<a href="display.do?tabs=requestTab ….doc=BIBMUSIQUE10478...">

<a href="display.do?tabs=moreTab...&doc=BIBMUSIQUE10478...">

A few problems, questions remain

Aznavour and Coulonges

The problem of Aznavour as (comp.), (interp.), (par.) – leave in parenthical elements or remove them

Vs

Aznavour and Coulonges

Aznavour

(include all facets)

Aznavour and Coulonges

Aznavour

(include all facets)

Aznavour and Coulonges

The example of Georges Coulonges as (comp.), (par.) – leave in parenthical elements or remove them

Vs

A few problems, questions remain

Strip out parenthetical notes for facets and suggested new searches

In addtion to ‘(par.)’ etc. we also have ‘(par. Fr.)’ and others. To get them all we used:

A few problems, questions remain

Currently no way to limit or prefilter to ‘Musique en feuille’, searchable elements are incompatible with visible elements

Resource type vs Format

Library vs Collection

Not a visible searchable scope option

Outcomes

1.

Collection is available via Primo

2.

Records are modified, added, harvested nightly into Primo

3.

Circulation stats increase dramatically

Future plans

1. Phase 2 of Pouchet collection, ~10k vinyl recordings

2. Horus : Law library annual reports database, 1500 records

3. Gestio : Management documentation centre collection of grey literature, technical papers, etc. 6000 records

4. Possibility of adding digital objects if sheet music is scanned, documents are digitized

Questions?

jerabek.alexander_j@uqam.ca

Download