Putting Metadata to Work - Information Management Advisory Service

advertisement
Case studies of practical
data management
Ben Kreunen
Technical Support Officer
University Digitisation Service
Putting Metadata to Work
Ben Kreunen
Technical Support Officer
University Digitisation Service
Putting metadata to work
What?
• Print collection:
– Image management tool created as a spin off
from the data used to scan
• Thesis on demand
– Incorporating administrative data into the
scanning process to improve business processes
• Asset stocktake (pilot)
– From THEMIS to iPhone and back
• Creating a contact list from an org chart (concept)
– Linking business processes and data
Putting metadata to work
Why?
• Time is money
– Save 10 seconds on a task performed 2,500
times and you save 1 working day
• Doing repetitive tasks sucks
• Doing repetitive tasks again sucks
• Reducing mental fatigue
Putting metadata to work
How?
• Reduce the time it takes to do stuff
– Automatically enter related data
– Collect data that’s been entered somewhere else
– Select from lists rather than type
– Re-use data that’s been entered before
– Script repetitive processes
• Simplify interface design
– Only show the data you need at the time
– Visual feedback
Putting metadata to work
Who?
• People who manage the process
• People who DO the process
• People with technical skills
• Working together!
Putting metadata to work
Simple tools, clever connections
http://www.philohome.com/panobot2/panobot2.htm
Putting metadata to work
Metadata:
• Data about data
Putting metadata to work
How do we manage Metadata?
• Data about data
Putting metadata to work
How do we manage data?
• Data about data
–Excel?
–Access?
–Database?
Putting metadata to work
How do we manage relational data?
• Data about data
–Excel?
–Access?
–Relational database?
Putting metadata to work
How do we manage relational data
efficiently?
• Data about data
–Excel?
–Access?
–Relational database?
Putting metadata to work
What is data management?
• Making lists of stuff
• Finding things in lists of stuff
• Sharing lists of stuff
• Editing lists of stuff
• Combining lists of stuff
• etc….
Re-using data
Open source tools
Usability
Print Collection
Putting metadata to work
Data Requirement
• There must be only one ID number to link each
image to a catalogue record
Putting metadata to work
Before
• ~3,500 images on 4 external HDDs without an
index (2Tb)
• File names based on a partial accession number
• Online images served via KE EMU
• Duplicate accession number exist
• Number of duplicate IDs not known
• ~4,500 prints to be scanned
Putting metadata to work
Preparation for scanning
• Prepare data
– Export data from EMU
– Create separate database to analyse/prepare
data
– Locate duplicate records (12)
– List existing images and calculate ID numbers
– Locate invalid file names (7)
– Copy master files to network storage (1Tb)
Putting metadata to work
Scanning requirements
• Previous images scanned with colour chart
• Getty “standard” ie. multiple versions at different
sizes
• All versions of images have colour chart
(archive master has colour chart, other versions
should be cropped)
• Archive master files = 50% of total file size
Putting metadata to work
Planning
• What is the best way to capture a master image
and cropped version?
• Should a cropped version be created of the
existing images?
Putting metadata to work
Planning
• What is the best way to capture a master image
and cropped version?
• Should a cropped version be created of the
existing images?
• Do we need to create a cropped version?
– Saves time digitising
– Reduces storage costs ~40%
Putting metadata to work
Planning
• What is required to crop images on demand?
• Is it possible?
• Can a standard computer do it?
• What data do we need?
• How do we collect it?
• How are the images used?
• How are requests processed?
Putting metadata to work
Planning
• What is required to crop images on demand?
• Is it possible?
Mini Project
• ImageMagick + coordinates + batch file =
automated cropping on demand
• Hack techniques to collect data
• Raised awareness of other possible uses
Putting metadata to work
Acquiring coordinates for cropping
Putting metadata to work
Scanning issues
• ID numbers of prints delivered is random
– Locating 1 ID number in a list of 8,000…
Putting metadata to work
What broke
• Not all ID numbers are unique
– modification of naming schema required
• “Modified” scanning procedure to deal with
annoyances was prone to the occasional error
– error, cause and solution identified by scanner
operator
• Image from previous project did not match
ID number
Putting metadata to work
Helping others
• A small step to change our project work into a
tool to improve management of image collection
– Crop, resize and format images on demand
– Fast response to deal with requests for
images
– Images more secure
– Images accessed using familiar identifiers
Putting metadata to work
Helping others
• Runtime version of database to be given to collection manager
• Total software cost: $0
Putting metadata to work
Helping others
Putting metadata to work
Helping others
Putting metadata to work
Helping others
Can we browse the
images scanned to
date?
Putting metadata to work
Helping others
That’s great...
Can you do the same
thing for everything
else you’ve scanned?
(currently 250,000 files)
Automating administrative processes
Sharing administrative data
Minimising data entry
Thesis on demand service
Putting metadata to work
About the service
• Copy of a thesis is requested by a researcher/
academic library for research purposes
• Thesis is scanned (for a fee) and delivered to
client
– Print
– CD
– Cloudstor
• Recently relocated from the Baillieu to UDS
Putting metadata to work
Challenges
• Incorporating administrative data and processes
• Multiple time frames depending on delivery
• Variable timing for delivery of theses
– accessed locally or from offsite archives
• Process is now split across 2 departments
Putting metadata to work
The Request
Putting metadata to work
Data entry
• Thesis details
– Scan barcode
– Automatic collection of required and optional
metadata
• Delivery method – check box
– Email address if Cloudstor
• Date request received
• Urgency – check box
Putting metadata to work
Putting metadata to work
Re-using data
• Date item is to be scanned by calculated from:
– Date received
– Delivery method
– Urgency
• Work list sorted by “completion status” and
“date due”
• Output filenames automatically generated from
metadata (author, year)
Putting metadata to work
Re-using data
• File delivery is automated as much as possible:
– Copy and rename file to pickup folder
– Generate email message to notify Special
Collections and Repository team
– Load Cloudstor interface if selected as the
delivery method
• Entries for each form field generated and copied to
•
the clipboard
Upload form completed with 8 mouse clicks
Putting metadata to work
Re-using data
Putting metadata to work
Re-using data
Putting metadata to work
What broke
• Client queries could not be answered
immediately because of the split
– no direct access to our data
– daily export of a PDF report enables most
queries to be dealt with
• Not all theses have barcodes
• Not all theses are catalogued
Putting metadata to work
What broke
Putting metadata to work
Outcomes
• Improved client communication
• Improved communication between departments
• Reduced data entry
• Improved quality of metadata
• Simplified reporting based on administrative data
Local management with centralised data
Simplifying data entry
Synchronising authoritative data
THEMIS Asset stocktake
Putting metadata to work
Issues
Error 401.303
Text box length exceeded. Refer to
KB1237 for assistance with this
error
Putting metadata to work
Issues
• Data in THEMIS is out of date
• No direct access to update THEMIS
– Generates significant workload for 2
organisations
• Asset data from other sources (CMDB) is out of
date
• Previous updates incomplete
Putting metadata to work
The Key(?)
• Excel “wizard” that can be imported into THEMIS
Putting metadata to work
Useability
• Where is the data I need to see?
Putting metadata to work
The Key
• Not user friendly BUT
• Consistent data structure for receiving and
updating data
• Create a local copy for collecting current data
• Populate with “static” data from THEMIS
• Compare “live” data with THEMIS
• Export current data to THEMIS
Putting metadata to work
The Pilot
• Filemaker 12 database to handle data
• Accessed via Filmaker Go on iPhone
• Integrate with CNS barcode app to scan
barcodes
• Streamline onsite data collection
Putting metadata to work
Simplify data display
Putting metadata to work
Potential Spin Offs
• Re-use data for local asset management
processes
• Warn me X weeks before a computer is due for
•
•
replacement
How many computers are due for replacement in
X months?
Auto-complete asset management forms
e.g. disposal
“Hacking” centralised data
Linking data management to process management
Data visualisation
Creating a contact directory
from an org chart
Putting metadata to work
The concept
• An org chart is a list of positions linked to people
• A contact list is a list of people linked to contact
data
• The people who maintain org charts are often the
same people responsible for local contact lists
• What if I want a list of people sorted by where
they work?
Putting metadata to work
The concept
• DO NOT update contact details locally
– Individuals must update their details in THEMIS
• Create links for Positions in org chart and link
reporting lines
• Link positions to usernames and lookup other details
• Export data for viewing
– GRAPHML for Org Chart
– XML, HTML or PDF for contact list
Putting metadata to work
Challenge
• It is technically possible for THEMIS to export an
•
XML data source for re-use
(Find an Expert)
For various reasons it is not practical at this point
in time
• How do I collect centrally managed contact
information efficiently?
– Active Directory?
Putting metadata to work
Raw data: It’s not pretty, but it’s useable
What I’ve learnt
Putting metadata to work
• Many people know the problems but without a
technical solution nothing happens
• Working smarter requires everyone to work
together
– Managers, works, technical people
• Know when to give up
• Working smarter is contagious
• IT support ≠ Technical support
Discussion/ Questions
© Copyright The University of Melbourne 2009
Download