Developing a Generic Toolkit: Architecture and technology issues

advertisement
Developing a Generic
Toolkit:
Architecture and technology
issues
ALLC/ACH Conference 2003
Chris Turner
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Generic Criteria
• Achieving Reusability through:
– System independence
– Standardisation
– Availability & cost
– Support
– Sustainability
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
TVS Model
• The TVS model (Carvalho & Cordeiro, 2002)
– proposes a structured framework for the exploitation
of XML technologies
• Transport
– Data exchange, transfer between systems
• Validation
– Structure, semantics, basic data typing
• Services
– Reuse of search and display functionality
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Web Services
• Web Services is an umbrella term for a set of standards
which are about communication between separate
software programs. They address the process of
exchanging data and instructions between different
programs. The programs may be resident on different
computers, with different operating systems and written
in different languages. Web Services use XML as the file
format and HTTP as the transport protocol.
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
System Components
User PC/Mac/X
Web
Browser
Server – Linux/Unix/Windows
LEADERS application
LEADERS Toolkit
Digitised/Encoded
Resources
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Digitised/Encoded Resources
EAC XML files –
EAD XML file –
Contains metadata
including index terms
regarding the entire
collection and resources –
•Original documents
•Transcripts
•Images
Descriptive metadata
about People,
Organisations and
families
TEI XML files –
Transcripts of original
documents
Image JPEG files –
Digitised images of
original documents
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
System Components
User PC/Mac
Web
Browser
Server – Linux/Unix/Windows
LEADERS application
LEADERS Toolkit
Digitised/Encoded
Resources
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Toolkit
LEADERS Application
WSDL- XML file describing services
– utilized by application
SOAP XML Messages generated to
carry messages between application
and toolkit
Services, written in Java:
 Search name/place/topic indexes – return browse list containing nearest match plus
four entries above and below in index.
 Search by name, place, topic, date individually or in combination – return hitlist
 Search by id – return detailed object display
Search Engine
XML Index document
derived from EAD finding
aid by stylesheet
Digitised/Encoded
Resources:
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Toolkit
• Components:
– Server environment: Apache Cocoon
– Parser, Processors, etc.:
• JAXP (Sun), dom4j, Xerces, Xalan, FOP.
– Search engine: Lucene
– System neutral; Open source; Supported by
Apache Software Foundation
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Toolkit
• Reusability:
– Can be applied to other resources encoded
according to the schema rules.
– Services can be consumed by multiple
applications.
– Services may be added or extended.
– Can be hosted on Windows/Unix/Linux.
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
System Components
User PC/Mac
Web
Browser
Server – Linux/Unix/Windows
LEADERS application
LEADERS Toolkit
Digitised/Encoded
Resources
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Application
• Hosted by Apache Tomcat
• ‘Consumes’ Web Services
• Components
– Java server pages
– Stylesheets
– Cascading Stylesheets
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Application
• Reusability
– Same application can be used to access different
resources served up by the Web Services.
– New applications can be created to consume services
accessing the same or different resources e.g.:
• Integration with an educational application
• Study of palaeography
• Focus on biographical or authority information
• On-line ordering of images/offprints/document production
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Summary
• Reusability at all levels:
– Resource files may be re-used as stand
alone files and/or as components in one or
more LEADERS applications.
– Toolkit can support multiple resource sets
and multiple applications in multiple
environments.
– Applications can access multiple resource
sets in multiple environments.
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
LEADERS Demo application
– An instance of a LEADERS Application.
– Concentration on the detailed presentation of
resources, rather than on search interface.
– Built for the purpose of gaining user feedback.
– Example screen shots of demonstrator
application.
Copyright, UCL
LEADERS: Linking EAD to Electronically
Retrievable Sources
Download