Application Architecture

advertisement
Archivists’ Toolkit
Preliminaries: Architecture, DB
Leslie Myrick
NYU
Possible Java Architecture
• JSP Model 2 Architecture
– Servlet Controller
• Handles requests, View selection, instantiates beans
– JSPs update the View in the browser
– JavaBeans used to represent the object in
memory; access DB using JDBC
• manage the Model
– JDBC connection to the data source
Similar Use of Servlet/JSP Model
in Digital Library Applications
• Dspace
• UC Berkeley’s GenX system
• CDL Preservation Repository
JSP Model 2
• Cleanest separation of presentation and content
– Clear delineation of roles of developers and designers
• Takes advantage of strengths of servlets and JSPs
for serving dynamic content
– JSP for presentation layer
– Servlets for performing process-intensive tasks
• Servlet as Controller in charge of request processing, creation
of beans or objects used by JSPs to forward request
• No processing logic in JSPs -- simply responsible for retrieving
objects or beans instantiated by servlets
JSP Model 2 Architecture
JSP Model 1
• Bulk of processing performed by JSP
– Process requests and draw view
• Fine for simple applications
JSP Model 1 Architecture
MySQL vs postgreSQL
• Both ACID compliant (transaction safe)
• Both support referential integrity (as of MySQL
4.x)
• MySQL faster; postgreSQL more robust
• Finer grained locking in postgreSQL
– MultiVersion Currency Control in postgreSQL
• Want triggers? Views? Inheritance? For now go
with postgreSQL
• MySQL has built-in full-text search capability
• Ease of installation and maintenance – MySQL
hands down.
The ACID test
• Atomicity - All elements of a given transaction take place or none do.
• Consistency - Each transaction transforms the database from one valid
state to another valid state.
• Isolation - The effects of a transaction are not visible to other
transactions in the system until it is complete.
• Durability - Once a transaction has been committed, it's effects are
permanent-- even if the system crashes, or a disk dies.
Proposed DB Schema:
Archaeology / Genealogy
• Ultimately based on MOA II model
• With refinements to NYU’s zeroDB schema
for digital object metadata
• Torqued to describe archival objects and
their digital surrogates
• Same essential hook: pure Aristotelian
hierarchy
It all comes down to
object
• Pivotal entity is object nesting other objects
– objectType can be fonds, collection, component
– componentType can be series, file, item,
accretion
• Object hierarchy maintained through:
– objectID, parentID, nextSibID
Object Table
object
PK
FK1
FK4
FK5
FK2
FK3
objectID
objectTypeID
componentTypeID
parentID
nextSibID
hasChildren
rightsID
accessionID
provenanceID
physDescID
processFinal
physLocID
Accession Table
accession
PK
accessionID
accessionTypeID
resourceID
recordCollectionTypeID
collectionSurvey
processingPlan
processingNote
acqinfo
accruals
appraisal
abstract
generalNote
scopecontent
arrangement
accessrestrict
preservationNote
conservationNote
otherfindaid
transferFinal
Provenance Table
provenance
PK
provenanceID
bioghist
bibliography
custodhist
fileplan
donorNote
provenanceNote
Physical Location Tables
physLoc
physLocID
FK1
FK2
physLocLevelID
physLocTypeID
physLoc
isPublic
objectID
physLocType
physLocLevel
PK
PK
physLocLevelID
physLocLevel
PK
physLocTypeID
physLocType
CREATE TABLE physLoc (
physLocID int(11) NOT NULL auto_increment,
physLocLevelID int(11) not NULL default '0',
physLocTypeID int(11) NOT NULL default '0',
physLoc varchar(128) NOT NULL default '',
isPublic tinyint(1) unsigned NOT NULL default '0',
PRIMARY KEY (physLocID)
--
);
-- Data for table physLocLevel
-INSERT INTO physLocLevel (physLocLevel) VALUES ('repository');
INSERT INTO physLocLevel (physLocLevel) VALUES ('internal
location');
INSERT INTO physLocLevel (physLocLevel) VALUES ('physical
container');
--- Data for table 'physLocType'
-INSERT INTO physLocType (physLocType) VALUES ('accession location');
INSERT INTO physLocType (physLocType) VALUES ('processing location');
INSERT INTO physLocType (physLocType) VALUES ('shelflist location');
INSERT INTO physLocType (physLocType) VALUES ('offsite location');
Ingest of Legacy Data
from marcxml
• Student Programmers’ Assignment
• Probably involve JAXP/DOM
• Already undertaken conversion of records
from Innopac iiirecord dtd to marc21slim
schema; tape .mrc to marcxml using marc4J
Ingest of Legacy Data from EAD
• Testbed creation tool
• XSLT with Java Extensions using Xalan
– Get nextID from database
– Extensions instantiate and increment DBID,
parentID, nextSibID for each component in
<dsc>
– Write out to .sql file to dump into DB
<xalan:component prefix="counter"
elements="init incr" functions="read">
<xalan:script lang="javaclass" src="xalan://MyCounter"/>
</xalan:component>
<xsl:template match="/">
<counter:init name="index"/>
<xsl:template name="dsc">
<xsl:for-each select="ead/archdesc/dsc">
<xsl:variable name="dsc-parentID"><xsl:value-of select="counter:read('index')"/></xsl:variable>
<counter:incr name="index"/>
<xsl:for-each select="c01">
DBID: <xsl:value-of select="counter:read('index')"/>
PARENTID <xsl:value-of select="$dsc-parentID"/>
Series: c01-<xsl:number/>
Unittitle: <xsl:apply-templates select="did/unittitle"/>
Abstract: <xsl:apply-templates select="did/abstract"/>
<xsl:if test="./child::scopecontent">
Scopecontent:<xsl:for-each select="scopecontent/p"><xsl:apply-templates select="."/></xsl:for-each>
</xsl:if>
DBID: 3
PARENTID 2
Series: c01-1
Unittitle: Series I: Documentary Material
DBID: 4
PARENTID:3
Subseries: c02-1
Unittitle: Subseries A: Subjects
DBID:5
PARENTID: 4
Subseries: c03-1
Box: 1
Folder: 1
Unittitle: Advertising
Unitdate:undated
DBID:6
PARENTID: 4
Subseries: c03-2
Box: 1
Folder: 2-6
Unittitle: Art & Collecting
Unitdate: undated
DBID: 3
PARENTID: 2
NEXTSIBID: 126
Series: c01-1
Unittitle: Series I: Documentary Material
INSERT INTO OBJECT (objectID, parentID, nextSibID, hasChildren, componentTypeID)
VALUES (3,2,126,1,1);
INSERT INTO TITLE (titleID, titleTypeID, title, objectID)
VALUES (NULL,1,"Series I: Documentary Material",3)
Download