L
a
Qu
A
T
Mike Jackson
Architect, EPCC michaelj@epcc.ed.ac.uk
+44 131 650 5141
• 6 month JISC-funded ENGAGE project
– Facilitate use of e-infrastructures
• Kings College, London
– Centre for e-Research (CeRch)
– Arts and Humanities e-Science Support Centre (AHeSSC)
– Application-domain expertise and development
• EPCC
– Technology providers
• National Grid Service (NGS)
– Deployment
2
• Link legacy data – keep original data as-is
• Use OGSA-DAI components
– Data access and integration software
– Web services
• Write demonstrators
• Deploy on the NGS
• Publish experiences, guidelines and case studies
• First step towards an e-infrastructure for humanities
3
• Projet Volterra
– Department of History, University College London
• Data on late Roman legal texts
– Imperial pronouncements in Latin
• Relational database
4
• Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Ägyptens
– Institut für Papyologie, Heidelberg Academy of the
Sciences
• Greek papyri meta-data
– Bibliography
– Dates
– Places – where found + provenance
• Relational database
5
• Inscriptions of Aphrodisias
– Kings College, London
• Greek inscriptions meta-data
– Bibliography
– Dates
– Places – where found + provenance
– Transcript
• XML database
– One XML document per inscription
– eXtensible Mark-up Language
– Share structured data
– Encode documents
– EpiDoc XML – Text Encoding Initiative for inscriptions
6
<keywords>
<term>
<geogName type="ancientRegion" key="Asia" cert="high" full="yes">
Asia
</geogName>
</term>
<term>
<geogName type="modernCountry" key="TR" cert="high" full="yes">
Turkey
</geogName>
</term>
<term>
<placeName type="ancientFindspot" key="Aphrodisias" cert="high” full="yes">
Aphrodisias
</placeName>
</term>
<term>
<rs type="textType" key="sacer" cert="high">
Religious
</rs>
</term>
</keywords>
7
Prosopographical researcher wants to learn about patterns of relationships and activities of group of individuals in a society of a certain period by analysing the data from inscriptions and legal legislation records of that period
• Volterra + HGV (relational + relational)
– Overlaps of dates and places
• Volterra + IAphrodisias (relational + XML)
– Overlaps of dates and people
• Volterra + IAphrodisias + HGV (relational + XML + relational)
– Overlaps of dates
• Insights outwith the scope of any single database
8
• Volterra
– Laws 1 (HonoreEL)
– Law ID
– Honore Ref No
– Titulus (source)
• HGV
– Erwähnte Daten 08-04-25 Kopie
– HGVjuni-9
– Erw. Daten exp.?
• SQL-92 standard and table and column names
– No to ( , ) , - , spaces,…
• Database products are more lenient
9
• Web services are based around XML document exchanges
• HGV in FileMaker Pro
– In the database
BGU XIII 2223.2-3 usually with ÍpÉ §moË in this context
– On the wire
<columnValue>BGU XIII 2223.2-3 ^ Kusually with ÍpÉ §moË in this context</columnValue>
• XML document + CTRL-K = invalid XML document
10
Publikation Datierung
1 CPR V 1 66, 2.
Sept.
3 CPR VII 1 7 - 4 v.Chr.
9 CPR XV 1
28 O.Bodl. III 1
(S. XVIII)
3 v.Chr.,
29. Aug. -
27. Sept.
130, 30.
Apr.
Ort Originaltitel
Oxyrhynchos Receipt for Dyke - tax
Soknopaiu
Nesos
(Arsinoites)
Soknopaiu
Nesos
(Arsinoites)
Petition der Priester des
Soknopaios an den
Präfekten
Traduzione greca dell’atto di rinuncia alla casa - mulino
Syene Receipt ά ξιον
11
• Multi-lingual
• Accented characters, Greek, German,…
– Erwähnte Daten 08-04-25 Kopie
– Petition der Priester des Soknopaios an den
– PräfektenTraduzione greca dell’atto di rinuncia alla casa - mulino
– Receipt for χειρων ά ξιον
• Character sets and encodings
– UTF-8 – our Linux test server
– CP1252 – my laptop
12
• Volterra in Microsoft Access
– Laws data about different time periods in 6 tables
– Not all tables have the same columns
– Not all tables use the same column names
• HGV in FileMaker Pro
– One massive table of 50,000 rows with 75 columns
• Database
– Data access and management tool
– Or
– Data entry and storage tool
13
Volterra – legal texts
View presents N tables as one table
SQL views
Query
OGSA-DAI
HGV – papyrus records
View translates
German column
SQL views names to English
Query
OGSA-DAI
DQP executes subqueries via workflows
DQP parses query and forms query plan
DQP
Query
Client
DQP aggregates results
14
15
• Volterra
– Viewing multiple tables as a single table
– Full text searches
• HGV
– Mapping between German and English
– UNICODE characters
– Full text searches
– Elementary JDBC driver – little meta-data
• IAphrodisias
– UNICODE characters
– Full text searches
16
• Dates vary from the day to 50-100 year spans
• Same query
– “find objects from, or references to, the period between 1479 and
1425 BC”
– “find objects associated with the reign of Tuthmosis III”
– Authority lists
• Variants
– Spelling and languages
– Spatial/geographical changes – Tuthmosis expanded Egyptian rule
• Relational-XML data integration
– AIST
17
• Existing on-line query forms
• Emulate these but with OGSA-DAI back ends
18