IMG2XML Linking Text and Image with SVG Hugh Cayless — NYU 1

advertisement
IMG2XML
Linking Text and Image with SVG
Hugh Cayless — NYU
Friday, October 9, 2009
1
background
NEH-funded project at UNC Chapel Hill
Prototype linking of images, transcription, and notes
on the diary of a 19th-century UNC undergraduate
Develop and refine methods for producing vector
graphic tracings of manuscript text
Explore the theoretical background
Friday, October 9, 2009
2
Papyrus: P. Mich. Inv. 3088
Friday, October 9, 2009
3
SVG Tracing of Previous
Friday, October 9, 2009
4
Huitfeldt’s and Sperberg-McQueen’s
model of transcription
A reading is a sequence of typed tokens
The reader recognizes marks on the support and
interprets them as tokens
token
type
ʻΑʼ
They aren’t really specific about resolution of types
—words may be ok, for example
Friday, October 9, 2009
5
The img2xml model
An SVG tracing of the text consists of Shapes
Structures (such as lines, words, or letters) are the
overlap of one or more Shapes and a bounding Region
Structures map to elements in a transcription or to
annotations
Friday, October 9, 2009
6
SVG
Transcription
Shape
Shape
Shape
Shape
Shape
Line
Structure
Word
Letter
Region
Friday, October 9, 2009
7
SVG tracing with
added Regions
a Region
Shapes
Friday, October 9, 2009
8
“Whence” in the line above is
touched by the “W” in “With”
and it in turn is touched by the
descender in “gaze” above.
The descender in “from” overlaps
the outlined Region, but does not
interact with it to form a Structure.
The combination of Shapes and a
Region constitutes a line Structure.
Friday, October 9, 2009
9
#struct1 | isA | http://philomousos.com/img2xml/ontology/Structure
jld-p010.svg#rect12755 | isA | http://philomousos.com/img2xml/ontology/Region
jld-p010.svg#path10728 | isA | http://philomousos.com/img2xml/ontology/Shape
jld-p010.svg#path10832 | isA | http://philomousos.com/img2xml/ontology/Shape
jld-p010.svg#path10812 | isA | http://philomousos.com/img2xml/ontology/Shape
dusenbery.xml#lb10-8 | transcribes | #struct1
jld-p010.svg#rect12755 | memberOf | #struct1
jld-p010.svg#path10728 | memberOf | #struct1
jld-p010.svg#path10832 | memberOf | #struct1
jld-p010.svg#path10812 | memberOf | #struct1
<path
d="m 120.24436,214.02349 c -2.99822,2.99822 -7.0785,7.32647 -7.98022,8.45362 -0.56358,0.69884 -1.33004,1.64564 -1.73581,2.0965 -0.76646,0.87918
-1.35258,2.02887 -1.12715,2.23176 0.22543,0.22543 1.10461,-0.18035 0.99189,-0.45086 -0.11271,-0.24798 1.19478,-2.07396 2.84042,-3.99011
0.27052,-0.31561 0.74392,-0.90172 1.05952,-1.33004 0.90172,-1.17224 6.15424,-6.49238 5.81609,-5.86118 -0.4734,0.87918 -3.08839,4.5086
-4.23808,5.86118 -1.66818,1.98378 -3.83231,4.89183 -3.83231,5.16235 0,0.56357 0.6312,0.13525 2.84042,-2.00633 2.00633,-1.91616 3.02076,-2.63753
3.02076,-2.14159 0,0.0676 -0.40577,0.76647 -0.90172,1.53293 -0.96935,1.48784 -1.12715,2.50227 -0.40577,2.93059 0.40577,0.27051 2.52481,-0.13526
3.60688,-0.67629 0.54103,-0.27052 0.60866,-0.24797 0.74392,0.18034 0.36068,1.1497 2.7277,1.12715 3.83231,-0.0225 0.36068,-0.38323 0.76646,-0.67629
0.87917,-0.67629 0.13526,0 0.92427,-0.56357 1.7809,-1.23987 0.83409,-0.67629 1.51038,-1.1046 1.51038,-0.9468 0,0.13526 -0.3156,0.81155
-0.67629,1.48784 -0.38323,0.67629 -0.67629,1.30749 -0.67629,1.42021 0,0.4734 0.90172,0.11271 1.44275,-0.58612 0.94681,-1.26241 2.07396,-1.82599
3.60688,-1.82599 1.89361,0 1.91616,-0.49594 0.0225,-0.54103 -1.01443,-0.0225 -1.62309,0.0676 -1.98378,0.3156 -0.69883,0.49595 -0.74392,0.45086
-0.65375,-0.67629 0.0902,-0.9468 0.0676,-1.01443 -0.4734,-1.01443 -0.36069,0 -1.05952,0.45086 -2.02887,1.33004 -2.02887,1.82598 -4.12537,3.22365
-4.73403,3.13347 -0.76646,-0.11271 -0.60866,-1.53292 0.29306,-2.61498 0.76646,-0.94681 0.94681,-2.16413 0.29306,-1.91616 -0.67629,0.22543
-1.21732,0.81155 -1.7809,1.89361 -0.56357,1.1497 -1.62309,1.9387 -2.95313,2.23176 -0.87918,0.20289 -1.42021,0.0902 -1.42021,-0.3156 0,-0.13526
0.36069,-0.96935 0.83409,-1.84853 0.49595,-0.9468 0.76646,-1.73581 0.67629,-1.96124 -0.27051,-0.69883 -1.33004,-0.42832 -2.61499,0.6312
-0.58611,0.47341 0.0902,-0.49594 3.02077,-4.41842 1.78089,-2.36702 2.45718,-3.404 2.79533,-4.32826 0.38323,-1.01443 -0.45086,-0.76646
-1.69073,0.49595 z"
id="path10812" style="fill:#000000;stroke:none"/>
Friday, October 9, 2009
10
Advantages
The SVG tracing gives you hooks in a facsimile of the
image that you can hang transcriptions or annotations
on.
It allows the capture of the structure and its relation to
the transcription.
SVG shapes can be manipulated using Javascript, made
transparent, (in)visible, colored, zoomed, etc.
Some structure-detection tasks are simpler with vector
graphics.
Friday, October 9, 2009
11
Disadvantages
2-dimensional
Relies on collapsing the image to a 1-bit (black/white)
colorspace.
Browser support (though Google may recently have
fixed that).
SVG lacks semantics beyond simple geometry
Friday, October 9, 2009
12
The Future / Questions
Prototype completion in winter 2009 (I hope).
Research into other texts (papyri, Archimedes
palimpsest data).
I’d like to incorporate this into a transcription tool
(maybe SoSOL – papyrus transcription editor under
development at UKY).
Is this useful?
Is the model sensible?
Do we need a model at all?
Friday, October 9, 2009
13
hugh.cayless@nyu.edu
http://github.com/hcayless/img2xml
http://philomousos.com/img2xml/
http://docsouth.unc.edu/dusenbery (not yet)
Friday, October 9, 2009
14
Download