Convert InqScribe XML into UCLA XML

advertisement
Steps to merge and convert Inqscribe XML files once we receive all XML files of
one interview:
Note: If the files are still in .inscr form, simply open the file using inqscribe and save as
an xml file.
Step one:
Create a new XML file which will include all sessions.
Open a new XML doc in OxYgen.
Uncheck “use DTD or a schema”
Click OK
Copy and paste the following code [which is sufficient for 3 sessions; repeat as needed]
and generate a new XML file. Please don't leave any space between starting and ending
codes and the text.
<?xml version="1.0" encoding="UTF-8"?>
<body>
<div type="transcript">
<div xml:id= “session1” type="session">
<head> Session 1 (interview date with the month and day)</head>
<!--XML TEXT GOES HERE -- >
</div>
<div xml:id= “session2” type="session">
<head> Session 2 (interview date with the month and day.</) </head>
</div>
<div t xml:id= “session3” ype="session">
<head> Session 3 (interview date with the month and day)</head>
</div>
<div xml:id= “session4” type="session">
<head> Session 4 (interview date with the month and day.</) </head>
</div>
<div t xml:id= “session5” ype="session">
<head> Session 5 (interview date with the month and day)</head>
</div>
<!--THIS WILL BE AT THE END OF THE MERGED DOCUMENT -- >
</div>
</body>
ONCE ADDITIONAL SESSION CODE HAS BEEN PASTED IN AS NEEDED, CHANGE
SESSION NUMBERS. Delete guideline text “XML TEXT GOES HERE”.l
Step two: Copy and paste the content from each XML file into above xml file.
Delete the following code at the top of the XML document
<?xml version="1.0" encoding="UTF-8"?><transcript>
Delete the following code from the bottom of the XML document:
</transcript>)
Important: DO NOT SAVE THIS CHANGE
Step three: Find and replace
1. Open XML file with any XML editor.
2. replace < with <
replace > with >
3. use the replace to delete <notes> and </notes>
replace with “blank” (leave the box blank)
4. replace <scene and </scene
replace <scene with <sp
replace </scene with </sp
5. save the XML files.
[note: CTRL-C to copy; CTRL-V to paste]
Step four:
Check the document form using the XML editor to make sure it is well
formatted.
--Errors will be checked by clicking blue arrow on tool bar
Click on the line of text that appears in bottom screen. Then click on highlighted passage
in left screen to reveal specific error. The Oxygen outline window is very useful tool for
this step.
Examples of common errors:
--For example every open bracket should be matched with a closed bracket
--For example: <sp may not have a </sp ending.
--<p may not have a closing tag either
--Extra speaker tags can sometimes show up in the middle of a sentence and need to be
deleted
--Sometimes there are 2 separate sp (speaker) sections without a speaker tag because it is
actually the same speaker. Merge into one sp section.
[Note: sometimes one space needs to be added behind an open tag and in front of a closed
tag. Try this if the doc. will not format and there don’t seem to be any errors.]
To optimize the display of XML file in oXygen, follow these steps:
Document-->source-->format and indent elements (Ctrl+I)
Or:
Document-> XML document -> Format and Indent
[Note: If the first way does not work, try the second way]
Open the final XML file in IE to see any errors.
[go to merged file and right click to get IE]
Step five:
Open in Internet Explorer
Apply stylesheet code at the top of the final XML file and print out the readable version.
There should be no space between this code and the first line of code already there
[<?xml….etc]
Stylesheet code:
This code should be a second line of code after the line that begins with ?xml.
<?xml-stylesheet type="text/xsl"
href="http://digital2.library.ucla.edu/xslt/local/interviewDisplayInHouse.xsl"?>
*Note: When opening on IE, a small window will open to confirm (usually does not pop
up in front of the window)
Print out on yellow paper.
Step six:
Once interview is returned from interviewee, make the indicated changes on Oxygen
XML software.
Perform a spell check.
Step seven:
Once interview has been reviewed by interviewer/editor and all of the changes are done,
save the individual files on Oxygen XML software:
1.
2.
3.
4.
Open the merged (edited) file (eg. Green1-9.xml)
Open the outline: Go to perspective  Show View  Outline.
Open a new file: File  New  uncheck the box on top left corner  Click OK.
Copy and paste the simplified TEI p5 XML template into the current XML file.
http://oralhistory-dev.library.ucla.edu/technology/simpleTEIp5template.xml
Finish the interview title. Here is an simple example with short speech:
http://oralhistory-dev.library.ucla.edu/technology/simpleTEIp5.xml
5. Go back to merged file and click on div (a list of divs should appear, one for each
session).
6. Click on the first div and copy (control + c). Clicking on first div will
automatically highlight the entire session 1. The second div will auto-select the
entire session 2, etc.
7. Go to the new untitled document and paste (ctrl + v) first session in between
<body> and </body>
8. Save the file (eg. Green1.xml)
9. Repeat step 3 for the second session, third session, etc. For session 2, click on the
second div and copy. For session 3, click on the third div and copy, and so on and
so forth. When step 8 is reached, save the file appropriately for the corresponding
session. (eg. Green2.xml for second session, Green3.xml for third session, etc.)
10. Check the document form and make sure that the final xml file is well formatted.
(All tags has a close tag to match)
11. Format and indent the whole XML document (so the final display of the xml file
looks nice)
Step eight:
After you finished with above 7 steps, you need to further modify the XML file into a
simplified XML P5 file. You can either apply a stylesheet on the above final transcript
file or modify manually to add milestone tag and modify the sp tag. Please follow another
document for the necessary changes.
Download