depify - XML London

advertisement
Diaries of a Desperate (XML|XProc)
Hacker
Diaries of a Desperate (XML|XProc)
Hacker
James Fuller
Lead Engineer | MarkLogic
Background
• Engineer on MarkLogic API team (History
meters, Management API, etc…)
• W3C XML Processing WG (XProc v2.0)
• 2001 started with XML tech (EXSLT),XML
Prague, etc…
• Open source contrib.
• Thank you to the organisers of
XProc XML London 2015
Agenda
1.
2.
3.
4.
5.
6.
XML Hacker Desperation
XMLCalabash & depify
Show & Tell
XProc Hacker Desperation
Summary
Goto pub
* Yes, I am going to ‘powerpoint’ you
* Raise your hand to ask question
Email !!!
The
D.P.H.
xkcd.com - http://xkcd.com/208/ [xkcd-ref]
D.P.H. – a twinkling in SGML eye
• Desperate Perl Hacker
– Paul Grosso 1997 xml-dev link
– Google images ‘desperate perl hacker’ link
– Etymological cousin of ‘Just Another Perl Hacker’
(JAPH) – Randal Schwartz aka Merlin
• What’s it all about ?
–
–
–
–
GSD
Opaque One liners (Perl Golf encouraged)
Even better if (regex|pipes|sed|awk) involved
Challenge: Be able to munge XML with Perl
Desperate XML Hacker
• GAD (Get it All Done) with XML Stack
• ‘clever’ (and|or) ‘clear’
• Highly productive, albeit marooned and
anxious on ‘XML island’
• Working with xml means working with
documents and that means working with
document workflows
All programmers are desperate
marklogic
emacs
ant
xml
xpath
json
xslt
emacs
java
xquery
gradle
bash
…..
•
•
•
•
•
•
•
•
•
•
•
Day 1 - transform an xml doc with XSLT
Day 2 - run transform on set of docs
Day 3 - generate multiple output formats
Day 4 - read docs from database
Day 5 - put results into database
Day 6 - notify when its done
Day 7 - run assertions and validate results
Day 8 - generate png from svg for each document
Day 8 - zip up files and upload them (w/ oauth)
Day 9 - create EPub
And so forth …
file system
xslt
xml doc
database
Technology Selection
transform
result doc
result
image
generate
zip
zip
package
result
doc
notify
– XSLT
– XQuery
– Bash scripts
– Makefiles
– Ant
– Java
result
image
upload
– All of the above ?
file system
xslt
xml doc
database
TRANSFORM
transform
result doc
result
image
generate
GENERATE
zip
zip
zip
package
result
doc
notify
PACKAGE
result
image
notify
upload
upload
Adhoc pipelines
Pipelines manage complexity
[McGrath2004] Sean McGrath. Performing impossible feats of XML processing with
pipelining, Proc XML Open 2004,
• Transformation decomposition is the key to
complexity management, just ask:
– Henry Ford
– Herbert Simon (The Two Watchmakers – “The Architecture of
Complexity”)
– George Miller (7+/-2)
– Adam Smith (An Inquiry into the Nature And Causes of the Wealth
of Nations,1776)
– Any electrical/chemical engineer
– Michael A. Jackson
• Easy to build, test and reuse
• Segregation of business rules from grammar rules
• Enable group collaboration
Michael Kay Balisage 2009 – ‘You Pull,
I’ll Push: on the Polarity of Pipelines’
• ‘the code of each step in the pipeline is kept
very simple’
• ‘very easy to assemble an application from a
set of components, thus maximizing the
potential for component reuse’
• ‘there is no requirement that each step in a
pipeline should use the same technology; it's
easy to mix XSLT, XQuery, Java and so on in
different stages.’
http://www.balisage.net/Proceedings/vol3/html/Kay01/BalisageVol3-Kay01.html
Use all the XML technologies …
XML – The Good Parts
Modern XML
Tier 1
Modern XML
Tier 2
Core
XML 1.0
Namespaces
XPATH 1.0/2.0/3.0
XML Canonicalization
Transform/
Query
XSLT 1.0/2.0/3.0
XQuery 1.0/3.0
XSLT 1.0/2.0 (in browser)
Processing
SAX, DOM
XProc?, XOM
Other
XML Catalog
XForms
Schema
Schematron
XML Schema 1.0
RELAX-NG
XML Schema 1.1
Semantics
RDF
OWL
SPARQL
SPARQL Update
Vocabularies*
SVG
‘Office’ Doc ML
….
MathML
Docbook
DITA
XHTML
- Amended from XML Amsterdam 2012 Keynote
Dependency Adoption
(technology selection)
Dependency Adoption
Helter skelter
http://upload.wikimedia.org/wikipedia/comm
ons/thumb/b/ba/Helter_skelter.jpg/440pxHelter skelter
Helter_skelter.jpg
Its more like this
The right Tool
Obligatory Jedi slide
But it works!
Java and XML
xml:Father- "XML gives Java something
to do.”
• XML, Java, and the future of the Web 1997,
Jon Bosak - http://www.ibiblio.org/pub/suninfo/standards/xml/why/xmlapps.htm
• SAX,DOM
• Unicode support
• Distributed
• Caring and feeding of java vm
• Invoke abstraction (classpath, jar fun)
Do Java and XML work better together?
Not enough time
Not enough time
Desire to be Productive
10x programmers is not a myth
•
•
•
•
•
•
•
•
•
•
•
•
•
Augustine, N. R. 1979. "Augustine’s Laws and Major System Development Programs." Defense Systems
Management Review: 50-76.
Boehm, Barry W., and Philip N. Papaccio. 1988. "Understanding and Controlling Software Costs." IEEE Transactions
on Software Engineering SE-14, no. 10 (October): 1462-77.
Boehm, Barry, et al, 2000. Software Cost Estimation with Cocomo II, Boston, Mass.: Addison Wesley, 2000.
Boehm, Barry W., T. E. Gray, and T. Seewaldt. 1984. "Prototyping Versus Specifying: A Multiproject Experiment."
IEEE Transactions on Software Engineering SE-10, no. 3 (May): 290-303. Also in Jones 1986b.
Card, David N. 1987. "A Software Technology Evaluation Program." Information and Software Technology 29, no. 6
(July/August): 291-300.
Curtis, Bill. 1981. "Substantiating Programmer Variability." Proceedings of the IEEE 69, no. 7: 846.
Curtis, Bill, et al. 1986. "Software Psychology: The Need for an Interdisciplinary Program." Proceedings of the IEEE
74, no. 8: 1092-1106.
DeMarco, Tom, and Timothy Lister. 1985. "Programmer Performance and the Effects of the Workplace."
Proceedings of the 8th International Conference on Software Engineering. Washington, D.C.: IEEE Computer
Society Press, 268-72.
DeMarco, Tom and Timothy Lister, 1999. Peopleware: Productive Projects and Teams, 2d Ed. New York: Dorset
House, 1999.
Mills, Harlan D. 1983. Software Productivity. Boston, Mass.: Little, Brown.
Sackman, H., W.J. Erikson, and E. E. Grant. 1968. "Exploratory Experimental Studies Comparing Online and Offline
Programming Performance." Communications of the ACM 11, no. 1 (January): 3-11.
Valett, J., and F. E. McGarry. 1989. "A Summary of Software Measurement Experiences in the Software Engineering
Laboratory." Journal of Systems and Software 9, no. 2 (February): 137-48.
Weinberg, Gerald M., and Edward L. Schulman. 1974. "Goals and Performance in Computer Programming."
Human Factors 16, no. 1 (February): 70-77.
Except when it is a myth
• technical debt
– Maintainable/Upgrade
– Add new features
– Enterprise requirements
• more bugs
• brittle code
Upfront design
Technology selection
Balancing trade-offs to achieve sum gain
reflection
• Desperate people do desperate things
–
–
–
–
–
Use all the XML technologies
Dependency adoption
Not the right tool
Not enough time
Being productive
avoid being a D.X.H.
• Careful technology selection
• Manage your dependencies
• Avoid distributing logic up/down/across tech
stack (hint: don’t use bash, makefiles, ant, etc)
• Simplify interaction with Java (VM)
• Model pipelines (hint: XProc)
avoid being a D.X.H.
• Use XProc (XMLCalabash)
– XProc is designed for XML processing pipelines
– Extensible
– Simplify and aggregate logic
• Use XProc extension steps (depify)
– XProc w/o extension steps is half of XProc
– Provide façade over other technologies
We use pipelines
•
•
•
•
John Lumley – worked with DITA OT
Sandro Cirulli - workflow (pull scm, push db, process)
Nic Gibson – conversion workflows
Philip Fearon - types of workflows (seq and concurrent)
with XMLFlow
• Andrew Sales – schematron on word docs (used Ant)
• ….
• most talks mentioned workflow/pipeline
– ~100 mentions in proceedings
– guestimate ~6 mentions per hour during the talks
Desperate XProc Hacker
• XProc learning curve
– v1.0 verbose in places
– XProc generic by design
– Some ‘Batteries not included’
• XProc v2.0 addresses this
–
–
–
–
–
–
–
Simplify connecting steps
Simplify parameters (maps)
Flow control
Metadata
Anything ‘flows’
avt/tvt
Syntactic optimisations
• depify provides a way to distribute and reuse extension steps
beats the problems that arise using ‘hairball’ approach
XMLCalabash & depify
• XMLCalabash – XProc processor
– Norm Walsh
– http://xmlcalabash.com/
• depify – XProc dependency management
– http://depify.com/
XMLCalabash extension steps
package com.example.library;
import com.xmlcalabash.library.DefaultStep;
… elided …
import com.xmlcalabash.runtime.XAtomicStep;
@XMLCalabash(
name = "ex:hello-world",
type = "{http://example.org/xmlcalabash/steps}hello-world")
public class HelloWorld extends DefaultStep {
private WritablePipe result = null;
public HelloWorld(XProcRuntime runtime, XAtomicStep step) {
super(runtime,step);
}
public void setOutput(String port, WritablePipe pipe) {
result = pipe;
}
public void reset() {
result.resetWriter();
}
public void run() throws SaxonApiException {
super.run();
… elided …
tree.addText("Hello World");
… elided …
result.write(tree.getResult());
}
}
Library for the step
<p:library version="1.0"
xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:ex="http://example.org/xmlcalabash/steps">
<p:declare-step type="ex:hello-world">
<p:output port="result"/>
</p:declare-step>
</p:library>
library xpl included in jar
M Filemode Length Date
Time File
- ---------- -------- ----------- -------- ----------------------------------------------------drwxr-xr-x
0 8-Mar-2015 10:43:38 META-INF/
-rw-r--r-843 8-Mar-2015 10:43:38 META-INF/MANIFEST.MF
drwxr-xr-x
0 8-Mar-2015 10:43:38 com/
drwxr-xr-x
0 8-Mar-2015 10:43:38 com/example/
drwxr-xr-x
0 8-Mar-2015 10:43:38 com/example/library/
-rw-r--r-- 2062 8-Mar-2015 10:43:38 com/example/library/HelloWorld.class
drwxr-xr-x
0 8-Mar-2015 10:43:38 META-INF/annotations/
-rw-r--r-31 8-Mar-2015 10:43:38 METAINF/annotations/com.xmlcalabash.core.XMLCalabash
-rw-r--r-294 19-Feb-2015 15:41:00 example-library.xpl
- ---------- -------- ----------- -------- ----------------------------------------------------3230
9 files
depify
• depify.com
• depify client
• depify github
•
•
•
•
Usage of XMLCalabash
Usage of depify
Develop your own step
Distribute with depify
depify future
• Gradle plugin
• Depify into other repos to enable day zero
bootstrap (w/ yum, etc)
• Integration (expath package management)
• More steps
• More steps
• More steps
Summary
•
•
•
•
XProc extension steps provide reuse
XProc v2.0 lets you work in broader context
Pipelines manage complexity
depify specifically built for XProc
(XMLcalabash)
• Reuse with existing mechanisms (ex. Maven)
How to Become
a Delighted XProc Hacker
• Stop using bash, makefiles, ant or bending XML
tech to control main loop
• Stop making adhoc pipelines
•
•
•
•
•
model pipelines with XProc (XMLCalabash)
try out ext steps (depify)
GSD
reuse and distribute new steps (depify)
goto pub
Thank you for your attention and
time, questions ?
<pub/>
Download