(S P M L

advertisement
(S i m p l e) ?
P DF
M anipulation
L anguage
Stefano Pacifico
Jayesh Kataria
Dhivya Khrishnan
Hye Seon Yi
Contents
•
•
•
•
•
•
•
Overview & Motivation
Language Features
PDF Functionalities
Architectural Design
Tutorials (including example)
Lesson Learned
Summary
Overview & Motivation
• SPML (Simple PDF Manipulation Language) is a
language to create and manipulate PDF files.
• PDF is a de-facto standard for electronic documents
because of its open standard format with a free
viewer program (Acrobat Reader)
• However, it is difficult and expensive to manipulate
PDF files!!!
• There are a few open source libraries available (ex)
iText and XPAAJ  Why not come up with a language
for PDF by using them
• Focus is on manipulation of PDF files since people can
easily create PDF files using freewares (ex) PDF
ReDirect, cutePDF Writer
Language Features
• Carefully chosen set of keywords
• Multiple Types (int, string, pdf, void, array)
• Several Operators
–
–
–
–
–
Unary Operators (~,!)
Arithmetic (+ , - , * , /)
Comparison (< , <= , > , >= ,== ,!=)
Logical operators (&&, ||)
PDF operators (+, create, extractpage, totextfile,
highlight, in)
Language Features (con.)
• Various types of statements
–
–
–
–
Conditional statements (if…else)
Iterative statements (while)
Jump statements (return, continue, break)
I/O statements (print, totextfile)
• User defined functions
• Recursion
PDF Functionalities
•
•
•
•
•
•
File generation (create)
File concatenation (+ operator)
Page extraction (extractpage)
Highlight a word(highlight)
All Pdfs in directory (in)
Text file support (totextfile)
Architectural Design
Front End
Tree Walker
Back End
Runtime Library
SPMLLexer
SPMLParser
SPMLWalker
CompilerException
SPMLCodeGen
Environment Classes
CodeGen
SPMLLibrary
JRE System
iText
XPAAJ
Bridge class
between
Java output
code and
Runtime
Libraries
iText (Open
Source PDF
library in
Java), XPAAJ
(XML/PDF
Access API
for Java from
Adobe)
Take
SPML
source
code and
output
AST
With the AST
passed, perform
static semantic
checking and
generate Java
output code
Tutorials - Example
• Program to concatenate two PDF files
start()
{
pdf p1;
p1 = "a.pdf";
/* open a.pdf */
pdf p2;
p2 = "b.pdf"; /* open b.pdf */
pdf combined;
combined = create "c.pdf"; /* create c.pdf */
combined = p1 + p2;
}
Tutorials (con.)
Function
• Variable declaration
• Array declaration
• Conditional statement
Example
• pdf file;
• pdf files[10];
if (a == 1) { print “a is 1”; }
• Iteration statement
• while (a < 5) { print “a = “ + a; }
• Jump statement
• return a; continue; break;
• I/O statement
• print “Hello World!”;
• User defined function
• int sum(int a, int b) { return a + b); }
• Recursion
• Used to reverse a file( coming soon in the demo)
Tutorials (con.)
Function
• Length operator
Example
• int a;
a = length files;
• In operator
– all PDFs in a dir
– phrase search in PDF
– phrase search in string
• files = pdf in “dir”;
• int iArray[10];
iArray = “the” in files[0];
• a = “1” in “12345”;
• Extract a page
• file = extractpage files[0] 1;
• Highlight a phrase
• highlight pdfFile “COMS”;
• Save as a text file
• totextfile pdfFile “file.txt”;
Applications
•
•
•
•
Forming a catalogue of pdfs
Reversing file pages
Deleting a page from pdf
Extracting even and odd pages and forming a
new pdf
• Swapping 2 pages of a file
• Highlighting word in a pdf
• Forming a new pdf of pages containing a
specific word.
Lesson Learned
• “Choose types carefully – absence of
boolean.”
• “User input could have been added.”
• “Deadlines are never too far away!”
Summary
• SPML is a simple yet powerful language
for manipulating PDF files.
• SPML works!
Download