Learn more - Muthukumaran Chandrasekaran

advertisement
AHRT: The Automated Human
Resources Tool
BY
Roi Ceren
Muthukumaran Chandrasekaran
Outline
Problem Domain
Background
Web Services
Ontologies
Approach
Web Services Architecture
Web Services
Parsing
Ontologies
AHRT Demo!
Problem Domain
Many companies use existing web based systems like Taleo as
their job application interface
Some systems allow the applicant to upload their resume and parse it to
automatically populate the fields in the application
However, these systems do a poor job in populating the fields accurately
and sometimes require extensive user interaction
They often store the information in flat files or databases only
Our Objective
Effective parsing rules for automating data collection
Ontologies used for knowledge representation
Exposed via RESTful Web Services for platform independence
Background: Web Services
Web services
Applications can be broadcast as a service on a web server,
such as Apache
A wrapper, such as Axis or Tomcat, can be used to execute these
applications
Standardized protocols (beyond HTTP) can be used to
further promote uniform communication, such as JSON
We use encoded form data
In this way, we can allow disparate platforms to
interoperate
To exemplify this, we use REST servers on different ports
PHP server handles parsing
JAX-RS server handles the ontology and instances
Background: Ontologies
Formal representation of knowledge as
Set of concepts in a domain, and
Relationships between them
Advantages of using Ontologies:
More expressive and searchable
Can be visually examined
Relationships can be expressed between
attributes
Approach (Web Service Architecture)
Expose the functionality of our program to the
world
Text categorization algorithm
Getters and setters for resume instances
Publish access to these functions via web services
1. REST with Java (JAX-RS) using Jersey: For persistence
layer
2. PHP RESTful web server: For parsing
Build a web interface that utilizes these web
services
Approach (Web Service Architecture)
Users will interact with the web interface to
categorize their resume
1.
2.
3.
4.
User uploads their resume to PHP REST server
Server returns parsed resume in a form
User may alter or add to categorized resume
Server stores the resume as an instance of its resume ontology using
JAX-RS server
Browser
Communication
Protocol
AHRT System
PHP
resume to port 80
processed resume
edited resume to port 8080
JAX-RS
Approach (Web Services)
Admins can then access the ontology file of the
entire database
Note: For ease of demo, uploads automatically navigate to
the download page
Browser
Communication
Protocol
AHRT System
OWL file request
OWL file
JAX-RS
Approach (Web Services)
A PHP server was built to handle parsing
Can exist independently of the database server
Uploaded resumes will be processed by the server and
formatted in an upload form
User can correct the resume categorization before uploading to
the server’s ontology database
Apache Tomcat was configured on the AHRT server for
instance handling
Serves as a wrapper for the JAX-RS REST server
Final submissions then are added to the database
The database serves as a instance pool for the ontology, which
has the logic for the domain built in
Approach (Parsing)
PHP used in this version
Other web services exist that parse (they cost
upwards of $600!) and can be swapped out in the
source code
preg_match and preg_match_all function used with
various regular expressions to handle the
identification and classification of resume categories
preg_match identifies a string based on the REGEX
preg_match_all splits data into arrays based on a REGEX
Approach (Ontologies)
JENA Ontology API
Loads a predefined OWL schema
API used to create instance variables
Can be used for two different types of properties
Datatype properties
Object properties
We utilize the database to pull instance data
JENA then populates the ontology with this data
We will open the created .owl file within Protégé
to view the ontology’s instance variables
Jambalaya Plugin can be used to visualize the
ontology
Demo!
http://denali.cs.uga.edu
Download