AHRT: The Automated Human Resources Tool BY Roi Ceren Muthukumaran Chandrasekaran Outline Problem Domain Background Web Services Ontologies Approach Web Services Architecture Web Services Parsing Ontologies AHRT Demo! Problem Domain Many companies use existing web based systems like Taleo as their job application interface Some systems allow the applicant to upload their resume and parse it to automatically populate the fields in the application However, these systems do a poor job in populating the fields accurately and sometimes require extensive user interaction They often store the information in flat files or databases only Our Objective Effective parsing rules for automating data collection Ontologies used for knowledge representation Exposed via RESTful Web Services for platform independence Background: Web Services Web services Applications can be broadcast as a service on a web server, such as Apache A wrapper, such as Axis or Tomcat, can be used to execute these applications Standardized protocols (beyond HTTP) can be used to further promote uniform communication, such as JSON We use encoded form data In this way, we can allow disparate platforms to interoperate To exemplify this, we use REST servers on different ports PHP server handles parsing JAX-RS server handles the ontology and instances Background: Ontologies Formal representation of knowledge as Set of concepts in a domain, and Relationships between them Advantages of using Ontologies: More expressive and searchable Can be visually examined Relationships can be expressed between attributes Approach (Web Service Architecture) Expose the functionality of our program to the world Text categorization algorithm Getters and setters for resume instances Publish access to these functions via web services 1. REST with Java (JAX-RS) using Jersey: For persistence layer 2. PHP RESTful web server: For parsing Build a web interface that utilizes these web services Approach (Web Service Architecture) Users will interact with the web interface to categorize their resume 1. 2. 3. 4. User uploads their resume to PHP REST server Server returns parsed resume in a form User may alter or add to categorized resume Server stores the resume as an instance of its resume ontology using JAX-RS server Browser Communication Protocol AHRT System PHP resume to port 80 processed resume edited resume to port 8080 JAX-RS Approach (Web Services) Admins can then access the ontology file of the entire database Note: For ease of demo, uploads automatically navigate to the download page Browser Communication Protocol AHRT System OWL file request OWL file JAX-RS Approach (Web Services) A PHP server was built to handle parsing Can exist independently of the database server Uploaded resumes will be processed by the server and formatted in an upload form User can correct the resume categorization before uploading to the server’s ontology database Apache Tomcat was configured on the AHRT server for instance handling Serves as a wrapper for the JAX-RS REST server Final submissions then are added to the database The database serves as a instance pool for the ontology, which has the logic for the domain built in Approach (Parsing) PHP used in this version Other web services exist that parse (they cost upwards of $600!) and can be swapped out in the source code preg_match and preg_match_all function used with various regular expressions to handle the identification and classification of resume categories preg_match identifies a string based on the REGEX preg_match_all splits data into arrays based on a REGEX Approach (Ontologies) JENA Ontology API Loads a predefined OWL schema API used to create instance variables Can be used for two different types of properties Datatype properties Object properties We utilize the database to pull instance data JENA then populates the ontology with this data We will open the created .owl file within Protégé to view the ontology’s instance variables Jambalaya Plugin can be used to visualize the ontology Demo! http://denali.cs.uga.edu