PPT

advertisement
PERSONALIZED JOB MATCHING
Md. Mustafizur Rahman
Ellie Clougherty
John Clougherty
Sam Hewitt
OUTLINE
Introduction
 Existing System
 Existing Work (Research)
 Lacking of existing systems
 Format of Job and Resume
 Our Approach
 System Component
 Evaluation
 Job Analytics
 Future Works

JOB MATCHING


A search engine that takes user input (i.e. job
title, company name, qualification etc.) and
suggests him/her the recommended job.
User input
Resume
 Job Postings
 Keyword

EXISTING SYSTEM

There are multiple job searching websites like
Glassdoor
 Monster
 Indeed


But very few support for resume searching

Indeed
EXISTING SYSTEM
EXISTING SYSTEM
EXISTING SYSTEMS
EXISTING SYSTEM
EXISTING WORK (RESEARCH)

Collaborative filtering [1]
Critically dependant on the availability of highquality user profiles
 Quite rare in the real world scenario


Content based filtering [2]

Highly dependent on user interaction
LACKING FEATURES

Absence of personalization


No support for user preference (i.e. New job seekers
tend to put more on their educational qualification
than experience in a resume)
Absence of resume and job dynamics

Keyword/term correlated experience and expertise
searching.
There is no way to search for a job using your
entire skill set and experience
SOLUTION
● Personalized Job Matching System
○ Crawl resume
○ Compare against a continuously-growing
database of job-postings across multiple sites
and companies
○ SQL commands to explore the nature of the
data and find patterns
● Prototype an Economic Job Graph
JOB POSTING: GOOGLE JOB
RESUME: INDEED.COM
TYPICAL FORMAT OF JOB POSTINGS AND
RESUME
Job Posting
Resume
Job Title
Title
Qualification
Educational
Background
Responsibilities
Experience
Job Description
Additional Information
FIELD TO SEARCH FOR
Job Posting
Resume
Job Title
Title
Qualification
Educational
Background
Responsibilities
Experience
Job Description
Additional Information
OUR APPROACH: KEY FEATURES
A
specialized search engine

Full text resume and job search

User control over the field

Aspect based (keyword based) experience
correlation

Job prediction
OUR APPROACH : SYSTEM COMPONENTS

As a specialized search engine we have the
following components






Crawler
Doc Analyzer
Indexer
Ranker
Interface
Evaluation
CRAWLER (COLLECTION OF DATA)

Problem
No benchmark job postings data set
 No benchmark resume data set
 Scarce resource of resume!



Solution
Crawler
We have to build specialized crawler for different
employer and resume websites
 3 different crawler for job posting: Google, Facebook
and IBM
 1 crawler for resume: indeed.com

DOC ANALYZER

Lets take a look in job postings :
BA/BS -> Bachelor of
Arts/Science
MS - > Master of Science
Solution: Dictionary
Expansion
Unix/Linux
Data Structure
Algorithm
Software design
Object oriented skills
Javascript
Network programming
How to identify these?
DOC ANALYZER (CONTD.)

Problem: Can we identify the keywords from the
open unstructured text?
Solution 1: Unigram model
 Problem: keywords: Software Design
becomes

Software -> less important than Software
Design
 Design -> less important than Software Design

Solution 2: Phrase Query
 Problem: How to make phrase query when
your input is a complete resume?

DOC ANALYZER (CONTD.)

Our observation:
Most of these keywords are Noun
 Most of these keyword appears only after some
preposition (in, with)
 For multiple word keyword (i.e. Software Design)
search for consecutive Noun.
 Use of Parts of Speech Tagger


Results are quite fascinating, we have got most of
the meaningfull keywords.
DOC ANALYZER (CONTD.)

Take a look again on a job postings
Question: Suppose
you have all the
qualifications, but not
4 years of experience,
where should a job
search engine rank
this result?
DOC ANALYZER (CONTD.)


Can we indentify keywords oriented experience
list for a job postings (or resume) like below?
Keywords
Experience
C++, Java
2 years
Software Development
5 years
TCP/IP
Not necessary
….
…
We already have the keywords list!!. Just simply
find out the year of experience using the parts of
speech tagger and some heuristics.
INDEXER

Two indexers
Job Posting
 Resume


Job Posting Indexer






Job Title
Job Location
Job Qualification
Job responsibilities
Job keywords (Processed from Data Analyzer)
Job experience (Processed from Data Analyzer)
INDEXER (CONTD)

Two indexer
Job Posting
 Resume


Resume Indexer






Resume Title
Educational Information
Experience
Additional information
Resume keywords (Processed from Data Analyzer)
Resume experience (Processed from Data Analyzer)
INDEXER (CONTD)

During Index time
Document Booster

Documents matching perfectly with query for
keywords and experience fields, receives higher
score except title.

Ultimately these will help us in ranking the matched
document in upper position.
QUERY PROCESSING

Since we have two indexers, we have two types
of query
Job postings (search in resume index)
 Resume (search in job posting index)


Input from the users


We take HTML form based input from the users
Query Processing

Perform the same steps of Doc Analyzer
RANKING & RETRIEVAL

During Index time
Document Booster


Documents matching perfectly with query for
keywords and experience fields, receives higher
score except title.

Ultimately these will help us in ranking the matched
document in upper position.
Document Scoring Function:
TF-IDF,
 BM 25

SYSTEM DESIGN

Backend design:
Run as a service on Apache Tomcat server 6.0
 Java


Client Connectivity:


Java Server Page (JSP)
Front End Design:

HTML
DEMO

System Demo
EVALUATION
oEvaluate
●
the performance we choose
Mean Average Precision (MAP)
oEvaluate
Methodologies
●
Resume selection: We carefully identify 2 resumes from
our dataset.
●
Job postings selection: Then we carefully labeled 10 job
postings as relevant to those selected 2 resumes
●
Mixed up these relevant job postings with some more 20
randomly picked job postings from data set.
●
Then calculate the MAP of our System using the top 5
results and find out the MAP 0.3395 but traditional
systems have only 0.295.
JOB PREDICTION



Until now we have performed two types of
searching:
For a given input Resume, perform search on the
job posting index
For a given input Job posting, perform search on
the resume index
Can we do something more
using exiting resources?
JOB PREDICTION (CONTD.)


Perform Resume search on the Resume index.
Why?
Intuition:


People with similar looking resume might be
eligible for similar job!!
Methods:
Find similar resumes
 Find the companies in those resumes
 Recommend those companies

JOB ANALYTICS
•
Goals:
● How fast are jobs being filled?
● How fast are jobs being posted?
● When is the best time to apply?
• Filled Positions:
JOBS IN THE USA
WORLDWIDE JOBS
PROGRAMMING LANGUAGES
Facebook: 92.3%
IBM: 23.9%
Google: 5.8%
FUTURE WORK

Resume Feedback Suggestions
What skills or experience do you need to be
qualified for a certain job?

Discover Patterns in Job-Hunting Seasons
What time of year are jobs posted most
frequently?



Build a Personal Database
Receive notifications of job posts that match
your interests and skill level

REFERENCES


[1] Y. Lu, S. El Helou, and D. Gillet. A recommender
system for job seeking and recruiting website. In
Proceedings of the 22nd international conference on World
Wide Web companion , pages 963{966.International World
Wide Web Conferences Steering Committee, 2013.
[2] R. Rafter, K. Bradley, and B. Smyth. Automated
collaborative ltering applications for online recruitment
services. In Adaptive Hypermedia and Adaptive Web-Based
Systems , pages 363{368. Springer, 2000.
TEAM CONTRIBUTIONS
Mustafiz: NLP and IR system, JSP Backend,
Google Crawler
 Sam: Crawler structure and database
 Ellie: IBM Crawler, Front end UI
 John: Job analytics

Download