PERSONALIZED JOB MATCHING Md. Mustafizur Rahman Ellie Clougherty John Clougherty Sam Hewitt OUTLINE Introduction Existing System Existing Work (Research) Lacking of existing systems Format of Job and Resume Our Approach System Component Evaluation Job Analytics Future Works JOB MATCHING A search engine that takes user input (i.e. job title, company name, qualification etc.) and suggests him/her the recommended job. User input Resume Job Postings Keyword EXISTING SYSTEM There are multiple job searching websites like Glassdoor Monster Indeed But very few support for resume searching Indeed EXISTING SYSTEM EXISTING SYSTEM EXISTING SYSTEMS EXISTING SYSTEM EXISTING WORK (RESEARCH) Collaborative filtering [1] Critically dependant on the availability of highquality user profiles Quite rare in the real world scenario Content based filtering [2] Highly dependent on user interaction LACKING FEATURES Absence of personalization No support for user preference (i.e. New job seekers tend to put more on their educational qualification than experience in a resume) Absence of resume and job dynamics Keyword/term correlated experience and expertise searching. There is no way to search for a job using your entire skill set and experience SOLUTION ● Personalized Job Matching System ○ Crawl resume ○ Compare against a continuously-growing database of job-postings across multiple sites and companies ○ SQL commands to explore the nature of the data and find patterns ● Prototype an Economic Job Graph JOB POSTING: GOOGLE JOB RESUME: INDEED.COM TYPICAL FORMAT OF JOB POSTINGS AND RESUME Job Posting Resume Job Title Title Qualification Educational Background Responsibilities Experience Job Description Additional Information FIELD TO SEARCH FOR Job Posting Resume Job Title Title Qualification Educational Background Responsibilities Experience Job Description Additional Information OUR APPROACH: KEY FEATURES A specialized search engine Full text resume and job search User control over the field Aspect based (keyword based) experience correlation Job prediction OUR APPROACH : SYSTEM COMPONENTS As a specialized search engine we have the following components Crawler Doc Analyzer Indexer Ranker Interface Evaluation CRAWLER (COLLECTION OF DATA) Problem No benchmark job postings data set No benchmark resume data set Scarce resource of resume! Solution Crawler We have to build specialized crawler for different employer and resume websites 3 different crawler for job posting: Google, Facebook and IBM 1 crawler for resume: indeed.com DOC ANALYZER Lets take a look in job postings : BA/BS -> Bachelor of Arts/Science MS - > Master of Science Solution: Dictionary Expansion Unix/Linux Data Structure Algorithm Software design Object oriented skills Javascript Network programming How to identify these? DOC ANALYZER (CONTD.) Problem: Can we identify the keywords from the open unstructured text? Solution 1: Unigram model Problem: keywords: Software Design becomes Software -> less important than Software Design Design -> less important than Software Design Solution 2: Phrase Query Problem: How to make phrase query when your input is a complete resume? DOC ANALYZER (CONTD.) Our observation: Most of these keywords are Noun Most of these keyword appears only after some preposition (in, with) For multiple word keyword (i.e. Software Design) search for consecutive Noun. Use of Parts of Speech Tagger Results are quite fascinating, we have got most of the meaningfull keywords. DOC ANALYZER (CONTD.) Take a look again on a job postings Question: Suppose you have all the qualifications, but not 4 years of experience, where should a job search engine rank this result? DOC ANALYZER (CONTD.) Can we indentify keywords oriented experience list for a job postings (or resume) like below? Keywords Experience C++, Java 2 years Software Development 5 years TCP/IP Not necessary …. … We already have the keywords list!!. Just simply find out the year of experience using the parts of speech tagger and some heuristics. INDEXER Two indexers Job Posting Resume Job Posting Indexer Job Title Job Location Job Qualification Job responsibilities Job keywords (Processed from Data Analyzer) Job experience (Processed from Data Analyzer) INDEXER (CONTD) Two indexer Job Posting Resume Resume Indexer Resume Title Educational Information Experience Additional information Resume keywords (Processed from Data Analyzer) Resume experience (Processed from Data Analyzer) INDEXER (CONTD) During Index time Document Booster Documents matching perfectly with query for keywords and experience fields, receives higher score except title. Ultimately these will help us in ranking the matched document in upper position. QUERY PROCESSING Since we have two indexers, we have two types of query Job postings (search in resume index) Resume (search in job posting index) Input from the users We take HTML form based input from the users Query Processing Perform the same steps of Doc Analyzer RANKING & RETRIEVAL During Index time Document Booster Documents matching perfectly with query for keywords and experience fields, receives higher score except title. Ultimately these will help us in ranking the matched document in upper position. Document Scoring Function: TF-IDF, BM 25 SYSTEM DESIGN Backend design: Run as a service on Apache Tomcat server 6.0 Java Client Connectivity: Java Server Page (JSP) Front End Design: HTML DEMO System Demo EVALUATION oEvaluate ● the performance we choose Mean Average Precision (MAP) oEvaluate Methodologies ● Resume selection: We carefully identify 2 resumes from our dataset. ● Job postings selection: Then we carefully labeled 10 job postings as relevant to those selected 2 resumes ● Mixed up these relevant job postings with some more 20 randomly picked job postings from data set. ● Then calculate the MAP of our System using the top 5 results and find out the MAP 0.3395 but traditional systems have only 0.295. JOB PREDICTION Until now we have performed two types of searching: For a given input Resume, perform search on the job posting index For a given input Job posting, perform search on the resume index Can we do something more using exiting resources? JOB PREDICTION (CONTD.) Perform Resume search on the Resume index. Why? Intuition: People with similar looking resume might be eligible for similar job!! Methods: Find similar resumes Find the companies in those resumes Recommend those companies JOB ANALYTICS • Goals: ● How fast are jobs being filled? ● How fast are jobs being posted? ● When is the best time to apply? • Filled Positions: JOBS IN THE USA WORLDWIDE JOBS PROGRAMMING LANGUAGES Facebook: 92.3% IBM: 23.9% Google: 5.8% FUTURE WORK Resume Feedback Suggestions What skills or experience do you need to be qualified for a certain job? Discover Patterns in Job-Hunting Seasons What time of year are jobs posted most frequently? Build a Personal Database Receive notifications of job posts that match your interests and skill level REFERENCES [1] Y. Lu, S. El Helou, and D. Gillet. A recommender system for job seeking and recruiting website. In Proceedings of the 22nd international conference on World Wide Web companion , pages 963{966.International World Wide Web Conferences Steering Committee, 2013. [2] R. Rafter, K. Bradley, and B. Smyth. Automated collaborative ltering applications for online recruitment services. In Adaptive Hypermedia and Adaptive Web-Based Systems , pages 363{368. Springer, 2000. TEAM CONTRIBUTIONS Mustafiz: NLP and IR system, JSP Backend, Google Crawler Sam: Crawler structure and database Ellie: IBM Crawler, Front end UI John: Job analytics