Uploaded by Hoang Le

Enhancing Information Extraction Process in Job Recommendation using Semantic Technology

advertisement
Available online at www.ijpe-online.com
vol. 18, no. 5, May 2022, pp. 369-379
DOI: 10.23940/ijpe.22.05.p7.369379
Enhancing Information Extraction Process in Job Recommendation
using Semantic Technology
Assia Brek* and Zizette Boufaida
Lire Laboratory, University of Abdelhamid MEHRI Constantine II, Constantine, 25016, Algeria
Abstract
Recently, the internet has become the first destination as a recruitment market, which has increased the number of job offers and resumes
online. Recommendation systems are proposed to help users filter this massive amount of information, selecting the best candidate or the
relevant offer. Processing the content of the documents correctly not only can reduce the matching complexity but also improve the
recommender performance. This paper presents a semantic-based information extraction process, which intelligently and automatically
extracts domain entities. The extracted entities are inter-linked to build domain context utilizing domain ontology covering the most
significant and common parts of job offers/resumes. Moreover, the extracted information is structured in RDF triples delivering a semantic
and unified presentation of documents data. The used ontology is dynamically enriched with both domain instances and relations to keep
up with the constant change of the relevant data. We evaluate our system using various experiments on data from real-world recruitment
documents. Our test results show that our approach can achieve a precision value of more than 90% in extracting domain-specific
information.
Keywords: information extraction; semantic technology; ontology; enrichment; semantic similarity
(Submitted on February 25, 2022; Revised on March 30, 2022; Accepted on April 18, 2022)
© 2022 Totem Publisher, Inc. All rights reserved.
1. Introduction
Resumes and job descriptions carry information unique to each job position and candidate, delivering a valuable base for
selecting the best candidate or choosing the right job. Thus, processing accurately job descriptions/resumes content is of
crucial importance. Processing documents aims to extract domain information intelligently and formally structure it, making
it easy to access and utilize. Current information extraction systems like OpenCalais (www.opencalais.com) and Alchemy
API (www.alchemyapi.com) are limited to extracting named entities based on NLP techniques that are insufficient for
complex sentences, and most importantly, that are unable to extract contextual entities.
The information in offers/resumes is presented in complex sentences composed of related domain entities that need
specific domain rules to extract. For example, in a job offer, the requirements may be composed of skills, years or level of
expertise such as "5+ years experience in Spring framework (Spring Boot, Spring MVC, Spring Batch, JPA)". Also, in a
resume, the education entity is presented with the diploma title, education degree, year of graduation, and university name.
Moreover, some entities are presented with different labels such as B.Sc, BS, Bachelors, which provides another challenge in
extracting domain entities. Extracting relation is also a critical phase, where contextual entities can be inferred from the
semantic links between entities. For example, let us look at a job that requires "3 years of experience in Android skill" and a
resume that deliver "Android skill" and "4 years of work experience as a developer of mobile applications". Linking the
"Android" with job experience "developer of mobile applications" can deliver the required entity "expert in Android skill".
As there are no defined templates for resumes or jobs contents, unifying the presentation of the extracted information is
essential. For example, instead of delivering "B.S. in Computer Science or equivalent" as a requirement, we can present the
"B.S." with the concept "degree" and "Computer Science" as "diploma", the same thing with the education information in a
resume, which facilitates matching the entities of the same concept.
* Corresponding author.
E-mail address: assia.brek@univ-constantine2.dz
370
Assia Brek and Zizette Boufaida
Semantic technology and resources such as ontologies and knowledge graphs are proposed as a solution to extract and
structure domain entities by presenting semantic links between them to infer contextual entities [1,2]. Therefore, we proposed
a semantic-based information extraction approach to intelligently extract domain entities, semantically interlink those entities,
and provide the extracted information in a unified format.
First, the offers/resumes content is divided into blocks based on keywords dictionary. Next, we developed extraction
rules (JAPE) based on domain entity dictionaries and sentence syntax to extract entities from each block. A domain ontology
(JobOnto) is proposed to link the extracted entities with semantic relations. Moreover, our ontology is enriched in sync with
the extraction process to keep up with data changes. As a final phase, the retrieved information was given as RDF triples,
granting a semantic and unified structure that is easy to access and exploit. The proposed approach processes the domain
documents intelligently and automatically without any human intervention, improving the user experience and gaining time
for both employers and candidates. Moreover, it has been evaluated in real domain data and gets a more than 90% precision
value.
This paper is organized as follows: Section 2 outlines the related work in information extraction and e-recruitment,
Section 3 delivers a detailed presentation of the proposed semantic-based process, Section 4 presents the evaluation, and
finally Section 5 concludes the paper.
2. Related work
Recommender systems aim to suggest products that interest users by studying their behaviors (their product research) and
their profiles [3]. As a recommender system, the job recommender system can retrieve a list of job positions that satisfy a job
seeker’s desires or a list of talented candidates that meet the requirements of a recruiter by using recommendation approaches.
The main- stream approaches to recommender systems are classified into four categories: Knowledge-based (KB), ContentBased Filtering (CBF), Collaborative Filtering(CF), and Hybrid approaches. Based on several comparison studies in the erecruitment field [4-6], the content-based approach is more suitable for this context, according to the necessity of examining
the content of domain documents to grant the best recommendation. Those documents contain unstructured information that
is difficult to automatically access or process correctly. Several information extraction methods have been proposed to address
this issue. Information extraction is an automatic technique for identifying entities from unstructured and/or semi-structured
documents, intending to structure these documents and make them machine-readable. Based on previous research, the
information extraction methods can be divided into three categories.
Rule and pattern-based methods: exploit predefined rules or patterns to extract entities from documents. The
rules/patterns are usually defined based on document structures or domain dictionaries. In [7,8] authors defined entities
extraction rules using the sentence syntax information, writing style, punctuation index, and word lexical in order to extract
entities from resumes. another way proposed by [9] suggests SAJ system that uses JAPE rules based on a domain-specific
dictionary to extract entities from job offers. These methods are limited to the text structure and domain dictionary. It is hard
to adopt these methods since it is impossible to know how many documents follow the same template or if the dictionary
covers all domain knowledge.
Machine learning-based methods: allow extracting information from unstructured texts by applying Hidden Markov
Models (HMM) [10] and Conditional Random Fields (CRF) [11]. PROSPECT [12] was presented as a system to rank the
candidate's resume for a job. The system employs a conditional random field (CRF) based on three features (Lexicon, Visual,
and Named Entity) to segment and label the resume contents. In [13], the authors proposed a cascaded information extraction
framework to acquire detailed information from the resumes. In the first step, the Hidden Markov Modeling (HMM) model is
applied to divide the resume into successive blocks. Next, an SVM model is utilized to gather detailed information in each
block, such as name, address, education, etc. Machine learning-based techniques require large data sets for training; the
absence of this last may cause information loss or extracting noisy ones.
Semantic methods: have recently been applied to guide the information extraction process [14]. Semantic methods
mainly exploit semantic references like Knowledge graphs and domain ontologies to extract entities or relations from
unstructured text. Domain ontologies have been used in different ways to extract and identify entities. In [15-17], a domain
ontology is used as a semantic dictionary to extract information from resumes. The document content was first segmented
into sentences, then the ontology concepts are used to detect domain entities. In the same way, authors in [18,19] proposed a
semantic annotation approach using domain ontology to extract information from resumes and job offers. Using an ontology
to extract entities takes advantage of exploiting semantic knowledge that gives accurate results, but the ontology knowledge
incompleteness problem still constrains it. To prevent this problem, authors in [20] utilized WordNet and YAGO3 knowledge
graph (KG) for two reasons first, to capture terms from resumes and offers after segmenting them into sentences and following
Enhancing Information Extraction Process in Job Recommendation using Semantic Technology
371
extracting semantic and taxonomic relations (synonymy relation and hypernymy relation) to create a network that reflects
resume and offer. The usage of YAGO3 and WordNet was insufficient, as certain concepts are absent; besides, the extracted
relations are not always meaningful.
This delivers the extracted information in a structured format that is easy to access and use, influences the matching
process, and enhances the recommendation results. Several works ignore this crucial phase, treating the extracted information
as a bag of entities where the matching process identifies the required entity in this bag. In contrast, others use different
formats, such as XML, RDF, or JSON, to formally structure the extracted entities from resumes [9,18,19]. In [8], the authors
used vectorial representation to generate a classification with Support vector machines (SVM). Furthermore, in [20], the
authors exploit WordNet to present the extracted entities in a semantic network using synonyms and hypernyms relations.
The works mentioned above present the extracted entities in different ways but ignore the semantic relation between entities
and the importance of presenting the resumes and offers in a unified format.
Our information extraction approach takes into consideration the limits of the previous methods. We proposed a
semantic-based approach to extract entities and relations from resumes and offers. Our approach employed JAPE rules based
on domain-specific dictionaries and sentence syntax to extract entities; then, we developed a domain ontology covering the
most significant and common parts of job offers/resumes (skills, diplomas, interests, assets, and work experiences); This last
part is used to link the extracted entities semantically and yield contextual entities. The extracted entities are presented in RDF
triples delivering a unified format for resumes and offers. Furthermore, during the IE process, the used ontology is enriched
with domain instances and relations to keep up with the constant change of the relevant data.
3. Proposed Approach
The e-recruitment documents are primarily published as unstructured text, which is difficult for the machine to process or
exploit. The information extraction process intends to extract information and entities such as skills, diplomas, experience,
job title, and others essential for job recommendations. We propose a process that will apply semantic technology to extract
information and formally structure them, presenting their semantic. Figure 1 presents the different phases of our information
extraction process.
Figure 1. Information Extraction Process
3.1. Overview
The proposed process consists of three main phases: Preprocessing that assists in extracting text from different files type and
structuring it in blocks based on keywords dictionary. Information extraction is the main phase that consists of extracting
entities and relations from each block of data based on defined JAPE rules and JobOnto. Information presentation enables
presenting the extracted information semantically in RDF triples.
3.2. Preprocessing Phase
This phase aids with text segmentation. The input is a given set of job offers or resumes with different file formats, such as
doc, docx, and pdf. Those files will be processed by Tika (HTTP //tikaapache.org) to extract the raw text while table layouts,
font type, and font colors will be removed. The next stage is arranging the extracted text in blocks based on the contents of
each section like "skills", "education", "job experience", and "requirement". Each section usually starts with a title that
describes its contents. Therefore, we started by identifying those titles and adding the "kayVal" label based on keywords
dictionary. This label indicates the beginning of a block, that finishes with identifying the next section label. Some vital
information does not begin with a label, such as personal information presented at the start of the resume. We define it as the
first block that starts from the first line of the text and ends at the first label. Figure 2 summarize the description above.
372
Assia Brek and Zizette Boufaida
Figure 2. Flowchart of the preprocessing phase
3.3. Information Extraction
3.3.1. Entities Extraction
As shown in Figure 3, the entities from blocks will be extracted and structured in an XML file in this step. The resumes or the
job offers are published in different templates, which means some information sections may be labeled differently, such as
"projects" that can be expressed with "achievements" or "requirements" as "must have". We suggest using an XML skeleton
to manage each resume/offer content to address this issue, emphasizing functional parts. Unlike the existing works, our
extraction theory suggests extracting only the necessary entities for the matching module and ignoring others such as name,
university name, phone number, mail, graduation year or company name. Therefore, we developed extracting rules using
JAPE grammar. JAPE grammar uses features for building pattern/action rules. The feature set contains aspects such as POS
tags, a dictionary of entities presented in gazetteer lists, and punctuations. Based on the tags name of the XML skeleton, the
block is selected, and then a specific JAPE rule for each block is applied to the block's content to extract entities and write
XML elements values (tag value and attribute value).
Figure 3. Flowchart of the entity extraction step
Enhancing Information Extraction Process in Job Recommendation using Semantic Technology
373
3.3.2. Semantic Labeling
We faced another challenge in extracting entities when a single word may express several concepts and vice versa. For
example, the term bachelor can be stated in several ways: baccalaureate, bachelor, B.S., B.A., BA/BS, etc. We suggest
employing a gazetteer list (Figure 4), which provides semantically similar terms to "bachelor". This list gives major type
property "degree" and minor type property "bachelor" in our example, after the annotation of type "lookup" is raised for each
education degree in the text, with the main type property set to "degree", and then we may extract the minor type property.
For example, the sentence "B.S. in computer science" from resume/offer, by applying the JAPE rule, the "B.S." token is
selected as lookup annotation with major type "degree"; next, we will extract the minor type of this annotation, which is
"bachelor". Figure 5 illustrates the JAPE rule of bachelor degree example.
Figure 5. Jape rule (example of bachelor degree)
Figure 4. Gazetteer list
This phase result is an XML file containing the extracted entities structured based on the XML skeleton. Figure 6 presents
an example of a resume XML file.
Figure 6. XML file (resume example)
374
Assia Brek and Zizette Boufaida
3.3.3. Relations Extraction
In this step, we aim to interlink the extracted entities based on the developed ontology (JobOnto) and the generated XML file.
Those links present both hierarchical and associative relations between entities which deliver information that may not be
explicitly defined in the documents. For example, adding a link between certification and language proves the required
language level. Adding a link between "TCF C2" and "French" presents the required information as "Fluent in French". This
step aims to semantically structure the data and implicitly enrich the JobOnto, which consequently assesses the semantic
similarity between resume, and job offers entities. Figure 7 presents flowchart of this step.
Figure 7. Flowchart of the relations extraction step
Figure 8. JobOnto ontology
Unlike the previous works, we developed one ontology that represents both resumes and offers. As shown in Figure 8,
the JobOnto (job ontology) elements are inspired by the most essential and common parts of resumes and job offers, such as
personal information, diplomas, skills, languages, interests, and job experiences. Moreover, the relationships (both
hierarchical and associative) present the semantic links between concepts, which provide a map to match resumes and offers
later.
The concept "person" has its attributes that are displayed in the personal information of the resume, which includes
personal information necessary in job offers such as age, sex, driving license, or address when requiring that an applicant
reside in an area near the work location. Furthermore, the "job" concept represents the work experience in the resume or the
Enhancing Information Extraction Process in Job Recommendation using Semantic Technology
375
job offered in the job descriptions, the applicant that occupant a job post or works on a project is gaining experience in certain
skills, which is shown with the relation "has-experience". Having a certification confirms the candidate's proficiency in some
skill or language, which is offered with the concept "certification" associated with concept "skill" and "language" with relation
"proves". Besides the skills, degrees, certification, and job experience, considering "assets" and "interest" is crucial in resumes
and offers. For example, in a job offer, the asset "team spirit" is required; this concept may not be clearly expressed in a
resume but may be deduced from the interest in football or work experience in a team. The concept "document" represents
the annotated document, which is defined with an "id", a "title", and "type" that indicates whether the document is a resume
or job offer.
Extracting relations from the ontology is based on comparing the extracted entities with the ontology instances using
entity types, such as skill, diploma, job, and others, for identification of the relationships. If the extracted entity is not in the
ontology instances, the JobOnto is enriched with the entity as a new instance.
Relation between entities can also be deduced from the XML document structure, where the relation parent/child present
forms an associative relation between the two entities. For example, in the XML file of a given resume from Figure 3, the
skill "routing" is a tag child of the job "network engineer". Thus, we can add a relation between "routing" and "network
engineer" titled "has experience" based on the JobOnto architecture.
3.3.4. Enrichment
Despite the benefits of employing ontology to interlink entities and construct context, enriching that ontology is a significant
challenge since the data constantly changes. Enriching the JobOnto is an implicit phase in the information extraction procedure.
To extract relation between entities, first, we compare the entities with the ontology instances using the entity type to determine
the ontology element that represents the latter and select the relation. If the entity is not in the ontology instances, we add the
entity as a new instance. Based on a similarity measure, we establish a relation "related-to" between the new instance and
others from the ontology of the same type.
The similarity between the two instances is based on two measures: string similarity using Levenshtein distance and
semantic similarity using the LCH measure [21] over WordNet dictionary, See Equation (1).
𝑆𝑖𝑚(𝑁𝑖, 𝑂𝑖) = 𝐿𝑒𝑣 (𝑁𝑖, 𝑂𝑖) + 𝐿𝑐ℎ (𝑁𝑖, 𝑂𝑖)
(1)
Where Ni refer to the new instance and Oi to the ontology instance. The similarity score is the sum of the two-measure
value.
The Levenshtein distance is used to catch similarities between the two strings (Ni, Oi), to deal with entities that are close
in terms of writing such as java, java fx, java EE or angular 8, angular js; the distance is scaled by the length of the longest
string as Equation (2):
𝐿𝑒𝑣 (𝑁𝑖, 𝑂𝑖) =
(𝑙𝑜𝑛𝑔𝑒𝑟𝐿𝑒𝑛𝑔𝑡ℎ – 𝐿𝑒𝑣𝑒𝑛𝑠ℎ (𝑁𝑖,𝑂𝑖))
𝑙𝑜𝑛𝑔𝑒𝑟𝐿𝑒𝑛𝑔𝑡ℎ
(2)
The distance score ranges between 1 and 0, 1 for totally similar and 0 for no similarity.
To handle the entities that are differently written but related based on their definition like "network director" and
"telecommunication manager", the instances are submitted to the LCH measure. The Leacock and Chodorow (LCH) method
count the number of edges between two words in WordNet's 'is-a' hierarchy. The value is then scaled by the maximum depth
of the WordNet 'is-a' hierarchy. A similarity value is obtained by taking the negative log of this scaled value. The LCH method
is exploited via WS4J API. The formulation is as Equation (3):
Lch(𝑁𝑖, 𝑂𝑖) = max (−𝑙𝑜𝑔
𝑆ℎ𝑜𝑟𝑡𝑒𝑠𝑡𝐿𝑒𝑛(𝑁𝑖,𝑂𝑖)
2∗𝑇𝑎𝑥𝑜𝑛𝑜𝑚𝑦𝐷𝑒𝑝𝑡ℎ
)
(3)
The similarity score is 0 for no similarity and superior to 0 if there is a similarity.
3.4. Information Presentation
The aim of structuring and delivering the extracted information in a unified format is to facilitate the access and use of this
information. Using this presentation in matching resumes/offers gains time and gives more accurate results since only the
376
Assia Brek and Zizette Boufaida
entities with the same type are compared instead of all entities. In addition, the presented relations will be represented as a
semantic map to match the documents and recommend the best job or candidate. Presenting information may be seen from
structuring the extracted knowledge based on the relations between entities and their etiquettes, attributes, and between entities
themselves. Therefore, we deliver the extracted knowledge as RDF triples. The RDF Triple is presented in 3 elements: subject,
predicate, and object. Table 1 shows the possible cases of presentation. Both resume and job offer data are presented on RDF
triples based on the delivered description, which grants a unified representation for the extracted information.
Subject
Concept
Instance
Instance
1
Predicate
rdf: type
Attribute
Name
Object
property
Object
Instance
Attribute
value
Instance
2
Table 1. RDF triples description
Description
This RDF triple presents the relation between the extracted entity and its type using JobOnto URI. See Figure 9
This RDF triple presents the relation between the extracted entity and the attributes; the attribute name defines
the relation between the entity and the attribute value. See Figure 10
This RDF triple presents the relation between two extracted entities, defined by the object property between
them. See Figure 11
Figure 9. Example 1
Figure 10. Example 2
Figure 11. Example 3
4. Evaluation
To assess the performance of our process, we evaluated it on real-world data. As there is no standard dataset available for job
descriptions or resumes, we gathered a dataset of 1000 job descriptions downloaded from (https://www.indeed.com/;
https://www.kaggle.com/). The content of job descriptions from "Indeed" contains more complex sentences and more detailed
information, unlike data from "kaggle", which contain brief information that describes only the required skills. We collected
100 resumes from (https://www.linkedin.com/) that contains information about candidates of different levels (senior,
beginners) and various specialties. The collected datasets belong to the field of information technology and computer science.
The collected resumes and job posts are unstructured documents in different document formats such as (.pdf) and (.doc).
The experiments were conducted on a laptop machine running Windows 10 with an Intel Core i5-4300U vPro processor and
16 GB RAM.
Evaluation Metrics: To evaluate the information extraction results we used the measure of precision reports how well a
system can identify information from a resume/offer and recall reports what a system actually tries to extract. Thus, these two
metrics can be seen as a measure of completeness and correctness, and F-measure is used as a weighted harmonic mean of
precision and recall. We denote the relevant entities in the resume/offer as E = {e1, e2,...en} and the retrieved entities as Ê =
{ ê1, ê2,... ên}, see Equations (4)-(6).
𝑟𝑒𝑐𝑎𝑙𝑙 =
𝐸 ∩Ê
𝑝𝑒𝑟𝑐𝑖𝑠𝑖𝑜𝑛 =
(4)
𝐸
𝐸 ∩Ê
(5)
Ê
2∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑟𝑒𝑐𝑎𝑙𝑙
𝐹 − 𝑚𝑒𝑎𝑠𝑢𝑟𝑒 = 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙
Table 2. Results of information extraction evaluation
Job offers
Resumes
Recall Precision F-1
Recall Precision
Skills
0.880
0.911
0.895 0.948
0.961
Education
0.851
0.832
0.841 0.902
0.970
Work experience
0.792
0.810
0.800 0.843
0.830
Personal information 0.723
0.734
0.728 0.972
0.987
Assets/Interests
0.761
0.774
0.767 0.820
0.846
(6)
F-1
0.954
0.934
0.836
0.979
0.832
Enhancing Information Extraction Process in Job Recommendation using Semantic Technology
377
Table 2 shows the information extraction approach results from extracting the required/acquired entities in the documents.
Exploiting entity dictionaries, punctuation, and sentence syntax in the extraction rules (JAPE rules) improves the extraction
module and gives fascinating results. Based on the table, we can see that the extraction results from resumes are better than
job offers due to how the information is written. The resume content presented in small sample sentences differs from the job
description, where sentences are more complicated, making extracting all correct entities more difficult.
The results represent the efficiency of extracting detailed knowledge in each block (skills, education, work experience,
personal information, assets, and interest). Based on those results, we conducted a comparative study with our approach and
other systems, those that exploit semantic methods E-RecSys [19], machine learning methods PROSPECT [12], CHM [17],
and others based on extraction rules SAJ [9]. We took the results directly from their papers, and we compared the precision
results in some commonly treated knowledge parts as (education, skills).
Figure 12 shows a comparison with the three systems that extract information from resumes from the graph; it is evident
that our process performed well compared to other systems. In Figure 13, our process was compared with the SAJ system that
extracts entities from job offers. Although SAJ used JAPE rules for entities, it achieved only 38% of precision in extracting
education entities compared to our process that also uses JAPE rules and achieved 83%. Unlike SAJ, which exploits only
domain entities dictionary on JAPE rules, we utilized both syntactic and lexical rules to extract detailed information.
120
Pecision
100
80
60
40
20
0
E-RecSys
PROSPECT
Education
CHM
our process
Skills
Figure 12. Comparative analysis among E-RecSys , PROSPECT, CHM, and our process
100
Percision
80
60
40
20
0
SAJ
our process
Education
Skills
Figure 13. Comparative analysis between our process and SAJ
Besides measuring the precision of entities extraction, we evaluated how extracting relations and presenting information
improves knowledge access and information retrieval. We selected some requirements manually from job offers to form
SPARQL queries searching for candidates with those requirements. We generated SPARQL queries of different requirements
(education, personal information, and work experiences), and we applied those queries over 50 resumes.
378
Assia Brek and Zizette Boufaida
We compared the selected resumes using SPARQL queries with manual selecting results. Table 3 shows the obtained
results and clearly shows that our extraction and presentation process delivers a valuable, structured knowledge base and is
easy to access and exploit.
Table 3. Results of resumes retrieval
Required information Manual selection Automatic selection
Education
30
30
Personal information
15
15
Work experience
12
12
5. Conclusion
This research proposed a semantic-based approach to extract information from resumes/offers and presented it in a unified
format. First, document content is structured in blocks based on the label of each section; next, we applied lexical, syntactic,
and semantic rules (JAPE rules) to extract domain entities from each block. Furthermore, based on the domain ontology
(JobOnto), we extract semantic relations to link the extracted entities and build context. Finally, the extracted information is
presented in RDF triples to grant a unified format for both resume and job offer contents. The JobOnto is dynamically enriched
during the process.
The evaluation has been performed on a dataset of 1000 jobs and 100 resumes. The initial assessment was conducted by
comparing verified data and extracted entities.
Currently, we are working on a matching process that exploits the RDF file of the extracted information and the proposed
ontology. We are also focusing on generalizing our process and exploring datasets of different fields.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
Bizer, C., Heese, R., Mochol, M., Oldakowski, R., Tolksdorf, R., and Eckstein, R. The Impact of Semantic Web Technologies on
Job Recruitment Processes. In Wirtschaftsinformatik, Physica, Heidelberg, pp. 1367-1381, 2005.
Mochol, M., Wache, H. and Nixon, L. Improving the Accuracy of Job Search with Semantic Techniques. In International
Conference on Business Information Systems, Springer, Berlin, Heidelberg, pp. 301-313, 2007.
Ricci, F., Rokach, L., and Shapira, B. Recommender Systems: Introduction and Challenges. In Recommender systems handbook,
Springer, Boston, MA, pp. 1-34, 2015.
Al-Otaibi, S. T., and Ykhlef, M. A Survey of Job Recommender Systems. International Journal of Physical Sciences, vol. 7, no.
29, pp. 5127-5142, 2012
Tran, M. L., Nguyen, A. T., Nguyen, Q. D., and Huynh, T. A Comparison Study for Job Recommendation. In 2017 International
Conference on Information and Communications (ICIC), IEEE, pp. 199-204, 2017
Dhameliya, J., and Desai, N. Job Recommender Systems: a Survey. In 2019 Innovations in Power and Advanced Computing
Technologies (i-PACT), IEEE, vol. 1, pp. 1-5, 2019.
Chen, J., Zhang, C., and Niu, Z. A Two-step Resume Information Extraction Algorithm. Mathematical Problems in Engineering,
2018.
Kessler, R., Torres-Moreno, J. M., and El-Bèze, M. E-Gen: Automatic Job Offer Processing System for Human Resources. In
Mexican international conference on artificial intelligence, Springer, Berlin, Heidelberg, pp. 985-995, 2007.
Ahmed Awan, M. N., Khan, S., Latif, K., and Khattak, A. M. A New Approach to Information Extraction in User-Centric ERecruitment Systems. Applied Sciences, vol. 9, no. 14, 2019.
Zhang, N. R. Hidden Markov Models for Information Extraction. Technical Report. Stanford Natural Language Processing Group,
2001.
Lafferty, J., McCallum, A., and Pereira, F. C. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling
Sequence Data, 2001.
Singh, A., Rose, C., Visweswariah, K., Chenthamarakshan, V., and Kambhatla, N. PROSPECT: A System for Screening
Candidates for Recruitment. In Proceedings of the 19th ACM international conference on Information and knowledge
management, pp. 659-668, 2010.
Yu, K., Guan, G., and Zhou, M. Resume Information Extraction with Cascaded Hybrid Model. In Proceedings of the 43rd Annual
Meeting of the Association for Computational Linguistics (ACL’05), pp. 499-506, 2005.
Martinez-Rodriguez, J. L., Hogan, A., and Lopez-Arevalo, I. Information Extraction Meets the Semantic Web: a Survey. Semantic
Web, vol. 11, no. 2, pp. 255-335, 2020.
Senthil Kumaran, V., and Sankar, A. Towards an Automated System for Intelligent Screening of Candidates for Recruitment
Using Ontology Mapping (EXPERT). International Journal of Metadata, Semantics and Ontologies, vol. 8, no. 1, pp. 56-64, 2013.
Çelik, D., and Elçi, A. An Ontology-based Information Extraction Approach for Résumés. In Joint international conference on
pervasive computing and the networked world, Springer, Berlin, Heidelberg, pp. 165-179, 2012.
Guo, S., Alamudun, F., and Hammond, T. RésuMatcher: A Personalized Résumé-job Matching System. Expert Systems with
Applications, vol. 60, pp. 169-182, 2016.
Yahiaoui, L., Boufaïda, Z., and Prié, Y. Semantic Annotation of Documents Applied to E-Recruitment. In SWAP, 2006.
Ben Abdessalem Karaa, W., and Mhimdi, N. Using Ontology for Resume Annotation. International Journal of Metadata,
Semantics and Ontologies, vol. 6, no. 3-4, pp. 166-174, 2011.
Maree, M., Kmail, A. B., and Belkhatir, M. Analysis and Shortcomings of E-recruitment Systems: Towards a Semantics-based
Enhancing Information Extraction Process in Job Recommendation using Semantic Technology
379
Approach Addressing Knowledge Incompleteness and Limited Domain Coverage. Journal of Information Science, vol. 45, no. 6,
pp. 713-735, 2019.
21. Leacock, C., and Chodorow, M. Combining Local Context and Wordnet Similarity for Word Sense Identification. WordNet: An
electronic lexical database, vol. 49, no. 2, pp. 265-283, 1998.
Download