Uploaded by Rhythm Bhatia

icml bhatiarhythm

advertisement
ML for Global Health - ICML Workshop
2020
REVIEW OF FEATURE IMPORTANCE FOR MENTAL HEALTH
INTERVENTIONS
Rhythm Bhatia
University of Eastern Finland
rbhatia@student.uef.fi
EXTENDED ABSTRACT
Mental health impacts our emotional, social and psychological well-being. It is an integral part of a balanced
and healthy life. For working professionals, especially doctors, it is often seemed that long working hours
and heavy workloads affect mental health. The emotional well-being of doctors is a matter of concern not
only for them, but also us [11]. Due to lack of research and data availability to determine the factors affecting
mental health, this study is conducted on publicly available dataset collected through a survey conducted on
technology workers [1].
Data source
The data for this feature analysis was hosted by Kaggle.com and was collected by Open Sourcing Mental
Illness, Ltd with an open source creative common license (CC BY-SA 4.0). This dataset consists of responses
to subset of questions prepared by the OSMI Mental Health in Tech survey. The responses were answered
by multiple people working in technology firms spanning across multiple countries. Data for this survey was
collected from August 2014 to February 2016. [1]
Variables
Feature analysis was performed across the participant characteristic variables on the survey that consists
variables defined below:
Variable Name
Age
Gender
Family history
Benefits
Care option
Anonymity
Leave
Work interfere
Mental health consequence
Physical health consequence
Definition
Age of participant
Gender of participant
History of mental illness in participant’s family.
Mental health benefits plan provided by participant’s employer.
Awareness of mental health benefits plan provided by participant’s employer.
Option to protect identity if participant chooses to take advantage of
mental health resources.
How easy it is for participant to take medical leave from work for a
mental health condition?
If participant has a mental health condition, does he/she feel that it interferes with their work?
Are there negative consequences of discussing a mental health issue
with your employer?
Are there negative consequences of discussing a physical health issue
with your employer?
Table 1: Variable definitions
All binary with the values “yes” or “no” were coded into numeric format of 0 or 1. ’Age’, in years, was left
as is. ’Work Interfere’ and ’leave’ was coded into ordinal bins consisting of never, rarely, sometimes and
1
ML for Global Health - ICML Workshop
2020
often. These ordinal measures were coded in the increments of 1 with the lowest level having the value of
0. Gender variable was coded to indicate females equal to 1 and males equal to 0. Only 172 out of 1,251
responses were used for analysis due to the uncertainty of the survey responses from the participants. [9]
Feature analysis
Method
Linear Regression
Logistic Regression
Permutation importance
Xgboost
χ2 p-value
Age
0.10650
0.65304
-0.00270
Gender
-0.08078
-0.51353
0.00016
Family history
0.18267
1.13361
0.00955
Benefits
0.04342
0.32557
0.02259
Care options
0.04210
0.27556
0.01671
0.05079
NAN
0.05566
0.23919
0.08413
0.153238
0.06588
8.01e−105
0.05849
4.6075e−53
Method
Anonymity
Leave
Work interfere
Linear Regression
Logistic Regression
Permutation importance
Xgboost
χ2 p-value
0.02377
0.15819
0.02355
0.00170
0.02007
0.02848
0.16501
0.92513
0.26476
0.06305
3.7375e−36
0.05782
0.46566
5.8297e−16 6.764e−06
Mental health
consequences
-0.00500
-0.03414
0.00414
Physical health
consequences
-0.00020
0.00750
-0.00095
0.05291
4.4497e−05
0.04560
0.0145355
Table 2: Feature values based on the various methods
The above table demonstrates the feature importance score of various variables with respect to the "seek
help" variable. This is implemented using scikit library [10] for linear regression [5], logistic regression [8],
permutation importance [2], Xgboost [4] and χ2 test [7].
Results
Based on the above reported feature analysis and χ2 test, we observe that benefits, anonymity, leave, physical
health consequences, mental health consequences are the relevant features. Consequently, a small p-value
( > 0.05), indicates that we can reject the null hypothesis that there is no relationship between the selected
feature and help seeking behaviour of participant, and conclude that there is a strong relationship between
the two variables. χ2 is not valid for age because expected frequencies for some participants is less than 1.
These results also suggest that there are negative consequences of discussing mental and physical health
issues with the employer. Also, mental health benefits and care options alone do not act as a motivation for
employee to seek help. Although, if the employee is given an option to maintain their anonymity and take
leave, it will positively impact their behaviour to opt for mental health services.
Future Work
These results can help in selection of features to develop models which can help in discovering and helping
medical professionals dealing with mental health issues due to COVID-19 pandemic. COVID-19 pandemic
has put the medical care professionals all over the world in an unusual situation where they have to work
under extreme pressure with scarce resources, take tough decisions and deal with life and death. They suffer
from moral injury, (military term) defined as the damage done to one’s conscience due to the actions, or
the lack of them. This can contribute to mental health difficulties, including depression, post-traumatic stress
disorder and suicide [6]. A mental health study conducted on medical staff in China also indicated significant
behavioral changes on the staff attending to COVID-19 patients. [3]
2
ML for Global Health - ICML Workshop
2020
REFERENCES
[1] Kaggle dataset link:. https://www.kaggle.com/osmi/mental-health-in-tech-survey.
[2] André Altmann, Laura Toloşi, Oliver Sander, and Thomas Lengauer. Permutation importance: a corrected
feature importance measure. Bioinformatics, 26(10):1340–1347, 2010.
[3] Qiongni Chen, Mining Liang, Yamin Li, Jincai Guo, Dongxue Fei, Ling Wang, Li He, Caihua Sheng,
Yiwen Cai, Xiaojuan Li, et al. Mental health care for medical staff in china during the covid-19 outbreak.
The Lancet Psychiatry, 7(4):e15–e16, 2020.
[4] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd
acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
[5] David A Freedman. Statistical models: theory and practice. cambridge university press, 2009.
[6] Neil Greenberg, Mary Docherty, Sam Gnanapragasam, and Simon Wessely. Managing mental health
challenges faced by healthcare workers during covid-19 pandemic. bmj, 368, 2020.
[7] Priscilla E Greenwood and Michael S Nikulin. A guide to chi-squared testing, volume 280. John Wiley
& Sons, 1996.
[8] David G Kleinbaum, K Dietz, M Gail, Mitchel Klein, and Mitchell Klein. Logistic regression. Springer,
2002.
[9] Pratik Patel. Perceived workplace factors and their influence on self-reported mental health service seeking among technology workers, 2018.
[10] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier
Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine
learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
[11] Reidar Tyssen and Per Vaglum. Mental health problems among young doctors: an updated review of
prospective studies. Harvard review of psychiatry, 10(3):154–165, 2002.
3
Download