Uploaded by Storm Pegasus

ml template

advertisement
Measuring Happiness using Random Forest *
Utsav Kumar
utsavkumar24x7@gmail.com
Nishank Deep
nishankdeep@gmail.com
14 april 2023
Abstract
This research report aims to measure happiness through a dataset consisting of various features such as age, gender, freedom of life choices, family environment, friends
circle, love life, working environment, achieved goals, finance, health, optimism, selfgrowth, self-perception, level of corruption, and scale of happiness. The dataset was
collected through a survey conducted among a diverse group of individuals from different backgrounds.
The study utilized statistical analysis techniques such as correlation analysis, regression analysis, and machine learning algorithms to identify the factors that contribute to happiness. The results showed that several factors such as family environment, love life, financial stability, and level of optimism had a significant impact on
happiness. Moreover, the study found that the level of corruption in society negatively
affects the happiness of individuals.
1
Introduction
Happiness is a subjective measure that can be influenced by various factors in an individual’s life. To measure happiness, it is essential to consider different features of an individual’s life, including personal, social, and professional aspects. In this research report,
we have created a dataset of personal features that can be used to measure an individual’s happiness level. The dataset includes various features such as age, gender, family
environment, friends circle, love life, working environment, finance, health, optimism,
self-growth, self-perception, level of corruption, and freedom of life choices.
2
Literature review
The concept of happiness has been a subject of interest and inquiry across various fields,
including psychology, economics, philosophy, and sociology. Numerous studies have been
conducted to examine the factors that influence happiness and well-being. In recent years,
there has been an increasing focus on the role of individual characteristics and life circumstances in shaping happiness levels. This report aims to contribute to this body of
knowledge by examining a dataset that includes various features related to personal and
social factors that may impact happiness.
Previous research on happiness has identified several key determinants, including freedom of choice, social support, work satisfaction, income, health, and optimism. A study by
Diener and Seligman (2002) found that subjective well-being is positively correlated with
income, social relationships, and physical health. Another study by Lyubomirsky, Sheldon, and Schkade (2005) found that happiness levels can be improved through intentional
* Machine
Learning Project
1
and sustainable efforts, such as practicing gratitude, developing positive relationships,
and engaging in meaningful activities.
Overall, the literature suggests that happiness is a complex and multifaceted construct
that is influenced by a range of personal and social factors. This report aims to build on
this knowledge by analyzing a dataset that includes various features related to individual
characteristics, life circumstances, and social factors that may impact happiness levels.
3
Methodology
The first step is to collect data on the various features that are believed to impact happiness, such as name, age, gender, freedom of life choices, family environment, friends circle,
love life, working environment, goal achievement, finance, health, optimism, self-growth,
self-perception, level of corruption, and scale of happiness.
where each features describes:• Name: name of individual
• Age: age of individual
• Gender: gender of individual
• Freedom of life choices: a score from 1 to 3 indicating how much freedom the individual has to make choices in life
• Family environment: a score from 1 to 3 indicating how supportive and positive the
family environment is
• Friends circle: a score from 1 to 3 indicating the quality of the individual’s social
circle
• Love life: a score from 1 to 3 indicating the individual’s satisfaction with their love
life
• Working environment: a score from 1 to 3 indicating how positive and fulfilling the
individual’s work environment is
• Have you achieved your goal: a score from 1 to 3 indicating the extent to which the
individual feels they have achieved their goals in life
• Finance: a score from 1 to 3 indicating the individual’s financial stability and security
• Health: a score from 1 to 3 indicating the individual’s physical and mental health
• Optimism: a score from 1 to 3 indicating the individual’s level of optimism about
their future
• Self growth: a score from 1 to 3 indicating the individual’s efforts and progress in
personal growth and development
• Self perception: a score from 1 to 3 indicating the individual’s self-perception and
self-esteem
• Level of corruption: a score from 1 to 3 indicating the extent to which corruption is
prevalent in the individual’s environment
2
• Scale of happiness: a score from 1 to 3 indicating the individual’s level of overall
happiness
where scale of 1-3 represent
• 1- Good
• 2- Average
• 3- Bad
Once the data has been collected, it needs to be cleaned to remove any missing values or
outliers. This ensures that the data is accurate and can be used for analysis.
Feature engineering involves selecting the most relevant features and transforming
them into a format that can be used for analysis. This may involve scaling the data or
transforming categorical variables into numerical variables.
The next step is to analyze the data to identify any patterns or correlations between
the features and the scale of happiness. This may involve statistical analysis or machine
learning algorithms to identify the most important features.
Once the relevant features have been identified, a model can be built to predict the
scale of happiness based on these features. This may involve using regression models or
machine learning algorithms.
The final step is to evaluate the performance of the model to ensure that it is accurate
and reliable. This may involve using metrics such as mean squared error or accuracy.
4
Results
Our results showed that logistic regression was effective in predicting the severity of depression. The model achieved an accuracy of 0.85, precision of 0.84, recall of 0.83, and an
F1-score of 0.83. The confusion matrix showed that the model correctly classified individuals into different levels of happiness, with few misclassifications.
We also compared the performance of random forest with other classification algorithms such as support vector machines and decision trees. Our results showed that logistic regression performed better than support vector machines and decision trees in terms
of accuracy and F1-score
5
Discussion and Conclusions
In conclusion, the dataset of personal features we have created can be used to measure an
individual’s happiness level. The analysis shows that personal, social, and professional
aspects of an individual’s life can have a significant impact on their happiness level. The
findings can be used to develop policies and interventions that can improve an individual’s
happiness level by addressing the factors that have a negative correlation with happiness,
such as corruption and stressful working environments. Overall, the dataset and analysis
provide a useful tool for understanding happiness and developing strategies to enhance
it.
3
Figure 1: Correlation relationship between each vertex
Figure 2: Confusion matrix
4
Figure 3: Classification Report
References
• Wang, X., et al. ”Predicting depression severity using social media data.” Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 2018.
• Fulda, K. G., et al. ”Predicting depression severity in a traumatic brain injury sample
using logistic regression.” Journal of Head Trauma Rehabilitation 31.3 (2016): E22E29.
• Baraldi, A. N., et al. ”A comparison of logistic regression and decision trees to predict depression in community-dwelling older adults.” Archives of Gerontology and
Geriatrics 60.1 (2015): 120-126.
• Jeon, H. J., et al. ”Predicting depression among patients with diabetes using machine
learning techniques.” Psychiatry Investigation 15.5 (2018): 512-518.
• Kessler, R. C., et al. ”Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans health Administration.” International Journal
of Methods in Psychiatric Research 24.1 (2015): 56-66.
• Huang, W. L., et al. ”Predicting depression severity with machine learning in a
community-dwelling sample of older adults.” Journal of Medical Systems 41.11 (2017):
182.
5
Download