Uploaded by Data Smith

Pokemon Predictor PS3

advertisement
CS 487/519 APPLIED MACHINE LEARNING I AT NEW MEXICO STATE UNIVERSITY UNDER DR. HUIPING CAO
1
”Pokémon Predictor” Project Stage III
Ziad Arafat† Student, NMSU, Angel Camacho† Student, NMSU, Mathew
Groover† Student, NMSU, Jason Ivey, Member, NMSU Nano-Sat Lab,
Abstract—This report details the preliminary motivation and description of the ”Pokémon Predictor,” assignment in CS 487/519 Applied Machine Learning I at New Mexico State University.
Keywords—Computer Society, IEEEtran, journal, LATEX, Machine Learning, PyTorch.
✦
1
M OTIVATION
O
U r machine learning case study focuses on a Pokémon Battle Predictor.
In recent years online competitive games
have had a boom in popularity, Pokémon
battles are no exception to this event. Our
motive is very simple, we want this battle
predictor to enhance and give the user
an advantage over their competitor. For
effective use the battle predictor will have
one situational condition, it must be used
before the battle takes place.
NMSU
March 16th, 2022
1.1
INTRODUCTION
The problem that arises within our case
study relies on predicting which party is
•
•
•
•
A. Camacho is a student in the Computer Science department of New Mexico State University.
E-mail: angelcam@nmsu.edu
M. Groover is a student in the Computer Science department of New Mexico State University.
E-mail: mgroov@nmsu.edu
J. Ivey is a student in the Computer Science department
of New Mexico State University, as well as a memeber
of the INCA/SAS-Sat Nano-Satellite teams.
E-mail: jiveyguy@nmsu.edu
Z. Arafat is a student in the Computer Science department of New Mexico State University.
E-mail: ziada@nmsu.edu
Manuscript received March 16, 2022; revision is scheduled
for late March 2022.
more likely to win given the input parameters (Pokémon and its attributes or
“stats”). The training data consist of a
list of matchups and their results (Target
Classes).

1.2 PROPOSED
SOLUTION
Our proposed method to solve our challenge will rely on the implementation of
four major steps, beginning with analyzing and visualizing the data.
Firstly, collect and analyze data about
the classes, such as size, noise, and randomness. We will then determine which
data is the most relevant to our case.
Secondly, we will use pre-processing
techniques to restructure and augment
our data to meet our problem definition
and other needs. This includes dimension
reduction, standardisation, pipelines, and
the restructuring of the data.
In the case of dimension reduction, we
will experiment and test different methods
to determine which is the most optimal for
dimension reduction. This includes PCA,
Kernel PCA, and LDA reduction techniques.
While for the case of Standardisation,
we will select the appropriate algorithm
to standardize the data. This is due to
the fact that different dimension reduction
and machine learning algorithms ask for
different methods to standardize data.
CS 487/519 APPLIED MACHINE LEARNING I AT NEW MEXICO STATE UNIVERSITY UNDER DR. HUIPING CAO
In the use of pipelines, we will test
different methods and hyperparameters,
which will be done in sci-kit learn, scikit learn includes a library that allows and
facilitates this process.
When structuring data, it will be important to note the fact that we will not only
include the data that the Kaggle dataset
provides but the appropriate and specific
corresponding attributes of the data. What
this means is that we will combine the
corresponding attributes with the Kaggle
dataset, which in turn provides more features. Furthermore, the “ID” feature will
not be implemented in the training, this
is because the goal is only to train based
on the attributes and not the individual
ID, this will facilitate the work of the
algorithm with new data.
Thirdly, In order to properly train our
models, we are implementing a pipeline
training model. We are using a predetermined set of data that will enhance
the speed of data training. By using a
pipeline system we will be able to rapidly
iterate and change hyperparameters. As
well as use multiple models to determine the proper model for our data.
Once we determine which model works
best through parameters such as accuracy,
time, and precision. We will then seek to
improve the selected model further and
finally conclude with our final battle predictor model.
Finally, the goal will be a standalone
python executable. It will use the researched and complete models to predict which Pokemon will win. Our focus
in this standalone application is a quick
and usable general user interface(GUI).
We wish to use ideas such as spacing
signifiers and color theory to make an
intuitive and discoverable interface. We
are assuming in our design that the final
user can read and use a keyboard with
relative ease.
2
R EFERENCES
[1]
J.
Bouchet,
Pokemon
battles,
(October
2017).
Retrieved February 23, 2022 from
https://www.kaggle.com/jonathanbouchet/pokemonbattles/report
CS 487/519 APPLIED MACHINE LEARNING I AT NEW MEXICO STATE UNIVERSITY UNDER DR. HUIPING CAO
Angel Camacho did not write his
biography.
Mathew Groover is a senior of computer science at NMSU graduating
this may. He completed a full internship in the summer of 2020. He
enjoys programming games on the
side as a hobby.
Jason Ivey is a current senior at
New Mexico State University focusing on artificial intelligence and minoring in Electrical Engineering. He
is a current member of NMSU’s
Electrical Engineering Department’s
Nano-Sat lab and previously successfully delivered a nano-satellite
to a publically traded spaceflight
company. This satellite was NMSU’s first satellite mission
and the first instance of NMSU code in low-earth-orbit.
Ziad Arafat is a computer science
major at NMSU with a strong career
interest in AI and data science. Ziad
seeks to apply artificial intelligence
for safety inspection systems and
image analysis. He also enjoys finding clever ways to automate tedious
business tasks at his work.
ACKNOWLEDGMENTS
Thank you to Dr. Huiping Cao, Shahriar
Rahman Dipon, and Erick Draayer of
NMSU, as well as: IEEE and Overleaf
and the LATEX community for making documentation and presets readily available
to us during this paper.
3
Download