On Statistical Approaches to Online Game Bot Detection P9592205 廖宜財 Advisor: 朱浩華 博士

advertisement
On Statistical Approaches to
Online Game Bot Detection
P9592205 廖宜財
Advisor: 朱浩華 博士
Co-advisor: 陳昇瑋 博士
Date: 2009/7/21
Outlines
•
•
•
•
•
•
Introduction
Related Work
FPS Bots Detection
Rhythm Bot Detection
Future Work
Conclusion
2016/6/28
2/80
MMOG
• MMOG (Massively Multiplayer Online Game)
has become the most popular Internet activities
2016/6/28
www.mic.iii.org.tw
3/80
MMOG Genres
•
•
•
•
•
•
RPG (Role Playing Game)
SLG (Simulation game)
FPS (First-Player-Shooter) Game
Gaming
Rhythm Game
…
2016/6/28
4/80
Major Challenge: Cheating
• Why cheating?
– Get profit from game
– Self-transcendence
• Game cheating types
– Hacker/Game exploits
• Expert Programmers
• Serious, but can be prevented by hardware/game
architecture
– Game Bots
• AI programs that can perform some tasks in place of gamers
• Everyone can download and use it!
2016/6/28
5/80
Game Bots
• Unfair competition
– Bots accumulating rewards efficiently in 24 hours a
day
• Economy problem
– Easy money  Currency inflation
– Reduce new-comers to join
– Game provider will close the game
2016/6/28
6/80
The Challenge of Bot Detection
• Hard to detect
– Bots obeys the game rules perfectly
– No general detection methods are available
today
Internet
2016/6/28
7/80
Our Goal for Bot Detection
• Propose bot detect approaches as general
as possible
• Passive detection
– No intrusion in players’ gaming experience
• No client software is required
2016/6/28
8/80
Related Work - Bot Prevention (1/2)
• Human intelligence
– Detection: Bots may show regular or peculiar
behavior
– Confirmation: Most of bots will off-line
immediately while detect GM (Game Master)
• Pros and cons
– Easy
– A lot of GM, a lot of salary
– Judge by subjective feeling of GM
2016/6/28
9/80
Related Work - Bot Prevention (2/2)
• CAPTCHA tests (Completely Automated Public
Turing test to tell Computer and Human Apart)
– Bots are hard to recognize distorted images
• Pros and cons
– Accurately
– Annoy innocent players
2016/6/28
10/80
Related Work - Bot Detection
• Process monitoring at client side
– Constantly changing the bot program’s signature
• Traffic analysis at the network
– Remove bot traffic’s regularity by heavy-tailed
random delays
• Aiming-bot detection using Dynamic Bayesian
Network
– Specific to aiming bots that help aim the target
accurately
2016/6/28
11/80
Progress
•
•
•
•
•
•
Introduction
Related Works
FPS Bots Detection
Rhythm Bot Detection
Future Work
Conclusion
2016/6/28
12/80
Quake 2
• A classic FPS game
– “The Best Game Ever” by PC Gamer in 1998
– Sell over one million copies
• Many real-life human traces are available on the
Internet
– There are in-game recording functions
• Open source (include Quake 3)
– Scholarly research
– A lot of bots
2016/6/28
13/80
A Screen Shot of Quake 2
2016/6/28
14/80
Data Collection (1/2)
• Human traces can be downloaded from
websites
– Competition records
– Experience shared
– To show off themselves
• Traces downloaded from:
GotFrag Quake
http://www.gotfrag.com/
2016/6/28
Planet Quake
http://planetquake.gamespy.com/
Demo Squad
http://q2scene.net/ds/
Revilla Quake Sites
http://www.revilla.nildram.co.uk/
15/80
Data Collection (2/2)
• Bot traces collected on our own Quake server
– CR BOT 1.14 (http://arton.cunst.net/quake/crbot/)
– Eraser Bot 1.01 (http://impact.frag.com)
– ICE Bot 1.0 (http://planetquake.com/ice)
• For automation, we implement a bot-launcher
program
–
–
–
–
–
2016/6/28
Launch Quake 2 and bot program
Monitor the time spent
Create bots
Select bot’s AI
…
16/80
Screen Shot of “Q2BotLauncher”
*Bots selection
Game map name
*In-game trace
recording file name
2016/6/28
17/80
Trace Data Description
• Quake 2 official trace format
– DM2
• Types of dm2 file format
– Client side recording
• Player point of view
• Replay-able
– Server side recording
• Server point of view
• Non-replay-able
2016/6/28
Players’
traces
Bots’
traces
18/80
DM2 Format Reference
• Source packages from Id-soft
– The newest release from the end of November 1998 with both
Mission Packs
– Plain game
• Linux version: ftp://ftp.idsoftware.com/pub/quake2/source/q2src320.shar.Z
• Windows version: ftp://ftp.idsoftware.com/pub/quake2/source/q2src320.exe
– Xatrix Quake II Mission Pack 1
• Linux version: ftp://ftp.idsoftware.com/pub/quake2/source/xatrixsrc320.shar.Z
• Windows version: ftp://ftp.idsoftware.com/pub/quake2/source/xatrixsrc320.exe
– Rouge Quake II Mission Pack 2:
• Linux version: ftp://ftp.idsoftware.com/pub/quake2/source/roguesrc320.shar.Z
• Windows version: ftp://ftp.idsoftware.com/pub/quake2/source/roguesrc320.exe
2016/6/28
19/80
Parsing Example of DM2
Block size MsgID
Frame Move
EOF
2016/6/28
20/80
Example: Frame Move
Server frame
Area bits
Read n bytes
Frame Move
Position
Delta frame
View Angle
Weapon
State
Read size
*2 byte
state
Player tag
Total entity
0x80=need
next byte
# Entity
Entity data
Player-flags, need
parse that for other
player data
Package
entity tag
2016/6/28
21/80
Tools We Developed
• DM2 Parser
– DM2 is inconvenient to parse by Statistical
tools
– Filter out the information we don’t need
– Convert traces to a CSV file format
• 3D trace viewer
– Quake 2 is 3D game, we need observe
traces in 3D space to figure out trace features
2016/6/28
22/80
DM2 Parser
2016/6/28
23/80
DM2 Parser Output
(Time)
2016/6/28
(X, Y, Z)
24/80
3D Viewer
2016/6/28
25/80
Data Summary
• Each cut into 1,000-second segments
• Totally 143.8 hours of traces were
collected
ICE bot tends to stay in a place,
and try to ambush other players
2016/6/28
26/80
Our Solution: Trajectory-based
Detection
• Based on the avatar’s moving trajectory in
game
• Applicable for all genres of games where
players control the avatar’s movement directly
• Avatar’s trajectory is high-dimensional (both in
time and spatial domains)
2016/6/28
27/80
The Rationale behind Our Scheme
• The trajectory of the avatar controlled by a
human player is hard to simulate for two
reasons:
– Complex context information:
Players control the movement of avatars based on
their knowledge, experience, intuition, and a great
deal of information provided in the game.
– Human is not always logical and optimal
• How to model and simulate realistic movements
is still an open question in the AI field?!
2016/6/28
28/80
Data Representation
• We give up the z value of trace
– Avatar moving on ground, unless fall/climb, the z-vale is not
changed
t
2016/6/28
(X, Y)
(X, Y)
(X, Y)
29/80
Trails of CRBot
Building,
obstacles
Color density= visit
frequency
2016/6/28
30/80
Trails of Eraser Bot
2016/6/28
31/80
Trails of ICE Bot
2016/6/28
32/80
Trails of Human Players
Human tend to explore
all areas
For finding items
Human players normally
avoid open area
For safety
Human players spent
more time in narrow area
For safety
2016/6/28
33/80
Individual Trajectory
• Human tend to turn their directions continuously
and slightly
– Searching
– Quick movement
CRBOT
Human player
2016/6/28
Eraser Bot
ICE Bot
34/80
Detection Scheme
• Feature Based
– Given a segment of a trajectory, {xt, yt}, 1 ≤ t ≤
T, we extract the following features from this
two-dimensional time series
•
•
•
•
ON/OFF Activity
Pace
Path
Turn
• SVM classifier
2016/6/28
35/80
ON/OFF Activity
• Move (ON) or stop (OFF)
• We define ON periods as consecutive
periods of movement longer than 1
second, and OFF periods as the
remaining time frames
2016/6/28
36/80
Pace
• Avatars are allowed to move at different speeds
– Running, slow walking, step-by-step walking, lateral
shifting, and moving backwards
– Fast move
• The SD paces > 10 units
• Teleportation frequency
– Some players are like to jump into teleport for quick
and long distance moving, some are not
– Bots has their probability to do that
– Avatar die
– Offset in one second >= 60 units
2016/6/28
37/80
Path: Linger
• Why linger?
– Waiting for item spawn
– Dog-fight with enemies
– Subconscious
• Linger definition
(t1, x1, y1)
| t1-t2 | > 30 sec. and
(t2, x2, y2)
2016/6/28
( x1  x2 ) 2  ( y1  y2 ) 2
< 300 units
38/80
Path: Smoothness
• Avatar moves in straight or zig-zag patterns
– In dog-fight, it is clearly more ‘zig-zag’
– Follow the cover, it is more ‘zig-zag’
– Subconscious
• Smoothness definition
(t1, x1, y1)
The number of times the avatar moves across
the line (x1, y1)-(x2,y2)
during the period (t1,t2)
(t2, x2, y2)
2016/6/28
39/80
Path: Detourness
• The effectiveness of avatar’s movements
• Detourness definition
(t1, x1, y1)
length of the movement
( x1  x2 ) 2  ( y1  y2 ) 2
(t2, x2, y2)
2016/6/28
40/80
Turn
• The frequency and amplitude of how
avatars change direction
• Turn definition (t , x , y )
1
1
1
t3 - t2 = t2 - t1
θ>= 30°
θ>= 60°
θ>= 90°
(t2, x2, y2)
θ
(t3, x3, y3)
2016/6/28
41/80
Feature Summary (1/2)
Feature
Observation
ON/OFF On: players’ mean and SD are
the highest
Pace
2016/6/28
Summary
Human tend to move all the time
Off: players’ mean is higher
than bots, SD is longer than
bots
Human tend to wait for a longer
time after a long move
Four player types have
different behavior
Great discriminability
42/80
Feature Summary (2/2)
Feature
Observation
Summary
Path
Players’ linger frequency is the
lowest
Lingering in a place for a long time
is dangerous
Players’ smoothness is the
lowest
Players' movements are irregular
and unpredictable
Movements of human players
are relatively more efficient
Human players tend to move away
from current positions to another
place efficiently
Turn
2016/6/28
Turn frequency of players’ in
Human players tend to adjust their
30° is the highest, and relatively directions continuously and slightly
lower in 90°, average turn angle
of human players is the lowest
43/80
Evaluation (1/2): Bot Detection
Accuracy
is > 90%
Good
accuracy
Time > 800
Result is ok
2016/6/28
44/80
Evaluation (2/2): Player-Type
Classification
Accuracy >
90%
Accuracy
> 98%
Good
accuracy
2016/6/28
45/80
Progress
•
•
•
•
•
•
Introduction
Related Work
FPS Bots Detection
Rhythm Bot Detection
Future Work
Conclusion
2016/6/28
46/80
Casual Game
• Simple and easy gameplay
– Poker, puzzle…
• Short term
– Quickly reach final stage during work break
• Free (Item mall)
2016/6/28
47/80
Market Scale of Casual Game in
Taiwan
2016/6/28
www.mic.iii.org.tw
48/80
Rhythm Game
• One of the most popular game in casual games
2016/6/28
49/80
Let’s Watch Some Dances
2016/6/28
50/80
Why Rhythm Bot?
• Game bots: automated AI programs that can
perform certain tasks in place of gamers
• Player performance based on timing information,
which is provided by the client
– Easy to cheat
Game client
Game server
Internet
2016/6/28
51/80
The Challenges
• It’s easy to construct a classifier to detect
game bots as
– 1) human behavior contains more variance,
– 2) each player has her own tendency to make
errors / skill levels
• BUT, it’s easy for bots to fight back by
learning human behavior
2016/6/28
52/80
Dancing Online (唯舞獨尊)
• Why DO (Dancing Online)?
– Top ten discussed game in casual game
forum
• www.gamer.com.tw
– We know the game designers
• Query some game rules from them
2016/6/28
53/80
DO Bot
• Only one bot
– DCO(舞林至尊)
• http://www.play55.net/
– Monthly-fee: NT$300
• The game itself is free!
• Some features
– Autorun
– Random select song
– ADV
• Anti-Detection Value
2016/6/28
54/80
Data Collection Challenges
• DO did not design any logs preservation
mechanism
• To add this feature is not possible
– Modify the source code will need to re-test
and reduce the profit
– Contracted agreement
• Implement a bot-like recording program
– Record players' and/or bot’s traces from
client side
2016/6/28
55/80
Recording Programming Principle
• Hook process
– Inject our codes to the address space of DO
• http://www.codeproject.com/KB/DLL/DLL_Injectio
n_tutorial.aspx/
• Code-rewrite
– Detours the original codes of DO to our
codes
• http://research.microsoft.com/enus/projects/detours/
2016/6/28
56/80
Recording Program Architecture
<<Component>>
Detours
Inject
<<Component>>
DO Monitor
Detours
<<Component>>
Recording Codes
2016/6/28
57/80
Screen Shots of Recording Program
Recording Program
State indicator
2016/6/28
58/80
Predication
• The bot can finish all keys in a very short
time
– The probability distribution is centered
• The bot almost never make any error keypress
– The error rate is very low (≒0)
• The key-press of bot is not affected by
key combinations
– Probability distribution patterns are similar
2016/6/28
59/80
Keystroke Mapping
2016/6/28
60/80
The Sorted Traces
2016/6/28
61/80
Data Collection: Real Players
• Recorded by our recording program
• Training data
– Six real players to play of different bpm
and/or different difficulty level
• Player-type classification
– 10 real players
– Assign 20 songs in each 4 game levels
2016/6/28
62/80
Data Collection: Bot
•
•
•
•
Recorded by our recording program
Enable autorun
Enable random select song
For each run, manual set ADV from 0 to
700ns and select difficult level from easy
to top
2016/6/28
63/80
Data Summary
2016/6/28
64/80
Scheme: Inter-Keypress-Time Based
Approach (1/3)
• Key combination (16 combinations)
– (↑、↓、←、→)x (↑、↓、←、→)
• Inter-keypress-time
t1
t1
2016/6/28
t2
t3
t2
t4
t5
t3
t6
t7
t8
t4
65/80
Scheme: Inter-Keypress-Time Based
Approach (2/3)
• Conditional probability distributions
– CDF (Cumulative Distribution Function)
• f( t | current input: key, previous input: key )
– Example:
• Current arrow key: ←
• Previous arrow key: ↑
f (t | current input  , previous input )
2016/6/28
66/80
Scheme: Inter-Keypress-Time Based
Approach (3/3)
• Classifier
– Kolmogorove-Smirnov test (KS test)
D1,2  sup F1  x   F2  x 
x
1,2: two players
– Multinomial Experiment
F: CDF
• Bonferroni method
• α = 0.05/16 ≈ 0.003
– If each trial error rate remain 5%
– 1 – ( 1 – 0.05)16 ≈ 0.5599
• Any p-value of combinations < α
– They are from two different distribution
2016/6/28
67/80
Probability Distribution of 16 key
Combinations
2016/6/28
68/80
Player-Type Consistency Check
• Players’ behaviors are different in different
time
– Players progress by learning and practicing
in long term
– Can it make our scheme results in
misjudgement?
2016/6/28
69/80
Consistency Check: Result
• All p-value > α(0.003): two samples are
from same population
2016/6/28
70/80
Simulated Bots
• Real bot: DCO
– Too easy to classified
– The distribution is centered and approximate 0
• Simulation bot 1
– Fix the inter-keypress-time and add uniform distribution to
simulate real players
– Example: 0.2 ± 0.05 sec
• Simulation bot 2
– Bot learn players average inter-keypress-time of each 4
combinations
– Calculate the mean and standard deviation of them
– Use normal distribution to simulate the probability distribution
2016/6/28
71/80
Simulated Bot 1: Result
• All p-value < α(0.003): two samples are
from different population
2016/6/28
72/80
Simulated Bot 2: Result
• We can still identify the bot
2016/6/28
73/80
Future Work
• The scheme can be fight back if bot designer:
– Find out the CDF
– Familiars with the probability and statistics
• Idea: pressure
– Why players make mistake?
• Probability: Easy to measure and simulate
• Pressure: Abstract, and it is difficult to quantify
• Every player behaves similar under the same pressures
– There are no existing pressure formula
• The bot-designer can not fight back without our formula
2016/6/28
74/80
Scheme: Pressure-Error Based
Approach
• We try to utilize the conditional probability
function to learn the model
– Interval time of indicators
– key combinations
– Inter-keypress-time
2016/6/28
75/80
Quantify the Pressure
2016/6/28
76/80
Users’ Error Rate under Different
Pressures
2016/6/28
77/80
Conclusion (1/2)
• We propose a trajectory-based approach for
game bots detection
– It is a general technique
• The avatar's movement is controlled by the player directly
– Achieve a detection accuracy of 95% or higher
• When the trace length is 200 seconds or longer
– Can distinguish between real players and bots
• Difficult to simulate human players' movement
2016/6/28
78/80
Conclusion (2/2)
• We propose a inter-keypress-time based
approach for rhythm game bot detection
– It is a general technique
– The first paper for rhythm game bot detection
2016/6/28
79/80
Thank You for Your Attention!
2016/6/28
80/80
Download