Accelerometer-based CAPTCHA

advertisement
A Game and Accelerometer-based CAPTCHA Scheme
for Mobile Learning System
Ching-Jung Liao, *Chang-Ju Yang, **Jin-Tan Yang, Hsiang-Yang Hsu, and Jhih-Wei Liu
Department of Information Management
Chung Yuan Christian University, Taiwan
{cjliao, g10094629, g10194028}@cycu.edu.tw
*
Department of Computer Science and Information Engineering
Hungkuang University, Taiwan
cjyang@sunrise.hk.edu.tw
**
General Education Center
Ming Chuan University, Taiwan
yangdav@mcu.edu.tw
Abstract: Most CAPTCHAs are designed to defense bots-logon to exploit, the massive spam
and registration attacks. However, both text-based and imaged-based challenges which are not
suitable for mobile devices. Text-based CAPTCHAs with too much noises and distortions are
relatively hard for recognition on tiny screens while the image-based CAPTCHAs require
massive gestures to enlarge and swipe the images will add annoyance to users. In this study,
we propose a novel accelerometer-based CAPTCHA scheme applied for mobile learning
system with the contributions: (1) uses simple accelerometer as input of horizontal and vertical
movement instead of complex keyboard inputs (2) is culture/language independent (3)
enhances the security without annoying users (4) is based on more advanced human cognitive
capabilities.
Introduction
Internet provides many free services including cloud storages, email services, online voting, chat
rooms, weblogs, online games, and etc. User authorization is more and more important issue because some
attacks, such as Automatic scripts and bots, are created to gain free accounts, send spams, cheat in games [1]
and vote remotely [2]. Human Interactive Proofs (HIPs) or Completely Automated Public Turing tests to tell
Computer and Human Apart (CAPTCHA) [3] [4] are widely used against these kinds of attacks. For example,
Yahoo improves its marketplace by blocking bots from phishing attack posts; Gmail improves its services by
blocking access to automated account creator and spammers; and Facebook limits creation of fraudulent
profiles used to spam honest users or cheat at games.
There are three main types of image recognition CAPTCHAs, naming images CAPTCHAs:
distinguishing images CAPTCHAs and identifying anomaly images CAPTCHAs [5]. The naming CAPTCHA
presents six images to the user and the user would pass the challenge if correctly typing the required common
term associated with the images. The game inquires the exact understanding of the semantic meaning of the
images of the user. Another example of anomaly CAPTCHA that provides six images to the user and five
images present the same subject different to the meaning of the other one. The user passes the game by
identifying the anomalous image. The distinguishing CAPTCHA presents two sets of images to the user and
both sets either have the same subjects or not with equal probability. The user passes the game if he/she
correctly determines whether the sets have the same subjects. The above three types of images recognition
CAPTCHAs use small, fixed sets of images (challenges) and responses. However they can be break easily by
recreating the entire database [5][6].
Most CAPTCHAs are designed to use on a personal computer which provide rich input devices and a
relatively large screen. In this paper, we propose a new scheme based on accelerometers built in modern smart
mobile devices, named an accCAPTCHA which focus on more advanced human cognitive process abilities.
Rolling ball games and racing games are applied in accCAPTCHA, which has an extremely high resistance to
automated malware attacks because it is considered nearly impossible for computer to reach such an artificial
intelligence level, regardless of how advanced the technology might be. Furthermore, it is language independent
and background knowledge is not required.
The remainder of this paper is organized as follows: related works of CAPTCHAs are discussed in
next section, and the concept of a CAPTCHA is described in Section 3. Section 4 presents the usability of a
CAPTCHA . Section 5 presents the conclusions.
Related Works
CAPTCHA is based on the Turing test concept developed by Turing [8]. The most common and
successful CAPTCHAs are visually distorted images of a string of letters and numbers that can ideally be
identified by humans, but not by computers [4]. Google and Microsoft are all continuously developing their
own CAPTCHA schemes. Current CAPTCHA implementations are primarily image-based, and therefore
inaccessible to users who are unable to view the screen. Audio CAPTCHAs, on the other hand, ask users to
interpret spoken audio posing other challenges (e.g. language dependency and annoying noises). To defeat
automated speech recognition, these audio HIPs use a significant amount of background noise and varied
speakers, making interpretation difficult. Furthermore, many sites do not provide audio CAPTCHAs [9].
Asirra CAPTCHA asks users to identify a subject among a set of photographs and takes cat as an example.
Asirra CAPTCHA is good because it is easy for humans to tell the difference of cats and dogs, but it is hard for
computers. However, Asirra CAPTCHA has some disadvantages: First, it works dependent on an image
database, which might be compromised by brute-force. Second, it requires large screen space to show 12
photos.
Text-based CAPTCHAs, alternatively, require the user to translate an image or a sound of words, still
it seems easy to break by implementing a five-stage pipeline, which are preprocessing, segmentation,
post-segmentation, recognition and post-processing [7]. Although it is possible to enhance the security of
existing CAPTCHAs by systematically, adding noise distortion and obfuscation techniques would also make
the CAPTCHA harder for human to recognize especially for tiny devices. In addition, both text and speech
recognition are language dependable and not applicable for all users. For example, African and Asian might not
be capable to read English.
Character-Based CAPTCHA contains a string of characters is presented to the user. This string can be
the combination of either words or random alphanumeric characters and punctuations.
Image-Based CAPTCHA provides the user some images which are identifiable real-world objects and could be
presented in the form of shapes [10]. For example, an image of a cat would be shown and the user would be
asked to identify it as a cat. Another example would present both squares and circles and asks the user to click
on the circle.
Audio-Based CAPTCHA asks the user (1) to recognize a spoken sentence, or (2) to match an audio
with an image.
Based on the type of challenge presented, CAPTCHA can be categorized into two groups: anomaly-based and
recognition-based. Anomaly-based CAPTCHAs ask users to determine which object, character, or shape does
not belong in a set of images displayed on the screen. Recognition-based CAPTCHAs request users to identify
what is being presented to them. Any of these five techniques can be used in conjunction with each other. For
example, ReCAPTCHA is one of the most successful demonstration which is character-based,
recognition-based, and sound-based [11].
With the rapidly growing of computation power and advanced algorithm, most of the text-based
CAPTCHA has been reported breached [7]. A five-stage de-captcha pipeline, which are preprocessing,
segmentation, post-segmentation, recognition and post-processing, is proposed and implemented to break
text-based CAPTCHA. They tested the efficiency of Decaptcha tool against real CAPTCHAs from Authorize,
Baidu, Blizzard, Captcha.net, CNN, Digg, eBay, Google, Megaupload, NIH, Recaptcha, Reddit, Skyrock,
Slashdot, and Wikipedia. On these 15 captchas, they had 1%-10% success rate on two (Baidu, Skyrock),
10-24% on two (CNN, Digg), 25-49% on four (eBay, Reddit, Slashdot, Wikipedia), and 50% or greater on five
(Authorize, Blizzard, Captcha.net, Megaupload, NIH). This automated Decaptcha tool breaks 13 out of 15 of
the most widely used CAPTCHA schemes.
Many experiments [12] and attacks [9] have proven that most CAPTCHA schemes are broken if they
can reliably segmented. A set of experiments using a sequence of character transformations such as translation,
rotation, scaling, warp (local and global), and clutter (thin and thick foreground and background arcs) are
carried out to determine the recognition rates of humans and computers and proved that the recognitions ability
and rate of computers is much better than human beings. Experimental results comparing human and computer
recognition of HIP characters indicate that computers (a) do as well as humans on the easy problems, (b) are
marginally better at low and medium difficulty scenarios, and (c) beat humans at high distortion and clutter
settings [12]. Although the security levels can be enhanced by systematically adding distortion and noise to
text-based CAPTCHA, which also make the character harder for human to recognize.
Images are rich in information, intuitive to human beings and of a large variation. Early image
recognition-based CAPTCHAs, including Bongo and PIX [13], which uses the shapes and labeled images may
suffer from the guess and database reconstruction attacks.
Another example of CAPTCHAs is based on the image semantics [14]. It is proposed a new type of
CAPTCHA based on the internal meanings contained in images, videos, stories, etc., such as context, humor,
and foreshadowing. They have focused on the ability to understand humor, which is considered one of the most
advanced human cognitive processing abilities. A four-panel cartoon with random orders often asks the user to
reorder for identifying as a human. In this four-panel cartoon CAPTCHA, a four-panel cartoon is presented with
the four panels rearranged randomly, and a user that is able to respond with the correct order is identified as a
human. Even if the panels are rearranged randomly, the human can understand the meaning of the pictures and
utterances in each panel, and deduce the order where the panels must be arranged in order to create a story. It is
almost impossible for the computer to understand the humor created by human beings. However, there is no
deep understanding of how to properly make use of image semantics, say ambiguous semantics. Besides, a
four-panel cartoon CAPTCHA may suffer from brute force attack since there only exists 4! (= 24)
combinations.
Accelerometer-based CAPTCHA
Concept
We propose a new CAPTCHA scheme, accCAPTCHA based on game logic and human recognition, which
is easy operated using the arrow keys and mouse and language dependable, and can be extended to be
implemented on any mobile device with an accelerometer.
One problem associated with using semantics is that the retrieved semantic information from an image tends to
be subjective and user-dependent. For example, with the image recognize-based CAPTCHA that ask the user to
choose the most beautiful flowers may be different from each person. This intrinsic ambiguity in semantics
makes it difficult to generate CAPTCHA challenges using image semantics. Besides, with different growth and
knowledge background may also lead to different result. Hence, we have proposed a new type of CAPTCHA
based on unambiguous high-level semantics. We focused on the ability to solve enigmas, which is considered
one of the most advanced human cognitive processing abilities. Through simple enigma-based flash/html games,
we can identify human easily since it’s almost impossible for computers to understand the meaning of an
enigma. In addition, simple enigma-based games are considered solvable as all ages without extra background
knowledge and it is language independent.
As a specific example, we have proposed an accCAPTCHA using a simple rolling ball game. A user who
is able to move the ball to the destination hole is identified as a human. For a computer, however, it would be
difficult to realize the meaning of the enigma and send corresponding arrow keys or mouse movement. The
enigma based on the native spatial sensitivity that human is born with. Moreover, even if image processing
capabilities developed to the level where the computer could recognize the meaning of the images and send the
correct response, it would be almost impossible for computers to understand the real meaning of the rolling ball
game by randomly changing the types of paths with different elevation and add traps as fake destination.
Framework
Requirements:
a. Motion operations: If there is an accelerometer built in the mobile device, the motion operations are
obtained from accelerometer signals. Otherwise, the motion operations are defined as screen touching
by fingers.
b. Easy games: The games should be language independent and capable for all level of human.
accCAPTCHA: The authorization decision depends on the results of games.
Analysis of the Proposed Scheme
In this section, the security requirements are examined to show that our proposed scheme not only fulfill
the security requirements: resistances against guessing, replay attacks and database reconstructions, there still
exist some other meanwhile things such as commercial complications.
Security Analysis
1. Resistance against Brute Force Attack
Password is usually fixed during a period of time of the traditional user and it is vulnerable to
brute force attack launched by automated form-fill software. In the proposed scheme, accCAPTCHA is
in a concept of one-time pad. If the user fails to pass, accCAPTCHA will be changed in next session
since each presentation of accCAPTCHA is distinct.
2.
Resistance against Replay and Relay Attack
Most of the CAPTCHAs are proposed and deployed with strong assumption, such as the
communication channel must be secure and is protected from any modification. The main reason is that
they may suffer from strong attack so-called replay and relay attack. If the user use automated
programs which can record the corresponding answers, they can breach the CAPTCHA appeared
before by replay the procedure. On the other hand, if the CAPTCHA scheme which sends the image
and answer together with plain text, automated program can replace the pair of them to do relay attack.
In our scheme, the challenge game and the corresponding correct answers are not unique and does not
exists one-one mapping relations. Thus the related parameters could be transmitted in plaintext and
might be impossible for automated program to launch replay and relay attack.
3.
Resistance against Machine Learning
Using machine learning to classify or recognize objects was an effective attack on Asirra, but it
will not work on acc-CAPTCHA since the objects used in a current challenge are uncorrelated with
those used in other challenges. The reason why we add the slope as sample is that even the computer
can recognize the sample as simple maze, it almost impossible for an computer to know how to gain
speed or slow down with the physical/spatial cognitive. This kind of processing ability is born with
innateness and always happens in our daily life. In addition, we can enhance our security by adding
new obstacles such as the boxes in rolling game.
Figure 1: The games used for experiments.
Experimental Results
We conducted basic experiments to verify and evaluate the feasibility of the proposed scheme. With
the six games proposed in Figure 1, we designed a preliminary experiment to test whether humans could easily
pass these games. We also carried out a survey to find out whether users behind the screen are willing to use
acc-CAPTCHA system or not. There are 50 participants for this experiment of age from 15 to 45. They perform
these selected acc-CAPTCHAs from smart phone. The experimental results are shown in Table 1 and Table 2,
respectively. In the experiment, the racing game pass rate is 96% which is the highest may due to the reason
of simplicity alternative to stack game. However, the baseball game gets the lowest pass rate 63% may cause by
the running of the game is smooth or not. On the other hand, the average response time is not all positive
correlated with the degree of difficulty since the average response time of the shooting game is shorter than
rolling ball game.
Games
Pass rate
Avg. challenge time (sec.)
Preference (persons)
Table 1. Experimental results.
Stack Rolling
Racing
Game ball game
game
0.67
0.78
0.96
47.3
25.2
55
2
5
12
Baseball
game
0.63
22
10
Gun
game
0.92
19.7
12
Shooting
game
0.90
15.6
9
Table 2. Comparison of traditional CAPTCHAs with acc-CAPTCHA
CAPTCHA
Input device depend
Keyboard
Arrow Key
Accelerometer
Security concern
Mis-identify
Mis-spell
Semantic Confuse
Brute Force/Guess
Relay/Replay
Language/Culture
Dependent
Text-based
Image-based
Audio-based
Accelerometer-based
Depend or not: O/X; Accept( not depend) or not: Y/N
O
O
X
N
N
Y
N
N
Y
Here X means fail, O means pass, and V partially-pass.
O
X
O
O
X
X
X
O
O
V
O
O
X
X
X
O
X
X
X
O
O
N
N
X
X
X
O
Contributions
Traditional CAPTCHAs, especially text-based, may ask the user to enter alphanumeric characters and
punctuations, which is hard for foreigners who did not speak English and cause the high false negative
accepting rate. Since acc-CAPTCHA is designed to use simple enigma games to identify the human, they are
designed to be operated by simple arrow keys and mouse. On the other hand, acc-CAPTCHA is language
independent, which can be suitable for all species.
For traditional CAPTCHAs, they add annoyance to user which may prevent the marketing expansion
of the commercial website. Besides, with the deuced protect, a text-based CAPTCHA will lead the higher
positive answering rate. In the proposed scheme, despite the fact that the proposed method may require more
time for an authentication than conventional CAPTCHA using text-based one, the level of usability experienced
by the user is not expected to decrease significantly.
In the proposed scheme, there is a potential to further increase the commercial value. Since we
demonstrate the rolling ball game with flat surface, the decoration of these tiles can have other purpose. For
example, advertising posters can be used to decorate these tiles with all kinds of idea including portrait puzzles,
car posters and even the commercial billboard. In addition, all kinds of promotional flags and big head dolls can
be used as the subject for high score challenges.
Conclusions
Traditional CAPTCHAs make users feel troublesome to prove that they are humans at every web
accesses. It is worthwhile to mention that CAPTCHA should bring more enjoyable experience to user. In this
paper we introduced a novel acc-CAPTCHA scheme and analyzed the usability. We conclude the contributions
and comparisons at Table 2. At present, there are still room for improvement in terms of both security and
usability, so we plan to make improvements to the proposed method through some experiments. Furthermore,
we have to evaluate how much the correct response rate and the total response time for the acc-CAPTCHA
depending on the intelligence of each human, and try to implement the acc-CAPTCHA in other enigma that get
better result.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
C. Shannon, “Programming a computer for playing chess,” Philosophical magazine, vol. 41, no. 314, pp.256-275,
1950.
J. Hernandez and J. Sierra, “Compulsive voting,” Proceeding of the 36th Annual 2002 intentional Carnahan
conference on security technology, pp.124-133, 2002.
M. Blum, L. von Ahn, L. John, and N. Hopper, “The CAPTCHA Project,” http://www.captcha.net/, 2000.
L. von Ahn, M. Blum, and N. Hopper, “CAPTCHA: Using hard AI problems for security,” in Eurocrypt, 2003.
M. chew, and J. Tygar, “Image recognition captchas,” Information Security, 2004.
P. Golle, “Machine learning attacks against the Asirra CAPTCHA,” Proceedings of the 15th ACM CCS, pp.535-542,
2008.
E. Bursztein, M. Martin, and J. Mitchell, “Text-based CAPTCHA strengths and weaknesses,” Proceedings of the
18th ACM conference on Computer and communications security, 2011.
A. Turing, “Computing machinery and intelligence,” Mind 59, pp.433-460, 1950.
A. Cavender, “Evaluating existing audio captchas and an interface optimized for non-visual use,” Proceedings of the
27th ACM International Conference on Human Factors in Computing Systems, 2009.
L. Ahn, M. Blum, and J. Langford, “Telling humans and computers apart automatically,” Communications of the
ACM, vol. 47, no. 2, pp.56-60, 2004.
A. Von L, Ed., "ReCAPTCHA: stop spam read books," http://recatpcha.net, 2007.
K. Chellapilla, K. Larson, and P. Simard, “Computers beat humans at single character recognition in reading based
human interaction proofs (HIPs),” Proceedings of 2nd Conference on Email and Anti-Spam in, 2005.
L. Von Ahn, “Human computation,” PHD Thesis, Carnegie Mellon University, 2005.
T. Yamamoto, T. Suzuki, and M. Nishigaki, “A Proposal of Four-Panel Cartoon CAPTCHA,” Proceedings of IEEE
International Conference on Advanced Information Networking and Applications, pp. 159–166, 2011.
Download