DESIGN OF THE DATA INPUT STRUCTURE FOR A MOUSE MOVEMENT BIOMETRIC SYSTEM TO AUTHENTICATE THE IDENTITY OF ONLINE TEST TAKERS HANDLING ARTIFICIAL ACCELERATION MOUSE MOVEMENT BIOMETRICS Fall, 2014: Frank Buckley, Vito Barnes, Thomas Corum, Stephen Gelardi, Keith Rainsford Spring 2015: Shawn C. Gross Introduction Objective Design a Biometric System to Verify Test Takers on Mouse/Keystroke Input Map Mouse Movement Trajectories for Structured and Unstructured Quizzes Apply new insight to previous work Does Fitts’ law apply to mouse movement trajectories? Results Results show that application of Fitts’ law to trajectories of either type Quiz is inconclusive. Fitts’ Law Derived from 1954 study done by Paul Fitts at Ohio State University. Fitts’ Law is a model of human behavior derived from Shannon’s communications theory. It models the human nervous system as communication channels, in which information is transmitted by carrying out a movement task. The formula for Fitts’ Law: MT =a+b * ID MT = Movement Time ID = Index of Difficulty a = Y-intercept for regression line b = coefficient for regression line ID can be expressed several ways, where D = Distance to Target and W = Width of Target Fitts’ original formulation: ID = log2(2D/W) Welford’s formulation: ID=log2(D/W +1/2) Shannon’s formulation: ID=log2(D/W +1) Shannon’s Formula was chosen for analysis as it always produces a positive ID. Procedure/Methodology Raw mouse movement data was parsed and sorted into usable chunks Operational Definitions established for each data field and formula Calculations for Movement Time(MT), Length of Trajectory, Index of Difficulty(ID), Velocity, Acceleration, Slope, Direction Angle and Change in Slope were completed in Excel Establish baseline by comparing data to Fitts’ Law Test webpage data output. MT and ID were used in Linear Regression Analysis in Minitab Shannon’s Formula Used to determine ID: ID = log2 (D/W + 1) MT = a + b * ID Key assumptions User’s used the same equipment (mouse & PC) for all quizzes Procedure / Methodology – Fitts’ Test Fitts’ Law Test on Berkley Website – Click on the green line. This is repeated about 45 times, with targets changing in both size and distance. Figure on left will have a lower ID as the target is both wider and closer than the one on the right. Procedure / Methodology – Fitts’ Test Results from Fitts’ Law Tests were copied in to Excel. Regression analysis was then performed using Minitab. Procedure/Methodology Fitts’ Law w/Shannon’s Formula for ID is MT = a + b log2 (D/W + 1) , where: • MT = Movement Time of task (Duration in milliseconds) • a = y intercept (determined through linear regression) • b = slope (determined through linear regression) • D = Distance (Length of Trajectory) calculated as • W = targetwidth provided in mouse movement data Key Findings – Regression Analysis – Fitts’ Law Test Regression for MT-B vs ID-B Summary Report Y: MT-B X: ID-B Fitted Line Plot for Linear Model Y = 500.5 + 286.6 X Is there a relationship between Y and X? 0 0.05 0.1 > 0.5 Yes No % of variation accounted for by model 0% 100% R-sq (adj) = 45.12% 45.12% of the variation in MT-B can be accounted for by the regression model. Negative -1 Correlation between Y and X No correlation 1 0.68 The positive correlation (r = 0.68) indicates that when ID-B increases, MT-B also tends to increase. 1000 500 1.0 1.5 2.0 ID-B 2.5 3.0 Comments Positive 0 1500 MT-B P = 0.000 The relationship between MT-B and ID-B is statistically significant (p < 0.05). The fitted equation for the linear model that describes the relationship between Y and X is: Y = 500.5 + 286.6 X If the model fits the data well, this equation can be used to predict MT-B for a value of ID-B, or find the settings for ID-B that correspond to a desired value or range of values for MT-B. A statistically significant relationship does not imply that X causes Y. Fitt’s Law Test Regression in Minitab provides analysis of the statistical relationship between MT & ID (pValue), the Linear Model fitted equation and line plot, R-sq (adj), and correlation between MT & ID Key Findings – Regression Analysis Quiz Mouse Movement Regression for MT-0A vs ID-0A Summary Report Y: MT-0A X: ID-0A > 0.5 Yes 0 4500 No MT-0A 0.05 0.1 > 0.5 Yes P = 0.955 The relationship between MT-0A and ID-0A is not statistically significant (p > 0.05). Fitted Line Plot for Linear Model Y = 29.77 + 460.4 X Is there a relationship between Y and X? P = 0.000 The relationship between MT-0Bc and ID-0Bc is statistically significant (p < 0.05). 3000 1500 % of variation accounted for by model 0% R-sq (adj) = 0.00% 0.00% of the variation in MT-0A can be accounted for by the regression model. Negative -1 Correlation between Y and X No correlation 1 0.00 The correlation between MT-0A and ID-0A is not statistically significant (p > 0.05). 0.0 0.4 0.8 ID-0A 1.2 Comments Positive 0 % of variation accounted for by model 0 100% The fitted equation for the linear model that describes the relationship between Y and X is: Y = 56.98 - 4.17 X If the model fits the data well, this equation can be used to predict MT-0A for a value of ID-0A, or find the settings for ID-0A that correspond to a desired value or range of values for MT-0A. A statistically significant relationship does not imply that X causes Y. Unstructured 1.6 3000 No MT-0Bc 0.05 0.1 Y: MT-0Bc X: ID-0Bc Fitted Line Plot for Linear Model Y = 56.98 - 4.17 X Is there a relationship between Y and X? 0 Regression for MT-0Bc vs ID-0Bc Summary Report 0% -1 Correlation between Y and X No correlation 1 0.15 The positive correlation (r = 0.15) indicates that when ID-0Bc increases, MT-0Bc also tends to increase. 0.0 0.2 0.4 ID-0Bc 0.6 0.8 Comments Positive 0 1000 0 100% R-sq (adj) = 2.06% 2.06% of the variation in MT-0Bc can be accounted for by the regression model. Negative 2000 The fitted equation for the linear model that describes the relationship between Y and X is: Y = 29.77 + 460.4 X If the model fits the data well, this equation can be used to predict MT-0Bc for a value of ID-0Bc, or find the settings for ID-0Bc that correspond to a desired value or range of values for MT-0Bc. A statistically significant relationship does not imply that X causes Y. Structured Regression in Minitab provides analysis of the statistical relationship between MT & ID (pValue), the Linear Model fitted equation and line plot, R-sq (adj), and correlation between MT & ID Key Findings – Quiz type comparisons Comparison of Fitts’ Test results to both Structured and Unstructured Quiz types p-Value <= 0.05 denotes a statistical relationship between between MT & ID R-sq (adj) denotes how much variation can be accounted for in the linear model Correlation coefficient – indicates correlation strength and direction Conclusion Trajectories from Fitts’ Law test website did show strong statistical relationship and a reasonably well fitting linear regression line. Trajectories studied do not follow Fitts’ law, in most cases, however more analysis is needed. Fitts’ law is based on a one-dimensional task, whereas mouse movement is a two-dimensional task with two-dimensional targets. It seems that as a student takes a quiz they are more likely to have errant or erratic mouse movements than those found in a Fitts’ Test. This makes sense as student taking a quiz will be checking study materials before answering or even changing answers midstream. More “thinking” on the part of the student will cause movement to deter from going directly to answer, whereas the Fitts’ Law Test does not require thinking so much as reaction to go directly to the target. Why? The data collected is the pointer motion, not the mouse motion Project Description Objective Investigate the problem of artificial acceleration in relation to the mouse movement biometric system Analyze how the Windows and Mac OS X operating systems implement artificial acceleration Reverse-engineer the artificial acceleration and implement a method to compensate mouse pointer velocity Results Simulator test results show how artificial acceleration can be accounted for while recording mouse movements Artificial Acceleration Created by Microsoft for the Windows XP operating system to compensate for sluggish mouse pointer movements Artificial acceleration increases the physical velocity of the mouse cursor Physical velocity of the mouse is the key determiner when artificial acceleration is applied Within Mac OS X: macScalingValue: This value resides in the system defaults. When the mouse passes this value, the velocity of the cursor increase by a factor of 2 Within Windows Registry: mouseThreshold1: When the movement of the mouse passes this value, the speed of the cursor will increase by a factor of 2 while the velocity of the mouse continues at this rate mouseThreshold2: Increases mouse cursor speed by a factor of 4 when the mouse movement velocity increases to this value Artificial Acceleration Formulas used to convert virtual mouse and cursor movements into physical movement speeds Threshold Registry Locations: Artificial Acceleration - Disabling Disabling is optimal! Disabling artificial acceleration limits complication in user recognition Disable in Windows: Un-check enhance pointer precision Disable in Mac OS X: Enter this command into the Terminal Application Enabled by default in: Windows Mac OS X Most versions of Linux Procedure/Methodology - Simulator In order to analyze artificial acceleration, more user/system data was needed: Creation of simulation environment for testing Records change in mouse coordinates, time, Monitor size, and Screen DPI value Developed in Python Includes methods to: Screen resolution Monitor size in inches Threshold or scaling value Screen DPI (Dots Per Inch) Calculate DPI Physical velocity of mouse Physical velocity of pointer Modifier for Artificial Acceleration (Mac & Windows) Test two separate user sessions Artificial Acceleration enabled Disabled Procedure / Methodology – Simulator Above is the user prompt, directing the user for a unique user number and monitor size in inches To the left, the sample output while the simulator Records the velocity and outputs this data to a CSV file Procedure / Methodology – Simulator Simulation Environment – Tracks mouse coordinates on the grey plane, recording the events into a CSV file while outputting the results in the command prompt or Terminal window Procedure / Methodology – Simulator A plot is then generated upon exit of the simulation environment with each dot indicating a recorded change in cursor location, limited by the mouse bus speed. To the left – Results from a test session with Artificial Acceleration disabled. Procedure / Methodology – Simulator The X and Y axis reflect the upper and lower limits(of the monitor) traveled by the cursor given some additional padding for presentation purposes To the left – Results from a test session with Artificial Acceleration enabled. Procedure/Methodology - Results Artificial Acceleration disabled Leads to optimal results No further work needed for user recognition Artificial Acceleration enabled Lower threshold(mouseThreshold1) reached Upper threshold never reached Longer lines between points in plot show possible limitations: Software used in implementation Mouse data packet generation too slow Conclusion If optimal results are desired, Artificial Acceleration should be disabled The research in this paper however, lets us calculate the severity of the acceleration as well as the true movements of the mouse and pointer The correct information must be queried from the user system This work can further be applied and integrated into the current mouse movement biometric studies at Pace University Suggestions for improving user data collection: Query user system for: Monitor Size (inches) Screen Resolution Artificial Acceleration values Thank you