MATHEMATICAL INVESTIGATION USING SPSS Learning Objectives At the end of the lesson, you are be able to: • Describe mathematical investigation; • Apply SPSS in mathematical investigation; and • Enjoy investigating cases using SPSS. What is investigation? • Investigation refers to the action of investigating something or someone. What is Mathematical investigation? • Mathematical Investigation refers to the sustained exploration of a mathematical situation. What is SPSS? • SPSS refers is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, criminal investigation. THE CASE: IT HAPPENED ONE NIGHT It was the night of February 17, heavy snowfall had covered the ground with a thick blanket of snow. Veronica had spent the evening at her friend Anna’s house. At midnight, she walked through the square back to his apartment, located on the second floor of a block on the other side of the square. Fortunately, it had stopped snowing about half an hour before. Unfortunately, veronica never made it home. Her corpse was found in the early hours of the morning in the square, at the point indicated by the cross on the map. A – Veronica and Samuele’s house B – Sonisa’s house C – Vittorio’s house D – Anna’s house E – Tommaso’s house AA – Samuele’s news stand EE – The bell tower Information Gathered ・ Veronica, the victim ・ Samuele newspaper vendor, age: 45, height: 165 cm ・ Sonia, physician, age: 38, height: 175 cm ・ Vittorio, bank employee, age: 36, height: 188 cm ・ Anna, Veronica's friend, age: 54, height: 169 cm ・ Tommaso, bell-ringer in the tower in the square, age: 50, height: 172 cm COLLECTED INFORMATIONS: MOVEMENTS • Samuele had left his house (A) and had gone to his news stand (AA). • Sonia had left her house (B) and headed toward Via Lancillotto to start her shift at the hospital. • Vittorio had left his house (C) to head toward Via Mortona. • Tommaso had left his house (E) to go to the bell tower. • The thick fog that had developed made their routes erratic and prevented any of them from noticing anything strange. • During their journey, no body came across any other footprints in the snow, none of their paths intersected with each other, or with poor Veronica’s path. When the police arrived, the sun had melted almost all of the footprints: The only place where footprints remained was in the shade of the be ll tower, close to the victim. These showed the killer’s exit route. Below is a rough sketch done by the appointed officer: 73 cm Other information collected by the investigators: • No one who was questioned had crossed the square more than once. • From the shape of footprints, it can be deduced that the killer ran to flee the scene. Of the people questioned, who is the likeliest suspect? Why? A LITTLE THEORY AND HISTORY Identifying criminals has always been a crucial issue in the field of criminology; in the past, there was a tendency to tattoo criminals in order to make them recognizable in the future. Only around 1879 did Alphonse Bertillon begin to suggest classification and identification methods based on a mathematical analysis of the measurements of people’s physiological features: he started the process known as biometric identification. His proposals were based on a wide sampling of dimensions relating to parts of the body including height, arm span, torso measurement, and dimensions of the head, the middle toe on the left foot and the left forearm. He observed that the probability of two people having the same measurements for all the anatomical areas considered was extremely low, thus proposing a new classification method for humans. The method yielded good results as long as the criminal database was populated with data from few prisoners; but, after around ten years, it was discovered that it people who had never been in prison could present similar characteristics, if not indeed identical, to those of the criminals. The system was abandoned at the start of the 1900s due to a more efficient method developed by the physician Henry Faulds and the statistician Francis Galton: Fingerprints. Over time, biometrics has implemented new techniques to meet its people identification and authentication objectives: DNA analysis, retina and iris scanning , hand structure, facial feature recognition, ear shape, body odor, brain imprints, signature dynamics, voice verification, lip imprints. One of the characteristic signs that investigators look for at crime scenes are footprints because they contain a lot of information about the subject who has left them. Sometimes, a footprint from a shoe can prove useful in: • Determining and tracing the subject’s routes on the crime scene. • Determining the approximate height and weight of a person. • Establishing gender, age and the features. • Studying the running mode (someone with a fast gait, a heavy person, someone moving backward, turning, limping, etc.), This analysis does not fall fully into the field of biometrics as it does not identify a two-way relationship between anatomical features and the person who possesses them; it does, however, allow more generic correlations to be highlighted which can prove very useful. DATA ANALYSIS: STATISTICS DESCRIPTIVE STATISTICS • Deals with information regarding a sample of a studied population. • Describes the fundamental characteristics of it through indices, tables and graphs. LINEAR REGRESSION • A technique to find out whether two data sets are correlated and to find the type of mathematical expression that unites them. APPLICATION HYPOTHESIS: The length of a person’s stride is proportional to their height Identify a sample The size of the sample The representativeness of the sample Important elements In our case, we are limiting to the population of the students in a school and we are choosing the students in a class as the representative sample. Two measurements must be taken from each individual in the sample: their height, and the length of their stride. scatterplots y = bx + a Linear regression is a statistical technique that allows us to find the equation of the line of best fit for the data from the many straight lines that cross the Cartesian plane. scatterplots 1 Perfect 0.75 - 1 strong 0.5 – 0.75 Moderate 0.25 – 0.5 weak <0.25 No rel. Calculating Estimated Values The coefficient r = 0.755 indicates that the straight line identified describes the relationship between the points rather well. Knowing the equation allows us to predict the value of a measurement, if the value of the correlated measurement is known. For example: If we know the measurement of a person’s length of stride, x = 73 cm, is it possible to gain an idea of the size of their footprint? REGRESSION REGRESSION REGRESSION REGRESSION REGRESSION Height y = 1.669 * Length of stride + 49.534 In other words: y = 1.669x + 49.534 Using the Equation to Predict y = 1.669x + 49.534 If the stride length of the killer (x) is 73 cm We can use the equation to predict the “height of the killer” y y = 1.669x + 19.502 y = 1.669(73) + 49.534 y = 121.837 + 49.534 Y = 171.371 cm SOLUTION There are two possibilities: • Tomasso passes behind the news stand in order to go to EE, in which case it was Vittorio who found the victim. • Tomasso passes in front of the news stand, in which case it is he who finds the victim. SOLUTION 1ST POSSIBILITY: Tomasso passes behind the news stand in order to go to EE, in which case it was Vittorio who found the victim. SOLUTION 2nd POSSIBILITY: Tomasso passes in front of the news stand, in which case it is he who finds the victim.