Displaying your data and using Classify Exploring how to use the legend classify command When displaying data on a map there are several things you should be aware of: 1. Since polygon sizes are different many times large areas simply have larger numbers 2. Thus, Normalizing by population can produce different results 3. How you classify your data can also emphasize different patterns 4. The number of classes you use can add to complexity Default with total count 5 classes and Here are the contiguous 48 states for Whites in the US Natural Breaks In part large states end up at the highest end of the category This is using the default “natural breaks” which probably isn’t the best classification here natural breaks classification See Also: classification, Jenks' optimization [cartography] A method of manual data classification that seeks to partition data into classes based on natural groups in the data distribution. Natural breaks occur in the histogram at the low points of valleys. Breaks are assigned in the order of the size of the valleys, with the largest valley being assigned the first natural break. Here I have changed to a simpler 3 classes Notice now the smaller eastern states and large but mostly low density western states fall into the lowest category But the size of the categories is quite a bit different Real Definition of Natural Breaks Jenk’s Optimization: The method requires an iterative process. That is, calculations must be repeated using different breaks in the dataset to determine which set of breaks has the smallest in-class variance. The process is started by dividing the ordered data into groups. Initial group divisions can be arbitrary. There are four steps that must be repeated: •Calculate the sum of squared deviations between classes (SDBC). •Calculate the sum of squared deviations from the array mean (SDAM). •Subtract the SDBC from the SDAM (SDAM-SDBC). This equals the sum of the squared deviations from the class means. •After inspecting each of the SDBC, a decision is made to move one unit from the class with the largest SDBC toward the class with the lowest SDBC. •New class deviations are then calculated, and the process is repeated until the sum of the within class deviations reaches a minimal value.[1][5] 5 Classes 3 Classes What is it doing? • • • • Not always clear. In my opinion works better with remotely sensed data. If data is logarithmic, then use a log or geometric classification. Make sure your classification scheme reflects whatever you’re trying to do. Equal Interval Results of Equal Interval Now we see the results of really big state and small one based on population in equal sized classes Geometric Progression Since our data is highly skewed to the right, we might want to try a geometric progression Geometric Progression Now the really small states are really small the middle size ones have a larger range, and the largest ones have the largest range Now do it by percent white Percent White Consider the future of Republicans. Switch to 5 Classes to improve detail Exploring for a Geometric Progression Given the fairly even distribution of the data there doesn’t seem to be anything gained by going to a geometric progression Further Explorations • Now explore Hispanic and Black Populations Final note and caution • How you display your data can give quite different answers