Business Data Analytics & Modeling Dataset: Coffee Quality Step 1 - Identify problem and find Dataset. Problem: Geography has a major impact on the way coffee tastes. This is why roasters note a coffee’s originating country and region. We believe another aspect of geography that affects a coffee’s quality and taste is elevation which is why coffee include this information on their bags of coffee. Question: (1) Does altitude have a direct relationship with the overall quality of the coffee (Total Cup Points)? (2) Does region have a direct relationship with the overall quality of the coffee (Total Cup Points)? (A) How does altitude or region affect any of the quality measures? Dataset: Arabica coffee species rating from Coffee Quality Institute (CQI) January 2018 https://github.com/jldbc/coffee-quality-database About the Data Dataset: Coffee Quality Institute (CQI) is a non-profit organization working internationally to improve the quality of coffee and the lives of the people who produce it. Arabica coffee species rating from Coffee Quality Institute (CQI) January 2018: https://github.com/jldbc/coffee-qualitydatabase It is assigned to three local Q Arabica or Q Robusta Graders (1) They conduct a blind evaluation and submit a report. (2) The scores are averaged (3) Coffees that meet the standards for are issued a Q Certificate. Cleaned Dataset Raw (initial data set): 1312 coffee reviews of Arabica beans Cleaned: 961 Cleaning the Data: ● Assigned a unique ID (UID) ● Removed Certification Body, Certification Address, Certification Contact as variables NOT RELEVANT ● Removed Expiration as NOT RELEVANT ● Removed ICO.Number as variable NOT RELEVANT ● Removed Lot.Number as variable. Incomplete data. ● Removed Farm.Name, Company, Producer, Mill, Owner.1 as duplicate to Owner variable. ● Removed any data that has "NA" in Altitude. ● Converted Altitude from “ft” to “Meters” Data included in Dataset: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● UID Species Country of Origin Region Owner Altitude Number of Bags Bag Weight In Country Partner Harvest Year Grading Date Variety Processing Method Aroma Flavor Aftertaste ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Acidity Body Balance Uniformity Clean Cup Sweetness Cupper Points Total Points- Coffee Quality Moisture Category One Defects Quakers Color Category of Two Defects Unit of Measurement Altitude Low Meters Altitude High Meters Altitude Mean Meters Step 3 - Identify potential Dependent and Independent variables Independent Variables: Altitude Region Dependent Variables: Total Points (Coffee Quality) Hypothesis: Higher altitude results in higher overall Coffee Quality points. Support information ● Coffee bean physical quality: The effect of climate change adaptation behavior of shifting up cultivation area to a higher elevation. http://biodiversitas.mipa.uns.ac.id/D/D1902/D190208.pdf ● Coffee Characteristics: What affects the quality of coffee before you brew it. https://www.blackoutcoffee.com/blogs/the-reading-room/coffee-characteristics-what-affects-the-quality-of-coffeebefore-you-brew-it ● The effect of bean origin and temperature on grinding roasted coffee. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4834475/