Uploaded by lizymia

Data Analytics & Modeling Group Coffee Quality

advertisement
Business Data Analytics & Modeling
Dataset: Coffee Quality
Step 1 - Identify problem and find Dataset.
Problem:
Geography has a major impact on the way coffee tastes. This is why roasters note a coffee’s
originating country and region. We believe another aspect of geography that affects a coffee’s quality and taste is
elevation which is why coffee include this information on their bags of coffee.
Question:
(1) Does altitude have a direct relationship with the overall quality of the coffee (Total Cup Points)?
(2) Does region have a direct relationship with the overall quality of the coffee (Total Cup Points)?
(A) How does altitude or region affect any of the quality measures?
Dataset:
Arabica coffee species rating from Coffee Quality Institute (CQI)
January 2018
https://github.com/jldbc/coffee-quality-database
About the Data
Dataset:
Coffee Quality Institute (CQI) is a non-profit organization working internationally to improve the quality of coffee and the
lives of the people who produce it.
Arabica coffee species rating from Coffee Quality Institute (CQI) January 2018: https://github.com/jldbc/coffee-qualitydatabase
It is assigned to three local Q Arabica or Q Robusta Graders
(1) They conduct a blind evaluation and submit a report.
(2) The scores are averaged
(3) Coffees that meet the standards for are issued a Q Certificate.
Cleaned Dataset
Raw (initial data set): 1312 coffee reviews of Arabica beans
Cleaned: 961
Cleaning the Data:
●
Assigned a unique ID (UID)
●
Removed Certification Body, Certification Address, Certification Contact as variables NOT RELEVANT
●
Removed Expiration as NOT RELEVANT
●
Removed ICO.Number as variable NOT RELEVANT
●
Removed Lot.Number as variable. Incomplete data.
●
Removed Farm.Name, Company, Producer, Mill, Owner.1 as duplicate to Owner variable.
●
Removed any data that has "NA" in Altitude.
●
Converted Altitude from “ft” to “Meters”
Data included in Dataset:
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
UID
Species
Country of Origin
Region
Owner
Altitude
Number of Bags
Bag Weight
In Country Partner
Harvest Year
Grading Date
Variety
Processing
Method
Aroma
Flavor
Aftertaste
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Acidity
Body
Balance
Uniformity
Clean Cup
Sweetness
Cupper Points
Total Points- Coffee Quality
Moisture
Category One Defects
Quakers
Color
Category of Two Defects
Unit of Measurement
Altitude Low Meters
Altitude High Meters
Altitude Mean Meters
Step 3 - Identify potential Dependent and
Independent variables
Independent Variables:
Altitude
Region
Dependent Variables:
Total Points (Coffee Quality)
Hypothesis:
Higher altitude results in higher overall
Coffee Quality points.
Support information
●
Coffee bean physical quality: The effect of climate change adaptation behavior of shifting up cultivation area to a higher
elevation.
http://biodiversitas.mipa.uns.ac.id/D/D1902/D190208.pdf
●
Coffee Characteristics: What affects the quality of coffee before you brew it.
https://www.blackoutcoffee.com/blogs/the-reading-room/coffee-characteristics-what-affects-the-quality-of-coffeebefore-you-brew-it
●
The effect of bean origin and temperature on grinding roasted coffee.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4834475/
Download