NutriData Final Report Joacim Soto Andrew Obregon jsotocruz@csustan.edu obregonandrew@gmail.com CS 4250 Project Part 6 05/18/13 1. Project Description NutriData is reference type of nutritional database for the public to investigate facts about all kinds of foods people may have an interest in. The information that can be added for any specific type of food can be vast. Not only NutriData returns the nutritional facts and ingredients related to such food, but it can also find the company that presently manufactures that food product. Although not directly implemented into the project’s WWW site, the database provides a set of users who supposedly entered their physical information and the foods they liked. This is a precursor to group of registered subscribers who may use the site. What is modeled by the database are the referential abilities reminiscent of that of any other massive library of information regarding any subject. In NutriData’s case the main subject of investigation and entity is food! More specifically, “specific food”. A user can find the nutritional values of a food, the ingredients of the food, which company manufactures that food, in what category may that food enter in the USDA food pyramid and needless to say, the database provides general information on the Nutritional Values, the Food Pyramid suggested servings and the Ingredients most found in foods. Also, ideally a user should be able to find any food in our database. But of course, this is not possible at the moment because the database only has 172 food items. The Real World The database can actually pave the way for added functionality, for if we added the group of subscribers, we would try to give them recommendations of replacing or adding to a set of foods into their so called “diet”. NutriData would help those most honest about their habits. Also from the consumer standpoint, people would make more informed decisions about what they consume. Price of foods could be also a possibility but it would be very difficult to maintain/update. We could get particularly very health conscious and point the foods that are carcinogenic , GM, etc. But the point is not to be biased, nor misrepresent the products or companies, but to expose the facts to the people. People are responsible for the choices they make. NutriData is just an informative tool. 2. Relational Schema Important Updates: The only change from the previous ER Diagram is that a calories attribute is now appended to Food. This is because calories are a measure unit themselves and cannot be quantified in grams. This is reflected on the new ER Diagram. Also, we would like to explain that relationships Food_Pyramid and Nutritional_Value are limited to 6 and 39 tuples accordingly. This is because there are only 6 Categories in the Food Pyramid and we were able to find information for only 39 Nutritional Values for the project. Note: PK = Primary Key and FK = Foreign Key CEOruns Field Type Null Default fName varchar(30) No lName varchar(30) No cID int(11) No Comments Partial Key Partial Key PK/FK Company Field Type Null Default Comments cID int(11) No cName varchar(30) Yes NULL cWebSite varchar(100) Yes NULL Contain Field foodID nvName grams PK Type Null Default Comments int(11) No PK/FK varchar(30) No PK/FK double No Food Field Type Null Default Comments foodID int(11) No PK foodName varchar(50) Yes NULL servingSize double Yes NULL calories int (6) No 0 Food_Pyramid Field category addInfo Type Null Default Comments varchar(10) No PK varchar(1000) Yes NULL RecommendedServings double General_Food Field Type foodID int(11) Null Default No Yes NULL Comments PK/FK Human_User Field Type Null Default Comments userID int(11) No PK uName varchar(30) Yes NULL uAge int(3) Yes NULL uWeight double Yes NULL uHeight double Yes NULL uGender char(1) Yes NULL Ingredients Field Type Null Default Comments foodID int(11) No PK/FK info varchar(1000) No Likes Field Type foodID int(11) userID int(11) Null Default Comments No PK/FK No PK/FK Manufactures Field Type Null Default Comments cID int(11) No PK/FK foodID int(11) No PK/FK Nutritional_Value Field Type Null Default Comments nvName varchar(30) No PK suggestedDailyVal int(11) Yes NULL info varchar(1000) Yes NULL Specific_Food Field Type Null Default Comments foodID int(11) No PK/FK isCategorized Field Type Null Default Comments foodID int(11) No PK/FK category varchar(10) No PK/FK isFanOf Field Type userID int(11) cID int(11) Null Default Comments No PK/FK No PK/FK isMadeOf Field Type specificfoodID int(11) ingredientfoodID int(11) 3. ER Diagram Null Default Comments No PK/FK No PK/FK 4. CREATE TABLE Statements (latest update version) SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO"; /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */; /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */; /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */; /*!40101 SET NAMES utf8 */; CREATE DATABASE jsoto DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci; USE jsoto; CREATE TABLE CEOruns ( fName varchar(30) NOT NULL DEFAULT '', lName varchar(30) NOT NULL DEFAULT '', cID int(11) NOT NULL, PRIMARY KEY (fName,lName,cID), KEY cID (cID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Company ( cID int(11) NOT NULL AUTO_INCREMENT, cName varchar(30) DEFAULT NULL, cWebSite varchar(100) DEFAULT NULL, PRIMARY KEY (cID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Contain ( foodID int(11) NOT NULL, nvName varchar(30) NOT NULL, grams double NOT NULL, PRIMARY KEY (foodID,nvName), KEY nvName (nvName) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Food ( foodID int(11) NOT NULL AUTO_INCREMENT, foodName varchar(50) DEFAULT NULL, servingSize double DEFAULT NULL, calories int(6) NOT NULL DEFAULT '0', PRIMARY KEY (foodID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Food_Pyramid ( category varchar(10) NOT NULL, addInfo varchar(1000) DEFAULT NULL, RecommendedServings int(11) DEFAULT NULL, PRIMARY KEY (category) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE General_Food ( foodID int(11) NOT NULL, PRIMARY KEY (foodID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Human_User ( userID int(11) NOT NULL AUTO_INCREMENT, uName varchar(30) DEFAULT NULL, uAge int(3) DEFAULT NULL, uWeight double DEFAULT NULL, uHeight double DEFAULT NULL, uGender char(1) DEFAULT NULL, PRIMARY KEY (userID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Ingredients ( foodID int(11) NOT NULL, info varchar(1000) NOT NULL, PRIMARY KEY (foodID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Likes ( foodID int(11) NOT NULL, userID int(11) NOT NULL, PRIMARY KEY (foodID,userID), KEY userID (userID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Manufactures ( cID int(11) NOT NULL, foodID int(11) NOT NULL, PRIMARY KEY (cID,foodID), KEY foodID (foodID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Nutritional_Value ( nvName varchar(30) NOT NULL DEFAULT '', suggestedDailyVal int(11) DEFAULT NULL, info varchar(1000) DEFAULT NULL, PRIMARY KEY (nvName) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE Specific_Food ( foodID int(11) NOT NULL, PRIMARY KEY (foodID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE isCategorized ( foodID int(11) NOT NULL, category varchar(10) NOT NULL, PRIMARY KEY (foodID,category), KEY category (category) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE isFanOf ( userID int(11) NOT NULL, cID int(11) NOT NULL, PRIMARY KEY (userID,cID), KEY cID (cID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE isMadeOf ( specificfoodID int(11) NOT NULL, ingredientfoodID int(11) NOT NULL, PRIMARY KEY (specificfoodID,ingredientfoodID), KEY ingredientfoodID (ingredientfoodID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; ALTER TABLE `CEOruns` ADD CONSTRAINT ceoruns_ibfk_1 FOREIGN KEY (cID) REFERENCES Company (cID) ON DELETE CASCADE; ALTER TABLE `Contain` ADD CONSTRAINT contain_ibfk_2 FOREIGN KEY (nvName) REFERENCES Nutritional_Value (nvName), ADD CONSTRAINT contain_ibfk_3 FOREIGN KEY (foodID) REFERENCES Specific_Food (foodID); ALTER TABLE `Ingredients` ADD CONSTRAINT ingredients_ibfk_1 FOREIGN KEY (foodID) REFERENCES Food (foodID) ON DELETE CASCADE; ALTER TABLE `Likes` ADD CONSTRAINT likes_ibfk_4 FOREIGN KEY (userID) REFERENCES human_user (userID) ON DELETE CASCADE, ADD CONSTRAINT likes_ibfk_3 FOREIGN KEY (foodID) REFERENCES Food (foodID) ON DELETE CASCADE; ALTER TABLE `Manufactures` ADD CONSTRAINT manufactures_ibfk_1 FOREIGN KEY (cID) REFERENCES Company (cID), ADD CONSTRAINT manufactures_ibfk_2 FOREIGN KEY (foodID) REFERENCES Food (foodID); ALTER TABLE `Specific_Food` ADD CONSTRAINT specific_food_ibfk_1 FOREIGN KEY (foodID) REFERENCES Food (foodID) ON DELETE CASCADE; ALTER TABLE `isCategorized` ADD CONSTRAINT iscategorized_ibfk_1 FOREIGN KEY (foodID) REFERENCES General_Food (foodID), ADD CONSTRAINT iscategorized_ibfk_2 FOREIGN KEY (category) REFERENCES Food_Pyramid (category) ON DELETE CASCADE; ALTER TABLE `isFanOf` ADD CONSTRAINT isfanof_ibfk_1 FOREIGN KEY (userID) REFERENCES human_user (userID), ADD CONSTRAINT isfanof_ibfk_2 FOREIGN KEY (cID) REFERENCES Company (cID); ALTER TABLE `isMadeOf` ADD CONSTRAINT ismadeof_ibfk_1 FOREIGN KEY (specificfoodID) REFERENCES Specific_Food (foodID), ADD CONSTRAINT ismadeof_ibfk_2 FOREIGN KEY (ingredientfoodID) REFERENCES Ingredients (foodID); 5. Functional Dependencies: Functional Dependencies nvName --> suggestedDailyVal nvName --> info nvName, foodID --> grams Iname --> info foodID --> foodName foodID --> servingSize UserID --> uAge UserID --> uHeight UserID --> uWeight Relations/Tables Nutritional_Value Nutritional_Value contain Ingredient Food Food Human_User Human_User Human_User UserID -->uName UserID --> uGender cID --> cName cID --> cAddress cID --> cWebsite category --> addInfo category --> recommendedServings foodID --> category Human_User Human_User Company Company Company Food_Pyramid Food_Pyramid Food, Food_Pyramid 6. BCNF or 3NF explanations Nutritional_Value is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. contain is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. Ingredient is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. Food is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. Human_User is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. Company is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. Food_Pyramid is in BCNF because the only non trivial dependency is a candidate key. Because it is in BCNF, it is also in 3NF. General_Food is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. Specific_Food is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. Likes is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. isMadeOf is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. isFanOf is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. manufactures is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. isCategorized is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. CEO is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. runs is in BCNF because it only contains trivial dependencies. Because it is in BCNF, it is also in 3NF. 7. Indices Nutritional Value Food nvName – Hash, foodID – B+ Tree foodName - Hash company cID – B+ tree cName - Hash Human_user userID – B+ tree uName - Hash Records with integers like IDs could be indexed on B+Trees since we may retrieve a range selection and for searching one record we minimize the number of disks I/Os since B+ Trees are the best option available to do accomplish this. If a file of the names of foods or users is “hashed” on the name field we may retrieve all records from that particular food or user or company name. This way the database can support searches for that particular name with equality selection queries. 8. Summary of records (at the time of Part 5) Table Records 29 CEO runs 29 Company 60 Contain 109 Food 6 Food_Pyramid 9 General_Food 42 Human_User 32 Ingredients 9 isCategorized 76 isFanOf 86 isMadeOf 66 Likes 32 Manufactures 39 Nutritional_Value 34 Specific_Food 15 table(s) 658 COMMANDS Now let’s see the resulting table: Let’s INSERT a Pear, Now, let’s DELETE that Pear: Now let’s change the serving size of an Apricot to 36 grams. Sources For most nutritional information on food: http://caloriecount.about.com/ To gather ingredient information : http://inrfood.com/ To investigate the companies and products they produce: http://www.wikipedia.org/ Data that was cooked was generated by an engine for the user names. Constraints are set in the type of fields that are allowed for each table. We do not have Check constraints at the moment. Note that most food ingredients have 0 in their serving size but not all of them, so it is not a complete generalization. We also set limits in the length of the fields. Rest of Samples (at the time of Part 5) ‘CEO runs’ fName lName cID John Bryant 1 William Stiritz 2 Indra Nooyi 3 Muhtar Kent 4 Tony Vernon 5 Denise Morrison 6 Ken Powell 7 John Bilbrey 8 Kandall Powell 9 Donnie Smith 10 Contain foodID nvName grams 7 Trans Fat 0 8 Calories 211 8 Cholesterol 0 8 Dietary Fiber 6.3 8 Polyunsaturated Fat 0.5 8 Protein 5.3 8 Saturated Fat 0 8 Sodium 0 8 Sugars 11.6 8 Total Carbohydrates 49.5 Company cID cName 1 Kelloggs 2 Post Foods 3 Pepsi co. 4 The Coca-Cola Company 5 Kraft Foods Inc. The Campbell Soup 6 Company 7 General Mills 8 The Hershey Company 9 Betty Crocker 10 Tyson Foods cWebSite kelloggs.com postfoods.com pepsico.com cocacola.com Kraft.com campbellsoup.com generalmills.com hersheys.com bettycrocker.com tyson.com Food foodID foodName servingSize 1 Gala Apple 152 2 Chiquita Banana 126 3 Motts Apple Sauce 128 4 Original Rockstar 240 5 Monster Beverage 240 7 Corn Flakes 28 8 Frosted Mini Wheats Bite Size 59 9 Frosted Shredded Wheat Bite Size 52 10 Pepsi Cola 227 11 Coca-Cola 240 Food_Pyramid category addInfo RecommendedServings Dairy All fluid milk products and many foods made from m... 3 Fruits Any fruit or 100% fruit juice counts as part of th... 5 Grains Any food made from wheat, rice, oats, cornmeal, ba... 7 Oils Oils are fats that are liquid at room temperature,... 3 Protein All foods made from meat, poultry, seafood, beans ... 3 Vegetables Any vegetable or 100% vegetable juice counts as a ... 5 General_Food foodID 18 101 123 125 127 131 132 133 134 isCategorized foodID category 101 Dairy 131 Fruits 132 Fruits 133 Fruits 18 Grains 127 Grains 134 Protein 123 Vegetables 125 Vegetables Human_User userID uName uAge uWeight uHeight uGender 31 uptightknees 35 161 66 M 32 brainmaroon 42 170 69 M 33 shoeschopping 23 127 66 M 34 biketasteless 28 136 63 M 35 canonmews 21 151 67 F 36 resentfulspawnslime 25 163 65 M 37 oddclerk 48 185 73 F 38 mumartery 27 139 68 F 39 geardome 24 140 71 M 40 jellyfishjellyfish 56 153 72 M Ingredient foodID info 27 Corn is used in many food preparations around the ... 28 Sugar (sucrose) is used in most recipes as a sweet... 29 In addition to imparting a much-desired flavor to ... 30 Butylated hydroxytoluene (BHT) is a synthetic comp... 31 Malt flavoring is used as a food additive and a sw... 32 Whole grain wheat is a grain that is used in bakin... 33 Brown rice syrup is used as a sweetener. It is mad... 34 Gelatin is used in desserts as a thickener. Gelati... 35 Carbonated water is water that has carbon dioxide ... 37 High-fructose corn syrup is modified corn syrup. C... isMadeOf pecificfoodID ingredientfoodID 10 35 10 37 11 37 10 38 11 38 10 39 11 39 10 40 11 40 10 41 Manufactures cID foodID 1 7 1 8 2 9 3 10 4 11 5 12 6 13 7 14 8 15 9 16 isFanOf userID cID 6 7 11 7 18 7 23 7 26 7 28 7 34 7 1 8 3 8 4 8 Likes foodID userID 2 1 5 1 7 1 113 1 115 1 9 2 10 2 15 2 26 2 115 2 Nutritional_Value nvName suggestedDailyVal info Biotin 0 All of the B vitamins help your body convert the c... Calcium 1 Calcium is considered the most abundant mineral in... Calories 2000 In nutritional contexts, the kilojoule (kJ) is the... Chloride 0 Chloride, together with sodium, potassium and bica... Cholesterol 0 Cholesterol is a type of fat that is part of all a... Chromium 0 Starting around 2000, chromium received a lot of a... Copper 0 Some of the many functions of copper in a diet inc... Dietary Fiber 25 Dietary fiber — found mainly in fruits, vegetables... Folate 0 Folate belongs to a group of vitamins collectively... Iodine 0 Iodine is required by your body for the synthesis ... Specific_Food foodID 23 24 25 26 113 114 115 116 117 118 9. Sample script of Querying NOTE: Queries in the script appear in the order from the descriptions below: 1. 2. 3. 4. Find all of the foods that are also Specific foods. Find the average serving size of all the foods entered in the database. Find the name of users who like foodID 115. Find the number of likes and food names of the top ten most liked foods by the users. 5. Find the names of foods that have at least one like. As a failed example we tested : SELECT AVG(Human_User.uAge), Food.foodName FROM (Human_User JOIN Likes ON USING userID) JOIN Food USING foodID We were trying to find the average age of users who like specific foods. And here is the script: mysql> use jsoto; Database changed mysql> SELECT Food.foodName -> FROM Food, Specific_Food -> WHERE Food.foodID = Specific_Food.foodID; +----------------------------------------+ | foodName | +----------------------------------------+ | Gala Apple | | Chiquita Banana | | Motts Apple Sauce | | Original Rockstar | | Monster Beverage | | Corn Flakes | | Frosted Mini Wheats Bite Size | | Frosted Shredded Wheat Bite Size | | Pepsi Cola | | Coca-Cola | | Macaroni & Cheese | | Chicken Noodle Soup | | Cheerios | | Hersheys Milk Chocolate | | SuperMoist Devils Food Cake | | Chicken Nuggets | | Grilled Chicken Marinara | | Lasagna | | Garden Vegetable Medley | | Boneless Skinless Breast Fillet | | Pillsbury Pancakes Buttermilk | | Purple 100% Grape Juice | | Old Fashioned Quaker Oats | | Marshmallow Mateys | | Sunny D Original Tangy | | Doraditas Bimbo | | Tang Orange Drink Mix | | Country Time Lemonade Mix | | Knudsen Fat Free Sour Cream | | Las Palmas Enchilada Sauce Chile Verde | | Costco Chicken Bake | | Caffe Nero Tiramisu | | Safeway French Bread | | Gansito Marinela | +----------------------------------------+ 34 rows in set (0.00 sec) mysql> SELECT AVG(servingSize) -> FROM Food; +-------------------+ | AVG(servingSize) | +-------------------+ | 49.82844036697248 | +-------------------+ 1 row in set (0.00 sec) mysql> SELECT Human_User.uName -> FROM Human_User -> WHERE Human_User.userID IN (SELECT Likes.userID -> FROM Food, Likes -> WHERE Likes.foodID = 115); +---------------+ | uName | +---------------+ | vulturemetal | | skillfulbask | | dawntoddlers | | dawnliberated | +---------------+ 4 rows in set (0.00 sec) mysql> SELECT COUNT( Likes.foodID) AS NumLikes, Food.foodName -> FROM Likes, Food -> WHERE Food.foodID = Likes.foodID -> GROUP BY Food.foodName -> ORDER BY NumLikes DESC -> LIMIT 10; +----------+----------------------------------+ | NumLikes | foodName | +----------+----------------------------------+ | 6 | Chiquita Banana | | 6 | Monster Beverage | | 6 | Corn Flakes | | 5 | Frosted Shredded Wheat Bite Size | | 4 | Tang Orange Drink Mix | | 4 | Motts Apple Sauce | | 3 | Gala Apple | | 3 | Coca-Cola | | 3 | Garden Vegetable Medley | | 3 | Original Rockstar | +----------+----------------------------------+ 10 rows in set (0.00 sec) mysql> SELECT DISTINCT Food.foodName -> FROM Likes, Food -> WHERE Food.foodID = Likes.foodID; +----------------------------------+ | foodName | +----------------------------------+ | Chiquita Banana | | Monster Beverage | | Corn Flakes | | Sunny D Original Tangy | | Tang Orange Drink Mix | | Frosted Shredded Wheat Bite Size | | Pepsi Cola | | Hersheys Milk Chocolate | | Marshmallow Mateys | | Coca-Cola | | Purple 100% Grape Juice | | Old Fashioned Quaker Oats | | Grilled Chicken Marinara | | Lasagna | | Garden Vegetable Medley | | Boneless Skinless Breast Fillet | | Gala Apple | | Motts Apple Sauce | | Original Rockstar | | Macaroni & Cheese | | Chicken Noodle Soup | | Pillsbury Pancakes Buttermilk | | Wheat Bread | +----------------------------------+ 23 rows in set (0.00 sec) mysql> exit Bye 10. Snapshots of the Database There were several attempts to implement a simple and working search function for food items. The ideal scenario is that this would have worked in the home page of NutriData. As a decorative touch, the team added CSS cascading menus to navigate the page. 11. Group Participation Joacim Soto: For the course of the semester I felt that we both participated intensely in the making of the database. We definitely had times were we found ourselves in the need to correct or change something and the issue was eventually resolved. I apologized/ notified to Andrew for missing some days I found myself unable even to attend classes due to illness or other circumstances but in our meetings we were extremely effective in making progress. Andrew and I designed the ER diagram but he was responsible for drawing it, he entered a fair portion of data into the database, and was effective in implementing the php pages at the end. As for me, I also entered and corrected a fair amount of data, I did a lot of testing and corrections since I had ready access to the database. We both did a lot of analysis as we figured for example some attributes and relationships needed to be changed. And I also wrote the reports and at the end, I designed the webpage and tried to add the extra functionality but this was done in very short time. Andrew Obregon: