A Day in the Life of a Data Scientist [Hamza Farooq] 9/8/17 Capabilities Overview Liaison with UI and Backend to deliver end-toend Integration Create Data Pipeline and Algorithm implementation within Hadoop Clusters Create data using various tables to provide richness Deployment Data Augmentation Scalability Algorithm Design Create Algorithms pertaining to the veracity of Wal-Mart Data Use Case Use Case 1: Wrigley’s estimates $65 million in lost gum sales annually at Walmart due to on shelf stock outs Other highly impacted impulse items include Mint, Candy, Yogurt, Frozen, Cookies and Chips However, every other category is also impacted to a considerable Degree This significantly impacts bottom line of both Walmart and Supplier Use Case 2: Iterative Customer Decision Tree Relationships between products determine the structure of customer decision tree (iCDT). The iCDT works to identify the customer thought process and also highlights online product segments not carried in stores. Example 1 Headphones Example 2 Food Storage SINGLE/DOUBLE Not carried in stores MULTI Use Case 3: Objective is to solve for the amount of demand that is transferred when an item is removed from an assortment Example: CDT Node (all the items in a node are substitutes) If Removed… A B C D E How will demand change for remaining items B-E? How much demand will not transfer? API based Hadoop Architecture Enterprise Assortment System [UI] API The new architecture allows scalability using Hadoop Clusters and deliver results faster and seamless integration within the UI. All jobs are executed within the edge nodes using the Oozie workflow which allows 100% automation and brings the need for manual intervention to zero Seamless Integration between each layer Seamless Integration between each layer