Recommender Systems and Product Semantics Rayid Ghani & Andy Fano Accenture Technology Labs Workshop on Recommendation & Personalization in E-Commerce May 28, 2002 Who we are? Accenture Technology Labs R&D Group for Accenture ~ 40 researchers in Chicago, Palo Alto (California) and Sophia Antipolis (France) Research in Data Mining, Machine Learning, Ubiquitous Computing, Wearable Computing, Language Technologies, Virtual & Augmented Reality, Collaborative Workspaces… What Does a Transaction Mean? Terabytes of transaction data. But what does any one transaction mean? What does it tell us about the customer? Example: Apparel Transactional information captured by retailers: Date of Purchase SKU Price Size Brand But what does this tell me about the customer who bought it? Product Semantics: What does a product mean? What does this shirt say about her? Is it conservative or flashy? Trendy or classic? Formal or casual? Where would we get this information? Where do people get this information? Marketing Product Companies and Retailers spend fortunes telling customers what their products mean. Our idea: Build a system that analyzes marketing texts to infer these attributes. Example From the Macy’s web site: DKNY Jeans Ruched Side-Tie Tee Get back to basics with a fresh new look this season. The Ruched Side-Tie Tee has a drawstring tie at left hip with shirred detail down the side. Stretch provides a flattering, shapely fit. V-neck. Training the System Product Descriptions Domain Experts Product descriptions marked up with attribute values Supervised Learning Algorithm Learned Statistical Models Inferring Attributes via Text Classification Build one classifier per attribute type Simple statistical classifier – Naïve Bayes Multinomial model (McCallum & Nigam 1998) For all words (description) and attribute values: calculate P(word | attribute value) using the manually rated items Given a new item description: Calculate P(attribute value | item description) for all attribute values Use Maximum Likelihood Semi-supervised Learning Lot of product descriptions available for minimal cost Labeling them is expensive Apply magical algorithms that combine labeled and unlabeled data for classification EM (Nigam et al. 1999), Co-Training (Blum & Mitchell 1999), Co-EM (Nigam & Ghani), ECo-Train (Ghani, 2002) The EM Algorithm Estimate labels Learn from labeled data Naïve Bayes Probabilistically add to labeled data A Peek at the Learned Models Not Conservative (Flashy) Extremely Conservative rose special leopard chemise straps flirty spray silk platform lauren ralph breasted seasonless trouser jones sport classic blazer Bias Slip Dress The perfect black dress gets flirty and feminine in the bias-cut slip dress with sheer ruffled cap sleeves. A low, scoop neck and back is ultra-flattering while a draped, romantic fit reveals total elegance. Lauren Single-Breasted Blazer Sporty elegance and classic Gatsby-esque styling are captured in this impeccably designed single-breasted, three-button blazer from Lauren by Ralph Lauren. With traditional notch collar, signature button hardware, front flap pockets, and signature crest on left breast pocket. A Peek at the Learned Models Informal formal jean tommy denim sweater pocket neck tee hilfiger jacket fully button skirt lines seam crepe leather Polo Jeans Co. Muscle Logo Tee Strut your stuff in the Muscle Logo Tee. Flattering on the arms with a close-to-the-body fit, classic crewneck and shimmery logo print with stars. A sporty new basic for your tee collection. BLACK TRIACETATE JACKET A fresh alternative to classic suiting. Wear open for cardigan effect, buttoned for a clean look. Hidden placket with four tonal buttons and a hook-and-eye closure at the collar. Falls to hip. Lined. A Peek at the Learned Models Loungewear Partywear chemise silk kimono calvin klein august lounge hilfiger robe gown rock dress sateen length: skirt shirtdress open platform plaid flower ABS by Allen Schwartz Asymmetrical Dress Just for the party girl with a big feminine streak. A ruffled one-shoulder cuts diagonally across the front and back. Accented with a rhinestone detail on the shoulder. A Peek at the Learned Models Juniors Extremely Sporty jrs dkny jeans tee collegiate logo tommy polo short sneaker sneaker camp base rubber sole white miraclesuit athletic nylon Mesh DKNY Jeans Jrs. Mesh Jersey Sweater An innovative take on the football jersey, the see-through mesh sweater is a fashion favorite among the sporty set. Denim appliqué Populating the Knowledge Base New Product Descriptions Learned Statistical Models Product descriptions automatically marked up with attribute values Product Semantics Knowledge Base Recommender System Retailer’s Web Site Extracted Descriptions of Products Browsed Product Semantics Knowledge Base Learned Statistical Models Evolving User Profile Advantages over Traditional Recommendation Systems This approach provides us some of the underlying attributes that characterize a customer’s preference. We can therefore begin to explain the preference rather than simply rely on the co-occurrence of purchases (e.g. people who bought x also bought y). This helps with: Handling new products/rapidly changing products Low Frequency Products Cross Category Recommendations Cross-Category Recommendations Difficult for collaborative filtering and contentbased systems Build a model of the user - personality, stylistic attributes Taste in clothing might also be suggestive of taste in other products, say furniture and home decoration Create models for different product classes and create mappings among these models Summary “Understand” a product and hence the customer Use Text Learning (supervised and semisupervised) to abstract from product (description) to subjective, domain-specific features Effective for new (and low frequency) products and for cross-category recommendations