E-commerce Data Prep with Jupyter Notebook

6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [1]: import json import pandas as pd import numpy as np THERE WILL BE AN ERROR WHILE LOADING THE BEC IN "PRODUCT DIPLAY NAME" SOME ITEMS TOKE 11 CELLS IN THE CSV WHILE THERE IS ONLY 10 COLUMNS THATS I S10 CELLS BUT ITS OK WE ONLY NEED THE IDS OF THE PRODUCTS In [2]: data = pd.read_csv('styles.csv', error_bad_lines=False) b'Skipping line 6044: expected 10 fields, saw 11\nSkipping line 6569: expected 10 fields, saw 11\nSkipping line 7399: expected 10 fields, saw 11\nSkipping lin e 7939: expected 10 fields, saw 11\nSkipping line 9026: expected 10 fields, saw 11\nSkipping line 10264: expected 10 fields, saw 11\nSkipping line 10427: expec ted 10 fields, saw 11\nSkipping line 10905: expected 10 fields, saw 11\nSkippin g line 11373: expected 10 fields, saw 11\nSkipping line 11945: expected 10 fiel ds, saw 11\nSkipping line 14112: expected 10 fields, saw 11\nSkipping line 1453 2: expected 10 fields, saw 11\nSkipping line 15076: expected 10 fields, saw 12 \nSkipping line 29906: expected 10 fields, saw 11\nSkipping line 31625: expecte d 10 fields, saw 11\nSkipping line 33020: expected 10 fields, saw 11\nSkipping line 35748: expected 10 fields, saw 11\nSkipping line 35962: expected 10 field s, saw 11\nSkipping line 37770: expected 10 fields, saw 11\nSkipping line 3810 5: expected 10 fields, saw 11\nSkipping line 38275: expected 10 fields, saw 11 \nSkipping line 38404: expected 10 fields, saw 12\n' In [3]: data.head() Out[3]: id gender masterCategory subCategory articleType baseColour season year usage 0 15970 Men Apparel Topwear Shirts Navy Blue Fall 2011.0 Casual 1 39386 Men Apparel Bottomwear Jeans Blue Summer 2012.0 Casual 2 59263 Women Accessories Watches Watches Silver Winter 2016.0 Casual 3 21379 Men Apparel Bottomwear Track Pants Black Fall 2011.0 Casual 4 53759 Men Apparel Topwear Tshirts Grey Summer 2012.0 Casual localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 1/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [4]: data Out[4]: id gender masterCategory subCategory articleType baseColour season year us 0 15970 Men Apparel Topwear Shirts Navy Blue Fall 2011.0 Cas 1 39386 Men Apparel Bottomwear Jeans Blue Summer 2012.0 Cas 2 59263 Women Accessories Watches Watches Silver Winter 2016.0 Cas 3 21379 Men Apparel Bottomwear Track Pants Black Fall 2011.0 Cas 4 53759 Men Apparel Topwear Tshirts Grey Summer 2012.0 Cas ... ... ... ... ... ... ... ... ... 44419 17036 Men Footwear Shoes Casual Shoes White Summer 2013.0 Cas 44420 6461 Men Footwear Flip Flops Flip Flops Red Summer 2011.0 Cas 44421 18842 Men Apparel Topwear Tshirts Blue Fall 2011.0 Cas 44422 46694 Women Personal Care Fragrance Perfume and Body Mist Blue Spring 2017.0 Cas 44423 51623 Women Accessories Watches Watches Pink Winter 2016.0 Cas 44424 rows × 10 columns In [5]: data.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 44424 entries, 0 to 44423 Data columns (total 10 columns): # Column Non-Null Count --- ------------------0 id 44424 non-null 1 gender 44424 non-null 2 masterCategory 44424 non-null 3 subCategory 44424 non-null 4 articleType 44424 non-null 5 baseColour 44409 non-null 6 season 44403 non-null 7 year 44423 non-null 8 usage 44107 non-null 9 productDisplayName 44417 non-null dtypes: float64(1), int64(1), object(8) memory usage: 3.4+ MB Dtype ----int64 object object object object object object float64 object object localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 2/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [6]: data.describe() Out[6]: id year count 44424.000000 44423.000000 mean 29696.334301 2012.806497 std 17049.490518 2.126480 min 1163.000000 2007.000000 25% 14768.750000 2011.000000 50% 28618.500000 2012.000000 75% 44683.250000 2015.000000 max 60000.000000 2019.000000 In [7]: JSON_files=data['id'].values In [8]: JSON_files Out[8]: array([15970, 39386, 59263, ..., 18842, 46694, 51623], dtype=int64) In [9]: JSON_files.sort() In [10]: JSON_files Out[10]: array([ 1163, 1164, 1165, ..., 59998, 59999, 60000], dtype=int64) In [11]: JSON_files=list(JSON_files) In [12]: JSON_files Out[12]: [1163, 1164, 1165, 1525, 1526, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536, 1537, 1538, 1539, 1540, 1541, 1542 localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 3/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [13]: JSON_files_Names=[] converting the id to string to use them for refrence In [14]: for i in JSON_files: JSON_files_Names.append(str(i)) In [15]: JSON_files_Names Out[15]: ['1163', '1164', '1165', '1525', '1526', '1528', '1529', '1530', '1531', '1532', '1533', '1534', '1535', '1536', '1537', '1538', '1539', '1540', '1541', '1542' lets test openeing the file In [53]: df=pd.read_json("1163.json") In [54]: df.info() <class 'pandas.core.frame.DataFrame'> Index: 43 entries, code to styleOptions Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------------------- ----0 notification 0 non-null float64 1 meta 2 non-null object 2 data 41 non-null object dtypes: float64(1), object(2) memory usage: 1.3+ KB localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 4/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [55]: df Out[55]: notification meta data code NaN 200 NaN requestId NaN 4cf4c56d-2941-4012b1d3-10f7c762a126 NaN id NaN NaN 1163 price NaN NaN 895 discountedPrice NaN NaN 895 styleType NaN NaN DEL productTypeId NaN NaN 219 articleNumber NaN NaN 409962-480-895 visualTag NaN NaN productDisplayName NaN NaN Nike Sahara Team India Fanwear Round Neck Jersey variantName NaN NaN Roundneck Jersey myntraRating NaN NaN 1 catalogAddDate NaN NaN 1461658417 brandName NaN NaN Nike ageGroup NaN NaN Adults-Men gender NaN NaN Men baseColour NaN NaN Blue colour1 NaN NaN NA colour2 NaN NaN NA fashionType NaN NaN Fashion season NaN NaN Summer year NaN NaN 2011 usage NaN NaN Sports vat NaN NaN 5.5 displayCategories NaN NaN Sports Wear,Sale weight NaN NaN 0 navigationId NaN NaN 0 landingPageUrl NaN NaN Tshirts/Nike/Nike-Sahara-Team-IndiaFanwear-Ro... articleAttributes NaN NaN {'Fit': 'Regular Fit', 'Fabric 3': 'NA', 'Body... crossLinks NaN NaN [{'key': 'More Tshirts by Nike', 'value': 'tsh... brandUserProfile NaN NaN {'uidx': '6d415071.a389.472b.b5b6.c93d864afbea... codEnabled NaN NaN True localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 5/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook notification meta data styleImages NaN NaN {'default': {'imageURL': 'http://assets.myntas... lookGoodAlbum NaN NaN {} style360Images NaN NaN {} masterCategory NaN NaN {'id': 9, 'typeName': 'Apparel', 'active': Tru... subCategory NaN NaN {'id': 31, 'typeName': 'Topwear', 'active': Tr... articleType NaN NaN {'id': 90, 'typeName': 'Tshirts', 'active': Tr... isEMIEnabled NaN NaN True otherFlags NaN NaN [{'dataType': 'BOOLEAN', 'name': 'isFragile', ... articleDisplayAttr NaN NaN {'id': 90, 'core': {'order': '0', 'display': '... productDescriptors NaN NaN {'materials_care_desc': {'descriptorType': 'ma... styleOptions NaN NaN [{'id': 8289, 'name': 'Size', 'value': 'XXS', ... In [56]: df.info() <class 'pandas.core.frame.DataFrame'> Index: 43 entries, code to styleOptions Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------------------- ----0 notification 0 non-null float64 1 meta 2 non-null object 2 data 41 non-null object dtypes: float64(1), object(2) memory usage: 1.3+ KB localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 6/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [57]: df1=df.drop(columns=['notification','meta']) df1 Out[57]: data code NaN requestId NaN id 1163 price 895 discountedPrice 895 styleType DEL productTypeId 219 articleNumber 409962-480-895 visualTag productDisplayName Nike Sahara Team India Fanwear Round Neck Jersey variantName Roundneck Jersey myntraRating 1 catalogAddDate 1461658417 brandName Nike ageGroup Adults-Men gender Men baseColour Blue colour1 NA colour2 NA fashionType Fashion season Summer year 2011 usage Sports vat 5.5 displayCategories Sports Wear,Sale weight 0 navigationId 0 landingPageUrl Tshirts/Nike/Nike-Sahara-Team-India-Fanwear-Ro... articleAttributes {'Fit': 'Regular Fit', 'Fabric 3': 'NA', 'Body... crossLinks [{'key': 'More Tshirts by Nike', 'value': 'tsh... brandUserProfile {'uidx': '6d415071.a389.472b.b5b6.c93d864afbea... codEnabled True styleImages {'default': {'imageURL': 'http://assets.myntas... lookGoodAlbum {} localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 7/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook data style360Images {} masterCategory {'id': 9, 'typeName': 'Apparel', 'active': Tru... subCategory {'id': 31, 'typeName': 'Topwear', 'active': Tr... articleType {'id': 90, 'typeName': 'Tshirts', 'active': Tr... isEMIEnabled True otherFlags [{'dataType': 'BOOLEAN', 'name': 'isFragile', ... articleDisplayAttr {'id': 90, 'core': {'order': '0', 'display': '... productDescriptors {'materials_care_desc': {'descriptorType': 'ma... styleOptions [{'id': 8289, 'name': 'Size', 'value': 'XXS', ... localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 8/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [58]: df1=df1.T df1.info() <class 'pandas.core.frame.DataFrame'> Index: 1 entries, data to data Data columns (total 43 columns): # Column Non-Null Count --- ------------------0 code 0 non-null 1 requestId 0 non-null 2 id 1 non-null 3 price 1 non-null 4 discountedPrice 1 non-null 5 styleType 1 non-null 6 productTypeId 1 non-null 7 articleNumber 1 non-null 8 visualTag 1 non-null 9 productDisplayName 1 non-null 10 variantName 1 non-null 11 myntraRating 1 non-null 12 catalogAddDate 1 non-null 13 brandName 1 non-null 14 ageGroup 1 non-null 15 gender 1 non-null 16 baseColour 1 non-null 17 colour1 1 non-null 18 colour2 1 non-null 19 fashionType 1 non-null 20 season 1 non-null 21 year 1 non-null 22 usage 1 non-null 23 vat 1 non-null 24 displayCategories 1 non-null 25 weight 1 non-null 26 navigationId 1 non-null 27 landingPageUrl 1 non-null 28 articleAttributes 1 non-null 29 crossLinks 1 non-null 30 brandUserProfile 1 non-null 31 codEnabled 1 non-null 32 styleImages 1 non-null 33 lookGoodAlbum 1 non-null 34 style360Images 1 non-null 35 masterCategory 1 non-null 36 subCategory 1 non-null 37 articleType 1 non-null 38 isEMIEnabled 1 non-null 39 otherFlags 1 non-null 40 articleDisplayAttr 1 non-null 41 productDescriptors 1 non-null 42 styleOptions 1 non-null dtypes: object(43) memory usage: 460.0+ bytes Dtype ----object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out o… 9/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [59]: df1 Out[59]: data code requestId id price discountedPrice styleType productTypeId articleNumber NaN NaN 1163 895 895 DEL 219 409962-480895 visua 1 rows × 43 columns In [60]: df1 = df1.dropna(axis=1) df1 Out[60]: data id price discountedPrice styleType productTypeId articleNumber 1163 895 895 DEL 219 409962-480895 visualTag productDisp Nike Sah India Fanwe Ne 1 rows × 41 columns localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 10/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [61]: df1.info() <class 'pandas.core.frame.DataFrame'> Index: 1 entries, data to data Data columns (total 41 columns): # Column Non-Null Count --- ------------------0 id 1 non-null 1 price 1 non-null 2 discountedPrice 1 non-null 3 styleType 1 non-null 4 productTypeId 1 non-null 5 articleNumber 1 non-null 6 visualTag 1 non-null 7 productDisplayName 1 non-null 8 variantName 1 non-null 9 myntraRating 1 non-null 10 catalogAddDate 1 non-null 11 brandName 1 non-null 12 ageGroup 1 non-null 13 gender 1 non-null 14 baseColour 1 non-null 15 colour1 1 non-null 16 colour2 1 non-null 17 fashionType 1 non-null 18 season 1 non-null 19 year 1 non-null 20 usage 1 non-null 21 vat 1 non-null 22 displayCategories 1 non-null 23 weight 1 non-null 24 navigationId 1 non-null 25 landingPageUrl 1 non-null 26 articleAttributes 1 non-null 27 crossLinks 1 non-null 28 brandUserProfile 1 non-null 29 codEnabled 1 non-null 30 styleImages 1 non-null 31 lookGoodAlbum 1 non-null 32 style360Images 1 non-null 33 masterCategory 1 non-null 34 subCategory 1 non-null 35 articleType 1 non-null 36 isEMIEnabled 1 non-null 37 otherFlags 1 non-null 38 articleDisplayAttr 1 non-null 39 productDescriptors 1 non-null 40 styleOptions 1 non-null dtypes: object(41) memory usage: 444.0+ bytes Dtype ----object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object object In [62]: pd.set_option('max_columns', None) localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 11/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [63]: df1 Out[63]: data id price discountedPrice styleType productTypeId articleNumber 1163 895 895 DEL 219 409962-480895 visualTag productDisp Nike Sah India Fanwe Ne In [75]: df1.articleAttributes.values Out[75]: array([{'Fit': 'Regular Fit', 'Fabric 3': 'NA', 'Body or Garment Size': 'Garmen t Measurements in', 'Occasion': 'Sports'}], dtype=object) In [66]: TBDF_articleAttributes=df1.articleAttributes.values.tolist() DF_articleAttributes=pd.DataFrame(TBDF_articleAttributes) DF_articleAttributes Out[66]: 0 Fit Fabric 3 Body or Garment Size Occasion Regular Fit NA Garment Measurements in Sports In [28]: data = {'id': [1], 'name': ['Nike'], 'logo':["httpsgadga"], 'top':[0], 'slug':[0], 'meta_title':[0], 'meta_description':[0], 'created_at':[0], 'updated_at':[0], } df = pd.DataFrame(data) print (df) 0 id 1 name Nike logo httpsgadga 0 updated_at 0 top 0 slug 0 meta_title 0 meta_description 0 created_at 0 \ In [29]: df Out[29]: 0 id name logo top slug meta_title meta_description created_at updated_at 1 Nike httpsgadga 0 0 0 0 0 0 localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 12/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [30]: print(df1['brandName'].unique()) ['Nike'] In [31]: df=pd.read_json("./styles/1163.json") df1=df.drop(columns=['notification','meta']) df1=df1.T df1 =df1.dropna(axis=1) df1 Out[31]: data id price discountedPrice styleType productTypeId articleNumber 1163 895 895 DEL 219 409962-480895 visualTag productDisp Nike Sah India Fanwe Ne In [32]: JSON_files_Names Out[32]: ['1163', '1164', '1165', '1525', '1526', '1528', '1529', '1530', '1531', '1532', '1533', '1534', '1535', '1536', '1537', '1538', '1539', '1540', '1541', '1542' In [33]: len(JSON_files_Names) Out[33]: 44424 localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 13/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [34]: Semi_proccessed_Complete_Jdata=df1 Semi_proccessed_Complete_Jdata Out[34]: data id price discountedPrice styleType productTypeId articleNumber 1163 895 895 DEL 219 409962-480895 visualTag productDisp Nike Sah India Fanwe Ne In [35]: for i in range(1,len(JSON_files_Names)): df=pd.read_json("./styles/"+JSON_files_Names[i]+".json") df1=df.drop(columns=['notification','meta']) df1=df1.T df1 =df1.dropna(axis=1) Semi_proccessed_Complete_Jdata=Semi_proccessed_Complete_Jdata.append(df1) localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 14/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [36]: Semi_proccessed_Complete_Jdata Out[36]: id price discountedPrice styleType productTypeId articleNumber data 1163 895 895 DEL 219 409962-480895 data 1164 1595 1595 P 289 Nike Sahara Jersey data 1165 2495 2495 D 219 Nike Jersey data 1525 1299 1299 P 597 6818802 data 1526 1299 1299 P 294 6814201 ... ... ... ... ... ... ... data 59995 4300 2150 P 379 SR370-Black data 59996 3400 1700 P 379 SR394PURPLE MIX59996 data 59998 1395 1395 P 445 9861EMT data 59999 1595 1595 P 445 9874BX visualTag produ N India eoss:PREMIUM Ni India Nike M Puma ... Av Avirat Catw localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 15/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook data id price discountedPrice styleType productTypeId articleNumber 60000 499 499 DEL 364 AJ0056 Blue visualTag produ A 44424 rows × 47 columns In [85]: Semi_proccessed_Complete_Jdata.head() Out[85]: Unnamed: 0 id price discountedPrice styleType productTypeId articleNumber visualTa 0 data 1163 895.0 895.0 DEL 219 409962-480895 Na 1 data 1164 1595.0 1595.0 P 289 Nike Sahara Jersey eoss:PREMIU 2 data 1165 2495.0 2495.0 D 219 Nike Jersey Na 3 data 1525 1299.0 1299.0 P 597 6818802 Na 4 data 1526 1299.0 1299.0 P 294 6814201 Na 5 rows × 48 columns In [70]: Semi_proccessed_Complete_Jdata.to_csv('Semi_proccessed_Complete_Jdata.csv') In [4]: Semi_proccessed_Complete_Jdata=pd.read_csv('Semi_proccessed_Complete_Jdata.csv',l localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 16/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [5]: Semi_proccessed_Complete_Jdata.tail() Out[5]: Unnamed: 0 id price discountedPrice styleType productTypeId articleNumber 44419 data 59995 4300.0 2150.0 P 379 SR370-Black Na 44420 data 59996 3400.0 1700.0 P 379 SR394PURPLE MIX59996 Na 44421 data 59998 1395.0 1395.0 P 445 9861EMT Na 44422 data 59999 1595.0 1595.0 P 445 9874BX Na 44423 data 60000 499.0 499.0 DEL 364 AJ0056 Blue Na visualT 5 rows × 48 columns localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 17/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [6]: BrandName=Semi_proccessed_Complete_Jdata['brandName'].unique() BrandName Out[6]: array(['Nike', 'Puma', 'Quechua', 'Artengo', 'Kalenji', 'Kipsta', 'Inesis', 'Domyos', 'Decathlon', 'Nabaiji', 'Newfeel', 'Geonaute', 'Reebok', 'Lotto', 'Inkfruit', 'FIFA', 'ADIDAS', 'Lee', 'Adidas', 'Basics', 'Probase', 'Red Tape', 'Numero Uno', 'Carlton London', 'Murcia', 'Disney', 'Classic Polo', 'Ediots', 'ID', 'Lee Cooper', 'Mr. Men Little Miss', 'Mr. Men', 'Catwalk', 'Do u speak green', 'Tantra', 'Guerrilla', 'ASICS', 'Myntra', 'Skechers', 'Converse', 'FILA', 'Status Quo', 'Crocs', 'Wrangler', 'Ed Hardy', 'Urban Yoga', 'Jealous 21', 'Spalding', 'Rockport', 'Mickey', 'Superman', 'Batman', 'DC Comics', 'Free Authority', 'Beatles', 'Pink Floyd', 'Smashing Pumpkins', 'Aerosmith', 'Marvel', 'Jimi Hendrix', 'Billy Idol', 'Rolling Stone', 'Nirvana', 'John Lenon', 'Wildcraft', 'Sher Singh', 'Levis Kids', 'Palm Tree', 'Gini and Jony', 'Being Human', 'DARK KNIGHT', 'LINKIN PARK', 'Queen', 'Ant', 'United Colors of Benetton', 'LOCOMOTIVE', 'HIGHLANDER', 'SPYKAR', 'Timberland', 'Forever New', 's.Oliver', 'SCULLERS FOR HER', 'SCULLERS', 'MEGADETH', 'W', 'Proline', 'Indigo Nation', 'Provogue', 'Fastrack', 'New Balance', 'Doodle', 'Mark Taylor', 'Regent Polo Club', 'John Miller', 'Buckaroo', 'Indian Terrain', 'Hush Puppies', 'Scholl', 'Little Miss', 'ESPRIT', 'Carrera', 'Clarks', 'Flying Machine', 'Vishudh', 'Playboy', 'Franco Leone', 'Ganuchi', 'AURELIA', 'Genesis', 'Reid & Taylor', 'Xoxo', 'Roadster', 'test', 'Mother Earth', 'Inc 5', 'Rocia', 'Chimp', 'Hanes', 'Belmonte', 'AND', 'Enroute Women', 'Vans', 'Arrow', 'New Hide', 'ADIDAS Originals', 'Arrow Sport', 'Hidekraft', 'Chhota Bheem', 'Belkin', 'Black coffee', 'Facit', 'Warner Bros', 'Tom & Jerry', 'Turtle', 'Mayhem', 'MTV', 'Tokyo Talkies', 'Enroute Men', 'Levis', 'Peter England', 'Spice Art', 'I DEE', 'Police', 'Image', 'GAS', 'Lino Perros', 'U.S. Polo Assn.', 'Pepe Jeans', 'CASIO', 'Beyouty', 'Speedo', 'CAT', 'Crusoe', 'DENI YO', 'Manchester United', 'Aneri', 'Wills Lifestyle', 'Undercolors of Benetton', 'VITAL Gear', 'GUESS', 'Nautica', 'DAVID BECKHAM', 'C Vox', 'Arrow Woman', 'Campbell', 'Quiksilver', 'ice watch', 'DIVA', 'Baggit', 'Tabac', '4711', 'FOOTLOOSE', 'Skybags', 'Allen Solly', 'Celine Dion', 'Louis Philippe', 'Pal Zileri', 'Van Heusen', 'roxy', 'KIARA', 'Enamor', 'Fossil', 'Biba', 'John Players', 'Global Desi', 'Woodland', '2go ACTIVE GEAR USA', 'maxima', 'Satya Paul', 'Hugo Boss', 'aramis', 'DKNY', 'dunhill', 'Nautilus', 'Baldessarini', 'BOSS', 'Nike Fragrances', 'Music', 'Ray-Ban', 'OAKLEY', 'LA-EMOTIO', 'Folklore', 'Pacific Gold', 'Mumbai Slang', 'AMERICAN TOURISTER', 'Femella', 'J. DEL POZO', 'JAGUAR', 'Paris Hilton', 'TOUS', 'Slazenger', 'Formula 1', 'PERRY ELLIS', 'Calvin Klein', 'DAVIDOFF', 'yelloe', 'Miss-T', 'JOVAN', 'pierre cardin', 'MISS SIXTY', 'Kylie Minogue', 'Jockey', 'CHE GUEVARA', 'BULCHEE', 'YARDLEY', 'Secret Temptation', 'Park Avenue', 'Wild stone', 'Fogg', '18+', 'Gatsby', 'Old Spice', 'Denizen', 'CABARELLI', 'vogue', 'GIORDANO', 'ASPEN', 'Kenneth Cole', 'SKAGEN', 'Lovable', 'OPIUM', 'Fabindia', 'Azzaro', 'Cartier', 'Dolce & Gabbana', 'Ferrari', 'Issey Miyake', 'Versace', 'Arrow New York', 'Titan', 'Heart 2 Heart', 'Q&Q', 'Tonino Lamborghini', 'ONLY', 'Levitate', 'iPanema', 'Grendha', 'Allen Solly Woman', 'Van Heusen Woman', 'Angry Birds', 'U.S. Polo Assn. Denim Co.', 'Sepia', 'Jack & Jones', 'Homme', localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 18/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook 'Cobblerz', 'Timex', 'Pieces', 'Vero Moda', 'Citizen', 'Tonga', 'Allen Solly Kids', 'Be For Bag', 'Envirosax', 'Windsor', 'Coolers', 'Fortune', 'Suunto', 'Senorita', 'Enroute Teens', 'Stens by Enroute', 'Bata', 'Gliders', 'Force 10', 'Kelme', 'Jungle Book', 'Peperone', 'Salomon', 'Ben 10', 'OTLS', 'SDL by Sweet Dreams', 'Strapless', 'Spinn', 'Paridhan', 'Nina Ricci', 'Yves Saint Laurent', 'Burberry', 'Salvatore Ferragamo', 'Bulgari', 'Mont Blanc', 'Estee Lauder', 'Valentino Perfumes', 'Giorgio Armani', 'Ralph Lauren', 'Carolina Herrera', 'U.S. Polo Assn. Kids', 'Madagascar 3', 'Happy Socks', '24', 'RNC', 'Mineral', 'Miami Blues', 'Polaroid', 'Kids Ville', 'Hannah Montana', 'Joker', 'Latin Quarters', 'Red Chief', 'Helix', 'Bwitch', 'Tortoise', 'Span', 'SWAYAM', 'Remanika', 'Estd. 1977', 'Prafful', 'Estelle', 'Alma', 'Little Miss Intimates', 'French Connection', 'Amante', 'FCUK Underwear', 'Calvin Klein Innerwear', 'Calvin Klein Underwear', 'Swiss Army', 'Wilson', 'Royal Diadem', 'HUGO', 'Carlos Moya', 'Spinz', 'Footfun', 'Globalite', 'Kama Sutra', 'Casio Baby-G', 'Lomani', 'Adrika', 'Fusion Beats', '109F', 'Jacques M', 'Rasasi', 'Giorgio Beverly Hills', 'Paco Rabanne', 'Euroluxe', 'Rising Wave', 'Love Passport', 'Saint James', 'Barbie', 'F5', 'Umbro', 'York', 'Shree', 'Tiptopp', 'Portia', 'Avengers', 'The Amazing Spiderman', 'Pitaraa', 'Revv', 'Lucera', 'Miki Pearl', 'Deborah', 'Calzini', 'Parx', 'Stoln', 'Chromozome', 'Raymond', 'Hop Scotch', 'Kraus Jeans', 'Hidedge', 'Nyk', 'Inaya', 'Just Natural', 'Lencia', 'ToniQ', 'Red Rose', 'Mod-acc', 'Morellato', 'Just Cavalli', 'Fiorelli', 'JAG', 'FNF', 'Smugglerz', 'Ayaany', 'Garfield', 'Avon', 'F Sports', 'Sushilas', 'Ivory Tag', 'Lakme', 'Ponds', 'Smartoe', 'Revlon', 'Colorbar', 'Biara', 'Tommy Hilfiger', 'Streetwear', 'Olay', 'Horsefly', 'HM', 'Elle', 'Lotus Herbals', 'Biotique', 'Fruit of the loom', 'Rreverie', 'Rocky S', 'Alayna', 'FCUK', 'Hakashi', 'Taylor of London', 'Denim', 'Colour me', 'Cavallini', 'BRUT', 'Peri Peri', 'Avirate', 'Valley of Flowers'], dtype=object) In [7]: idt=[] In [8]: for i in range(1,425): idt.append(i) localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 19/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [9]: idt Out[9]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 In [10]: len(idt) Out[10]: 424 In [11]: data={'id':idt, 'name':BrandName} In [12]: dftest = pd.DataFrame(data) In [13]: dftest Out[13]: id name 0 1 Nike 1 2 Puma 2 3 Quechua 3 4 Artengo 4 5 Kalenji ... ... ... 419 420 Cavallini 420 421 BRUT 421 422 Peri Peri 422 423 Avirate 423 424 Valley of Flowers 424 rows × 2 columns from translate import Translator translator= Translator(to lang="Arabic") translation = localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 20/33 6/28/22, 10:05 PM o Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook t a s ate po t a s ato t a s ato a s ato (to_ a g ab c ) t a s at o translator.translate("Nike") print(translation) from googletrans import Translator translator = Translator() Arabic_BrandName=[] for i in BrandName: translation = translator.translate(i) Arabic_BrandName.append(translation) Arabic_BrandName In [14]: result = translator.translate("Newfeel",src='en', dest='arabic') result.text --------------------------------------------------------------------------NameError Traceback (most recent call last) <ipython-input-14-e6322cfe5fca> in <module> ----> 1 result = translator.translate("Newfeel",src='en', dest='arabic') 2 result.text NameError: name 'translator' is not defined In [15]: result = translator.translate("Artengo",src='english', dest='arabic') result.text --------------------------------------------------------------------------NameError Traceback (most recent call last) <ipython-input-15-c6ef4e8326fa> in <module> ----> 1 result = translator.translate("Artengo",src='english', dest='arabic') 2 result.text NameError: name 'translator' is not defined In [16]: print(result.pronunciation) --------------------------------------------------------------------------NameError Traceback (most recent call last) <ipython-input-16-9243e0d409fd> in <module> ----> 1 print(result.pronunciation) NameError: name 'result' is not defined In [17]: from deep_translator import GoogleTranslator localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 21/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [18]: Arabic_BrandName=[] for i in BrandName: if i.isnumeric(): Arabic_BrandName.append(i) else: result =GoogleTranslator(source='en', target='ar').translate(i) Arabic_BrandName.append(result) In [19]: Arabic_BrandName Out[19]: ,'‫['نايك‬ ,'‫'بوما‬ ,'‫'الكيتشوا‬ ,'‫'أرتينجو‬ ,'‫'كالينجي‬ ,'‫'كيبستا‬ ,'‫'إينيسيس‬ ,'‫'دوميوس‬ ,'‫'ديكاتلون‬ ,'‫'نابيجي‬ ,'‫'شعور جيد‬ ,'‫'جيونوت‬ ,'‫'ريبوك‬ ,'‫'لوتو‬ ,'‫'إنكفروت‬ ,'‫'اتحاد كرة القدم‬ ,'‫'شركة اديداس‬ ,'‫'لي‬ ,'‫'شركة اديداس‬ '‫'األ ا ات‬ In [20]: result =GoogleTranslator(source='en', target='ar').translate("b24") In [21]: result Out[21]: '24 ‫'ب‬ In [22]: import requests In [23]: scrapeKeywords=[] for i in BrandName: scrapeKeywords.append(i.replace(" ","+")) localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 22/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [24]: scrapeKeywords Out[24]: ['Nike', 'Puma', 'Quechua', 'Artengo', 'Kalenji', 'Kipsta', 'Inesis', 'Domyos', 'Decathlon', 'Nabaiji', 'Newfeel', 'Geonaute', 'Reebok', 'Lotto', 'Inkfruit', 'FIFA', 'ADIDAS', 'Lee', 'Adidas', 'B i ' In [25]: response=requests.get("https://www.google.com/search?q=taylor+of+london+logo&tbm= In [26]: data Out[26]: {'id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 In [27]: from bs4 import BeautifulSoup localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 23/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [28]: soup= BeautifulSoup(response.content,"html.parser") soup Out[28]: <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wap forum.org/DTD/xhtml-mobile10.dtd"> <html dir="rtl" lang="ar" xmlns="http://www.w3.org/1999/xhtml"><head><meta co ntent="application/xhtml+xml; charset=utf-8" http-equiv="Content-Type"/><meta content="no-cache" name="Cache-Control"/><title dir="ltr">taylor of london lo go - ‫ بحث‬Google</title><style>a{text-decoration:none;color:inherit}a:hover{te xt-decoration:underline}a img{border:0}body{font-family:Roboto,Helvetica,Aria l,sans-serif;padding:8px;margin:0 auto;max-width:700px;min-width:240px;}.FbhR zb{border-right:thin solid #dadce0;border-left:thin solid #dadce0;border-top: thin solid #dadce0;height:40px;overflow:hidden}.n692Zd{margin-bottom:10px}.cv ifge{height:40px;border-spacing:0}.QvGUP{height:40px;padding:0 8px 0 8px;vert ical-align:top}.O4cRJf{height:40px;width:100%;padding:0;padding-left:16px}.O1 ePr{height:40px;padding:0;vertical-align:top}.kgJEQe{height:36px;width:98px;v ertical-align:top;margin-top:4px}.lXLRf{vertical-align:top}.MhzMZd{border:0;v ertical-align:middle;font-size:14px;height:40px;padding:0;width:100%;paddingright:16px}.xB0fq{height:40px;border:none;font-size:14px;background-color:#42 85f4;color:#fff;padding:0 16px;margin:0;vertical-align:top;cursor:pointer}.xB 0fq:focus{border:1px solid #000}.M7pB2{border:thin solid #dadce0;margin:0 0 3 px 0;font-size:13px;font-weight:500;height:40px}.euZec{width:100%;height:40p t t li t b d i 0}t bl Z td{ ddi 0 idth 25%} QI I In [29]: soup.select('img[src^="https://encrypted-tbn0.gstatic.com/images"]')[0]['src'] Out[29]: 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRj4QKatyLljpw3eCK0la48ov NCQfbvQbiZEGe_oKd0ofUbfK3PB-MFTEO2RQ&s' In [30]: BrandName_Logos=[] for i in scrapeKeywords: response=requests.get("https://www.google.com/search?q="+i+"+logo&tbm=isch&ve soup= BeautifulSoup(response.content,"html.parser") logoSRC=soup.select('img[src^="https://encrypted-tbn0.gstatic.com/images"]')[ BrandName_Logos.append(logoSRC) In [31]: len(BrandName_Logos) Out[31]: 424 localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 24/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [32]: BrandName_Logos Out[32]: ['https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQi_iLEOGN8vEqDnH3lObH HlYM0uwme1L7LmejHPDOqk2wi50G-itdJUlSmS54&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTHg2wTDFgZrmOTk0eRkgg bBbajhWNOoVSTcyiPy0yUluiNdASfBBBDQX6wug&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcS9az2e0v7sgAQU1ObkG4x nBdXdwGITW0xERtkMpb2RpzUeUKz9hFzknh2SZw&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRgQSAZ8MiWdlh-lIA6kA9 lnpAJHI79KWE4YdeSV3TK5Dns8RX967OSAEASyao&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRfRJKgaf74x6DiXbYvsPK pQqMqkrGOlEpAAOQghpSbmD2K8MzzF_mpvYFDL7g&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQFpuWbkk2be4cq0UH7l0D sgcmIQpktodh_JeOquBAc_HNOLL6OPKjRpMjW5g&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcT1cX3hOc8WYdfJIsRqy6t Sxb_EOugkRewTOEEwjyPoECR8qjraz50zmslx3g&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQ0PS36o8zEDaT4CPoN_jD cM1V37UVJbRihorFfFC5rt7wCRQ-_NtBc4unfZpI&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQcfd-UV8vJ8ujvf3-jfi9 yLDVwKETrH8ZsYNXQhd6Hk92_4y2P-gUL_FPTtQ&s', 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSYxx-Tn_R-kVqg2niic1U JFUM h Mb1Mh14 Zki 2 R tC Y 5D 6 K & ' In [33]: top=[] top[:]=np.zeros top --------------------------------------------------------------------------TypeError Traceback (most recent call last) <ipython-input-33-dd047583b9da> in <module> 1 top=[] ----> 2 top[:]=np.zeros 3 top TypeError: can only assign an iterable localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 25/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [34]: top = [] for i in range(424): top.append(int(0)) top Out[34]: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 In [35]: len(top) Out[35]: 424 In [36]: slug=[] for i in range(424): slug.append("") In [37]: slug Out[37]: ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '' localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 26/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [38]: meta_title=[] for i in range(424): meta_title.append("") In [39]: meta_title Out[39]: ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '' In [40]: meta_description=meta_title In [41]: meta_description Out[41]: ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '' localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 27/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [42]: created_at=[] updated_at=[] for i in range(424): created_at.append("") updated_at.append("") In [43]: created_at Out[43]: ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '' In [44]: updated_at Out[44]: ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '' localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 28/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [45]: brands={'id':idt, 'name':BrandName, 'logo':BrandName_Logos, 'top':top, 'slug':slug, 'meta_title':meta_title, 'meta_description':meta_description, 'created_at':created_at, 'updated_at':updated_at} In [46]: brands Out[46]: {'id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 In [47]: brands = pd.DataFrame(brands) localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 29/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [48]: brands Out[48]: id name logo top 0 1 Nike https://encryptedtbn0.gstatic.com/images? q=tb... 0 1 2 Puma https://encryptedtbn0.gstatic.com/images? q=tb... 0 2 3 Quechua https://encryptedtbn0.gstatic.com/images? q=tb... 0 3 4 Artengo https://encryptedtbn0.gstatic.com/images? q=tb... 0 4 5 Kalenji https://encryptedtbn0.gstatic.com/images? q=tb... 0 ... ... ... ... ... 419 420 Cavallini https://encryptedtbn0.gstatic.com/images? q=tb... 0 420 421 BRUT https://encryptedtbn0.gstatic.com/images? q=tb... 0 421 422 Peri Peri https://encryptedtbn0.gstatic.com/images? q=tb... 0 422 423 Avirate https://encryptedtbn0.gstatic.com/images? q=tb... 0 423 424 Valley of Flowers https://encryptedtbn0.gstatic.com/images? q=tb... 0 slug meta_title meta_description created_at ... ... ... ... up 424 rows × 9 columns In [49]: brands.to_csv('brands.csv',index=False) In [50]: langarabic=[] for i in range(424): langarabic.append("Arabic") localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 30/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [51]: brands_translations={'id':idt, 'brand_id':idt, 'name':Arabic_BrandName, 'lang':langarabic, 'created_at':created_at, 'updated_at':updated_at} In [52]: brands_translations = pd.DataFrame(brands_translations) In [53]: brands_translations Out[53]: id brand_id name lang 0 1 1 ‫نايك‬ Arabic 1 2 2 ‫بوما‬ Arabic 2 3 3 ‫الكيتشوا‬ Arabic 3 4 4 ‫أرتينجو‬ Arabic 4 5 5 ‫كالينجي‬ Arabic ... ... ... ... ... 419 420 420 ‫كافاليني‬ Arabic 420 421 421 BRUT Arabic 421 422 422 ‫بيري بيري‬ Arabic 422 423 423 ‫أفيرات‬ Arabic 423 424 424 ‫وادي الزهور‬ Arabic created_at updated_at ... ... 424 rows × 6 columns In [54]: brands_translations.to_csv('brand_translations.csv',index=False) In [55]: Semi_proccessed_Complete_Jdata.iloc[[0]].articleAttributes.values.tolist() Out[55]: ["{'Fit': 'Regular Fit', 'Fabric 3': 'NA', 'Body or Garment Size': 'Garment Mea surements in', 'Occasion': 'Sports'}"] In [2]: dfs=pd.read_csv("brand_translations.csv") localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 31/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook In [3]: dfs Out[3]: id brand_id name lang created_at updated_at 0 1 1 ‫نايك‬ Arabic NaN NaN 1 2 2 ‫بوما‬ Arabic NaN NaN 2 3 3 ‫الكيتشوا‬ Arabic NaN NaN 3 4 4 ‫أرتينجو‬ Arabic NaN NaN 4 5 5 ‫كالينجي‬ Arabic NaN NaN ... ... ... ... ... ... ... 419 420 420 ‫كافاليني‬ Arabic NaN NaN 420 421 421 BRUT Arabic NaN NaN 421 422 422 ‫بيري بيري‬ Arabic NaN NaN 422 423 423 ‫أفيرات‬ Arabic NaN NaN 423 424 424 ‫وادي الزهور‬ Arabic NaN NaN 424 rows × 6 columns In [11]: dfs.to_csv("brand_translations_.csv", encoding="utf-8",index=False) In [10]: pd.read_csv("brand_translations.csv") Out[10]: id brand_id name lang created_at updated_at 0 1 1 ‫نايك‬ Arabic NaN NaN 1 2 2 ‫بوما‬ Arabic NaN NaN 2 3 3 ‫الكيتشوا‬ Arabic NaN NaN 3 4 4 ‫أرتينجو‬ Arabic NaN NaN 4 5 5 ‫كالينجي‬ Arabic NaN NaN ... ... ... ... ... ... ... 419 420 420 ‫كافاليني‬ Arabic NaN NaN 420 421 421 BRUT Arabic NaN NaN 421 422 422 ‫بيري بيري‬ Arabic NaN NaN 422 423 423 ‫أفيرات‬ Arabic NaN NaN 423 424 424 ‫وادي الزهور‬ Arabic NaN NaN 424 rows × 6 columns In [ ]: localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 32/33 6/28/22, 10:05 PM Data Preparation + Data input Pre Pre Processing(Merging JSON files to make a data frame out of them) - Jupyter Notebook localhost:8888/notebooks/E Commerce Project/Data Preparation %2B Data input Pre Pre Processing(Merging JSON files to make a data frame out … 33/33

E-commerce Data Prep with Jupyter Notebook

Products

Support

E-commerce Data Prep with Jupyter Notebook

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib