Dear Client, The data given by you had a total record of 20002 rows and 3495 unique IDs in the transactional dataset while the customers demographic and customers address had 4002 and 4001 rows respectively with the same number of unique IDs. I found the following issues with the data set: The additional customer IDs in the transactional dataset could point to a loss in data. There were some inconsistencies in the values which had the same attributes There was also a difference in the data type with same attributes Multiple columns such as online orders, brands and many more The next step in the process would be data cleaning, just wanted to bring this discrepancies to your kind notice. Regards, Tanishi Gupta