SANDHAI – An E-shopping Service Aggregation Framework

advertisement
SANDHAI – An E-shopping Service
Aggregation Framework
Harikrishna Narayanan Pranesh Parimala Ranganathan Vijay Ramakrishnan Siva Subbiah
902533226
902505951
902446624
902538209
{harikrishna, pranesh, v.ramakrishnan, siva.subbiah}@gatech.edu
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 1
Contents
Abstract: ...................................................................................................................................................... 3
Some Sandhai Features: ........................................................................................................................... 4
Architecture and Desgin: ......................................................................................................................... 4
Currently supported APIs: ...................................................................................................................... 4
Integration of Amazon APIs: .................................................................................................................. 4
Integration of Ebay APIs: ....................................................................................................................... 5
Recommendation System: ...................................................................................................................... 5
Apriori algorithm: ............................................................................................................................... 6
Database Design & UI Functionality ...................................................................................................... 8
Architecture Diagram: .......................................................................................................................... 11
Testing and Evaluation: ..................................................................................................................... 12
Challenges: .............................................................................................................................................. 13
Screenshots : ............................................................................................................................................ 13
Future Work: .......................................................................................................................................... 14
Conclusion: .............................................................................................................................................. 15
Project Planning: .................................................................................................................................... 15
Acknowledgments: .................................................................................................................................. 15
References: .............................................................................................................................................. 15
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 2
Abstract:
Sandhai = a common market place in Tamil [South Indian Language] where one can buy or
sell anything. Sandhai is an e-shopping search engine system. With the current boom in
Ecommerce business integration of ecommerce search engines into a single point service offers
several advantages like fast and cost effective search and quality of return etc. In Sandhai we
had designed such an Aggregation framework targeting major Ecommerce service providers.
The Background and below use cases will give a better picture of what Sandhai is all about.
.
Background:
The background of service integration and Ecommerce applications that lead to this project is
explained in the form of two simple use cases as below.
Sample Use case #1:
Consider a user U wish to buy a product P online. He has to search across several online
e-commerce services to find a best deal of his interest both in terms of cost and quality. The
various factors that might influence his buying are Product Cost, Free Offers, Shipment
charges, Taxes. He has to spend a considerable amount of time in finding a good deal for the
product of his interest. Also the user has to be aware of various services and also other
information like Products of Category “C” are better offered by Website W1 and Products of
Category “D” are better offered by Website W2. Nearly half of people, who fix deals of
products through a web service online, find out a better offer of the same product by a different
service later. Instead if there is a consolidated service or a system that can talk to several online
services and find the best offer amongst all, the user would be happy to use the system and can
be very much satisfied with the deal he found for himself.
Sample Use case #2:
The idea of buying new products, goods, gadgets spreads amongst friends circle when
friends usually meet or get together. Say A, B, C, D and E are friends and they get together
once in a while for a dinner. Let user A buy a product P in a nice offer. When the friends meet
and casually talk about the product that A bought some of his friends might like the product and
would wish to buy the same in a similar offer, but unfortunately the offer might have expired or
might have turned unfavorable in the time. Instead If there existed a system where users can
keep track of their wish list and once they buy one or get one they check it with the details of
the deal they used to buy it, his/her friends circle might be notified by the same by a Pub Sub
framework. So in our system users create and maintain their wish lists. The friends circle can
then subscribe themselves for a wish list item of their friend. Say now user A buys a product in
his wish list he fills out the wish list completion that will publish the details of his 3 buying to
all the subscribed friends. The existing social network infrastructure can also be used to
accomplish this.
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 3
Other Popular Ecommerce Search Engines
 www.shopzilla.com
 www.kelkoo.com
 www.MyShoppingpal.com
 www.thefind.com
Some Sandhai Features:
 Easy integration of new web services
 Support for both SOAP and REST based product search APIs.
 User customized search tuning
 Get information about users preferences and interests [say his favorite online shops, his
favorite brands, his favorite color etc]
Architecture and Desgin:
1) What to look for in any E-Commerce Service ?
 Product data: Product data includes information about product availability and pricing
for items in the catalog.
 Content from customers: Content from customers include reviews and product lists
 Seller information: Seller information includes general information and customer
feedback about the wide range of vendors
2) The system will allow users to do a single master search that will spawn itself across
various e-commerce players using their E-Shopping API interfaces and help users in
getting the right product.
3) Currently supported APIs:
4) Integration of Amazon APIs:
a. The API provides well defined mechanism for querying the database from
Amazon. It provides for a variety of search queries and uses REST or SOAP
protocols. The API exposes Amazon's product data and e-commerce
functionality. This allows developers, web site publishers and others to leverage
the data that Amazon uses to power its own business, and potentially make
money as an Amazon affiliate.
b. Representational state transfer (REST) is a style of software architecture for
distributed hypermedia systems. REST refers to a collection of network
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 4
architecture principles which outline how resources are defined and addressed.
The term is often used more loosely to describe any simple interface which
transmits domain-specific data over HTTP without an additional messaging
layer such as SOAP or session tracking via HTTP cookies.
c. There are certain groups of APIs which can be used. Some of the popular groups
include BrowseNodeLookup, Customer Content, Items, List, Cart, Third Party
Listings, TransactionLookup . The only restriction is that one one call can be
made per IP per second and it also presents a limitation on how long the data can
be cached locally.
d. Amazon provides WSDL to integrate with platforms such as .net in order to
code. A WSDL (Web Service Description Language) is an XML document that
defines the operations, parameters, requests, and responses used in web service
interactions. It acts like the contract that defines the language and grammar used
by web service clients and servers. When you look at the Amazon Associates
Web Service WSDL, for example, you find in it all of the Amazon Associates
Web Service operation names, parameters, request and response structures.
e. .Amazon Associates Web Service REST requests are URLs, as shown in the following
example.
http://ecs.amazonaws.com/onca/xml?Service=AWSECommerceService&Operation=Ite
mSearch&AWSAccessKeyId=[Access
Key
D]&AssociateTag=[ID]&SearchIndex=Apparel&Keywords=Shirt
5) Integration of Ebay APIs:
a. Ebay like amazon also provides APIs for performing a search. It uses SOAP
protocols. A sample ebay query is given below for reference.
http://open.api.ebay.com/shopping?appid=MyAppID&version=517&siteid=
0&callname=FindItems&QueryKeywords=ipod&responseencoding=JSON&callb
ack=true.
6) Recommendation System:
Recommender systems represent user preferences for the purpose of suggesting items to
purchase or examine. They have become fundamental applications in electronic commerce and
information access, providing suggestions that effectively prune large information spaces so that
users are directed toward those items that best meet their needs and preferences. This paper presents
an overview of the field of recommender systems and describes the current generation of
recommendation methods that are usually classified into three major categories: content-based,
collaborative and hybrid recommendation approaches. But we focused on only one approach and it
is the collaborative filtering approach.
For the recommendation system we tried three different algorithms.The three algorithms are
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 5
1) Apriori Algorithm.
2) Naïve Non Parametric Baye’s Classifier.
3) Aditive content based classification.
-> For the Naive version of NBC I had a training set with labeled data about the products(into four
different categories). (Technology, Wearable, Books and Entertainment). So now if some user had
bought a product from a category he would be shown a different product from the same category. This
had some issues as the number of products in each category is huge. So even though the classification
accuracy was good the recommended stuff was not very appropriate as there were many products in the
same category which were not very closely related to the current product. This was written in C#.
-> The additive algorithm just checks for the products the user has already bought in different news
articles(which are stored in text file according to date) and based on the reviews and recommendations
from other customers we suggest other products. This was written in Java.
-> The apriori algorithm is the third algorithm which we tried implementing. It uses association rule
mining and it gave better results of the three. But since this doesn't indicate the other popular products
in market other than the users' preference information (which was limited), we just grabbed the ranking
of the popular products from Amazon and E-Bay API and displayed it as well. The variety was good and
also
the
content
was
appropriate.
This
was
written
in
C#.
Since finally for the integration we needed a common platform, we dropped the first two algorithms
which had different requirements.
Apriori algorithm:
Apriori is designed to operate on databases containing transactions (for example, collections of
items bought by customers, or details of a website frequentation). Apriori algorithm is basically
used to find association rules.
Association rule mining works the following way, given a set of itemsets (for instance, sets of
retail transactions, each listing individual items purchased), the algorithm attempts to find
subsets which are common to at least a minimum number C of the itemsets. Apriori uses a
"bottom up" approach, where frequent subsets are extended one item at a time (a step known as
candidate generation), and groups of candidates are tested against the data. The algorithm
terminates when no further successful extensions are found.
Apriori uses breadth-first search and a tree structure to count candidate item sets efficiently. It
generates candidate item sets of length k from item sets of length k − 1. Then it prunes the
candidates which have an infrequent sub pattern. According to the downward closure lemma,
the candidate set contains all frequent k-length item sets. After that, it scans the transaction
database to determine frequent item sets among the candidates.
Apriori, while historically significant, suffers from a number of inefficiencies or trade-offs,
which have spawned other algorithms. Candidate generation generates large numbers of subsets
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 6
(the algorithm attempts to load up the candidate set with as many as possible before each scan).
Bottom-up subset exploration (essentially a breadth-first traversal of the subset lattice) finds
any maximal subset S only after all 2 | S | − 1 of its proper subsets.
Association rule mining is to find out association rules that satisfy the predefined minimum
support and confidence from a given database. The problem is usually decomposed into two
sub problems.
One is to find those itemsets whose occurrences exceed a predefined threshold in the database;
those itemsets are called frequent or large itemsets. In our case item sets are those products
which a user has purchased.
The second problem is to generate association rules from those large itemsets with the
constraints of minimal confidence. Suppose one of the large itemsets is Lk, Lk = {I1, I2, … ,
Ik}, association rules with this itemsets are generated in the following way: the first rule is {I1,
I2, … , Ik-1}⇒ {Ik}, by checking the confidence this rule can be determined as interesting or
not. Then other rule are generated by deleting the last items in the antecedent and inserting it to
the consequent, further the confidences of the new rules are checked to determine the
interestingness of them. Those processes iterated until the antecedent becomes empty. Since the
second sub problem is quite straight forward, most of the researches focus on the first sub
problem. The Apriori algorithm finds the frequent sets L In Database D.



Find frequent set Lk − 1.
Join Step.
o Ck is generated by joining Lk − 1with itself
Prune Step.
o Any (k − 1) -itemset that is not frequent cannot be a subset of a frequent k -itemset,
hence should be removed.
Where (Ck: Candidate itemset of size k)

(Lk: frequent itemset of size k)
The pseudo code for the algorithm is given below.
Apriori
large 1-itemsets that appear in more than transactions }
while
Generate(Lk − 1)
for transactions
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 7
Subset(Ck,t)
for candidates
return
In this algorithm based on the other users’ purchases we recommend stuff for this user. This is called
association rule mining and the method of recommendation is called collaborative filtering. Since we
didn’t do any actual purchases we extrapolated the values and filled in the data base. From these values
we implemented the algorithm and filled in another database for easy retrieval of data. This was the first
type of recommendation.
The database for this design stored the user details and the product details. There were two other
databases one for matching the users and their purchases and the final one for storing the
recommendations. There is also one another simple implementation where we get the ranking of the
products from the web services namely Amazon and E-Bay. This gives the popular searches for the day.
By this way we bring in variety to the products recommended. So now the user has many options some
of which are personalized and based on his purchases and purchases of other like minded users and the
others are given by the popular purchases of the day. Thus the user will have many options to choose
from. The next level we would like to take the project to is to recommend based on hybrid approaches
based on item and based on social networks.
Database Design & UI Functionality
The Database design was made keeping in mind the simplicity of the relationships between the
different tables and the efficiency of querying that has to be performed on the underlying data.
The main tables used in the Database design are
1. User 2. Product 3. Category 4. Friend_List.
The Database tables are primarily used in the following operations of a user:
Registration:
Once when a user registers in the Sandhai website, the data which is collected through the
forms is populated in the underlying ‘User’ Database. The data collected about the user during
the registration phase is used for recommendations and populating the search results in a
particular order.
Login:
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 8
Every user when logging into the site is verified upon using the username and password details
those are present in the database. Once a user successfully logs in, the session variables are set
denoting the current user and the Last_visit field is updated with the current time stamp of the
user.
Recommendation:
Once a user opts to check out the suggested recommendations for him, the data from the user
table and the product table are fetched and fed into the recommendation system using which
user recommendations are provided based on the apriori algorithm.
Send Mail Alerts:
The user has an option to send email or sms alerts whenever the product requested by him on a
particular cost differential is available. The sendMail program fetches the contact details of the
currently logged in user and sends an alert through email with the help of the underlying
sendMail API.
Twitter:
Twitter is a feature add messages to a friend’s blog when a user searches for a particular
product of interest. We use a separate table for maintaining the friends list of every user and
once when the tweet option is selected by the user, the currently searched items’ link is sent as a
tweet to the friends of the particular user.
Database Connectivity
The primary database which was used to build the application is SQL and for testing purposes
we had implemented access databases which can be connected with the C# application with the
help of the JET OLEDB 4.0 driver.
The Database Tables are as follows:
User
Attribute
Username(primary key)
Password
Phone
Email
Zip
Amazon_pref
Ebay_pref
Besstbuy_pref
Last_visit
Datatype
Varchar
Varchar
Varchar
Varchar
Integer
Integer
Integer
Integer
datetime
Product
Attribute
Pid(primary key)
Datatype
Varchar
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 9
Username(foreign key from Varchar
user)
Product name
Varchar
Vendorid
Varchar
Categoryid
Varchar
Unitprice
Varchar
P_descp
Varchar
P_cost
Float
Category
Attribute
Categoryid(primary key)
Categoryname
C_descp
Datatype
Varchar
Varchar
varchar
Friend_List
username
F_username
F_email
varchar
varchar
varchar
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 10
Architecture Diagram:
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 11
Testing and Evaluation:
 How to test the effectiveness of a service aggregation system as a whole?
◦ Highly dependent on individual sub systems
 Bringing up a Sandhai Pilot system and asking many users to perform search in it will
help in evaluating the system.






Performance of the system at various stages of integration
The QOS parameters are
Speed
Number of Results
Preference
Simulating Web services to profile the integration framework code
◦ Created simple web service mockups
◦ Integrated these mock up services into Sandhai ‘s framework
◦ Triggered custom searches and calculated the request response times for various
ranges of queries.
 Defining Quality of Service Parameters
 Q : Search Query
 WS1,WS2,…..WSn : Web Services
 T1 ,T2,……Tn : Time Taken for Search
 AT : Aggregator Framework time
We aim to achieve a performance in which the sandhai‘s search time is always better than the
slowest product search engine among the integrated engines.
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 12
Challenges:
 Trying to aggregate different web services with a generic framework.
 Performance Evaluation was a challenge
Screenshots :
Home page screenshot
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 13
Search results screenshot
Future Work:
 Integrating and supporting more e-commerce sites to provide users with a wider search
range
 Supporting a full fledged recommendation system for the user profiles in the system
 Independent wish-list publisher subscriber system
 We would like to implement this idea mainly for e-commerce [buying and selling of
online goods] services and extend them to other service consolidations like web search
services consolidation, Social Network services consolidation and thus giving the user
flexibility across all services at a single place.
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 14
 Data mining and trend analysis based on product searches made by different user
profiles
Conclusion:
The idea of Web Service aggregation seamlessly is powerful when implemented. In our
project we were successfully able to wrap APIs of 3 different Ecommerce service providers and
were able to utilize several of their features from a single point. We would be more interested in
seeing how the framework of integration suites and scales well with respect to service
aggregation from different domains.
Project Planning:
Resource Plan and Schedule:
Week 1: Reading related literature and Web Service APIs
Week 2: Design of Database and basic Classes for implementation
Week 3: Implementation of Shopping features to support various e-commerce services
Week 4: Implementation of wish list Pub Sub system
Week 5: Implementation of auxiliary services supported by the system
Week 6: Integration and System Testing
Week 7: Testing, Bug fixes and Regression
Week 8: Presentation, Release and Usage in Production
Acknowledgments:
Prof. Ling Liu for her valuable comments and suggestions during the design and
implementation phase of our project.
References:
1. Amazon API “http://docs.amazonwebservices.com/AWSECommerceService/2008-03-03/GSG/”
2. Google Maps API http://en.wikipedia.org/wiki/Google_Maps
3. Masand, Spiliopoulou, Srivastava, Ziane: “Web Mining for Usage Patterns & Profiles”, WEBKDD
2002.
4. Rayid Ghani, Carlos Soares: “Data Mining for Business Applications”, KDD – 2006.
5.http://developer.ebay.com/DevZone/shopping/docs/HowTo/JS_Shopping/JS_SearchGS_NV_JSON/JS_Sea
rchGS_NV_JSON.html
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 15
SANDHAI – E-shopping aggregation framework | CS8803 AIAD | Spring’09
Page 16
Download