FATIH UNIVERSITY FACULTY OF ENGINEERING COMPARISON SHOPPING SITE SYSTEM By Ahmet Faruk BİŞKİNLER & Mehmet ÇOKYILMAZ 07010441 & 07010321 Advisor: Assist Prof. Atakan KURT 2 March 2016 COMPARISON SHOPPING SITE SYSTEM by Ahmet Faruk BİŞKİNLER & Mehmet ÇOKYILMAZ A Senior Design Final Report Submitted to The Department of Computer Engineering of Fatih University in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Engineering March 2016 Istanbul, Turkey ABSTRACT COMPARISON SHOPPING SITE SYSTEM Ahmet Faruk BİŞKİNLER & Mehmet ÇOKYILMAZ Computer Engineering March 2016 Advisor: Ins. Assist Prof. Atakan KURT The purpose of Comparison Shopping Site System is to develop an e-commerce site to provide comparison of online shopping products and their information among lots of online shopping sites for consumers who use internet for shopping. Comparison Shopping Site System is composed of two main parts. One of them is crawling part that visits online shopping sites, gathers information about products and store them into database. Other part is front-end part that contains interactions with customers and online shopping sites. Comparison Shopping Site System is very helpful in order to find more suitable and cheaper products basically and fast. PHP, XML and MySQL Database System are used in Comparison Shopping Site System. TABLE OF CONTENTS ABSTRACT ................................................................................................................................ i TABLE OF CONTENTS ............................................................................................................ i LIST OF FIGURES ................................................................................................................... iii LIST OF TABLES .................................................................................................................... iv LIST OF SYMBOLS AND ABBREVIATONS ........................................................................ v CHAPTER 1 INTRODUCTION .............................................................................................. 1 1.1. Project Overview and Purposes.................................................................................. 1 1.2. Scope of Project ......................................................................................................... 1 1.3. Success Criteria of Projectroject Overview ........................................................................................................ 3 3.2. Functional Requirements............................................................................................ 3 3.2.1. Searching A Product From The Web Site .................................................................. 3 3.2.2. Sorting Searched Products ......................................................................................... 4 3.2.3. Listing Products.......................................................................................................... 5 3.2.4. Narrowing Search Results .......................................................................................... 6 3.2.5. Going to Web Site for Selling a Product .................................................................... 7 3.2.6. Create New User ........................................................................................................ 8 3.2.7. User Login .................................................................................................................. 9 3.2.8. Adding Product by Hand .......................................................................................... 10 3.2.9. Adding Product by Uploading XML File ................................................................ 11 3.2.10. Viewing XML File Uploads. .................................................................................... 12 3.2.11. Viewing XML File. .................................................................................................. 13 3.2.12. Lost Password or Forgetten Password. .................................................................... 13 3.2.13. Banner Upload.......................................................................................................... 14 3.2.14. Administrator Login ................................................................................................. 15 3.2.15. Administrator Approves XML Files ........................................................................ 16 3.3. Non functional Requirements................................................................................... 18 3.3.1. Usability ........................................................................................................... 18 3.3.2. Reliability ......................................................................................................... 18 3.3.3. Performance ..................................................................................................... 18 3.3.4. Implementation Platform.................................................................................. 18 3.4. System Models ......................................................................................................... 19 3.4.1. Use Case ................................................................................................................... 19 3.4.2. Activity ..................................................................................................................... 20 3.4.3. Relational Database Schema .................................................................................... 21 3.4.4. Entity Relationship Diagram .................................................................................... 23 CHAPTER 4 ............................................................................................................................. 24 IMPLEMENTATION .............................................................................................................. 24 3.4.5. Comparison Shop System Structure ......................................................................... 24 4.1. index.php ...................................................................................................................... 24 3.4.6. config.php ................................................................................................................. 24 3.4.7. out.php ...................................................................................................................... 25 i 3.4.8. login.php ................................................................................................................... 25 3.4.9. database.php ............................................................................................................. 25 3.4.10. admin/index.php ....................................................................................................... 26 3.4.11. admin/database.php .................................................................................................. 26 3.4.12. include/function.php ................................................................................................. 26 3.4.13. include/geshi.php...................................................................................................... 26 Handes the code coloringii LIST OF FIGURES Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 2.1.3.1 Architecture of a Standard Web Crawler ........................................................... 2 3.2.1Main Page .............................................................................................................. 3 3.2.2 Sorting Searched Products .................................................................................... 4 3.2.3Listing Products ..................................................................................................... 5 3.2.4 Narrowing Search Results .................................................................................... 6 3.2.5Going to Web Site for Selling a Product ............................................................... 7 3.2.6 Create New User ................................................................................................... 8 3.2.7 User Login ............................................................................................................ 9 3.2.8 Adding Product by Hand .................................................................................... 10 3.2.9 Adding Product by Uploading XML File ........................................................... 11 3.2.10 Viewing XML File Uploads ............................................................................. 12 3.2.11 Viewing XML File............................................................................................ 13 3.2.12 Lost Password or Forgetten Password .............................................................. 13 3.2.13 Banner Upload .................................................................................................. 14 3.2.14 Administrator Login ......................................................................................... 15 3.2.15 Administrator Approves XML Files ................................................................. 16 Figure 3.4.1.1 User Service ...................................................................................................... 19 Figure 3.4.1.2 Brand Management Services ............................................................................ 19 Figure 3.4.1.3 Crawler System ................................................................................................. 19 Figure 3.4.1.4 Store Management Service ............................................................................... 20 Figure 3.4.4.5 Entity Relationship Diagram ............................................................................ 23 iii LIST OF TABLES Table Table Table Table Table Table Table Table Table 3.4.3.0.1 Categories Table............................................................................................. 21 3.4.3.0.2 Crawl Table .................................................................................................... 21 3.4.3.0.3 Users Table..................................................................................................... 21 3.4.3.0.4 Favorite Table. ............................................................................................... 22 3.4.3.0.5 Search Table. .................................................................................................. 22 3.4.3.0.6 Sellers Table. .................................................................................................. 22 3.4.3.0.7 Products Table. ............................................................................................... 22 3.4.9.1 Document Type Definition ............................................................................... 25 3.4.9.2 Example of a well defined XML file. .............................................................. 26 iv LIST OF SYMBOLS AND ABBREVIATONS DBMS Database Management System GUI Graphical Use Interface URL Uniform Resource Locator PHP Personnel Home Page IDE Integrated Development Environment v CHAPTER 1 INTRODUCTION 1.1. Project Overview and Purposes Comparison Shopping Site System is a web based application site that collects information of the products from different shopping sites, and serves these information to the users. So the users can find products very fast and easier with using our comparison shop site system instead of investigating lots of shopping sites. Comparison shopping site system is composed of two main parts. One of them is crawling part that visits the shopping sites, gathers information about products and store them into database. Other part is front-end part that serves information to the customers. The main purpose of the project is to help the customers to find and buy products by comparing lots of products from different shopping sites. It allows a smart searching in order to increase the luck of finding products that are searched by user. The other purpose of the projects is comparing the products by their prices, in order to help the users to find the cheapest product. 1.2. Scope of Project This system can be used by everyone who wants to make an online shopping. Generally people can use this system to compare a product in different shopping sites. 1.3. Success Criteria of Project Two main criteria that make this system successful are to; 1. Provide time saving for customers to find information of products from thousands of shopping sites instead of visiting one by one. 2. Provide money saving for customers to compare the prices of products from different shopping sites. 1 CHAPTER 2 BACKGROUND AND MOTIVATION From the late 1990s, the range of information, products, and services available on the internet grew massively. At the same time, the popularity of the Internet also grew at a phenomenal rate. So, the internet became very beneficial platform in terms of making life easy. Online shopping is one aspect of internet that makes life easy and convenience. Online shopping is important because it offers buyers convenience that has never before been achievable. The technology that is now available allows customers to shop on the internet 24 hours a day and seven days a week, without having to leave their homes or offices. Shoppers are provided with an abundance of merchant sites where almost any goods on earth can be bought. Consumers can also compare prices from a variety of different retailers with greater ease, compared to them physically going to shop in a built shopping centre to check prices. Nowadays, there is a new concept that is more popular that online shopping sites are Comparison Shopping Sites. The mission of these kinds of sites is to help consumers anywhere use the power of information to easily find, compare and buy anything online – in less time and for the best price. There are some problems on comparison shopping sites. The most important problem is that to gather product information into database of comparison sites and update the database regularly. Programmers produce different systems in order to solve this problem. One kind of this system is called web crawler. Web crawler is a program that browses web pages and filters needed information from these pages. Web crawlers are the core part of the searching process. 1 Figure 2.1.3.1 Architecture of a Standard Web Crawler 2 CHAPTER 3 PROPOSED SYSTEM 3.1. Project Overview Comparison Shopping Site System is a web based application site that collects information of the products from different shopping sites, and serves these information to the users. So the users can find products very fast and easier with using our comparison shop site system instead of investigating lots of shopping sites. Comparison shopping site system is composed of two main parts. One of them is crawling part that visits the shopping sites, gathers information about products and store them into database. Other part is front-end part that serves information to the customers. 3.2. Functional Requirements 3.2.1. Searching A Product From The Web Site Searching a product from indirim.com is very simple process for users. In the search panel, users will enter search key into the textbox; the search process will start, after pressing the “ARA” button. Figure 3.2.1Main Page 3 3.2.2. Sorting Searched Products User will sort products in terms of product names and product prices. It is also very simple process for users. There are two sorting links; “ürün” and “fiyat”. User will click one of sorting links and sorting process will be automatically done by system. The products are placed unsorted after an search. When user clicks one of the sorting links, the system will sort products by ascending order. After clinking the link second time, the system sorts products by descending order automatically. Figure 3.2.2 Sorting Searched Products 4 3.2.3. Listing Products Twenty products are shown per page. User will check all products page by page by selecting page number from the bottom menu of the web site. Figure 3.2.3Listing Products 5 3.2.4. Narrowing Search Results User will narrow search results in terms of price interval, product category, online seller sites. There are 2 menu groups left side of the page for this process. The system allows multiple narrowing criteria. For example, the user can narrow search of prices between 20 YTL and 300 YTL, after he/she narrowed the search results he/she can also filter an specific online seller site like “alisveris.com”. So, the results contain the price between 20 YTL and 300 YTL and only from “alişveris.com”. Figure 3.2.4 Narrowing Search Results 6 3.2.5. Going to Web Site for Selling a Product After the user searches and finds the right product to buy, the user may click the name of product for going to the original web site. The system sends the user to the online shopping site to buy. Buying process will be done in the real site of the product. Figure 3.2.5Going to Web Site for Selling a Product 7 3.2.6. Create New User A registered user has the opportunity to add his own products to our database by uploading or entering by hand and many more services are provided. Registering to site is easy. Enter your username, a password, name, surname and an email will do. Figure 3.2.6 Create New User 8 3.2.7. User Login For a user to benefit from the services the user needs to login. User enters his/her username and password to login. Figure 3.2.7 User Login 9 3.2.8. Adding Product by Hand The user can add his/her own product by using the form provided to the registered users. The form should be filled with the products Name, Shortdescription, Longdescription, Uppercategory, Category, Url, Imageurl, Price, Pricevat, Currencyunit and a Shortname. Figure 3.2.8 Adding Product by Hand 10 3.2.9. Adding Product by Uploading XML File Click Upload XML from the left menu then browse the xml file finally clicking Upload XML button will upload the xml to the server. Figure 3.2.9 Adding Product by Uploading XML File 11 3.2.10. Viewing XML File Uploads. The user can see his/her uploaded xml files from the “List View” menu. Figure 3.2.10 Viewing XML File Uploads 12 3.2.11. Viewing XML File. The user can see his/her uploaded xml files from the “List View” menu. Then choosing View link from the list. Figure 3.2.11 Viewing XML File 3.2.12. Lost Password or Forgetten Password. The user can see his/her retrieve a new password from the system by entering his/her email. Figure 3.2.12 Lost Password or Forgetten Password 13 3.2.13. Banner Upload. User may give adverdtesment by uploading a .gif file. Figure 3.2.13 Banner Upload 14 3.2.14. Administrator Login The Administrator Login System controls all the events, the user information and the hole system. Figure 3.2.14 Administrator Login 15 3.2.15. Administrator Approves XML Files User uploaded XML datas should be approved by the administrator. Administrator may delete, publish or see the contents. Figure 3.2.15 Administrator Approves XML Files 16 General Functional Requirements: 1. The crawler part of the system will gather information about products from different shopping sites. First of all, the system visit an online shopping site, find all page links as URL for this site and save these URLs into database. Then, it checks all URLs and takes the necessary information of the products such as name, URL address, image address, price, money unit and tax. 2. The products will be categorized into database. 3. The users will be able to search products. 4. Result of the search will be shown in a table format that contains product image, name, price and source URL. 5. There will be smart search feature. This feature is provided by using full-text search functions in MySQL. 6. The site will be updated frequently. New and nearly come products will be shown highlighted in the site. 7. The searched products stored into database with a hit point value. The most frequently searched products will be offered to customers in the site. 8. Users can compare price of a product into different shopping sites and sort them by their prices. Functional Requirements Related with User Services: 1. The site will have services for specific users. 2. The users will be able to see their search history. 3. The users will be able to create their favorite product and brand list. 4. According to the users’ favorite product lists, the site will offer and inform them about new, cheapest, nearly coming products and new brands via e-mail or cell-phone. 5. According to the users’ favorite product lists, the site will inform them about products when a change occurs to the price of products via e-mail or cell-phone. Functional Requirements Related with Store Services 1. Online shopping stores will add their store links into our database to be crawled. 2. Online shopping stores will add their advertisements into our site. 3. Online shopping stores will add their products and prices into our database directly. 17 Functional Requirements Related with Brand Services 1. The site will have an advertisement control system. 2. Brand owners, factories, firms add their advertisement into the site. 3.3. Non functional Requirements 3.3.1. Usability The system has a simple user interface that makes it user-friendly. In addition to this, pages loaded very fast because of simplicity. 3.3.2. Reliability If the crawling process is interrupted or stopped because of some unexpected errors, the crawling process will resume from where the process left. The user services will be secure; the information that belongs to a user will be kept in safe. 3.3.3. Performance The simplicity makes the system to load the pages more fast which earns us the performance. 3.3.4. Implementation Platform Programming language: Comparison Shopping Site System will be implemented in PHP, MySQL. Development Environment: Comparison Shopping Site System will be implemented on a Windows PC with Apache Web Server and MySQL Database Server. The PHPEdit 2.12.2 will be used as an Integrated Development Environment (IDE). 18 3.4. System Models 3.4.1. Use Case Figure 3.4.1.1 User Service Figure 3.4.1.2 Brand Management Services Figure 3.4.1.3 Crawler System 19 Figure 3.4.1.4 Store Management Service 3.4.2. Activity Figure 3.4.2.1 Activity of User Sorting Product 20 3.4.3. Relational Database Schema Categories (ccategoryid, cname, cParentID, ckeywords, ccount) Products (productid, pname, paddress, ppictureaddress, pprice, pmoneyunit, ptaxincluded, pdescription, categoryid, pmodificationtime, sellerid, hitout) Sellers (sellerid, fullname, saleinternet, username, password, hitout, crawlwait, crawling, startedcrawling, finishedcrawling, crawledproductcount, crawlcount) Search (searchtext, counter) Crawl (cid, csellerid, caddress, caddresslabel, cinlinks, coutlinks, cvisited, cvisitdatetime, cpagelength, cparentpage) Users (userid, username, password, usermail, phonenumber) Favorite (userid, productid) Categories Table. Field Name ccategoryid cname cParentID ckeywords ccount Field Description Category ID Category Name Category Parent ID Category keywords Count the number of products in the category Table 3.4.3.0.1 Categories Table Crawl Table. Field Name cid csellerid caddress caddresslabel cinlinks coutlinks cvisited cvisitdatetime cpagelength cparentpage Field Description Crawler ID Seller ID Crawling Link Address Crawling Link Address Label Crawling Link on Comparison Shopping System Crawler Link out crawled URL Crawler Link is visited Crawler visited the URL on which date and time Web Page length in KB. The referrer Table 3.4.3.0.2 Crawl Table Users Table. Field Name userid username password usermail phonenumber Field Description User ID User name User password User mail User phone number Table 3.4.3.0.3 Users Table. 21 Favorite Table. Field Name userid productid Field Description User ID Product ID Table 3.4.3.0.4 Favorite Table. Search Table. Field Name searchtext counter Field Description Searched Text How many time the user searched this text Table 3.4.3.0.5 Search Table. Sellers Table. Field Name sellerid fullname saleinternet username password hitout crawlwait crawling startedcrawling finishedcrawling crawledproductcount crawlcount Field Description Seller Seller full name Internet address of the seller Seller user name to login Seller password to login Number of times the Seller link is clicked Seller is in the wait status Seller is in the crawling status Seller is in the started crawling status Seller is in the finished crawling status Seller’s number of products crawled Seller number of crawled Table 3.4.3.0.6 Sellers Table. Products Table. Field Name productid pname paddress ppictureaddress pprice pmoneyunit ptaxincluded pdescription categoryid pmodificationtime sellerid hitout Field Description Product ID Product’s Name Product’s Address Product’s Picture Address Product’s Price Product Money Unit (USD, YTL, EUR) If the Product’s Tax Included or Not Product’s Description Product’s Category Id Product Modified Time Product Seller Id Number of times the product link is clicked Table 3.4.3.0.7 Products Table. 22 3.4.4. Entity Relationship Diagram Figure 3.4.4.5 Entity Relationship Diagram 23 CHAPTER 4 IMPLEMENTATION This section of report includes implementation details of Comparison Shop Site system. Comparison Shop System Structure 4.1. index.php index.php file is the main file of the system. When the web site is opened, this file is run firstly. All general applications are done by this php file. search() – This function takes the search key split it into words and call formQuery() function in order to create an SQL query for search. formQuery() – This function takes the words and return an SQL query for search. listing() – This function takes the created SQL query runs it and show the information in the screen. sayfalama() – This function calculate the number of pages that the search results are shown. splitText() – This function splits search key into words if the search key is composed of two or more words. filterWords() – This function filters the search key. If they have bad words, the search process returns zero number of products. updateSearches() – This function calculate how many times a search key is searched. If a word is searched, function inserts the key into database or increments its counter 4.2. config.php config.php file provides database connection. db_connect() – This function provides a database connection between the system and MySQL Database System. 24 4.3. out.php Out.php file sends the user into the products web page. 4.4. login.php Shows login form for normal user to login. See Figure 3.2.7. 4.5. database.php Manages user requires. Like upload XML file. The XML files has the definition as on table <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE PRODUCTCATALOG[ <!ELEMENT PRODUCTCATALOG (PRODUCTS)+> <!ELEMENT PRODUCTS (NAME,SHORTDESCRIPTION?,LONGDESCRIPTION?, UPPERCATEGORY?,CATEGORY?,URL,IMAGEURL?,PRICE, PRICEVAT,CURRENCYUNIT,SHORTNAME)> <!ELEMENT NAME ( #PCDATA )> <!ELEMENT SHORTDESCRIPTION ( #PCDATA )> <!ELEMENT LONGDESCRIPTION ( #PCDATA )> <!ELEMENT UPPERCATEGORY ( #PCDATA )> <!ELEMENT CATEGORY ( #PCDATA )> <!ELEMENT URL ( #PCDATA )> <!ELEMENT IMAGEURL ( #PCDATA )> <!ELEMENT PRICE ( #PCDATA )> <!ELEMENT PRICEVAT ( #PCDATA )> <!ATTLIST PRICEVAT kdv CDATA #REQUIRED > <!ELEMENT CURRENCYUNIT ( #PCDATA )> <!ELEMENT SHORTNAME ( #PCDATA )> ]> Table 4.5.1 Document Type Definition 25 An Example file: <?xml version="1.0" encoding="ISO-8859-9" ?> <!DOCTYPE PRODUCTCATALOG (View Source for full doctype...)> <PRODUCTCATALOG> <PRODUCTS> <NAME>TECRA A8-103 INTEL CORE 2 DUO T5500 1.66Ghz 1GB 100GB TAŞINABİLİR BİLGİSAYAR</NAME> <SHORTDESCRIPTION>TECRA A8-103 INTEL CORE 2 DUO T5500 1.66Ghz 1GB 100GB TAŞINABİLİR BİLGİSAYAR</SHORTDESCRIPTION> <LONGDESCRIPTION>Intel® Core 2 Duo T5500(1.66 GHz, 2MB L2 cache, 667 MHZ FSB), Intel® PRO/Wireless 3945ABG ağ bağlantısı ve Intel® 945 GM chipset Standard : 1.024 MB (2x512), Maximum : 4,096 MB Teknoloji : DDR2 RAM (533 Mhz) 100 GB (5.400 rpm) Seri ATA HDD Microsoft® Windows® Vista Business Edition Türkçe / İngilizce DVD Super Multi (DVD±R/RW, DVD-RAM) çift katmanlı sürücü </LONGDESCRIPTION> <UPPERCATEGORY>Bilgisayar > Taşinabilir Bilgisayar</UPPERCATEGORY> <CATEGORY>Bilgisayar</CATEGORY> <URL>http://www.alisveris.com/asp/show_stock.asp?product=1503266870</URL> <IMAGEURL>http://www.alisveris.com/content_files/prd_images/223K.JPG</IMAGEURL> <PRICE>1.399,00</PRICE> <PRICEVAT kdv="18">1.576,08</PRICEVAT> <CURRENCYUNIT>USD</CURRENCYUNIT> <SHORTNAME>alisveris.com</SHORTNAME> </PRODUCTS> </PRODUCTCATALOG> Table 4.5.2 Example of a well defined XML file. 4.6. admin/index.php Shows login form for administrator user to login. See Figure 3.2.14. 4.7. admin/database.php Manages administrator requires. 4.8. include/function.php Includes all the functions used by the database.php classes. 4.9. include/geshi.php Handes the code coloring. 26 CHAPTER 5 CONCLUSION To summarize; Comparison Shop Site System is a web crawler e-commerce site. PHP and MySQL technologies will be used in this project. Basically, the system will do such thing; Web crawler application will visit different online shopping sites, gather product information from these sites automatically, and store this information into database. In addition to this, Comparison Shop Site System will serve comparison feature to users. Then, the information about products in database will be shown by searching. The results will be compared in different shopping sites in terms of price. So, this helps online shopping customers to find and buy cheapest products via internet fast and easily. 27 CHAPTER 6 GLOSSARY Web Crawler: A web crawler (also known as a web spider or web robot) is a program or automated script which browses the World Wide Web in a methodical, automated manner. Many sites, in particular search engines, use spider as a means of providing up-to-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a website, such as checking links or validating HTML code. Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam). PHP: PHP is a reflective programming language originally designed for producing dynamic web pages. PHP is used mainly in server-side scripting, but can be used from a command line interface or in standalone graphical applications. PHP is a widely-used general-purpose scripting language that is especially suited for Web development and can be embedded into HTML. PHP generally runs on a web server, taking PHP code as its input and creating Web pages as output. MySQL: MySQL is a multithreaded, multi-user SQL database management system. MySQL is popular for web applications. Its popularity for use with web applications is closely tied to the popularity of PHP. PHPEdit: PHPEdit is a commercial IDE developed by WaterProof SARL. It is written in Delphi and runs on the Microsoft Windows operating system, and is designed mainly for the PHP language, but supports many other languages such as CSS, HTML, JavaScript, INI, PHPEditScript, PHP, PlainText, SQL, XML, and XSLT. 28 CHAPTER 7 REFERENCES 1- Web Crawler http://en.wikipedia.org/wiki/Web_crawler 2- Online Shopping http://wiki.media-culture.org.au/index.php/Online_Shopping 3 Some Comparison Shopping Site Example www.shopping.com www.shopzilla.com www.bizrate.com www.pricegrabber.com www.smarter.com www.nextag.com www.become.com 29