E-Commerce Technology e-Commerce Technology Overview Commerce (8000 B.C.) BUYER LOCATES SELLER SELECTION OF GOODS NEGOTIATION SALE PAYMENT DELIVERY INFORMATION PHYSICAL POST-SALE ACTIVITY Electronic Commerce (2002) SOME TECHNOLOGIES USED: SEARCH ENGINE ON-LINE CATALOG RECOMMENDER AGENT SOME INFORMATION GATHERED: BUYER LOCATES SELLER SEARCH BEHAVIOR BROWSING BEHAVIOR CUSTOMER PREFERENCES CONFIGURATOR SHOPPING BOT SELECTION OF GOODS EFFECTIVENESS OF PROMOTIONS BARGAINING STRATEGIES AGGREGATOR AUTOMATED AGENTS TRANSACTION PROCESSOR NEGOTIATION PRICE SENSITIVITIES PERSONAL DATA SALE MARKET BASKET DATA INTERCHANGE CRYPTOGRAPHY PAYMENT E-PAYMENT SYSTEMS TRACKING AGENT CREDIT/PAYMENT INFORMATION DELIVERY REQUIREMENTS DELIVERY ON-LINE PROBLEM REPORTS ON-LINE HELP INFORMATION PHYSICAL BROWSER SHARING INTERNET TELEPHONY POST-SALE ACTIVITY CUSTOMER SATISFACTION FOLLOW-ON SALES OPPORTUNITIES The Electronic Marketplace BUYER LOCATES SELLER SELECTION AGENT CREDIT FILE SELECTION OF GOODS NEGOTIATION ORDER TRACKING INSTALL BID PREP SALE PAYMENT DATA ANALYSIS DIRECT SELL SECURE PAYMENT DELIVERY DELIVERY POST-SALE ACTIVITY CRM The eCommerce Process • Buyers and sellers find each other – Communication (via Networking, the Internet, and Web-Based Information Architectures) – Human-Computer Interaction, Multimedia – Intermediaries • Negotiation The eCommerce Process • Transaction – Transaction processing, Databases – Electronic Payment Systems, – Computer Security, – eCommerce Architecture • Order fulfillment – Manufacture (manufacturing systems) – Delivery (tracking systems) – Supply Chain Management The eCommerce Process • Post-sale events – Customer Service and Help Facilities – Reorder, restock • Accounting – Transaction processing – Interoperability between online and legacy systems • Data analysis – Data Mining eCommerce Technology • • • • • Infrastructure Wireless technologies Search engines Access security Data interchange • Cryptographic security • • • • • Electronic payments Content delivery Intelligent agents Data mining Mass personalization E-Commerce Infrastructure • What worldwide structure is required to support eCommerce? • Network • Machines • Protocols • Security • Payment How Does an Optical Fiber Transmit Light? • • • Suppose you want to shine a flashlight beam down a long, straight hallway. Just point the beam straight down the hallway -- light travels in straight lines, so it is no problem. What if the hallway has a bend in it? You could place a mirror at the bend to reflect the light beam around the corner. What if the hallway was very winding with multiple bends? You might line the walls with mirrors and angle the beam so that it bounces from side-to-side all along the hallway. This is exactly what happens in an optical fiber. Client/Server Architecture • • • • Fundamental Internet structure Client requests service; server provides it Data exchanged only through real-time messages Server may become a client to a different server 1 Server 2 responds to client 1 Client 1 requests service from server 2 The Internet 2 Client 2 requests service from server 3 3 Server 3 responds to client 2 Routers NORTEL 3COM CISCO Router Tables Internet Server • The server is the heart of the technical architecture, receiving requests from Internet users, retrieving the information locally or from networked devices and replying. • Selection and sizing of this machine is critical task, typically presenting a tradeoff between performance and cost. Web Server Web server - A Web server is a piece of computer software that can respond to a browser's request for a page, and deliver the page to the Web browser through the Internet. You can think of a Web server as an apartment complex, with each apartment housing someone's Web page. In order to store your page in the complex, you need to pay rent on the space. Pages that live in this complex can be displayed to and viewed by anyone all over the world. Your landlord is called your host, and your rent is usually called your hosting charge. Every day, there are millions of Web servers delivering pages to the browsers of tens of millions of people through the network we call the Internet. UNIX v.s. NT • • • • • The two basic options are – UNIX based platforms(IBM, Sun, HP) – Microsoft NT based, Intel platforms MS products generally cost less than UNIX platforms. UNIX is a more mature OS than NT. As a result it delivers a better performance for the same hardware configuration. UNIX administration, requires more complex skills. If you don’t have in-house UNIX expertise, investing in an UNIX based server may require a large maintenance cost. Server Workflow Client / Server In general, all of the machines on the Internet can be categorized as two types: • Server • Clients Those machines that provide services (like Web servers or FTP servers) to other machines are servers. And the machines that are used to connect to those services are clients. When you connect to Yahoo at www.yahoo.com to read a page, Yahoo is providing a machine (probably a cluster of very large machines), for use on the Internet, to service your request. Yahoo is providing a server. Your machine, on the other hand, is probably providing no services to anyone else on the Internet. Therefore it is a user machine, also known as a client. It is possible and common for a machine to be both a server and a client, but for our purposes here you can think of most machines as one or the other Client / Server • A server machine may provide one or more services on the Internet. – For example, a server machine might have software running on it that allows it to act as a Web server, an e-mail server and an FTP server. – Clients that come to a server machine do so with a specific intent, so clients direct their requests to a specific software server running on the overall server machine. • For example, if you are running a Web browser on your machine, it will most likely want to talk to the Web server on the server machine. • Your e-mail application will talk to the e-mail server, and so on... Understanding a simple Email Server • • The simplest possible e-mail server might look like this: It would have a list of e-mail accounts, – – • It would have a text file for each account in the list. – • • • • • • with one account for each person who can receive e-mail on the server. My account name bozdogan, John Smith's might be jsmith, and so on. So the server would have a text file in its directory named bozdogan.TXT, another named JSMITH.TXT, and so on. When someone wants to send me a message, the person composes a text message (“Barbaros, Can we have lunch Monday? John") in an e-mail client, and indicates that the message should go to bozdogan. When the person presses the Send button, the e-mail client would attach to the e-mail server and pass to the server the name of the recipient (bozdogan), the name of the sender (jsmith) and the body of the message. The server would format those pieces of information and append them to the bottom of the MBRAIN.TXT file. The entry in the file might look like this: From: jsmith To: mbrain Marshall, Can we have lunch Monday? John Email • There are several other pieces of information that the server might save into the file, – – – • • • • • • The time and date of receipt and a subject line, but overall you can see that this is an extremely simple process! As other people send mail to bozdogan, the server would simply append those messages to the bottom of the file in the order that they arrive. The text file would accumulate a series of five or 10 messages, and eventually I would log in to read them. When I want to look at my e-mail, my e-mail client would connect to the server machine. In the simplest possible system it would: Ask the server to send a copy of the bozdogan.TXT file. Ask the server to erase and reset the bozdogan.TXT file. Save the bozdogan.TXT file on my local machine. Parse the file into the separate messages (using the word "From:" as the separator). Show me all of the message headers in a list. When I double-click on a message header, it would find that message in the text file and show me its body. – You have to admit that this is a VERY simple system. Surprisingly, the real e-mail system that you use every day is not much more complicated than this! Understanding the Real Email System • For the vast majority of people right now, the real e-mail system consists of two different servers running on a server machine. – One is called the SMTP Server, where SMTP stands for Simple Mail Transfer Protocol. The SMTP server handles outgoing mail. – The other is a POP3 Server, where POP stands for Post Office Protocol. • The POP3 server handles incoming mail. • The SMTP server listens on well-known port number 25, while POP3 listens on port 110 A typical Email Server looks like this: Understanding SMTP • • Whenever you send a piece of e-mail, your e-mail client interacts with the SMTP server to do the sending. The SMTP server on your host may have conversations with other SMTP servers to actually deliver the e-mail. Understanding SMTP • I sent an email using outlook express to Johnny • Outlook Express connects to the SMTP server at mail.mercynet.edu using port 25. Outlook Express has a conversation with the SMTP server. Outlook express tells the SMTP server the address of the sender and the address of the recipient, as well as the body of the message. The SMTP server takes the "TO" address (for example, jsmith@mindspring.com) and breaks it into two parts: 1) the recipient name (jsmith) and 2) the domain name (mindspring.com). Since the recipient is at another domain, SMTP needs to communicate with that domain. The SMTP server has a conversation with a Domain Name Server and says, "Can you give me the IP address of the SMTP server for mindspring.com?" The DNS replies with the one or more IP addresses for the SMTP server(s) that Mindspring operates. The SMTP server at mercynet.edu connects with the SMTP server at Mindspring using port 25. It has the same simple text conversation that my e-mail client had with the SMTP server for Mercynet.edu, and gives the message to the Mindspring server. The Mindspring server recognizes that the domain name for jsmith is at Mindspring, so it hands the message to Mindspring's POP3 server, which puts the message in jsmith's mailbox. • • • • SMTP • The actual conversation that an e-mail client has with an SMTP server is incredibly simple and human readable. It is specified in public documents called Requests For Comments (RFC) (see the links section) and a typical conversation might look something like this: • • • • • • • • • • • • • helo test 250 mx1.mindspring.com Hello abc.sample.com [220.57.69.37], pleased to meet you mail from: test@sample.com 250 2.1.0 test@sample.com... Sender ok rcpt to: jsmith@mindspring.com 250 2.1.5 jsmith... Recipient ok data 354 Enter mail, end with "." on a line by itself from: test@sample.com to:jsmith@mindspring.com subject: testing John, I am testing... . 250 2.0.0 e1NMajH24604 Message accepted for delivery quit 221 2.0.0 mx1.mindspring.com closing connection Connection closed by foreign host. – What the e-mail client says is in red, and what the SMTP server replies with is in green. The e-mail client introduces itself, indicates the from and to addresses, delivers the body of the message and then quits. Domain Name Servers • If you spend any time on the Internet sending email or browsing the web, then you use Domain Name Servers without even realizing it. • Domain Name Servers, or DNS, are an incredibly important but completely hidden part of the Internet, and they are fascinating! • The DNS system forms one of the largest and most active distributed databases on the planet, and without DNS the Internet would shut down very quickly. DNS Resolution Root server Request for uchicago.edu DNS IP number Internet 128.135.4.2 Request for gsbkip IP number Local DNS server Internet GSB DNS server 128.135.4.2 128.135.130.201 Request for gsbkip.uchicago.edu IP number 128.135.130.201 File request Internet Desktop File returned Enterprise Web Server http://www.stamey.nu/DNS/DNSHowItWorks.asp The Basic Idea • For example, the machine that humans refer to as www.mercynet.edu has an IP address of 216.27.61.137. Every time you use a domain name, you use the Internet's domain name servers (DNS) to translate the human-readable domain name into the machine-readable IP address. • During a day of browsing and emailing, you might access the domain name servers hundreds of times! Web Architecture How are web sites constructed? TIER 4 Database TIER 3 Applications TIER 2 Server TIER 1 SOURCE: INTERSHOP Firewall • The firewall is typically a hardware/software combination that controls the traffic between your internal network and the public internet. • Although a firewall can be directly incorporated into an Internet server, it is most commonly a specialized computer. • The configuration is a challenging task and should be performed by experts. Firewall As you can see all inbound and outbound Internet traffic must pass through the firewall eCommerce Data Exchange Needs RFQs Ship Notices Catalogs Letters of Credit Quotations Purchase Orders Electronic Payments Bills of Lading Invoices Data Interchange • How can sites exchange information without prior agreement? – What do the data fields mean? price, extended price, unit price, prix, цена, τιμή, 값, X’AC12’ – XML: Extensible Markup Language • How can the content be separated from form (visual appearance)? • How can data formats and structures be communicated? – What does the hex string “65436F6D6D65726365” mean? – ASN.1, Basic Encoding Rules (BER) Invoice Example <UnitPrice>6.05</UnitPrice> SOURCE: PROF. JEROME YEN How to Make Data Portable • Tell what the data means • Tell how the data is structured • Tell how it should look SO COMPUTERS CAN UNDERSTAND IT • BUT DO THESE SEPARATELY. MIXING IS BAD • • • • The meaning -- XML The structure -- DTD (document type definition) The formatting -- XSL (Extensible style sheet) Example: XML catalog structure XML at a glance Well Formed Document: <Book> <Author>George Soros</Author> <Title>The Crisis of Global Capitalism</Title> <Year>1998</Year> <Publ>Public Affairs</Publ> <Price>26.00</Price> <ISBN>1-891620-27-4</ISBN> </Book> DTD: Document Type Definition <?xml version="1.0"> <!DOCTYPE Book [ <!ELEMENT Book (Author, Title, Year, Publ, Price, ISBN)> ]> SOURCE: PROF. JEROME YEN XML Recipe Example <?xml version="1.0"?> <Recipe> <Name>Apple Pie</Name> <Ingredients> <Ingredient> <Qty unit=pint>1</Qty> <Item>milk</Item> </Ingredient> <Ingredient> <Qty unit=each>10</Qty> <Item>apples</Item> </Ingredient> </Ingredients> <Instructions> <Step>Peel the apples</Step> <Step>Pour the milk into a 10-inch saucepan</Step> <!-- And so on... --> </Instructions> </Recipe> Electronic Payment Systems Electronic Payments • Forms of money – token (cash), notational (bank account), hybrid (check) • Money does not move on the Internet • Credit-card transactions – Secure protocols: SSL, SET • • • • • • Automated clearing and settlement systems Smart cards Electronic cash, digital wallets Micropayments Electronic delivery of goods Electronic bill presentment and payment – BlueGill Intelligent Agents • • • • • Programs to perform tasks on your behalf Metasearchers, shopping bots, news agents, stock agents, auction bots, bank bots How to make agents “intelligent” – Rule-based systems – Knowledge representation Agents that learn – Inductive inference Negotiation agents Avatars (characters in human form) SYLVIE from VPERSON Shopping Agents Data Mining • Extracting previously unknown relationships from large datasets • Discovery of patterns • Predicting the future – past behavior best predictor of future purchasing • Market basket analysis – diapers/beer Data Mining Tools • Visualization (“seeing” the data) Table Lens • Predictive Modeling • Database Segmentation – Classify the users • Link Analysis – Associations discovery • Neural networks – Systems that learn from data • Deviation Detection – Are any of the data unusual? Fraud detection Data Mining • Extracting previously unknown relationships from large datasets – discover trends, relationships, dependencies – make predictions – target customers • In eCommerce, data comes from – – – – – – customers themselves cookies external databases data matching DoubleClick, etc. Digital rights management tools (what we read and how much) – library records Taxonomy of Data Mining Methods Data Mining Methods Predictive Modeling • Decision Trees • Neural Networks • Naive Bayesian • Branching criteria Database Segmentation Link Analysis Text Mining Deviation Detection Semantic Maps • Clustering • K-Means Rule Associa tion Visualization SOURCE: WELGE & REINCKE, NCSA Predictive Modeling • Objective: use data about the past to predict future behavior • Sample problems: – Will this (new) customer pay his bill on time? (classification) – What will the Dow-Jones Industrial Average be on October 15? (prediction) • Technique: supervised learning – decision trees – neural networks – naive Bayesian Mass Personalization Mass Personalization • Treating each user as an individual – key is INFORMATION • How to acquire and store information about customers – – – – Cookies Question and response Clickstream analysis External databases. • How to use information effectively and instantly • Personalization technology Outline • What is personalization? • Personalization is based on data • How can data about people be acquired? – From people themselves – From their clickstream – From outside data sources • How can data be used – To improve the customer’s experience? – To help the company? What is Personalization? • Addressing customers by name and remembering their preferences • Showing customers specific content based on who they are and their past behavior • Empowering the customer. Examples: Land’s End, llbean • Product tailoring. Example: dell.com • Connecting to a human being when necessary. CallMe – Adeptra TeleBanner • Allowing visitors to customize a site for their specific purposes • Users are 20%-25% more likely to return to a site that they tailored (Jupiter Communications, Inc.) Need For Personalization • In the real-world – Customer relationship is mediated by people – Personalization is critical: PEOPLE are PEOPLE • On the Web – – – – Too many customers; too few employees Orders are entered by machine; follow-up is by machine Customer relationship is mediated by machines Personalization is critical • Uniqueness (everyone is different) • Efficiency (everyone has limited time) Store Visitors in the Real World • Casual store visitor: – no intention of buying • Prospecting store visitor: DATA COLLECTED ONLY IF VISITOR BUYS SOMETHING – wants to buy, maybe not here • Add, marketing target: – in store because of ad or promotion • Customer: – – – – buys something pays cash uses a credit card uses a store charge card IDENTITY UNKNOWN PRODUCT/TIME KNOWN IDENTITY KNOWN IDENTITY, JOB, INCOME KNOWN Store Visitors in Cyberspace • Casual site visitor: – no intention of buying • Prospecting site visitor: CAN EASILY DETECT THE DIFFERENCE – wants to buy, maybe not here • Add, marketing target: – in store because of ad or promotion WE KNOW HOW HE GOT HERE AND WHAT HE WANTS TO BUY • Customer: – – – – buys something pays cash uses a credit card uses a store charge card WE HAVE HIS WHOLE FILE WE KNOW WHAT OTHER PEOPLE LIKE HIM ARE BUYING Click Behavior CASUAL VISITOR STORE HOME PAGE OFFICE PRODUCTS HOUSEWARES PRESENTATION ITEMS LASER POINTERS LASER 1 LASER 2 KITCHEN TOASTERS LASER 3 SPORTING GOODS HUNTING RIFLES GOLF CLUBS CALLAWAY Click Behavior PROSPECTING VISITOR STORE HOME PAGE OFFICE PRODUCTS HOUSEWARES PRESENTATION ITEMS LASER POINTERS LASER 1 LASER 2 KITCHEN TOASTERS LASER 3 SPORTING GOODS HUNTING RIFLES GOLF CLUBS CALLAWAY