IS 524 CORPORATE INFORMATION SYSTEMS SUPPLEMENTARY NOTES Chandra S. Amaravadi March 16, 2016 Part II Managing data resources (database I) Database planning and analysis (database II) E-Commerce ©Copyright 2014 by Chandra S. Amaravadi. All rights reserved. Electronic reproduction or distribution is strictly prohibited. MANAGING DATA RESOURCES Need for Databases Rapid access to information is a life-or-death issue for many organizations. For example a physician may want to know if an incoming patient is allergic to a particular drug or an airline company manager may want to know the forecasted flight occupancy on a given flight. Organization is key to accessing the data in a timely fashion. Real World Concepts: In the real world there are objects such as cars, animals, basketball games etc. Entities- Are individual examples of objects and things in the real world. A computer with a particular serial number, Bob’s car, the city of ‘Colorado Springs’ are examples of entities. Eclasses- Are collections of related entities. Students in Maryanne’s course, a collection of car parts sold in a store, vendors in the Midwest are examples of entity classes. Is the weather an entity class? Attributes – Are properties of eclasses (which are ultimately recorded in the database). A person’s hair color , a credit card#, processor speed and storage capacity are examples of attributes. Data Models A data model is an abstract view of the data. It captures relationships between two or more entity classes. For e.g. students take courses, ships have captains and products are placed on order. Here students and courses, ships and captains, products and orders are eclasses respectively. Three types of relationships are possible between two eclasses, a concept that is referred to as cardinality: 1:1 – Each instance of eclass A is associated with only one instance of eclass B and vice versa. E.g. ship: captain; company: CEO; country: president. A ship has only one captain and a captain is assigned only one ship. 1:M – Each instance of A is associated with many instances of eclass B. Each instance of B is associated with one instance of eclass A. Example, hospital: wards, department: courses, department: employees. A department has several employees but an employee belongs to only one department. M:N -- Each instance of A is associated with many instances of eclass B and vice versa. Example, students: courses; hospital: special treatments; advertisement: TV programs. Fig 1. Types of cardinality relationships among eclasses. 1 Cardinality is indicated by ‘crows foot’ notation. A crow’s foot indicates ‘many entities’ of an eclass. Note that the only basis for judging cardinality are the rules given above (not whether the eclass is singular or plural!). The Database Approach The overall concept here is that objects in the real world have a representation in the database. Organization deals with entity classes such as customers, employees, products, vendors, benefits etc. Not only that, eclasses have relationships among themselves (‘customers buy products’, ‘vendors supply raw materials’). Entity classes and relationships are depicted via data models as shown below. Once a data model is developed, the database can be quickly developed. In fact there is something of the software engineering life cycle that is present in the diagram below – planning, analysis, design and implementation. Which phases correspond to which here? Fig 2. The database approach Basic Organization Concepts: Data – Consists of raw facts e.g. a scoring average of 24 points per game, 30 items ordered, a building has ten floors. These are not meaningful unless we add a context. Information – Raw facts are not by themselves meaningful. Information can be thought of variously as data presented in a context or as a collection of raw facts. For example, we may have the following collection of facts about a person. Name: DOB: Profession: Address: Chris Nash 2-4-1965 Engineer 415 Walnut street. In database terminology this would be called a record and the individual facts would be attributes (consisting of attribute name and a value). This collection of facts makes the information meaningful. Attribute – Property of an eclass e.g. hair color, part#, parcel weight, expense amount etc. Primary key- Is a particular field or combination of fields that is used to uniquely identify any one given record within the database. For example part# identifies a part and account# identifies an account uniquely. When tables are listed as below, the primary key is generally underlined and is the left most attribute. Note 2 that a given primary key can occur only once in a table – that is what uniqueness means. File/table- Traditionally, a computer file was a group of characters, but here it is simply a collection of records. File and table are used interchangeably. Record -- A grouping of attribute values -- row within a table. The product table below has two records. Schema - The schema of a database describes the structure of the file. Overall organization – the database is organized into files which are organized into records and then into attribute values. The product file below has sku# as a primary key, a description, price and quantity. Description, price and qty (as well as SKU#) are attributes. The table has two records corresponding to the two rows. Product table/file sku# description 11967-4 crankshaft 11978-5 Cylinder ring price qty $250.00 400 $28.00 15 Fig 3. Basic organization of a database table Summary of Basic Organization Concepts: A database is a collection of files or tables. Each row describes one entity – it could describe a student named Jack. The row is called a record and describes an entity as stated. Attributes are columns in the table. Attributes store values that contain the information about a given attribute of an entity. Thus a record can also be thought of as a collection of attribute values. The primary key is used to uniquely identify each record in the database. Secondary keys are attributes other than primary key e.g. ‘Descr’, ‘Price’, ‘Quantity’ in the table below. The schema/structure of the database shows a list of tables and their attributes. The schema of the product table above consists of sku#, description, price and qty. Fig 4. Organization of a database table 3 DBMS DBMS is a software program such as Access and Oracle, to define, manage, retrieve and update the data. DBMS Activities The DBMS is a software program that allows users to physically manage the data in an organized fashion. This includes defining the tables, entering the data, getting reports and carrying out queries. Data definition This is used to define the structure or schema of the table. Essentially it is the process of identifying the attributes of a table and defining their data type i.e. whether it is numeric, currency, text etc. The primary key as well as allowable length for each attribute are also defined during this stage. Data entry A form is created to enter the data. The form has a title or heading, labels which are used to guide the data entry and fields where data is actually entered. A data entry form facilitates data entry. It can be used to check for valid data. Modification Data can be added, deleted or changed. For e.g. a new customer may be added, an inactive customer deleted or a customers address may be updated. Query Query refers to getting information from the database. There are two methods of querying data. QBE – QBE stands for Query By Example. This is a form used to retrieve the data. The form has the attributes from a table (e.g. a product table). The user can select attributes as needed (e..g product description and price) and additionally specify criteria for e.g. they can type “chair” under description column, to retrieve all products where the description is “chair”. Here the query is for products costing < $200, no fields are selected for output yet. 4 Fig 5. QBE for finding products costing < $200 SQL – SQL is a scripting language (4thGL) used to retrieve data from the database. The SQL statement has three parts: SELECT <attr1, attr2,….> FROM <table1, table2…> WHERE <condition1 AND/OR condition2 AND/OR …..> The ‘select’ part has a list of attributes desired in the output, separated by commas. The ‘From’ part has a list of tables also separated by commas, corresponding to the source of attributes. The ‘where’ part has a list of conditions. If there are multiple conditions they are linked by ‘AND’ or ‘OR’. The example below shows a reservation table with Flt#, Pname and Confirm# all as combined primary keys and it shows three records. The SQL query below the table is selecting Flt# and Confirm# for output with the condition “where Pname = Smith”. Fig 6. Query on reservation table and query result Multi-table retrieval – When there are multiple tables, there are three changes to the SQL format. In the ‘select’ part, attributes are preceded by name of table e..g prod.prod#. (what are the three parts of an SQL query?) The ‘From’ part will have a list of all the tables separated by commas. Thirdly in the ‘where’ part values of the common key are equated. For e.g. if ‘cust#’ is a common key between the ‘customer’ and ‘order’ tables, the ‘where’ part of a query involving ‘cust’ and ‘order’ tables will have …Where cust.cust# = ord.cust#. See if you can use these principles to write a query to LIST Employee name, title, salary, and department name for the table below. What is the result of the query? dept 5 d_no 142 230 345 467 d_name Manugistics IMS Pilot InfoSec d_mgr_ssn 967-89-8898 979-99-0045 978-64-8005 998-98-0967 d_phone 845-9878 989-0087 787-9934 884-5768 emp e_ssn 956-34-8895 959-66-6785 967-89-8898 e_name Smith Johnson Weintraub e_title Developer Analyst Manager e_salary 35000 27000 60000 Fig 7. Employee and department tables Reports – Data is commonly obtained from the database in the form of reports. Reports enable the data to be formatted according the needs of the user (this is information). A query produces raw unformatted data. A report is specified with a report specification. This consists of report and page headers, detail line (this has rows from the database), page and report footers. Fig 8. A report specification. Importance of DBMSs: Like many information systems, DBMSs have both an operational and strategic usage. Operationally, DBMSs are useful in retrieving stored facts. ‘Was check# 1013 cashed?’ ‘How many units of product with SKU#112393 are remaining in the store?’ ‘Is a white refrigerator manufactured by whirlpool available for less than $1000?’ But their strategic usage outweighs the advantages of such information. Strategically, information in databases can be mined to yield information such as ‘which brand and style of jeans are best sellers?’ ‘what type of employees make the best managers?’ ‘What size carpets have the highest volume of sales?’ etc. This information is useful in making better decisions for production, hiring, retail etc. 6 DATABASE PLANNING The first step in the database development process is database planning. There are two main goals here, first to get an overview of the organization’s data requirements (but it is not to identify their data needs, this is done at a later stage). The second is to define the resource requirements for the overall development process. Resource requirements can include personnel, hardware, software and schedule. What hardware/software is required to ultimately implement the database? How many personnel? How long will the development process take? In addition, the scope of the project is also defined at this stage. The process starts with getting some background information on the organization first. This can include the number of employees, number of products, items in inventory etc. What is the purpose of getting this background information? The method used to carry out planning is called Business Systems Planning (BSP) or sometimes Enterprise Planning. The method is akin to planning for a library. Suppose you are asked to build a library, what information do we need to know? viz. the number of expected books and the areas they cover. These can be used to determine the size and layout of the building. BSP relies on the principle that activities generate or use data. For e.g. ordering raw materials requires checking available inventory. In BSP, the organization’s functions are first identified. Functions consist of processes and activities so these are also identified. Next the information used or generated by these activities are tabulated against the functions and processes. E.g. determining land use for a county administration uses information on tracts in the county. See table below that is a result of this exercise. This is called an Enterprise Analysis or a Planning Matrix. From the information classes (in the top row) an enterprise data model is also developed. See the second diagram below. Fig 2. An Enterprise Analysis Matrix (Planning Matrix) for a county admin. Fig 3. An enterprise data model from the information classes 7 The Enterprise Data model illustrates the following: 1. County has many tracts 2. County runs many conservation programs 3. A conservation program can be run on several tracts 4. A tract has many parcels 5. A parcel can be owned by more than one person 6. A tract has a single land use. The EA matrix along with the Enterprise Data Model are used to obtain an overview of the data requirements of the organization. The results of planning (what are these?) are used to determine the size and scope of the database as well as the resource requirements. 8 DATABASE ANALYSIS During the analysis stage of database development, (or requirements definition) the actual data requirements are identified. Generally this is done by examining “views” of the data. “Views” can include forms, reports etc. that exist within an organization. McDonald’s restaurants for e.g. have ‘waste sheets’ that list items and quantities that are wasted for the day. So it is an example of a view. ER modeling This diagramming method is used to depict relationships existing between the different entity classes in the real world that are of interest to the organization. This is same as a data model (and crudely similar to an Enterprise data model). The fundamental concepts are: Eclasses- Are collections of related entities. Student’s in Maryanne’s course, a collection of car parts sold in a store are examples of entity classes. Is the weather an entity class? Eclasses are modeled with a rectangle with the name of the eclass inside it. Fig 4. Modelling an entity class Attributes – Are properties of eclasses (which are ultimately recorded in the database). A person’s hair color , a credit card#, a gpa are examples. Attributes are modeled with ovals and labeled appropriately. The primary key is drawn as the left most attribute and the label is underlined. Attributes are shown only once. They are not duplicated except for the primary key. If a pkey is not given, it can be assumed. One challenge in modeling attributes is properly associating an attribute with an eclass – this is based on commonsense as there are no set rules. Does ‘fare’ belong with ‘flights’ or with ‘booking’? Fig 5. Modeling attributes with ovals Multi-valued attributes- attributes that can have more than one set of values per entity instance. For e.g. an employee can have multiple skills, a classic car can have two colors. What are the entity classes in each of those cases? Multi-valued attributes are generally represented with a double oval. Fare is a multi-valued attribute in the diagram below. Why is it multi-valued in this case? 9 Fig 6. A multi-valued attribute in an eclass Relationships – are logical and meaningful connections between two or more entity classes. E.g. students take courses or physicians treat patients. Relationships are modeled with the diamond symbol and labeled with the name of the relationships. Relationships can be two way (binary), three way or four way (more than that is also possible). The diagram is read left to right or top to bottom. Relationship attributes – sometimes an attribute does not belong to eclasses in a relationship, but with the relationship. For e.g. a distributor can place order with a book publisher. If distributor and publisher are taken to be the eclasses, the quantity of books ordered does not belong with the distributer or with the publisher. It is treated as a relationship attribute. When there are relationship attributes, a dotted rectangle is placed around the relationship symbol (diamond) and the attribute is shown coming off the relationship. See the diagram below. Here ‘date’ is a relationship attribute. Fig 7. Modeling relationships with relationship attributes Cardinality- Cardinality shows the number of entities that participate in a relationship, i.e. how many instances of an entity class correspond with how many instances of another entity class? For 1:1 cardinality each instance of the first entity class corresponds to one instance of the other entity class. For 1:M one instance of the first entity class can correspond to many instances of the second entity class. M:N cardinality means that each instance of either entity can correspond to many instances of the other entity class. You should recall this from earlier discussion. Max Cardinality Rule: When we have more than two entity classes in a relationship, maximum cardinality is used. Class/subclass relationship -- Eclasses can be grouped into classes and subclasses based on generalization/specialization of properties. For e.g. a sports car is a specialization of a car that seats two, has an aerodynamic shape etc. Here ‘car’ is a class and ‘sports car’ is a subclass of it. Class subclass relationships are modeled by an elongated hexagon. The label for this hexagon is always “is-a”. Cardinality is not shown for subclasses. Attributes specific to the subclass may be shown, otherwise they are shown on the superclass. Only horizontal and vertical lines are used. 10 Fig 8. Classes/Subclasses relationship Drawing the ER chart First step is to list all eclasses and attributes. The following format needs to be used: Prod: prod#, descr., price, qty Cust: cust#, name, dt. Joined Then relationships between eclasses need to be mapped. Is there a relationship between prod and cust? between ord and products? Between orders and salespeople? Map all relevant relationships among the eclasses. Here relationships between the eclasses students, courses, departments, faculty, majors (Note that these have been arbitrarily selected for illustration) are shown. Also note that attributes are not shown. Note that the relationships are mapped as follows (see diagram following): 1) 2) 3) 4) 5) 6) Students take courses (with faculty). Faculty offer courses (to students). Courses are offered by majors (and departments). Majors are offered by departments (could also show mapping between courses and departments). Departments have faculty Faculty teach students (in courses) Fig 9. Map the relationships between classes 11 Start by selecting one of the more important eclasses (those that have lots of relationships with others). Draw a rectangle corresponding to the eclass. Then gradually add other entity classes one at a time. Make sure that between two entity classes there needs to be a diamond symbol. Since there can be more than two entity classes in a relationship, watch out for three way and four way relationships. These can be determined via mapping as illustrated or by articulating “Students take courses with faculty,” “Salesperson sells cars to Customers.” Here we start with students: Fig 10. An eclass named students Students are involved in relationships with both faculty and courses. This is a three way relationship. So we draw a diamond symbol to represent the relationships and connect it with all three entity classes. Note the use of horizontal and vertical lines and maximum cardinality. Fig 11. Modeling relationships with students, courses & faculty In the remainder of the model we add departments and majors. Since we mapped courses to majors (in our mapping diagram) the logical place to draw majors in the diagram is beneath the courses. It’s an m:n relationship. Fig 12. Add majors in the relationship mapping diagram 12 Then we add the departments eclass as shown below. Both courses and majors are offered by departments. There are relationships between courses and majors, courses and departments and departments and majors. It is a three way relationship and is shown also with maximum cardinality. Fig 13. Add departments in the relationship mapping diagram Finally, we add the relationship between faculty and departments. A department has multiple faculty but each faculty member works only for one department. Note that linking faculty and departments results in a cycle which is indicated with the multi-sided arrow. Cycles are considered undesirable because they may indicate error in modeling. Here it cannot be helped since eclasses such as courses, departments and faculty are part of more than one relationship. Cycles need to be recognized by the bent arrow as seen below. Fig 14. Final model At this stage attributes are added. These are not shown here in this illustration. 13 E-COMMERCE What is E-Commerce? E-commerce is business carried out using the internet. An E-commerce transaction has a buyer and a seller carrying out commerce on the internet. The buyer selects goods/services and makes a payment and the seller responds by providing the product/service. Types of E-Commerce E-commerce can occur in several modes as follows. Customer to Business (B2C or C2B): In B2C or C2B, the buyer is a retail customer and the seller is an online retailer such as Amazon.com. This is the type most commonly encountered on the web. Business to Business (B2B): In B2B, both buyers and sellers are businesses. Customer to Customer (C2C): In C2C, customers buy and sell to one another as in Ebay. Some computer companies such as Dell allow their customers to buy and sell to each other. Intra-organizational: This is a microscopic portion of the overall e-commerce and is mentioned here for interest. It occurs if buyers and sellers are within the same organization. Many organizations have internal websites or marketplaces where employees can buy and sell from one another. Primary Motivator for online purchases The primary motivators for online purchases are convenience, choice and lower prices. Convenience is the ability to shop online at one’s leisure regardless of whether it is day or night or whether one is in a city or a rural area. Choice refers to the range of product offerings. In principle an infinite number of products can be offered online. Amazon.com has some 20 million products offered. Elimination of middle-man (middle person) and reduction of overheads allow retailers to offer products at lower prices. However many online retailers have trouble making profits because of the stiff competition. Typical Acquisition cost per customer The acquisition cost per customer is the cost of advertisements and promotions divided by the total number of customers. So if a company has one million customers and spends ten million on marketing, its cost per customer is $10. The acquisition costs per customer are quite high in conventional retail businesses. In consumer electronics it is $56/customer, in groceries it is $84/customer, in apparel it is $53. In the cell phone industry, some companies have spent as much as $250 acquiring each new customer. In internet based businesses, in theory, the acquisition cost/customer should be $0 since customers should be able to find the retail store through a search engine. In practice, except for the major retailers, the smaller retailer has to advertise in order to have customers browsing their site. Net Shoppers key demographics Net shoppers are upwardly mobile and young with an average age of 42 and an average income of $65,000. The distribution of the sexes is 44-56 between males and females. The larger proportion of the latter (i.e. 14 females) is due to the popularity of social networking sites like facebook and twitter. 75% percent of the U.S. population has shopped online at one time or the other. Most commonly purchased items The most commonly purchased small ticket items (inexpensive) are: Apparel Toys/video games Books Software Music Health and Beauty Office supplies The most commonly purchased large ticket items (inexpensive) are: Plane tickets Computer hardware Hotel reservations Consumer electronics Car rental What characteristics do these items share in common? An Extended View of E-Commerce E-commerce has been defined as a buyer and seller carrying out a transaction via the web. To enable this it is necessary to have a physical infrastructure in the form of internet and e-commerce system as well as a method of processing payments. Thus an Extended View of E-Commerce includes the Buyer and Seller as well as additional entities. The buyer can pay the seller via banks, intermediaries and other merchants. Since the data travels through public networks, there is a need for security. There are secure protocols for carrying out internet transactions. SSL and SET are two such protocols. 15 Fig 1. An Extended View of E-Commerce Infrastructure -- Internet There are two components to the infrastructure: the internet and the e-commerce system. The internet is the physical medium through which e-commerce transactions take place. It is important to realize that there are no direct wired connections between the buyer and seller (in this case, it is a client and a web server). When a web client requests a web page, the request is broken down into packets (small units of data). The packets are addressed by the host (see diagram below) with the destination IP address (e.g. 143.43.78.9) and sent to their destination. Routers at the junctions of networks route the packets to their destinations using routing rules. They perform the function of traffic lights at street corners – to guide traffic. Sometimes the host may not know the destination address. In such cases it sends an enquiry to its neighbors. If its neighbors are unable to answer the query, the query goes to a DNS server (previously known as root server). The DNS servers also take periodic ‘attendance roll calls’ to maintain their list of IP addresses. Fig 2. Infrastructure -- Internet 16 Components of the Infrastructure Web clients – Computers that access the webservers usually using browsers as software. The web client is often the ‘front end’ of an e-commerce system. It could be a mobile phone or a personal computer or other computing device. Host/node – This is an addressable location on the internet with an IP address. It is a computer that is often dedicated to internet services such as caching, packet collection etc. E-commerce system – The set of hardware and software that supports e-commerce transactions. The front end of an e-commerce system is the web client while the back end is the web server. Together with e-commerce web site and e-commerce application, the client and server constitute an e-commerce system. DNS server – The ultimate reference for all web site addresses. DNS servers poll web sites in their area i.e. take ‘attendance’ to maintain the DNS lists. Unknown web site addresses are resolved this way. Routers – Components that forward data at network branches. Service Providers – (ISPs) provide internet access. E-Commerce System Purpose An e-commerce system can fulfill many different functions related to the business. The important ones are: a) Attract and interact with visitors – In E-Commerce, the web site represents the company and only point of contact between the customer and company. Therefore the site has to be attractive and easy to navigate, b) Display products/services – An e-commerce site exists to sell products/services or information (see discussion on business models). One of the important functions of a web site is to display these attractively, c) Process payments – In the case of companies selling products or services, collecting payments is an important function. Payments are handled usually through credit cards or electronic accounts (discussed later). d) Shipment/payment confirmation – When payments are made the order has to be confirmed and subsequently when the item is shipped, the customer has to be notified. e) Update databases – Databases are updated at various stages in the process. Specific updates include user profiles, orders, invoices, inventory, shipments etc. (such activities have been discussed as part of ‘organization concepts’ as well as in ‘intro to corporate systems’). Conversion Ratio The ratio of buyers to visitors is called the conversion ratio. It is in the vicinity of 5% for small companies and 75% for the better known sites. The attractiveness and features of the web site is one of the factors that influences the conversion ratio. Can you think of others? E-Commerce System The set of hardware and software that supports e-commerce transactions is called an e-commerce system. The front end of an e-commerce system is mostly the web client (local clients could be used for updates) while the back end is the web server. The client and server are often separated by geography. Together with e- 17 commerce web site and e-commerce application, these components constitute an e-commerce system. (ecommerce application is discussed further on). Web clients (already discussed) Servers (see ‘server architecture’ below) Web site (e-commerce site) E-commerce application Interface Business logic Database backend Web Server Functions The functions of the web server go beyond providing the home page. Web servers sometimes provide authentication i.e. verification of identity usually through username and password. They respond to DNS protocols and capture visitor information (site traffic) for advertising purposes. Server Architecture Most organizations (i.e. large organizations) do not have a single web server. Instead, they have a hierarchy of web servers carrying out specialized functions. The architecture shown in the diagram below is a typical three tier architecture used by many large organizations. The three tiers separate the functionality (purpose) and protect web resources: a) Webserver layer – this is the set of web servers that respond to incoming requests for web sites. These servers simply furnish the home page. b) Application server layer – this is also known as the e-commerce layer. The user is switched to this layer whenever he/she wants to view products or to place an order. c) Back end layer – This layer connects the application servers to the back end systems consisting of the organization’s information systems – ERP, inventory, accounting etc. the user is not directly in contact with this layer. For the organization, this layer handles the order processing, accounting and inventory. Fig 3. Server Architecture 18 E-commerce site A simple e-commerce site is shown below. The top part of the site is called a banner. The banner has left and right panels for images. The bottom left has a side bar with menu options. The main part of the web page has content. Fig 4. A simple e-commerce site Web Site Content One of the major components of an E-commerce system is the web site (what are other major components?). The purpose of web site content is to convert browsers to buyers. Web site content is mainly developed in HTML which is a language to format text and images on a web page. Data in a webserver could sometimes be in XML format. The code can perform various functions such as loading a form, loading a web page, verifying credit card information, verifying form data etc. The code could reside on the client side or on the server side. In the old days code resided on client side, but it proved to be a boon for hackers, so nowadays code/applets reside on the server for greater protection. The web site content consists of: HTML XML (data) code -- (part of E-commerce application) client side -- Java script Server side – Applets, servlets (JSP/ASP) Web site Content – HTML and XML HTML – Hypertext markup language is a language used to ‘mark up’ text for display on the web. Note that it is not a programming language. In HTML text is “tagged” with different types of tags to create a variety of effects such as bold, color etc. Most tags have a start marked by angled brackets (e.g. <start tag>) and an end marked with a forward slash and angled brackets (e.g. </end tag>). Thus ‘<b> Bolded text </b>’ creates bolding for Bolded text. Also note that all html documents start with <html> and </html> tags. In the example below the html code and how it appears when displayed in a browser are shown. As an exercise see 19 if you can copy and paste it into notepad (its an “Accessory” in windows), save it as “testhtml.htm” and open it using “file”, “open” in your browser. Fig 5. A example of the HTML code XML – XML or Extended Markup Language is used to ‘mark up’ configuration (e.g. ‘date’ to be in mm/dd/yyyy format) files, product files or other data files for transfer of information on the internet. It has become a universal format for data exchange on the internet. XML uses user defined tags, but it needs a schema to define the syntax of the tags following it. In the example below, ‘xs’ stands for XML schema. A ‘note’ element type is described as a complex type consisting of the elements, ‘to’, ‘from’, ‘heading’ and ‘body’. Note that the part listed directly below is the XML schema so that a browser or application could ‘understand’ the data. and the second part is an instance of the schema. <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> Following is an example of the <note> data type, that is defined above. <note> <to>IS524</to> <from>Amaravadi</from> <heading>Reminder</heading> <body>Don't forget assignments coming up!</body> </note> 20 Another example is shown below. Here a ‘bookstore consists of various books. Each ‘book’ has the attributes (what are these called in XML?) title, author, year and price. Could you write the XML schema for this data? <bookstore> <book category=“YOUNG ADULT"> <title>The Maze Runner </title> <author>James Dashner</author> <year>2010</year> <price>6.35</price> </book> <book category="WEB"> <title>Beginning XML</title> <author>Joe Fawcett</author> <year>2012</year> <price>24.87</price> </book> </bookstore> Web site Content – Code (JavaScript, Applets) As we have learned previously, the content of a web site includes code as well. The complete code will constitute an e-commerce application (discussed next). The code can perform various functions such as loading a form, loading a web page, verifying credit card information, verifying form data etc. There are a number of technologies to perform these actions including Java, JavaScript and Applets. As stated earlier, the code could reside on the client side or on the server side. As stated earlier, in the old days code resided on client side, but it proved to be a boon for hackers, so nowadays code/applets reside on the server for greater protection. In JavaScript high level functions are called within the html document (is HTML a programming language? Confusingly HTML is also referred to as a code). In the script below, the function is ‘write’ (document is the class, a concept similar to entity class). ‘document’ refers to current document, so the code will write whatever is in the quotes i.e. Hello World! to the current document. You could copy and paste this in the notepad, save it as a ‘.htm’ file (test.htm) and open it as a file in your browser. See if you can change it to print “Hello World!” two times! <html> <body> <script type = “text/javascript”> document.write(“Hello World!”) </body> </html> The following is a specification or declaration of a ‘class’ called ‘MyApplet.’ This has a function called ‘paint’ That can paint the string, “Hello Applet”). This part of the code is stored in the server or embedded inside the webpage itself. 21 package org.me.hello; import java.applet.Applet; import java.awt.Graphics; public class MyApplet extends Applet { @Override public void paint(Graphics g) { g.drawString("Hello applet!", 50, 25); } } If it is stored in the server, the applet is embedded in the HTML document as follows – it refers to the path where the actual code for the applet is stored. <applet code="org.me.hello.MyApplet" archive="HelloApplet.jar"></applet> An e-commerce application We have learned that a customized application has three major components: 1) the interface or front end of the application, 2) the business logic or the component that does the actual processing (e.g. adding up items at a retail counter) and 3) the back end or database. An e-commerce application consists of applets or servlets that usually are written in Java/JSP that run in the application server layer. These are illustrated in the diagram below. The application is a cellular provider that allows clients to log in and get information on cellular packages (such as ‘voice and data’, ‘family plan’ etc.). As an example of applets or servlets, ‘GetClientInfo()’ can get the user name and password and ‘SelectPkg() allows the client to select one from all the choices displayed. 22 Fig 6. A customized application has three major components Fig 7. The application in action Security Since buyer and seller cannot see each other, web transactions require a secure environment. Physical security Physical security is safe storage for information and is ensured via firewalls and electronic accounts. A firewall is a set of hardware and software to protect the web server. Electronic accounts hide the customer’s balances. Transaction security Transaction security is the safe exchange of payment information. In the internet data can travel through many nodes before reaching its destination. The data can be potentially intercepted within the network or en-route, so it is important to encode the information so it is visible only to the parties in the transaction. Transaction security is ensured via encryption and authentication schemes. 23 Encryption Encryption is encoding the transaction. There are two major methods of encryption: SSL and SET. Both of these work on the ‘public key’-‘private key’ scheme. The private key consists of the factors of the public key. For example the factors of 1650 are 55 x 30 so the public key is 1650 and the private key is 55 and 30. The scheme is based on the idea that there is no algorithm to identify the correct factors of a very long number. The public key is is used to encrypt information before sending and the private key is used to decrypt it. SSL – SSL stands for Secure Sockets Layer and is used widely in e-commerce transactions. In SSL, as in SET keys are exchanged at the start of the transaction. In SSL the key exchange protocol is called S-HTTP. Note that SSL is an encryption method. SET – SET is Secure Electronic Transaction is an encryption method used for financial information such as currency exchanges, credit cards etc. In SET every transaction is ‘witnessed’ by a third party. Fig 8. Encryption Authentication Authentication is a method of verifying identities of the parties in a transaction. When one writes a check, store clerks ask for a drivers license or other identification. This ensures that the person writing the check is the same person who is on the driver’s license. Similarly on the internet it is important to verify identities of the buyer and seller, so digital certificates are used. Digital Certificates – Because parties in an e-commerce transaction will not typically see one another, it is essential to have verification of identity. Digital certificates are similarly exchanged between buyers and sellers before the start of a transaction. A digital certificate is a document issued by a certificate authority that has information about the owner, owner’s public key, validity date, serial number etc. 24 Fig 9. Digital Certificates Electronic Payments Payments are obviously an important part of e-commerce. Payments need to be made when purchasing a product. Choices in payment methods are credit/debit cards, smart cards, electronic checks and electronic accounts. Problems with credit cards and electronic payments People typically do not want to expose their credit card to the web, besides there are some categories of people like elderly, school kids and travelers who do not have credit cards. The solution is supposed to be digital cash or electronic accounts. In its early days, vendors tried to introduce digital cash that mimicked paper cash, but that became problematic as it needed a machine (just like a credit card processing terminal). Now electronic payments are carried out via electronic accounts. Leading vendors include Paypal, Verisign, RBS etc. 25 E-Commerce Business Models An E-Commerce business model is a method of operation or primary reason for existing or what the business does. It includes, being an E-tailer, a transaction broker, a market creator, content provider, community provider, portal and service provider. CATEGORY DESCRIPTION EXAMPLES E-tailer Sells physical products directly to either other businesses or consumers Amazon Processes online sales transactions for users, charging a fee for each transaction in order to save those users both time and money E-trade.com Sets up a digital environment where buyers and sellers can meet to browse and exchange products allowing for the setting of the prices on these products. They make profit off transaction fees e-Bay Generates revenue by providing digital content over the web. This revenue is gained either from selling access to this content or through providing advertising space. WSJ.com Community provider Enables the sharing of information between people with similar interests Facebook.com Portal Serves as an initial point of entry to the web providing additional services as well. Yahoo Transaction broker Market place Content provider Wal-mart.com Expedia Priceline.com GettyImages.com Itunes.com Google 26 Service Provider Provides web 2.0 services such as photo sharing and online backup to users Photobucket.com Dropbox E-Commerce Revenue Models Model Description Advertising Revenue Model Revenue is generated by gathering a large audience and then subjecting that audience to a large volume of advertisements Sales Revenue Model Revenue is derived from selling: goods, information, or services to customers. Subscription Revenue Model Revenue is generated from the charging of a subscription fee in exchange for access to the provided content. Free/Freemium Revenue Model Revenue is generated from a premium that is charged for access to additional services above and beyond the basic services which are provided for free. Transaction Fee Revenue Model Revenue is generated from charging a transaction fee for each transaction that a company generates. Affiliate Revenue Model Revenue is generated from by web sites that send visitors to other web sites in exchange for a referral fee. 27