1 Introduction to Distributed Systems Chapter Two Architecture Introduction Distributed systems are often complex pieces of software of which the components are by definition dispersed across multiple machines. To master their complexity, it is crucial that these systems are properly organized. There are different ways on how to view the organization of a distributed system, but an obvious one is to make a distinction between the logical organization of the collection of software components and on the other hand the actual physical realization. The organization of distributed systems is mostly about the software components that constitute the system. These software architectures tell us how the various software components are to be organized and how they should interact. The actual realization of a distributed system requires that we instantiate and place software components on real machines. There are many different choices that can be made in doing so. The final instantiation of software architecture is also referred to as system architecture. Centralized architectures A single server implements most of the software components (and thus functionality), while remote clients can access that server using simple communication means. The main problem with the centralized model is that it is not easily scalable. There is a limit to the number of CPUs in a system and eventually the entire system needs to be upgraded or replaced. Fig 2: sample centralized system ITEC 551 Compiled by: Miraf Belyu Page 1 2 Introduction to Distributed Systems Decentralized architectures Two or more machines more or less play equal roles, as well as hybrid organizations. Fig 3: sample decentralized system Architectural style Architectural style is formulated in terms of components, the way that components are connected to each other, the data exchanged between components, and finally how these elements are jointly configured into a system. Several styles have by now been identified, of which the most important ones for distributed systems are: 1. 2. 3. 4. Layered architectures Object-based architectures Data-centered architectures Event-based architectures Layered architectures The basic idea for the layered style is simple: components are organized in a layered fashion where a component at layer L; is allowed to call components at the underlying layer Li:«, but not the other way around, as shown in the following figure. ITEC 551 Compiled by: Miraf Belyu Page 2 3 Introduction to Distributed Systems Fig 4: layered architecture This model has been widely adopted by the networking community; a key observation is that control generally flows from layer to layer: requests go down the hierarchy whereas the results flow upward. Object-based architectures A far looser organization is followed in object-based architectures,. In essence, each object corresponds to what is called a component, and these components are connected through a (remote) procedure call mechanism. Not surprisingly, this software architecture matches the client-server system architecture. The layered and object-based architectures still form the most important styles for large software systems. Fig 5: sample object-based architecture ITEC 551 Compiled by: Miraf Belyu Page 3 4 Introduction to Distributed Systems Data-centric architectures Data-centered architectures evolve around the idea that processes communicate through a common (passive or active) repository. It can be argued that for distributed systems these architectures are as important as the layered and object-based architectures. For example, a wealth of networked applications has been developed that rely on a shared distributed file system in which virtually all communication takes place through files. Likewise, Web-based distributed systems, are largely data-centric: processes communicate through the use of shared Web-based data services. Event-based architectures In event-based architectures, processes essentially communicate through the propagation of events, which optionally also carry data. For distributed systems, event propagation has generally been associated with what are known as publish/subscribe systems. The basic idea is that processes publish events after which the middleware ensures that only those processes that subscribed to those events will receive them. The main advantage of event-based systems is that processes are loosely coupled. In principle, they need not explicitly refer to each other. This is also referred to as being decoupled in space, or referentially decoupled. Fig 6: sample event-based architecture System architectures Now that we have briefly discussed some common architectural styles, let us take a look at how many distributed systems are actually organized by considering where software components are placed. Deciding on software components, their interaction, and their placement leads to an instance of software architecture, also called system architecture. We will discuss centralized and decentralized organizations, as well as various hybrid forms. ITEC 551 Compiled by: Miraf Belyu Page 4 5 Introduction to Distributed Systems Centralized architectures In the basic client-server model, processes in a distributed system are divided into two (possibly overlapping) groups. A server is a process implementing a specific service, for example, a file system service or a database service. A client is a process that requests a service from a server by sending it a request and subsequently waiting for the server's reply, this client-server interaction also known as request-reply. Fig 7: general interaction between client and server Communication between a client and a server can be implemented by means of a simple connectionless protocol when the underlying network is fairly reliable as in many local-area networks. In these cases, when a client requests a service, it simply packages a message for the server, identifying the service it wants, along with the necessary input data. The message is then sent to the server. The latter, in turn, will always wait for an incoming request, subsequently process it, and package the results in a reply message that is then sent to the client. Using a connectionless protocol has the obvious advantage of being efficient. As long as messages do not get lost or corrupted, the request/reply protocol just sketched works fine. Unfortunately, making the protocol resistant to occasional transmission failures is not trivial. The only thing we can do is possibly let the client resend the request when no reply message comes in. The problem, however, is that the client cannot detect whether the original request message was lost, or that transmission of the reply failed. If the reply was lost, then resending a request may result in performing the operation twice. If the operation was something like "transfer 10,000 birr from my bank account," then clearly, it would have been better that we simply reported an error instead. On the other hand, if the operation was "tell me how much money I have left," it would be perfectly acceptable to resend the request. When an operation can be repeated multiple times without harm, it is said to be idempotent. Since some requests are idempotent and others are not it should be clear that there is no single solution for dealing with lost messages. As an alternative, many client-server systems use a reliable connection-oriented protocol. Although this solution is not entirely appropriate in a local-area network due to relatively low ITEC 551 Compiled by: Miraf Belyu Page 5 6 Introduction to Distributed Systems performance, it works perfectly fine in wide-area systems in which communication is inherently unreliable. For example, virtually all Internet application protocols are based on reliable TCP/IP connections. In this case, whenever a client requests a service, it first sets up a connection to the server before sending the request. The server generally uses that same connection to send the reply message, after which the connection is torn down. The trouble is that setting up and tearing down a connection is relatively costly, especially when the request and reply messages are small. Application layering The client-server model has been subject to many debates and controversies over the years. One of the main issues was how to draw a clear distinction between a client and a server. Not surprisingly, there is often no clear distinction. For example, a server for a distributed database may continuously act as a client because it is forwarding requests to different file servers responsible for implementing the database tables. In such a case, the database server itself essentially does no more than process queries. However, considering that many client-server applications are targeted toward supporting user access to databases, many people have advocated a distinction between the following three levels, essentially following the layered architectural style we discussed previously: 1. The user-interface level 2. The processing level 3. The data level The user-interface level contains all that is necessary to directly interface with the user, such as display management. Clients typically implement the user-interface level. This level consists of the programs that allow end users to interact with applications. There is a considerable difference in how sophisticated user-interface programs are. The simplest user-interface program is nothing more than a character-based screen. Such an interface has been typically used in main frame environments. In those cases where the mainframe controls all interaction, including the keyboard and monitor, one can hardly speak of a client-server environment. However, in many cases, the user's terminal does some local processing such as echoing typed keystrokes, or supporting form-like interfaces in which a complete entry is to be edited before sending it to the main computer. Nowadays, even in mainframe environments, we see more advanced user interfaces. Typically, the client machine offers at least a graphical display in which pop-up or pull-down menus are used, and of which many of the screen controls are handled through a mouse instead of the keyboard. Typical examples of such interfaces include the X-Windows interfaces as used in many UNIX environments, and earlier interfaces developed for MS-DOS PCs and Apple Macintoshes. ITEC 551 Compiled by: Miraf Belyu Page 6 7 Introduction to Distributed Systems Modern user interfaces offer considerably more functionality by allowing applications to share a single graphical window, and to use that window to exchange data through user actions. For example, to delete a file, it is usually possible to move the icon representing that file to an icon representing a trash can. Likewise, many word processors allow a user to move text in a document to another position by using only the mouse. Many client-server applications can be constructed from roughly three different pieces: A part that handles interaction with a user A part that operates on a database or file system and A middle part that generally contains the core functionality of an application. This middle part is logically placed at the processing level. In contrast to user interfaces and databases, there are not many aspects common to the processing level. For example, consider an Internet search engine. Ignoring all the animated banners, images, and other fancy window dressing, the user interface of a search engine is very simple: a user types in a string of keywords and is subsequently presented with a list of titles of Webpages. The back end is formed by a huge database of Webpages that have been pre-fetched and indexed. The core of the search engine is a program that transforms the user's string of keywords into one or more database queries. It subsequently ranks the results into a list, and transforms that list into a series of HTML pages. Within the client-server model, this information retrieval part is typically placed at the processing level. Fig 8: three different layers of a search engine ITEC 551 Compiled by: Miraf Belyu Page 7 8 Introduction to Distributed Systems Decentralized architectures Multi-tiered client-server architectures are a direct consequence of dividing applications into a user-interface, processing components, and a data level. The different tiers correspond directly with the logical organization of applications. In many business environments, distributed processing is equivalent to organizing a client-server application as a multi-tiered architecture. We refer to this type of distribution as vertical distribution. The characteristic feature of vertical distribution is that it is achieved by placing logically different components on different machines. Having a vertical distribution can help: functions are logically and physically split across multiple machines, where each machine is tailored to a specific group of functions. However, vertical distribution is only one way of organizing client-server applications. In modem architectures, it is often the distribution of the clients and the servers that counts, which we refer to as horizontal distribution. In this type of distribution, a client or server may be physically split up into logically equivalent parts, but each part is operating on its own share of the complete data set, thus balancing the load. Structured peer – to – peer architecture In a structured peer-to-peer architecture, the overlay network is constructed using a deterministic procedure. By far the most used procedure is to organize the processes through a distributed hash table (DHT). In a DHT based system, data items are assigned a random key from a large identifier space, such as a 128-bit or 160-bit identifier. Likewise, nodes in the system are also assigned a random number from the same identifier space. The crux of every DHT-based system is then to implement an efficient and deterministic scheme that uniquely maps the key of a data item to the identifier of a node based on some distance metric. Most importantly, when looking up a data item, the network address of the node responsible for that data item is returned. Effectively, this is accomplished by routing a request for a data item to the responsible node. Unstructured peer – to – peer architecture Unstructured peer-to-peer systems largely rely on randomized algorithms for constructing an overlay network. The main idea is that each node maintains a list of neighbors, but that this list is constructed in a more or less random way. Likewise, data items are assumed to be randomly placed on nodes. As a consequence, when a node needs to locate a specific data item, the only thing it can effectively do is flood the network with a search query. Super peers Notably in unstructured peer-to-peer systems, locating relevant data items can become problematic as the network grows. The reason for this scalability problem is simple: as there is no deterministic way of routing a lookup request to a specific data item, essentially the only technique a node can resort to is flooding the request. As alternative many peer-to-peer systems have proposed to make use of special nodes that maintain an index of data items which are Super Peers. ITEC 551 Compiled by: Miraf Belyu Page 8 9 Introduction to Distributed Systems Consider a collaboration of nodes that offer resources to each other. For example, in a collaborative content delivery network (CDN), nodes may offer storage for hosting copies of Webpages allowing Web clients to access pages nearby, and thus to access them quickly. In this case a node P may need to seek for resources in a specific part of the network. In that case, making use of a broker that collects resource usage for a number of nodes that are in each other's proximity will allow to quickly selecting a node with sufficient resources. Nodes such as those maintaining an index or acting as a broker are generally referred to as super peers. As their name suggests, super peers are often also organized in a peer-to-peer network, leading to a hierarchical organization. A simple example of such an organization is shown in Fig.9. In this organization, every regular peer is connected as a client to a super peer. All communication, from and to a regular peer, proceeds through that peer's associated super peer. Fig. 9 Hierarchical organization of nodes into a super peer network In many cases, the client-super peer relation is fixed: whenever a regular peer joins the network, it attaches to one of the super peers and remains attached until it leaves the network. Obviously, it is expected that super peers are long-lived processes with a high availability. To compensate for potential unstable behavior of a super peer, backup schemes can be deployed, such as pairing every super peer with another one and requiring clients to attach to both. Having a fixed association with a super peer may not always be the best solution. For example, in the case of file-sharing networks, it may be better for a client to attach to a super peer that maintains an index of files that the client is generally interested in. In that case, chances are bigger that when a client is looking for a specific file, its super peer will know where to find it. Garbackietal describe a relatively simple scheme in which the client-super peer relation can change as clients discover better super peers to associate with. In particular, a super peer returning the result of a lookup operation is given preference over other super-peers. ITEC 551 Compiled by: Miraf Belyu Page 9 10 Introduction to Distributed Systems As we have seen, peer-to-peer networks offer a flexible means for nodes to join and leave the network. However, with super peer networks a new problem is introduced, namely how to select the nodes that are eligible to become super peer. This problem is closely related to the leaderelection problem Hybrid architectures So far, we have focused on client-server architectures and a number of peer-to-peer architectures. Many distributed systems combine architectural features, as we already came across in super peer networks. In this section we take a look at some specific classes of distributed systems in which client-server solutions are combined with decentralized architectures. Edge-Server Systems Collaborative Distributed Systems Edge-Server Systems An important class of distributed systems that is organized according to hybrid architecture is formed by edge-server systems. These systems are deployed on the Internet where servers are placed "at the edge" of the network. This edge is formed by the boundary between enterprise networks and the actual Internet, for example, as provided by an Internet Service Provider (ISP). Likewise, where end users at home connect to the Internet through their ISP, the ISP can be considered as residing at the edge of the Internet. This leads to a general organization as shown below. Fig 10: Viewing the Internet as consisting of a collection of edge servers End users or clients in general, connect to the Internet by means of an edge server. The edge server's main purpose is to serve content, possibly after applying filtering and transcoding functions. More interesting is the fact that a collection of edge servers can be used to optimize content and application distribution. The basic model is that for a specific organization, one edge ITEC 551 Compiled by: Miraf Belyu Page 10 11 Introduction to Distributed Systems server acts as an origin server from which all content originates. That server can use other edge servers for replicating Webpages and such. Collaborative Distributed Systems Hybrid structures are notably deployed in collaborative distributed systems. The main issue in many of these systems to first gets started, for which often a traditional client-server scheme is deployed. Once a node has joined the system, it can use a fully decentralized scheme for collaboration. Let us first consider the BitTorrent file-sharing system. BitTorrent is a peer-to-peer file downloading system. Its principal working is shown in Fig.11. The basic idea is that when an end user is looking for a file, she downloads chunks of the file from other users until the downloaded chunks can be assembled together yielding the complete file. An important design goal was to ensure collaboration. In most file-sharing systems, a significant fraction of participants merely download files but otherwise contribute close to nothing. To this end, a file can be downloaded only when the downloading client is providing content to someone else. Fig 11: The principal working of BitTorrent To download a file, a user needs to access a global directory, which is just one of a few wellknown Websites. Such a directory contains references to what are called .torrent files. A .torrent file contains the information that is needed to download a specific file. In particular, it refers to what is known as a tracker, which is a server that is keeping an accurate account of active nodes that have (chunks) of the requested file. An active node is one that is currently downloading another file. Obviously, there will be many different trackers, although (there will generally be only a single tracker per file (or collection of files). Once the nodes have been identified from where chunks can be downloaded, the downloading node effectively becomes active. At that point, it will be forced to help others, for example by providing chunks of the file it is downloading that others do not yet have. This enforcement comes from a very simple rule: if node P notices that node Q is downloading more than it is uploading, P can decide to decrease the rate at which it sends data to Q. This scheme works well ITEC 551 Compiled by: Miraf Belyu Page 11 12 Introduction to Distributed Systems provided P has something to download from Q. For this reason, nodes are often supplied with references to many other nodes putting them in a better position to trade data. System Models Systems that are intended for use in real world environments should be designed to function correctly in the widest possible range of circumstances and in the face of many possible difficulties and threats. Each type of model is intended to provide an abstract, simplified but consistent description of a relevant aspect of distributed system design: Physical models are the most explicit way in which to describe a system; they capture the hardware composition of a system in terms of the computers (and other devices, such as mobile phones) and their interconnecting networks. Architectural models describe a system in terms of the computational and communication tasks performed by its computational elements; the computational elements being individual computers or aggregates of them supported by appropriate network interconnections. Fundamental models take an abstract perspective in order to examine individual aspects of a distributed system. In this section we introduce fundamental models that examine three important aspects of distributed systems: interaction models, which consider the structure and sequencing of the communication between the elements of the system; failure models, which consider the ways in which a system may fail to operate correctly and; security models, which consider how the system is protected against attempts to interfere with its correct operation or to steal its data. Physical Models A physical model is a representation of the underlying hardware elements of a distributed system that abstracts away from specific details of the computer and networking technologies employed. A distributed system was defined in Chapter 1 as one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages. This leads to a minimal physical model of a distributed system as an extensible set of computer nodes interconnected by a computer network for the required passing of messages. Beyond this baseline model, we can usefully identify three generations of distributed systems. Early distributed systems: Such systems emerged in the late 1970s and early 1980s in response to the emergence of local area networking technology, usually Ethernet. These systems typically consisted of between 10 and 100 nodes interconnected by a local area network, with limited Internet connectivity and supported a small range of services such as shared local printers and file servers as well as email and file transfer across the Internet. Individual systems were largely homogeneous and openness was not a primary concern. Providing quality of service was still very much in its infancy and was a focal point for much of the research around such early systems. ITEC 551 Compiled by: Miraf Belyu Page 12 13 Introduction to Distributed Systems Internet-scale distributed systems: Building on this foundation, larger-scale distributed systems started to emerge in the 1990s in response to the dramatic growth of the Internet during this time (for example, the Google search engine was first launched in 1996). In such systems, the underlying physical infrastructure consists of a physical model that is, an extensible set of nodes interconnected by a network of networks (the Internet). Such systems exploit the infrastructure offered by the Internet to become truly global. They incorporate large numbers of nodes and provide distributed system services for global organizations and across organizational boundaries. The level of heterogeneity in such systems is significant in terms of networks, computer architecture, operating systems, languages employed and the development teams involved. This has led to an increasing emphasis on open standards and associated middleware technologies such as CORBA and more recently, web services. Additional services were employed to provide end-toend quality of service properties in such global systems. Contemporary distributed systems: In the above systems, nodes were typically desktop computers and therefore relatively static (that is, remaining in one physical location for extended periods), discrete (not embedded within other physical entities) and autonomous (to a large extent independent of other computers in terms of their physical infrastructure). The emergence of mobile computing has led to physical models where nodes such as laptops or smart phones may move from location to location in a distributed system, leading to the need for added capabilities such as service discovery and support for spontaneous interoperation. The emergence of ubiquitous computing has led to a move from discrete nodes to architectures where computers are embedded in everyday objects and in the surrounding environment (for example, in washing machines or in smart homes more generally). The emergence of cloud computing and, in particular, cluster architectures has led to a move from autonomous nodes performing a given role to pools of nodes that together provide a given service (for example, a search service as offered by Google). ITEC 551 Compiled by: Miraf Belyu Page 13 14 Introduction to Distributed Systems Fig 12: Generation of Distributed Systems Architectural Models The architecture of a system is its structure in terms of separately specified components and their interrelationships. The overall goal is to ensure that the structure will meet present and likely future demands on it. Major concerns are to make the system reliable, manageable, adaptable and cost-effective. The architectural design of a building has similar aspects – it determines not only its appearance but also its general structure and architectural style (gothic, neo-classical, modern) and provides a consistent frame of reference for the design. Architectural elements To understand the fundamental building blocks of a distributed system, it is necessary to consider four key questions: 1. What are the entities that are communicating in the distributed system? 2. How do they communicate, or, more specifically, what communication paradigm is used? 3. What (potentially changing) roles and responsibilities do they have in the overall architecture? 4. How are they mapped on to the physical distributed infrastructure (what is their placement)? Communicating Entities The first two questions above are absolutely central to an understanding of distributed systems; what is communicating and how those entities communicate together define a rich design space for the distributed systems developer to consider. It is helpful to address the first question from a system-oriented and a problem-oriented perspective. ITEC 551 Compiled by: Miraf Belyu Page 14 15 Introduction to Distributed Systems From a system perspective, the answer is normally very clear in that the entities that communicate in a distributed system are typically processes, leading to the prevailing view of a distributed system as processes coupled with appropriate inter process communication paradigms with two caveats: In some primitive environments, such as sensor networks, the underlying operating systems may not support process abstractions (or indeed any form of isolation), and hence the entities that communicate in such systems are nodes. In most distributed system environments, processes are supplemented by threads, so, strictly speaking, it is threads that are the endpoints of communication. Objects Objects have been introduced to enable and encourage the use of object-oriented approaches in distributed systems (including both object-oriented design and object-oriented programming languages). In distributed object-based approaches, a computation consists of a number of interacting objects representing natural units of decomposition for the given problem domain. Objects are accessed via interfaces, with an associated interface definition language (or IDL) providing a specification of the methods defined on an object. Components Since their introduction a number of significant problems have been identified with distributed objects, and the use of component technology has emerged as a direct response to such weaknesses. Components resemble objects in that they offer problem-oriented abstractions for building distributed systems and are also accessed through interfaces. The key difference is that components specify not only their (provided) interfaces but also the assumptions they make in terms of other components/interfaces that must be present for a component to fulfill its function in other words, making all dependencies explicit and providing a more complete contract for system construction. Web services Web services represent the third important paradigm for the development of distributed systems, web services are closely related to objects and components, again taking an approach based on encapsulation of behavior and access through interfaces. In contrast, however, web services are intrinsically integrated into the World Wide Web, using web standards to represent and discover services. Communication paradigms Defines how entities communicate in a distributed system, and consider three types of communication paradigm: a. Inter process communication b. Remote invocation c. Indirect communication. ITEC 551 Compiled by: Miraf Belyu Page 15 16 Introduction to Distributed Systems Inter process communication refers to the relatively low-level support for communication between processes in distributed systems, including message-passing primitives, direct access to the API offered by Internet protocols (socket programming) and support for multicast communication. Remote invocation represents the most common communication paradigm in distributed systems, covering a range of techniques based on a two-way exchange between communicating entities in a distributed system and resulting in the calling of a remote operation, procedure or method. Request-reply protocols: Request-reply protocols are effectively a pattern imposed on an underlying message-passing service to support client-server computing. In particular, such protocols typically involve a pairwise exchange of messages from client to server and then from server back to client, with the first message containing an encoding of the operation to be executed at the server and also an array of bytes holding associated arguments and the second message containing any results of the operation, again encoded as an array of bytes. This paradigm is rather primitive and only really used in embedded systems where performance is paramount. The approach is also used in the HTTP protocol. Most distributed systems will elect to use remote procedure calls or remote method invocation, as discussed below, but note that both approaches are supported by underlying request-reply exchanges. Remote procedure calls: In RPC, procedures in processes on remote computers can be called as if they are procedures in the local address space. The underlying RPC system then hides important aspects of distribution, including the encoding and decoding of parameters and results, the passing of messages and the preserving of the required semantics for the procedure call. This approach directly and elegantly supports client-server computing with servers offering a set of operations through a service interface and clients calling these operations directly as if they were available locally. RPC systems therefore offer (at a minimum) access and location transparency. Remote method invocation: Remote method invocation (RMI) strongly resembles remote procedure calls but in a world of distributed objects. With this approach, a calling object can invoke a method in a remote object. As with RPC, the underlying details are generally hidden from the user. RMI implementations may, though, go further by supporting object identity and the associated ability to pass object identifiers as parameters in remote calls. Indirect communication Key techniques for indirect communication include: Group communication: concerned with the delivery of messages to a set of recipients and hence is a multiparty communication paradigm supporting one-to-many communication. Group communication relies on the abstraction of a group which is represented in the ITEC 551 Compiled by: Miraf Belyu Page 16 17 Introduction to Distributed Systems system by a group identifier. Recipients elect to receive messages sent to a group by joining the group. Senders then send messages to the group via the group identifier, and hence do not need to know the recipients of the message. Groups typically also maintain group membership and include mechanisms to deal with failure of group members. Publish-subscribe systems: many systems can be classified as information-dissemination systems wherein a large number of producers (or publishers) distribute information items of interest (events) to a similarly large number of consumers (or subscribers). It would be complicated and inefficient to employ any of the core communication paradigms discussed above for this purpose and hence publish-subscribe systems (sometimes also called distributed event-based systems) have emerged to meet this important need. Publish-subscribe systems all share the crucial feature of providing an intermediary service that efficiently ensures information generated by producers is routed to consumers who desire this information. Message queues: whereas publish-subscribe systems offer a one-to-many style of communication, message queues offer a point-to-point service whereby producer processes can send messages to a specified queue and consumer processes can receive messages from the queue or be notified of the arrival of new messages in the queue. Queues therefore offer an indirection between the producer and consumer processes. Tuple spaces: tuple spaces offer a further indirect communication service by supporting a model whereby processes can place arbitrary items of structured data, called tuples, in a persistent tuple space and other processes can either read or remove such tuples from the tuple space by specifying patterns of interest. Since the tuple space is persistent, readers and writers do not need to exist at the same time. This style of programming, otherwise known as generative communication. Distributed shared memory: Distributed shared memory (DSM) systems provide an abstraction for sharing data between processes that do not share physical memory. The underlying infrastructure must ensure a copy is provided in a timely manner and also deal with issues relating to synchronization and consistency of data. ITEC 551 Compiled by: Miraf Belyu Page 17