Manageability of Future Internet Choong Seon Hong Kyung Hee University cshong@khu.ac.kr November 23, 2010 Contents 2 Introduction to Future Internet and its Manageability GENI Working Groups related to Mgmt GMOC Federation Requirements for Future Internet 3 Security • Seamless handoff/roaming • Identity/addressing Mobility • Intelligent and programmable network nodes Programmability • Integrity, authenticity, confidentiality of communication with any given peer • Virtualization of Resources Scalability Interoperability Reliability Availability Virtualization • FCAPS • Autonomic Management Manageability Requirements for Future Internet 4 Security • Seamless handoff/roaming • Identity/addressing Mobility • Integrity, authenticity, confidentiality of communication with any given peer • Virtualization of Resources Scalability Interoperability Reliability Availability Virtualization Manageability • Intelligent and programmable network nodes Programmability • FCAPS • Autonomic Management Manageability 5 Current Network Environment 6 INTERNET xDSL/Cable FTTH PSDN 10 Gigabit Ethernet ATM Satellite Fast Ethernet ISDN SS#7 Gigabit Ethernet WANs IN/AIN Ethernet Broadcast Networks (DAB, DVB-T) PSTN SONET CDMA, GSM, GPRS IP-based micro-mobility B-ISDN Bluetooth Zigbee 6LoWPAN Wireless LANs WiBro, HSDPA Current Network Management Framework 7 Management Platform Collect, organize & interpret Operational Data Administrator Workstation mgmt requests/replies Agent event reports Agent Agent Agent Agent Agent Observation & Control Agent Functional Requirements for NM 8 Fault Management detection, Configuration Management identify track of usage for charging Performance Management monitor managed resources and their connectivity, discovery Accounting Management keep isolation and correction of abnormal operations and evaluate the behavior of managed resources Security Management allow FCAPS only authorized access and control Standard Management Frameworks 9 OSI Network Management Framework CMIP (X.700 Series) Internet Network Management Framework SNMPv1 SNMPv2 SNMPv3 TeleManagement Forum Distributed Management Task Force SID, eTOM, NGOSS CIM, WBEM Open Mobile Alliance OMA DM 10 11 Manageability for the current Internet has been developed as an afterthought! THINK about Manageability of Future Internet Do we need a revolutionary approach or an evolutionary approach? FCAPS ? Management for Future Internet 12 Autonomic Management/Self-Management Self-managing frameworks and architecture Knowledge engineering, including information modeling and ontology design Policy analysis and modeling Semantic analysis and reasoning technologies Virtualization of resources Orchestration techniques Self-managed networks Context-awareness Adaptive management Research Efforts for Management of FI US NSF Future Complexity Oblivious Network Management architecture (CONMan) Global (GENI) Internet Design (FIND) Environment for Networking Innovations Operations, Management, Integration and Security (OMIS) WG EU Framework 13 Program (FP) 7 4WARD In-network (INM) project Autonomic Internet (AutoI) project Autonomic Network Architecture (ANA) project CONMan: Overview Management interface should contain as little protocol-specific information as possible Complexities of protocols should be masked from management Goal A generic abstraction of network entities (protocols & devices) for management purpose A set of atomic management operations to work upon the abstraction A way to translate high-level management objectives to low-level operations 14 Research Efforts - EU 15 http://www.4ward-project.eu 4WARD WP4: INM (In Network Management) Autonomic self-management Abstractions and a framework for a selforganizing management plane Scheme, strategies, and protocols for collaborative monitoring, self-optimizing, and self-healing Research Efforts - USA 16 GENI OMIS WG (Operations, Management, Integration and Security) Operations, management, integration and security processes in GENI Experiment support, monitoring, and data storage Security monitoring and incident response Federation management and monitoring Hardware release, maintenance and integration Software release, maintenance and integration Operations metric collection and analysis http://www.geni.net/wg/omis-wg.html Research Efforts - Korea 17 CASFI (Collect, Analyze, and Share for Future Internet) Goals Manageability of Future Internet Data Sharing Platform for Performance Measurement High-Precision Measurement and Analysis Human Behavior Analysis Groups KHU, KAIST, POSTECH, CNU Period 2008.03.01 ~ 2013.02.28 http://casfi.kaist.ac.kr Management for Future Internet [1] 18 Management Interface Management Information Modeling & Operations Instrumentation Management Architecture Centralized vs. Decentralized Management Peer-to-Peer Hybrid Service Management Customer-centric service Service portability SLA/QoS Management for Future Internet [2] 19 Traffic Monitoring/Measurement and Analysis Monitoring for large-scale and high-speed networks Network/application-level monitoring Global traffic data access/sharing Fast and real time monitoring Statistical sampling method Storing method for large scale traffic data Measurement and analysis of social networking GENI Working Groups related to Mgmt 20 Outline 21 GENI Working Groups Control Framework Experiment Workflow & Services Instrumentation & Measurements Operation, Management, Integration & Security (OMIS) GMOC GENI Meta Operation Center GENI Working Groups 22 Control Framework WG Experiment Workflow and Services WG Logically stitching GENI components and user-level services into a coherent system Design of how resources are described and allocated and how users are identified and authorized Tools and mechanisms a researcher uses to design and perform experiments using GENI Includes all user interfaces for researchers, as well as data collection and archiving Instrumentation & Measurements WG GIMS - GENI Instrumentation and Measurement Service GENI researchers require extensive and reliable instrumentation and measurement capabilities to gather, analyze, present and archive Measurement Data To conduct useful and repeatable experiments Operations, Management, Integration and Security (OMIS) WG Designing, deploying, and overseeing the GENI infrastructure Operation Framework Control Framework 23 GENI control framework defines: Interfaces between all entities Message types including basic protocols and required functions Message flows necessary to realize key experiment scenarios GENI control framework includes the entities and the Control Plane for transporting messages between these entities component control slice control access control within GENI federation key enablers such as identification, authentication and authorization GENI Architecture - Control Framework 24 The Control Framework WG focuses on component control, slice control, access control within GENI and federation and interaction between these GENI entities Experiment Workflow & Services 25 Identify and specify tools and services needed to run experiments on GENI Planning, scheduling, deploying, running, debugging, analyzing, growing/shrinking experiments Collaboration Multiple researchers on an experiment Building on other experiments Identify interfaces/ joint definition/ information-exchange needed across working groups Provides Services What resources are available to slices What level of programmability is possible on different components and their associated resources Relationship to GENI Architecture 26 WG focuses on experimenter-users needs for planning, scheduling, running, debugging, analyzing and archiving experiments. Instrumentations & Measurements 27 Discuss, develop and build consensus around the architectural framework for the instrumentation and measurement infrastructure that will be deployed and used in GENI Create an architecture for measurement that enables GENI goals to be achieved Facilitate dialog and coordination between teams focused on I&M Identify key challenges in I&M that could otherwise inhibit the infrastructure Solicit feedback from users Deploy basic instrumentation and measurement capabilities Services Measurement Orchestration (MO) Measurement Point (MP) Measurement Collection (MC) Measurement Analysis and Presentation (MAP) Measurement Data Archive (MDA) Relationship to GENI Architecture 28 The Instrumentation and Measurement WG focuses on the instrumentation and measurement infrastructure that will be deployed and used in GENI. GIMS – Protocols & Communication 29 Researcher via Experiment Control service (tools), including MO(Measurement Orchestration) service, manages the setup and running of I&M services Protocols for researcher/experiment control tools to access APIs: Xml-rpc web services (SOAP, WSDL) APIs for setting up and running I&M services APIs for MP (Measurement Point) services APIs for MC (Measurement Collection) services APIs for MAP (Measurement Analysis and Presentation)services APIs for MDA (Measurement for Data Archiving) service All traffic is carried in the GENI Control Plane GIMS Traffic Flow 30 Option 1: Option 2: Carry all MD (Measurement Data) traffic flows using a dedicated measurement VLAN Carry all MD traffic flows using the same IP network that supports the Control Plane. Option 3: Carry most MD traffic flows using the same IP network that supports the Control Plane, but for high-rate MD traffic flows, define a dedicated measurement VLAN for the slice/experiment Detailed Outline for OMIS 31 Operation, Management, Integration & Security (OMIS) GMOC GENI Meta Operation Center Why Meta-Operation? Objective Architecture Operational Data Set Topology Operational Status Administrative Status Utilization Measurements Specialized Data Data Acquisition & Sharing Communication & Coordination Operations Use Case Notification Emergency Shutdown Functions OMIS 32 Operation GMOC (GENI Meta-Operation Center) Management Meta-Management Integration Overlap System for GENI & Interfaces with other WGs Security Policies, Authorization & Authentication Overlaps with other WG 33 Control Framework WG common interface for operations Security Experiment Workflow and Services WG lower levels of GENI & higher level should be consistent Operation & Management Tools Services Usage Instrumentation & Measurements WG Data Acquisition Measurements for performance and management Relationship to GENI Architecture 34 OMIS WG focuses on GENI operations, management and GENI wide view of the projects and experiments Questions 35 How will network operators exchange the data necessary to allow end-to-end troubleshooting of crossdomain circuits? How will network operators exchange data to create a end-to-end view (user view or operator view) of crossdomain circuits? How will network security concerns be taken into account? It is believed that GMOC activity represents one possible path forwarding addressing to these complex cross-domain issues Answer 36 Collect, Analyze and Share Meta Operation Center Federated Network Management Management Analyze Integration Operations Security Collect Share GMOC 37 GENI Meta-Operation Center Goal: To start to help develop the datasets, tools, formats, & protocols needed to share operational data among GENI constituents Why “Meta?” There will be lots of groups operating their own parts This is no intention to change that Interested in what kinds of data exchange and functions are useful to share among these groups, at a GENI-wide level Operations is important Reliability Repeatability User Opt-in GMOC: Objective 38 Give GENI-wide view of operational status of the GENI system maps & graphs prototype other views, such as slice-by-slice views Need for a common operational dataset Give Scientists access to their data GENI-wide and Researcher specific “What was going on during these 2 weeks I ran my test?” Operations Emergency Shutdown find out-of-control virtual slices and isolate or shut them down Identify & Shutdown Misbehaving Slices Protect Other Slices Ensure Stability Meta-Operations 39 • GMOC is not entirely a Centralized or a Distributed architecture • GENI projects can best handle most operational tasks • GMOC coordinates operations across projects to present a single interface to operators and users Cluster 1 Project B Cluster 2 Project C Project A Project D GMOC - Architecture 40 GMOC Translator Alert Monitor GMOC Data Repository GMOC Exchanger Operations Backbone Visualizations Operations Portal Control Framework Clusters Control/ Emergency Stop Conceptual Design 41 GMOC GMOCRepository Exchanger- -Central Polls and/or GMOC Translator - Translates datastore for operational receives operational datadata from information from other formats into from all GENI parts aggregates consistent data format Spiral 1 42 Deliverables Define an Operational Dataset Choose a Dataset Format & Protocol Build Functions Spiral 2 43 1. 2. 3. 4. 5. GMOC contacts exemplar projects and starts a dialogue on what data they are collecting, how that data can be mapped to the operational data set and what issues the specific project has with the operational data set. GMOC starts collecting as much data as possible from the exemplar projects on the format of their choosing importing it into RRD(Round Robin Database) files. GMOC integrates all the data collection tools with the GMOC user interface to provide a unified interface to the diverse backend dataset. GMOC works with the exemplar project to create and use a unified for operational data sharing. GMOC works with other projects to determine effective mechanisms for exporting the operational data set. Data Views 44 How do we look at Operations? Aggregate view Component view Slice view Sliver view GENI Operational Data View 45 Operations’ Requirements 46 It will need to be a collaborative effort Will be contacting anchors and related projects for input Each project may share different kinds/amounts of operational data Initially, concentrating on operational data about components/aggregates and their interconnections, Additionally, may want to access information about the mapping of aggregates data to slice data Balance between central visibility and decentralized autonomy will need to evolve (and continue evolving) Use cases: slice A needs emergency shutdown; which aggregate(s) need to act? what slices were affected by the outage on component B? what was the state of GENI during the life of my experiment on slice C? GMOC: The Plan 47 Set of things needed for GENI operations? Step 1: what kinds of data is needed (need to get)? Operational Data & Data formats Step 2: how should that data be shared? Data Acquisition & Sharing Coordination (Communication) Step 3: what should be done with the data once gets it? Visualization Monitoring Operations Emergency Shutdown Function Event Notification Step 1: Operational Data 48 Potential Types of Operationally Significant Data 1. System-wide View (topology) 2. Operational Status 3. Administrative Status 4. Utilization Data (Measures) 5. Specialized Data Data: Topology [1/2] 49 What exists at a given time on GENI, from an operational viewpoint System Component/Aggregate perspective Slice perspective Requires data about topology of aggregates/components, and the mapping of slice to component. Data might come from experiment tools, clearinghouses, or aggregate managers Aggregates, Components, Resources, Interfaces, Circuits/links, Slivers & their relationship Relationships are described by graphs Data: Topology[2/2] 50 Topology Description Network Description Language (NDL) perfSONAR topology schema GEANT2’s Common Network Information Service (cNIS) OpenGring Forum’s NML (Network Markup Language) Ontology based Topology Description Shows the Topology and the relationships Combination of RRDTool and SQL database RRDTool stores data about utilization, SQL database about GENI topology Data: Operational Status 51 The operational status of a given component, sliver, aggregate, or slice, at a given time Up Down Impaired May include additional specific information Potential States i.e. how is it impaired, or why is it down Examples Component Operational Status Interconnection for operational status – linking Sliver operational status (e.g. virtual machine running, Ethernet VLAN active, etc.) Data: Administrative Status 52 The expected state of a given component, sliver, aggregate, or slice Potential States Up Down Impaired Used in conjunction with operational status to understand overall status Any discrepancy means a misbehaving slice Data: Utilization Measures 53 Time series measures of a resource in use by a GENI component, aggregate, slice etc. Usefull for visualization of GENI-wide view Link Utilization of resources CPU utilization (component level) Condition Measures Critical to emergency shutdown Determination to correct behavior Analogous to Service level agreement (SLA) Utilization Data - Data about the data flowing on GENI components, slices, backbones, etc Some things might be fairly common Link utilization CPU utilization Memory utilization Data: Specialized Data 54 Data specific to the type of component latency/jitter signal strength error counts (network links) Data specific to a situation Wireless propagation, virtual memory usage, page faults, etc Disk cache performance Not useful for GENI unified Interface, but to a user or researcher There should be a way for aggregates/components to create their own types of this data Data Format -ERD 55 Step 2: Data Acquisition & Sharing 56 GENI is made up of many loosely affiliated projects Many projects have existing means to provide operations Data Sharing is difficult as diverse data formats and tools are used Possible Data Acquisition & Sharing tools ( & formats) RRDTool & SQL Database PerfSONAR SNAPP (GRNOC SNMP Collector) Communication 57 There are two possible ways for coordination between the Projects and GMOC Interfaces Define Reports (API) consistent messages to push or pull data sharing Projects submit performance or/and network description and utilization reports to the GMOC A protocol for communication is needed to be standardized Federation in GMOC 58 Proposal # 1 Local MOC for each project Communicate with GMOC at GENI level Communication by Interfaces (APIs) or Protocol Proposal # 2 Shareable Operational data with in a Federation ‘namespace’ Network Description & Measures Reports Reports needed to be consistent Use an adapter (translator) Standard reporting style (SNMP-based, RRDTool , PerfSONAR etc.) E.g. ngeni.<organization>.<operational_state>.<network_id>.<device_id> = “BGP Router 1” Federated GMOC: P1 59 Translator Proposal # 1 Meta-Operation Center for each project Communication with GMOC GMOC Repository Local Interfaces (APIs) Exchanger MOC Visualization Monitoring nGENI Control/ Emergency Stop Operations Monitoring Visualization MOC xGENI Control/ Emergency Stop Operations Federated GMOC: P2 60 Translator Proposal # 2 GMOC directly communicate with the Operational portal Operational data with in Federation ‘namespace’ Shareable Data Exchanger Monitoring Operational Portal Using a translator Standard reporting GMOC Repository SNMP-based, RRD, PerfSONAR etc.) E.g. (at bottom) nGENI Visualization Control/ Emergency Stop Monitoring Operational Portal xGENI Visualization Control/ Emergency Stop ngeni.<organization>.<operational_state>.<network_id>.<device_id> = “BGP Router 1” Step 3: GMOC - Operations 61 Operations required by GMOC Experiment support, monitoring, and data storage Security monitoring and incident response (including incidents unrelated to security) Federation management and monitoring Hardware release, maintenance and integration Software release, maintenance and integration Operations metric collection and analysis Event Notification Emergency Shutdown Control Framework: Event Notification 62 Notification & Data Sharing 63 Operational Data gathering done by each project locally by the NOC Received from the aggregates, components, sliver etc Private data: abstracted from the global view Public data: directly visible for GENI wide view Local NOC (event producer) creates an ‘Event Repository’ Events that can occur with in the resources Event Repository is registered at the Clearinghouse along with the resources Operator and Admin register the events that needed to be ‘consumed’ (received) When an event occurs (Condition Measures inconsistency) the notification is made to the concerned parties (Admin, Component Manager, User, Researcher etc.) Federation 64 Why federating? Through federation it may: Achieve better scaling capabilities Access different resource types (E.G. Wireless, sensors) Contribute to a richer environment offering to the user Facilitate (new) standards definition, E.G. Resource description, protocols, monitoring Federation should not make access more complex to the user, neither exclude unforeseen uses of the facilities Federation Vision 65 Share user credentials and resource descriptions Agree on slice management API and allocation policy Allow experiments to run across facilities Federation: Mechanism 66 Integrated Partially integrated Only part of the control is exchanged, e.g. schedule, AAA information Overlay The facilities can be used as one infrastructure with a inter-domain common control plane Each facility just uses the services/resources of the other without a common control plane, just a data plane, there is an exchange of information related to monitoring, faults, and so on In any case a common data plane and one or more information exchange protocols between them must exist. What is the minimum common set for Federation User Interface and related information exposed to the user Federation: Requirements 67 Given that a Federation is based on two main characteristics: It creates (virtual) resources and the relations between them only when they are needed It must carefully map the virtual resources to the substrate to ensure the best reproducibility to researchers The first step for project (like GENI) is to federate in the overlay model It requires: An agreed (standard) resource description set A common AAI, data plane and information exchange protocol at the substrate level offering a slice, imposes less configuration complexities to other facilities. GENI Federation 68 Federation with Non-GENI Infrastructures Federation among GENI Infrastructures Federation with Resources (Aggregates) GENI Federation 69 Federation: Within GENI 70 Federation: GENI & non-GENI 71 Aggregate Federation 72 73 Federation between GENI Suite and Non-GENI Suite Problems Identity and authority management Control procedures Incompatibility between control procedures Resource and experimentation description Manage identity and authority based on different local policy Use different mechanism for authentication and authorization Use different scheme for resource and experimentation description The following requirements should be considered in order to resolve the observed problems. Common interfaces or adapters for different control framework Common interfaces or adapter for authority service Unified profile for certificate and authority management Common resource and experimentation description language Common data access interface Problems for GENI Federation 74 Classification of federation problems Different identity/authority management Different control procedures Control flows and interfaces/APIs Ex) GENI AM/CM/Slice APIs Different resources and experiments description Identity allocation/authorization policy/mechanism Ex) GID, ABAC (SFA 2.0) Resource description schemes (syntax, context, entity, …) Description of experiments, services, experimental results Ex) RSpec Global standards/adapters * ABAC(Attributed-Based Access Control) Key Issues 75 Reproducibility of experiments, in particular the amount of variation of average values Monitoring and combination of data Virtualization use How to combine physical resources and virtual resources in a seamless environment Signaling between the (many) control planes and ensure the separation between the user control plane and the facility control plane, which is fundamental in case of failures Standards for resource and topology description Check pointing and error recovery/restart AAI, scheduling and naming Reference 76 http://gmoc.grnoc.iu.edu/gmoc/index.html Question and Discussion 77