Manageability in Future Internet

advertisement
Manageability of Future Internet
Choong Seon Hong
Kyung Hee University
cshong@khu.ac.kr
November 23, 2010
Contents
2




Introduction to Future Internet and its Manageability
GENI Working Groups related to Mgmt
GMOC
Federation
Requirements for Future Internet
3
Security
• Seamless handoff/roaming
• Identity/addressing
Mobility
• Intelligent and programmable
network nodes
Programmability
• Integrity, authenticity,
confidentiality of communication
with any given peer
• Virtualization of Resources
Scalability
Interoperability
Reliability
Availability
Virtualization
• FCAPS
• Autonomic Management
Manageability
Requirements for Future Internet
4
Security
• Seamless handoff/roaming
• Identity/addressing
Mobility
• Integrity, authenticity,
confidentiality of communication
with any given peer
• Virtualization of Resources
Scalability
Interoperability
Reliability
Availability
Virtualization
Manageability
• Intelligent and programmable
network nodes
Programmability
• FCAPS
• Autonomic Management
Manageability
5
Current Network Environment
6
INTERNET
xDSL/Cable
FTTH
PSDN
10 Gigabit
Ethernet
ATM
Satellite
Fast
Ethernet
ISDN
SS#7
Gigabit
Ethernet
WANs
IN/AIN
Ethernet
Broadcast
Networks
(DAB, DVB-T)
PSTN
SONET
CDMA, GSM,
GPRS
IP-based
micro-mobility
B-ISDN
Bluetooth
Zigbee
6LoWPAN
Wireless
LANs
WiBro,
HSDPA
Current Network Management Framework
7
Management Platform
Collect, organize & interpret
Operational Data
Administrator
Workstation
mgmt requests/replies
Agent
event reports
Agent
Agent
Agent
Agent Agent
Observation
& Control
Agent
Functional Requirements for NM
8

Fault Management
 detection,

Configuration Management
 identify

track of usage for charging
Performance Management
 monitor

managed resources and their connectivity, discovery
Accounting Management
 keep

isolation and correction of abnormal operations
and evaluate the behavior of managed resources
Security Management
 allow
FCAPS
only authorized access and control
Standard Management Frameworks
9

OSI Network Management Framework


CMIP (X.700 Series)
Internet Network Management Framework
SNMPv1
 SNMPv2
 SNMPv3


TeleManagement Forum


Distributed Management Task Force


SID, eTOM, NGOSS
CIM, WBEM
Open Mobile Alliance

OMA DM
10
11
Manageability for the current Internet has
been developed as an afterthought!
THINK about Manageability of Future Internet
Do we need a revolutionary approach
or an evolutionary approach?
FCAPS
?
Management for Future Internet
12

Autonomic Management/Self-Management
 Self-managing
frameworks and architecture
 Knowledge engineering,
including information modeling and ontology design
 Policy analysis and modeling
 Semantic analysis and reasoning technologies
 Virtualization of resources
 Orchestration techniques
 Self-managed networks
 Context-awareness
 Adaptive management
Research Efforts for Management of FI

US NSF
 Future

Complexity Oblivious Network Management architecture (CONMan)
 Global
(GENI)


Internet Design (FIND)
Environment for Networking Innovations
Operations, Management, Integration and Security (OMIS) WG
EU
 Framework



13
Program (FP) 7
4WARD In-network (INM) project
Autonomic Internet (AutoI) project
Autonomic Network Architecture (ANA) project
CONMan: Overview



Management interface should contain as little
protocol-specific information as possible
Complexities of protocols should be masked
from management
Goal
A generic abstraction of network entities (protocols &
devices) for management purpose
 A set of atomic management operations to work upon
the abstraction
 A way to translate high-level management objectives
to low-level operations

14
Research Efforts - EU
15
http://www.4ward-project.eu

4WARD WP4: INM (In Network Management)
 Autonomic
self-management
 Abstractions and a framework for a selforganizing management plane
 Scheme, strategies, and protocols for collaborative
monitoring, self-optimizing, and self-healing
Research Efforts - USA
16

GENI OMIS WG (Operations, Management, Integration
and Security)
 Operations,
management, integration and security processes
in GENI
 Experiment support, monitoring, and data storage
 Security monitoring and incident response
 Federation management and monitoring
 Hardware release, maintenance and integration
 Software release, maintenance and integration
 Operations metric collection and analysis
 http://www.geni.net/wg/omis-wg.html
Research Efforts - Korea
17

CASFI (Collect, Analyze, and Share for Future
Internet)
 Goals
Manageability of Future Internet
 Data Sharing Platform for Performance Measurement
 High-Precision Measurement and Analysis
 Human Behavior Analysis

 Groups

KHU, KAIST, POSTECH, CNU
 Period

2008.03.01 ~ 2013.02.28
 http://casfi.kaist.ac.kr
Management for Future Internet [1]
18

Management Interface
 Management
Information Modeling & Operations
 Instrumentation

Management Architecture
 Centralized
vs. Decentralized Management
 Peer-to-Peer
 Hybrid

Service Management
 Customer-centric
service
 Service portability
 SLA/QoS
Management for Future Internet [2]
19

Traffic Monitoring/Measurement and Analysis
 Monitoring
for large-scale and high-speed networks
 Network/application-level monitoring
 Global traffic data access/sharing
 Fast and real time monitoring
 Statistical sampling method
 Storing method for large scale traffic data
 Measurement and analysis of
social networking
GENI Working Groups related to Mgmt
20
Outline
21

GENI Working Groups




Control Framework
Experiment Workflow & Services
Instrumentation & Measurements
Operation, Management, Integration & Security (OMIS)

GMOC GENI Meta Operation Center
GENI Working Groups
22

Control Framework WG



Experiment Workflow and Services WG



Logically stitching GENI components and user-level services into a coherent system
Design of how resources are described and allocated and how users are identified and
authorized
Tools and mechanisms a researcher uses to design and perform experiments using GENI
Includes all user interfaces for researchers, as well as data collection and archiving
Instrumentation & Measurements WG


GIMS - GENI Instrumentation and Measurement Service
GENI researchers require extensive and reliable instrumentation and measurement
capabilities to gather, analyze, present and archive Measurement Data


To conduct useful and repeatable experiments
Operations, Management, Integration and Security (OMIS) WG


Designing, deploying, and overseeing the GENI infrastructure
Operation Framework
Control Framework
23

GENI control framework defines:




Interfaces between all entities
Message types including basic protocols and required functions
Message flows necessary to realize key experiment scenarios
GENI control framework includes the entities and the Control
Plane for transporting messages between these entities





component control
slice control
access control within GENI
federation
key enablers such as identification, authentication and authorization
GENI Architecture - Control Framework
24
The Control Framework WG
focuses on component
control,
slice control, access
control within GENI and
federation and
interaction between
these GENI entities
Experiment Workflow & Services
25

Identify and specify tools and services needed to run
experiments on GENI


Planning, scheduling, deploying, running, debugging, analyzing,
growing/shrinking experiments
Collaboration




Multiple researchers on an experiment
Building on other experiments
Identify interfaces/ joint definition/ information-exchange
needed across working groups
Provides Services


What resources are available to slices
What level of programmability is possible on different
components and their associated resources
Relationship to GENI Architecture
26
WG focuses on
experimenter-users needs
for planning, scheduling,
running, debugging,
analyzing and archiving
experiments.
Instrumentations & Measurements
27




Discuss, develop and build consensus around the architectural
framework for the instrumentation and measurement infrastructure that
will be deployed and used in GENI
Create an architecture for measurement that enables GENI goals to be
achieved
Facilitate dialog and coordination between teams focused on I&M
Identify key challenges in I&M that could otherwise inhibit the
infrastructure

Solicit feedback from users

Deploy basic instrumentation and measurement capabilities

Services

Measurement Orchestration (MO)

Measurement Point (MP)

Measurement Collection (MC)

Measurement Analysis and Presentation (MAP)

Measurement Data Archive (MDA)
Relationship to GENI Architecture
28
The Instrumentation and
Measurement WG
focuses on the
instrumentation and
measurement
infrastructure that will be
deployed and used in
GENI.
GIMS – Protocols & Communication
29


Researcher via Experiment Control service (tools), including
MO(Measurement Orchestration) service, manages the setup
and running of I&M services
Protocols for researcher/experiment control tools to access
APIs:








Xml-rpc
web services (SOAP, WSDL)
APIs for setting up and running I&M services
APIs for MP (Measurement Point) services
APIs for MC (Measurement Collection) services
APIs for MAP (Measurement Analysis and Presentation)services
APIs for MDA (Measurement for Data Archiving) service
All traffic is carried in the GENI Control Plane
GIMS Traffic Flow
30

Option 1:


Option 2:


Carry all MD (Measurement Data) traffic flows using a dedicated
measurement VLAN
Carry all MD traffic flows using the same IP network that supports the
Control Plane.
Option 3:

Carry most MD traffic flows using the same IP network that supports the
Control Plane, but for high-rate MD traffic flows, define a dedicated
measurement VLAN for the slice/experiment
Detailed Outline for OMIS
31

Operation, Management, Integration & Security (OMIS)

GMOC GENI Meta Operation Center




Why Meta-Operation?
Objective
Architecture
Operational Data Set









Topology
Operational Status
Administrative Status
Utilization Measurements
Specialized Data
Data Acquisition & Sharing
Communication & Coordination
Operations
Use Case


Notification
Emergency Shutdown Functions
OMIS
32

Operation
 GMOC

(GENI Meta-Operation Center)
Management
 Meta-Management

Integration
 Overlap

System for GENI
& Interfaces with other WGs
Security
 Policies,
Authorization & Authentication
Overlaps with other WG
33

Control Framework WG


common interface for operations
Security


Experiment Workflow and Services WG



lower levels of GENI & higher level should be consistent
Operation & Management Tools
Services Usage
Instrumentation & Measurements WG


Data Acquisition
Measurements for performance and management
Relationship to GENI Architecture
34
OMIS WG focuses on
GENI operations,
management and
GENI wide view of
the projects and
experiments
Questions
35




How will network operators exchange the data
necessary to allow end-to-end troubleshooting of crossdomain circuits?
How will network operators exchange data to create a
end-to-end view (user view or operator view) of crossdomain circuits?
How will network security concerns be taken into
account?
It is believed that GMOC activity represents one
possible path forwarding addressing to these complex
cross-domain issues
Answer
36

Collect, Analyze and Share
 Meta Operation Center

Federated Network Management
Management
Analyze
Integration
Operations
Security
Collect
Share
GMOC
37





GENI Meta-Operation Center
Goal: To start to help develop the datasets, tools, formats, &
protocols needed to share operational data among GENI
constituents
Why “Meta?”

There will be lots of groups operating their own parts

This is no intention to change that
Interested in what kinds of data exchange and functions are useful
to share among these groups, at a GENI-wide level
Operations is important

Reliability

Repeatability

User Opt-in
GMOC: Objective
38

Give GENI-wide view of operational status of the GENI system

maps & graphs

prototype other views, such as slice-by-slice views




Need for a common operational dataset
Give Scientists access to their data


GENI-wide and
Researcher specific
“What was going on during these 2 weeks I ran my test?”
Operations

Emergency Shutdown


find out-of-control virtual slices and isolate or shut them down
Identify & Shutdown

Misbehaving Slices

Protect Other Slices

Ensure Stability
Meta-Operations
39
• GMOC is not entirely a Centralized or a Distributed architecture
• GENI projects can best handle most operational tasks
• GMOC coordinates operations across projects to present a
single interface to operators and users
Cluster 1
Project B
Cluster 2
Project C
Project A
Project D
GMOC - Architecture
40
GMOC
Translator
Alert
Monitor
GMOC
Data
Repository
GMOC
Exchanger
Operations
Backbone
Visualizations
Operations
Portal
Control
Framework
Clusters
Control/
Emergency
Stop
Conceptual Design
41
GMOC
GMOCRepository
Exchanger- -Central
Polls and/or
GMOC
Translator
- Translates
datastore
for
operational
receives operational datadata
from
information
from
other formats
into
from
all
GENI
parts
aggregates
consistent data format
Spiral 1
42

Deliverables
 Define
an Operational Dataset  Choose a Dataset Format & Protocol
 Build Functions
Spiral 2
43
1.
2.
3.
4.
5.
GMOC contacts exemplar projects and starts a dialogue on what
data they are collecting, how that data can be mapped to the
operational data set and what issues the specific project has with
the operational data set.
GMOC starts collecting as much data as possible from the
exemplar projects on the format of their choosing importing it into
RRD(Round Robin Database) files.
GMOC integrates all the data collection tools with the GMOC
user interface to provide a unified interface to the diverse
backend dataset.
GMOC works with the exemplar project to create and use a
unified for operational data sharing.
GMOC works with other projects to determine effective
mechanisms for exporting the operational data set.
Data Views
44

How do we look at Operations?
 Aggregate
view
 Component view
 Slice view
 Sliver view
GENI Operational Data View
45
Operations’ Requirements
46

It will need to be a collaborative effort

Will be contacting anchors and related projects for input

Each project may share different kinds/amounts of operational data




Initially, concentrating on operational data about
components/aggregates and their interconnections,
Additionally, may want to access information about the mapping of
aggregates data to slice data
Balance between central visibility and decentralized autonomy will
need to evolve (and continue evolving)
Use cases:

slice A needs emergency shutdown; which aggregate(s) need to act?

what slices were affected by the outage on component B?

what was the state of GENI during the life of my experiment on slice C?
GMOC: The Plan
47
Set of things needed for GENI operations?

Step 1: what kinds of data is needed (need to get)?



Operational Data & Data formats
Step 2: how should that data be shared?

Data Acquisition & Sharing

Coordination (Communication)
Step 3: what should be done with the data once gets it?
 Visualization
 Monitoring
 Operations


Emergency Shutdown Function
Event Notification
Step 1: Operational Data
48
Potential Types of Operationally Significant Data
1.
System-wide View (topology)
2.
Operational Status
3.
Administrative Status
4.
Utilization Data (Measures)
5.
Specialized Data
Data: Topology [1/2]
49

What exists at a given time on GENI, from an operational viewpoint

System Component/Aggregate perspective

Slice perspective



Requires data about topology of aggregates/components, and the
mapping of slice to component.
Data might come from experiment tools, clearinghouses, or
aggregate managers
Aggregates, Components, Resources, Interfaces, Circuits/links,
Slivers & their relationship

Relationships are described by graphs
Data: Topology[2/2]
50


Topology Description

Network Description Language (NDL)

perfSONAR topology schema

GEANT2’s Common Network Information Service (cNIS)

OpenGring Forum’s NML (Network Markup Language)
Ontology based Topology Description

Shows the Topology and the relationships

Combination of RRDTool and SQL database

RRDTool stores data about utilization, SQL database about GENI topology
Data: Operational Status
51

The operational status of a given component, sliver,
aggregate, or slice, at a given time



Up

Down

Impaired
May include additional specific information


Potential States
i.e. how is it impaired, or why is it down
Examples

Component Operational Status

Interconnection for operational status – linking

Sliver operational status (e.g. virtual machine running, Ethernet VLAN
active, etc.)
Data: Administrative Status
52

The expected state of a given component, sliver,
aggregate, or slice

Potential States




Up
Down
Impaired
Used in conjunction with operational status to
understand overall status
 Any discrepancy means a misbehaving slice
Data: Utilization Measures
53

Time series measures of a resource in use by a GENI component,
aggregate, slice etc.

Usefull for visualization of GENI-wide view

Link Utilization of resources

CPU utilization (component level)

Condition Measures



Critical to emergency shutdown

Determination to correct behavior

Analogous to Service level agreement (SLA)
Utilization Data - Data about the data flowing on GENI components,
slices, backbones, etc
Some things might be fairly common

Link utilization

CPU utilization

Memory utilization
Data: Specialized Data
54




Data specific to the type of component

latency/jitter

signal strength

error counts (network links)
Data specific to a situation

Wireless propagation, virtual memory usage, page faults, etc

Disk cache performance
Not useful for GENI unified Interface, but to a user or researcher
There should be a way for aggregates/components to create their
own types of this data
Data Format -ERD
55
Step 2: Data Acquisition & Sharing
56




GENI is made up of many loosely affiliated projects
Many projects have existing means to provide
operations
Data Sharing is difficult as diverse data formats and
tools are used
Possible Data Acquisition & Sharing tools ( & formats)



RRDTool & SQL Database
PerfSONAR
SNAPP (GRNOC SNMP Collector)
Communication
57

There are two possible ways for coordination
between the Projects and GMOC
 Interfaces
 Define
 Reports
(API)
consistent messages to push or pull data
sharing
 Projects
submit performance or/and network
description and utilization reports to the GMOC
A
protocol for communication is needed to be
standardized
Federation in GMOC
58

Proposal # 1
Local MOC for each project
 Communicate with GMOC at GENI level



Communication by Interfaces (APIs) or Protocol
Proposal # 2
Shareable Operational data with in a Federation
‘namespace’
 Network Description & Measures Reports





Reports needed to be consistent
Use an adapter (translator)
Standard reporting style (SNMP-based, RRDTool ,
PerfSONAR etc.)
E.g.
ngeni.<organization>.<operational_state>.<network_id>.<device_id> = “BGP Router 1”
Federated GMOC: P1
59
Translator

Proposal # 1
Meta-Operation
Center for each project
 Communication with
GMOC
GMOC Repository
 Local
 Interfaces
(APIs)
Exchanger
MOC
Visualization
Monitoring
nGENI
Control/
Emergency
Stop
Operations
Monitoring
Visualization
MOC
xGENI
Control/
Emergency
Stop
Operations
Federated GMOC: P2
60
Translator

Proposal # 2
GMOC directly communicate
with the Operational portal
 Operational data with
in Federation ‘namespace’
 Shareable Data



Exchanger
Monitoring
Operational
Portal
Using a translator
Standard reporting




GMOC Repository
SNMP-based,
RRD,
PerfSONAR etc.)
E.g. (at bottom)
nGENI
Visualization
Control/
Emergency
Stop
Monitoring
Operational
Portal
xGENI
Visualization
Control/
Emergency
Stop
ngeni.<organization>.<operational_state>.<network_id>.<device_id> = “BGP Router 1”
Step 3: GMOC - Operations
61

Operations required by GMOC
Experiment support, monitoring, and data storage
 Security monitoring and incident response (including
incidents unrelated to security)
 Federation management and monitoring
 Hardware release, maintenance and integration
 Software release, maintenance and integration
 Operations metric collection and analysis
 Event Notification
 Emergency Shutdown

Control Framework: Event Notification
62
Notification & Data Sharing
63


Operational Data gathering done by each project locally by the NOC

Received from the aggregates, components, sliver etc

Private data: abstracted from the global view

Public data: directly visible for GENI wide view
Local NOC (event producer) creates an ‘Event Repository’




Events that can occur with in the resources
Event Repository is registered at the Clearinghouse along with the
resources
Operator and Admin register the events that needed to be
‘consumed’ (received)
When an event occurs (Condition Measures inconsistency) the
notification is made to the concerned parties (Admin, Component
Manager, User, Researcher etc.)
Federation
64


Why federating?
Through federation it may:
Achieve better scaling capabilities
 Access different resource types (E.G. Wireless, sensors)
 Contribute to a richer environment offering to the user
 Facilitate (new) standards definition, E.G. Resource
description, protocols, monitoring


Federation should not make access more complex to
the user, neither exclude unforeseen uses of the
facilities
Federation Vision
65



Share user credentials and resource descriptions
Agree on slice management API and allocation policy
Allow experiments to run across facilities
Federation: Mechanism
66

Integrated


Partially integrated



Only part of the control is exchanged, e.g. schedule, AAA
information
Overlay


The facilities can be used as one infrastructure with a inter-domain
common control plane
Each facility just uses the services/resources of the other without a
common control plane, just a data plane, there is an exchange of
information related to monitoring, faults, and so on
In any case a common data plane and one or more information
exchange protocols between them must exist.
What is the minimum common set for Federation

User Interface and related information exposed to the user
Federation: Requirements
67

Given that a Federation is based on two main characteristics:




It creates (virtual) resources and the relations between them only when
they are needed
It must carefully map the virtual resources to the substrate to ensure the
best reproducibility to researchers
The first step for project (like GENI) is to federate in the overlay model
It requires:


An agreed (standard) resource description set
A common AAI, data plane and information exchange protocol at the
substrate level offering a slice, imposes less configuration complexities to
other facilities.
GENI Federation
68
Federation with
Non-GENI
Infrastructures
Federation
among
GENI
Infrastructures
Federation with
Resources
(Aggregates)
GENI Federation
69
Federation: Within GENI
70
Federation: GENI & non-GENI
71
Aggregate Federation
72
73
Federation between GENI Suite and Non-GENI
Suite

Problems

Identity and authority management



Control procedures


Incompatibility between control procedures
Resource and experimentation description


Manage identity and authority based on different local policy
Use different mechanism for authentication and authorization
Use different scheme for resource and experimentation description
The following requirements should be considered in order to resolve the
observed problems.





Common interfaces or adapters for different control framework
Common interfaces or adapter for authority service
Unified profile for certificate and authority management
Common resource and experimentation description language
Common data access interface
Problems for GENI Federation
74

Classification of federation problems

Different identity/authority management



Different control procedures



Control flows and interfaces/APIs
Ex) GENI AM/CM/Slice APIs
Different resources and experiments description




Identity allocation/authorization policy/mechanism
Ex) GID, ABAC (SFA 2.0)
Resource description schemes (syntax, context, entity, …)
Description of experiments, services, experimental results
Ex) RSpec
Global standards/adapters
* ABAC(Attributed-Based Access Control)
Key Issues
75



Reproducibility of experiments, in particular the amount of
variation of average values
Monitoring and combination of data
Virtualization use





How to combine physical resources and virtual resources in a seamless
environment
Signaling between the (many) control planes and ensure the
separation between the user control plane and the facility
control plane, which is fundamental in case of failures
Standards for resource and topology description
Check pointing and error recovery/restart
AAI, scheduling and naming
Reference
76

http://gmoc.grnoc.iu.edu/gmoc/index.html
Question and Discussion
77
Download