Prototypes - From PURSUIT

advertisement

PROTOTYPES

T-110.6120

17.11.2011

Jimmy Kjällman

Ericsson Research, NomadicLab

Prototypes

• Two research prototypes will be described in this presentation

• Blackadder

– Developed in PURSUIT

– Channel-oriented base implementation

– Demonstrated at the end of the lecture

• Blackhawk

– Originates from PSIRP

– Document-oriented implementation

BLACKADDER

Original slides:

George Parisis, Computer Laboratory, University of Cambridge, 2011

Blackadder

• Realizes PURSUIT’s functional model for information-centric networking

Pub/Sub Service Model

Dissemination

Strategy

Rendezvous Topology

RId

Forwarding

Functional scoping

Information scoping

SId

RId

Recursion

Information Structure

Scopes, subscopes, information items

• Information is structured as a directed acyclic graph

IDs are (statistically) unique within a scope

– (Possibly) self-generated, flat labels

– Same ID space for both subscopes and information items

• “Complete” identifier: Prefix + ID

– One or more paths starting from one or more graph’s root(s)

Information Structure

Scope

Information item

AAA1 0002 AAA0 AAA1

0001 0002 0003

Information ID : /0003/0002/AAA2

Scope ID : /0001/0001/0001, /0002/0001/0001, /0003/0001/0001

Core Functions

• Simplified example

P

Rendezvous

Topology

Forwarding S

Dissemination Strategies

• Defines the methods used for implementation (of a scope)

– Architectural components

– Data formats

– Governance structures

– Etc.

• Can be “overridden” for sub-items – if permitted

– Strategies have to be aligned

• Usually engineered at design time

• Larger problem solutions through the assembly of smaller ones

Service Model

• Publish/Subscribe

• For example:

– publish_scope(id, prefix, strategy) publish_info (id, prefix, strategy)

– unpublish_scope(id, prefix, strategy) unpublish_info (id, prefix, strategy)

– subscribe_scope(id, prefix, strategy) subscribe_info (id, prefix, strategy)

– unsubscribe_scope(id, prefix, strategy) unsubscribe_info (id, prefix, strategy)

– publish_data(id, strategy, data, data_len)

– getEvent(&event)

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

• Click is an external framework that Blackadder uses

Background Information:

The Click Modular Router

• Open source platform for building packet processing configurations that consist of connected elements

– Language for describing router configurations

Ready-made elements

– Libraries for creating new elements as C++ classes

Portable code

– Kernel and userlevel

Linux, FreeBSD, Mac OS X, etc.

• Modular design approach

– Reuse of elements in different configurations

(e.g., in different prototypes or experiments)

• Basic operation: packets are pushed or pulled between elements

Click Router Configuration

FromDevice@1 c

Classifier

• Example: Ping

(nothing to do with Blackadder, just illustrates a Click router)

CheckIPHeader@3 ip

IPClassifier ping

ICMPPingSource

SetIPAddress@6 define($DEV eth0, $DADDR 8.8.8.8, $GW $DEV:gw)

FromDevice($DEV, SNIFFER false)

-> c :: Classifier(12/0800, 12/0806 20/0002)

-> CheckIPHeader(14)

-> ip :: IPClassifier(icmp echo-reply)

-> ping :: ICMPPingSource($DEV, $DADDR)

-> SetIPAddress($GW) arpq[1] c[1]

-> arpq :: ARPQuerier($DEV)

-> IPPrint

-> q :: Queue

-> ToDevice($DEV);

-> q;

-> [1] arpq; arpq

ARPQuerier

IPPrint@8 q

Queue

ToDevice@10

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

IPC Element

• Implements a Netlink socket for receiving pub/sub requests from applications (or an API library) and for sending back pub/sub events and published data

– These are sent as messages through the socket

– In user space, the IPC element utilizes the selection mechanism provided by Click

– In kernel space, the element receives sk_buffs in the context of the running process – buffers are wrapped into Click packets that are later processed by a Click task

• Everything is asynchronous – like an event-based system

API (Service Model):

Functions and Messages

• publish_scope(id, prefix, strategy) publish_info (id, prefix, strategy)

• unpublish_scope(id, prefix, strategy) unpublish_info (id, prefix, strategy)

• subscribe_scope(id, prefix, strategy) subscribe_info (id, prefix, strategy)

• unsubscribe_scope(id, prefix, strategy) unsubscribe_info (id, prefix, strategy)

Variable length 1 1 Variable length 1

ID Prefix ID length

• publish_data(id, strategy, data, data_len)

1 1 Variable length

ID

1 LID size

LIPSIN Identifier

(These messages are only used node-internally)

1 LID size

LIPSIN Identifier

Data

API: Events

• Start Publishing, Stop Publishing

• New Scope, Deleted Scope

1 1 Variable length

ID

• Published Data

1 1 Variable length

ID Data

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

Accessing the network

• Standard Click elements for network communication

ToDevice and FromDevice for directly sending and receiving Ethernet frames

• Suitable, e.g., when experimenting over high-speed LANs

RawSocket for sending and receiving IP (UDP) packets over raw sockets

• Suitable, e.g., when experimenting in the PlanetLab testbed or VPNs

• IP network used as an underlay

Network Packet Format

LID size

LIPSIN Identifier

1 1

ID

1

1

ID

2

1

ID n

Payload

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

Forwarding

• Receives packets from the network communication elements

– Matches the FID with all outgoing links and forwards the packets

– A separate LID is assigned to the “internal link” between the Forwarding element and the Local

Proxy Element

• Implements the notion of destination

Sample forwarding configurations

• Click configurations – can be auto-generated

Forwarder (MAC, 1,

1, 08:00:00:00:00:01, 08:00:00:00:00:11, 1000000000000000000000000000000000000000000000000000000000000000

1, 08:00:00:00:00:02, 08:00:00:00:00:12, 1000001000000000000000000000000000000000000000000000000000000000

2, 08:00:00:00:00:03, 08:00:00:00:00:13, 1000001000000000001000000000000000000000000000000000000000000000

); fw[1] -> Queue(1000) -> ToDevice(eth0); fw[2] -> Queue(1000) -> ToDevice(eth1);

FromDevice(eth0, SNIFFER false) -> Classifier(12/080a)[0] -> [1]fw;

FromDevice(eth1, SNIFFER false) -> Classifier(12/080a)[0] -> [2]fw;

Forwarder (IP, 1,

1, 192.168.0.1, 192.168.0.2, 1000000000000000000000000000000000000000000000000000000000000000

1, 192.168.0.1, 192.168.0.6, 1000001000000000000000000000000000000000000000000000000000000000

2, 192.168.1.1, 192.168.1.2, 1000001000000000001000000000000000000000000000000000000000000000

); fw[1] -> Queue(1000) -> RawSocket(UDP) -> IPClassifier(dst udp port 9999)[0] -> [1]fw; fw[2] -> Queue(1000) -> RawSocket(UDP) -> IPClassifier(dst udp port 9999)[0] -> [2]fw;

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

Local Proxy

• “The heart of a network node” – everything goes through it

• Receives all pub/sub requests from applications and other Click elements

• Keeps track of

– Pending subscriptions

– Advertised information items (and assigns FIDs)

• Receives

– Published data and notifications about new or deleted scopes

• Pushes packets to subscribers (applications or Click elements)

– Notifications to start or stop publishing data

• Pushes packets to one (of the potentially many) publishers

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

RV Function

• The same element runs in all nodes

• Every node can create an information structure that will be known and maintained by the local RV function

• Other nodes can send pub/sub requests to that node if they know a path to it

• Usual scenarios

– A network node (its RV function) maintains a local structure for IPC (node-local strategy)

– A network node (its RV function) maintains a structure accessible by physical neighbours (link-local strategy)

– One or more dedicated RV nodes run in a domain – end hosts know how to reach them (domain-local scenario)

RV IPC

• The RV Element access the world the same way applications do

• It subscribes to root scope FFFF where all pub/sub requests are published

• It publishes Topology Formation requests to scope

FFFE to which the TM has subscribed

• Topology formation is required when:

– A set of publishers need to be notified with

Forwarding IDs that point to a set of subscribers

– A set of subscribers need to be notified about a new or deleted scope

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

The Topology Manager

• An application

– Calculates shortest paths in a network

Forwarding information

– Uses (e.g.) the igraph library for this

• How the TM does IPC

– Subscribes locally to scope FFFE

– Receives requests from the RV node as publications

– Publishes responses directly to publishers and subscribers using the Information ID

/FFFD/destinationNodeID

– Utilizes an implicit rendezvous dissemination strategy where information is published with a specific FID

Blackadder Architecture

App1 App2 App3 App4 ………………...

AppN

IPC Element

Rendezvous

Local Proxy Topology

Manager

Forwarding

Communication Elements

/dev/eth0 /dev/eth1 Raw IP Sockets

Dissemination Strategies

• Currently 5 strategies are implemented

– These strategies are used for choosing the scope of information visibility in a network

1. Node-local

– IPC

2. Link-local

– A node can create information graphs a) locally – accessible to physical neighbours b) remotely – accessible to this node

– Link IDs are provided by applications

Dissemination Strategies

3. Intra-domain

– End-hosts use an FID to a dedicated RV to create information graphs and to subscribe to scopes and information items

– Publishers assign FIDs (to subscribers) to individual information items

4. Subscribe locally

– Do not send anything to any RV

5. Implicit rendezvous

– Publish the data immediately using the provided FID

A Blackadder Network

• All network nodes run the same software

– Blackadder runs in user space or kernel space in the nodes

• Configurations can be different

– End-nodes are configured to have link access (LID) and access to dedicated rendezvous (RV) nodes (with an FID)

– Dedicated forwarding nodes run only the forwarding element

• And other elements if additional functionality is required

(e.g. caching)

– Dedicated RV and TM nodes

• Any nodes can be RV nodes – an FID is required to reach them

• TM nodes run a Topology Manager (TM) application

– A deployment tool can be used for generating configuration files and deploying them in a network

– Network attachment component for dynamic settings

Simple API Example

Publisher ba = Blackadder(True) ba.publish_scope(sid,

“”, DOMAIN_LOCAL,

None) ba.publish_info(rid, sid, DOMAIN_LOCAL,

None) ev = Event(); ev.type = 0 while ev.type != START_PUBLISH: ba.getEvent(ev) pass while True: data = raw_input() ba.publish_data(sid+rid, DOMAIN_LOCAL,

None, data, len(data))

Subscriber ba = Blackadder(True) ba.subscribe_info(rid, sid, DOMAIN_LOCAL,

None) ev = Event() while True: ba.getEvent(ev) if ev.type == PUBLISHED_DATA: print ev.data[:ev.data_len]

(This example uses a Python API that is wrapped on top of a C++ API library that translates API calls to messages that are passed through IPC sockets.)

Blackadder availability

• Open source (GPLv2 / BSD)

• Code, documentation, etc.

• http://www.fp7-pursuit.eu/

• https://github.com/georgeparisis/blackadder

• Current release: v0.2beta (in GitHub)

BLACKHAWK

Blackhawk

• Pub/Sub prototype that implements the core ideas from PSIRP

Blackboard-based architecture

• Integrated with the OS kernel

– E.g., virtual memory management

• Objectives: efficiency, natural interface, object deduplication, etc.

• Works in FreeBSD

Publications as Memory Objects

• A publication is an object in the blackboard – i.e., in the computer ’ s memory

– A (concept) publication is identified by a RId

– A version is a specific piece of data identified by a vRId

• version-RId: hash tree root

– A page is a block of data identified by a pRId

• page-RId: hash of content

• Sub-object relationships

– Concept publications can have several different versions

– Versions have a specific set of pages in a specific order

Scopes are special publications that are identified by SIds and store collections of RIds

Blackboard: Objects

• Publication

A piece of content

Related metadata

• Identifiers, size, type, …

• Objects have their own identifiers

– E.g. 256 bits; an opaque or a hierarchical structure

– Could be tied to the data and/or an entity

– Single global identifier space assumed (by default)

• Scope

– Collection of data publications (their IDs)

– Information aggregation, access control

• Data

– Placeholder for a

” concept ” , i.e., mutable content

• Version

– Immutable instance of a data publication

• Page

– A chunk of actual data

(e.g. in the OS kernel or in network packets)

– E.g., 4096 bytes

Root Scope

Subscopes

Publications

Versions

Pages

Object Hierarchy

Scope 0

Scope 1 Scope 2

Pub 2 Pub 3 Pub 1 Pub 4

Version 1 Version 2 Version 3 Version 4 Version 5

Page 1

Page 2

Page 3

Page 4

...

Page 5

Page 6

Page 7

Page 8

Page 9

Page 10

Page 11

...

Page 12

Pub/Sub API: Operations

• Create

– Create a piece of content to be published

– I.e., allocate virtual memory objects for

data and metadata

• Publish

– Make content available to others

– Results in a new version

• Subscribe

– Request and get content

• Register, Listen

– Get notified about publication events

(e.g., when a new version appears)

Conceptual API

pointers to data and metadata of a memory object

handle := create(size)

identifies a scope

publish(sid, rid, handle)

identifies a publication

handle := subscribe(sid, rid) events := listen(handles[])

System Architecture

Userlevel Click

Kernel interface socket

Userlevel interface fs

RZV client

Forwarder

Network devices

Kernel-level Click

RZV if

Blackboard

Data publisher

… pub/sub API

Data subscriber

TM

RZV node

Blackboard

File system, kevents

File system

Kernel events

Pub/Sub applications

Pub/Sub API library

System call interface

Blackboard *

Internal data structures

Kernel-level interface

Virtual memory system

VM System Integration

• Motivation:

We want to achieve efficiency, a natural interface and object deduplication

• Existing FreeBSD VM system data structures utilized:

– vm_page_t

– vm_object_t

– vm_map_t, vm_map_entry_t

– ...

• In our system, for each publication, we have a VM object for metadata and data

VM System Integration

• Metadata object

– One page (currently)

– Object ’ s own ID, its size, etc.

– List of sub-object IDs

• Pub: versions

• Version: pages

• Data object

– Pages contain actual content

VM System Integration

• Metadata and data objects mapped to applications ’ memory spaces (when created or subscribed to)

• Data is copy-on-write

– Can be modified

• results in a new shadow object

• unmodified pages shared – don’t need to be copied

– Re-publishing results in a new version that can be subscribed to

1 ...

...

2 ...

...

File System Integration

• Each publication has a corresponding vnode in the kernel

• Applications get an open file descriptor in the “ handle ”

– After publish or in subscribe

• Enables the use of kevents

– We use it to get notifications when somebody publishes (or subscribes to) something

File System Integration

• A new file system type, psfs

• File system view to the blackboard

– E.g.: /pubsub/sid/rid/vrid/prid/data

– Data/metadata can be accessed on different levels in the object hierarchy

– In theory, we can also map file system ops to pub/sub ops

/pubsub

/sid1

/sid2

/rid1 data meta

/vrid1

...

• Could be used for enabling demand paging over the network as well

– Together with a pull-based caching-enabled transport protocol

In-kernel Rendezvous

• Publication Index ( pubi )

– Each scope, data publication and version (and page) has this small additional data structure for auxiliary in-kernel metadata

– Holds pointers to metadata and data VM objects and a vnode, filesystem-related information, etc.

• Publication Index Table (PIT)

– UMA zone-based storage

– Hash table with ID → pubi mappings

– All identifers are accessible on the same hierarchical level

– Used for (recursive) object lookups in the blackboard

• ID → pubi → metadata and/or data → sub-obj. ID → …

In-kernel Rendezvous

.

.

.

PIT pages

.

.

.

ID → pubi entry

Pointer to metadata

ID, size, sub-object count, etc.

Sub-object IDs

Publication

Index (pubi)

Metadata

Page

Publication type Identifier Metadata Data

Scope SID VRIDs RIDs in scope

Concept (Data) RID

Version VRID

Page PRID -

VRIDs

PRIDs

Newest version

Immutable data

Immutable page

Networking: Basic Protocol

P R S

RC

Publish

Subscribe

RC

Subscriber set update

(MD SUB)

RC

Version metadata

RN Rendezvous

DP Data

Data subscription

DS

RC

DS

API

• Native C API: the libpsirp pub/sub library

• Wrappers for Python and Ruby

– Generated with SWIG and additional C and

Python/Ruby code

– The API for Python is object-oriented

C API

• Header

– #include libpsirp.h

• Types

– Identifiers: psirp_id_t (array)

– Handle: psirp_pub_t (pointer)

• Primitives

– psirp_create(), psirp_subscribe(), psirp_subscribe_sync(), psirp_publish(), psirp_free()

• Accessors

– for data, length, identifiers, fd , …

– psirp_pub_data(psirp_pub_t pub), psirp_pub_data_len(psirp_pub_t pub), …

• Events

– psirp_kq_t

– or standard kqueue() and kevent() calls

Very Simple API Example

#include <libpsirp.h> void pubsub(psirp_id_t *sid, psirp_id_t *rid) { psirp_pub_t pub; psirp_subscribe(sid, rid, &pub, 0x0) != 0); uint8_t data = psirp_pub_data(pub); data[0] = ’a’; data[1] = ’b’; psirp_publish(sid, rid, pub, 0x0);

} psirp_free(pub);

Blackhawk

• Open source (GPLv2 / BSD)

• Documentation, source code, VM images, etc.

• http://www.psirp.org

• http://code.psirp.org

• http://users.piuha.net/blackhawk/

• Current release: v0.3 – in this presentation we described a more developed version

Summary

• Two information-centric pub/sub prototypes

• Different approaches

– Channel vs Document

– Not presented: Algorithmic IDs

• Blackadder implements PURSUIT’s functional model

• Blackhawk implements PSIRP’s memory object model

• Similar APIs, similar architectural components

– Ongoing work: Integration

BLACKADDER DEMO

Download