XTreeNet: A Framework for Flexible Large Scale
Information Dissemination & Retrieval
TaeWon Cho, Divesh Srivastava,
K. K. Ramakrishnan, Yin Zhang and many others
AT&T Labs Research, NJ USA
August 2011
© 2008 AT&T Intellectual Property. All rights reserved.
Network as the Vehicle for Information
Dissemination
• The ‘network’ will (has) become increasingly Information-centric
– Information of all types becoming electronic and network accessible
– Access of information based on content of interest, instead of location
• Information Overload - Scale: Producers and Consumers face
challenges
– Large number of producers (publishers; data sources)
– Even larger number of consumers (subscribers, users querying/looking
for content)
o Tremendous number of information producers makes it difficult for a
consumer to know where to find relevant information
– Significant challenge: “whom and what to ask” & “whom and what to
tell”
• XTreeNet looks at the various problems related to a networkbased Information Dissemination and Retrieval environment
– Obtain “information” of interest by asking the network to find it
– Tell the network to deliver “information” of interest
– Ask the network as to what “information” I should be interested in
Page 2
© 2008 AT&T Intellectual Property. All rights reserved.
Role of the Network in Information Dissemination
• Success of information aggregators (search engines etc.) unquestionable
– Information aggregators do play a key role
• Limitation:
– Dis-intermediates producers: constrains business model of producers
• Timeliness and Coverage are also key criteria for information
dissemination
– Timeliness: Need information (including real-time) to be available right away
o E.g., for a consumer to access real-time media content
o Ability for the content to be withdrawn is also desirable
– Coverage: Availability of information depends on set of information that is
made available to the consumer by intermediaries, like an aggregator
o Information providers can be “dynamic”/ transient. Complete coverage by an
aggregator may be difficult
o Desirable to enable information producers themselves to make it available on an asneeded basis
• Publish-subscribe based access has become somewhat popular
– (E.g., news groups, RSS feeds)
• Information dissemination and Query-Response for Information Retrieval
in a scalable manner is essential
– Inherently N-to-N communication
– We seek to exploit XML-tagging of information
Page 3
© 2008 AT&T Intellectual Property. All rights reserved.
XML Routing: Overlay Services based on XML
XML
router
XML Overlay
Network
Data query
generation
Database
Subscriber
for alerts
IP Network
Infrastructure
Subscriber for
information
Publisher
• An XML Network: overlay network of XML switches/routers
• XTreeNet project: investigate the design for a large-scale
integrated publish/subscribe + query/response application
• how can we partition functions between the overlay and underlay?
Page 4
© 2008 AT&T Intellectual Property. All rights reserved.
XTreeNet Overview
• Publishers and Subscribers submit Content Descriptors (CD’s) to the
network
• As soon as CD (from producer or consumer) hits network, map into single
hash-id at first overlay router
– Subsequent routers forward based on hash-id downstream
 much more efficient than matching against aggregated query filters
• XTreeNet builds a common Core-based tree(CBT) on a per-”CD”
basis; integrate both producers and consumers of information
– Dynamically create CBT on first arrival of CD from producer
• Groups (overlay multicast) formed on an as-needed basis for each
CD
– Very fine grained distribution tree connecting producers & consumers
– Branches to subscribers for disseminating published content & branches to
publishers for forwarding queries
– Different cores for different CDs – reduce likelihood of traffic concentration
Page 5
© 2008 AT&T Intellectual Property. All rights reserved.
Content Descriptors
• Content Descriptors (CDs) act like “indexes” in a distributed data
base environment
– Each data item generated by a producer and each consumer query filter
are independently mapped to a set of CDs
– A data item matches a query when respective sets of CDs have at least
one CD in common
• CDs decouple producers from the consumers
– Can support heterogeneous producer schemas
• CD can be an element of a topic hierarchy; multiple hierarchies
may be supported (e.g., topics, geographic location)
– An XML schema path (root-to-leaf path) may also be used as basis of
hierarchically structured domain for constructing
CDs
rss
o Disambiguate between multiple XML documents using string values at
leaves
channel
<rss> <channel>
<editor> Jupiter </editor>
<item> <title> ReutersNews </title>
<link> reuters.com </link> </item>
<description> abc </description>
</channel> </rss>
Page 6
editor
item
title
description
link
Jupiter
ReutersNews
© 2008 AT&T Intellectual Property. All rights reserved.
reuters.com
abc
• Publisher guidance
Scalability of CDs
o Information publisher provides guidance on what XML tags of potential interest
• Strategies
o Fullpath: /rss/channel/item/title/ReutersNews
o Last Tag: /title/ReutersNews
o Keyword: ReutersNews
• Estimated by extracting CDs from XML version of Wikipedia
Unique CDs genereated by Wikipedia articles
8000000
7000000
6000000
# of unique Cds
• ~ 5M CDs for
about 1M articles
and grows slowly
– duplication of
CDs in
documents
5000000
Fullpath
4000000
Last Tag
3000000
Keyword
2000000
Last Tag + Keyword
1000000
10
43 00
0
85 00
12 00
7 0
16 00
9 0
21 00
1 0
25 00
3 0
29 00
5 0
33 00
7 0
37 00
9 0
42 00
1 0
46 00
3 0
50 00
5 0
54 00
7 0
58 00
9 0
63 00
1 0
67 00
3 0
71 00
5 0
75 00
7 0
79 00
9 0
84 00
1 0
88 00
3 0
92 00
5 0
96 00
0
10 70
0 00
10 90
5 0
10 10 0
93 0 0
00
0
0
# of Wikipedia articles
Page 7
© 2008 AT&T Intellectual Property. All rights reserved.
Scalable Multicast: Multicast Architecture with
Adaptive Dual-state
• Multicast is key to efficient information dissemination
• Requirements for Information-centric Multicast:
– Scalability in group membership
o Fine granularity of access  support for large number of groups
– Persistent access to group
o Network should be responsible for maintaining group membership unless
users explicitly un-subscribe from group
– Minimize loss of information
– Keep control traffic scalable
• Limitations of existing IP / Overlay Multicast
o Forwarding state grows linearly with number of groups
– State overhead (at multiple routers)
o Soft-state needs to be refreshed
– Control overhead
o Hence, limits scalability and has inadequate persistence
• How to achieve scalable and persistent multicast?
• MAD seeks to solve issues of scale and persistence with multicast
Page 8
© 2008 AT&T Intellectual Property. All rights reserved.
Group Memberships Lifetime & Activity Level
•Membership (e.g., in a pub-sub
environment) likely to be long-lived
Subscription count to YouTube channels
•Users subscribe, and remain interested
in receiving info’ even when publishers
distribute infrequently
•Only 2.3% groups see reduction
•Long-lived membership results in
•Network state grows for group; increased
group size
• Group activity can vary widely
RSS: Publishing rate (# updates/month)
– Analyzed publishing activity of RSS feeds
o Only 5% RSS feeds publish more than 100
updates/month
o Median rate is 10 updates/month
– 10% most active feeds contribute 75% updates
• IP multicast: Inactive groups usually
treated the same as an active group
o But can’t afford loss of information
Page 9
© 2008 AT&T Intellectual Property. All rights reserved.
Using an IP-Multicast Style Approach
• Every intermediate router has to maintain state
o Forwarding state grows linearly with number of groups
– State overhead (at multiple routers)
o Soft-state needs to be refreshed
– Control overhead
• A lot of routers maintain
forwarding state:
00
11
05
02
09
14
13
15
12
10
01
Page 10
04
08
• 6 intermediate routers keep
state that has to be constantly
refreshed
03
•4 first hop routers also keep
state
06
07
First-hop router (FH)
Forwarder
Router not participating
User
© 2008 AT&T Intellectual Property. All rights reserved.
The MAD environment
• MAD multicast service overlay consists of a set of
logical overlay routers
• Each logical router serves as a single aggregated local
subscriber for all users attached to it
• Subscription manager responsible for all the users’
subscription management
– maintains subscriptions for users connected to site
Page 11
© 2008 AT&T Intellectual Property. All rights reserved.
Differentiate the Roles of Multicast State
• Membership State vs. Forwarding State
• Group membership can be separated from
forwarding state
– Group membership must be stored scalably and
persistently
o Especially for groups that have low frequency of information
flow
– Forwarding state: efficient forwarding of active groups
o Can be re-generated when a group becomes active
• Active and inactive groups can be treated
differently
– Small percent of (active) groups generate data at a high
rate: forward efficiently
– Large percent of (inactive) groups generate low traffic
volume
Page 12
© 2008 AT&T Intellectual Property. All rights reserved.
The MAD Solution
• Group membership is separated from forwarding state:
Multicast with Adaptive Dual State
• Use Membership Tree (MT) for scalable state maintenance
– Store group membership information in MT
o Minimize number of intermediate routers keeping group state
– Impose static virtual hierarchy => no control overhead
o But, static hierarchy may not result in optimal delivery path
• Use Dissemination Tree (DT) for forwarding efficiency
– Use DT for active groups
o Can use any “state-of-art” multicast protocol
• MAD may begin as an overlay multicast service
– Use IP multicast to improve forwarding efficiency for DT
– MT may also eventually evolve to being supported by the underlay
• MAD achieves best of both worlds - scalability and forwarding
efficiency
Page 13
© 2008 AT&T Intellectual Property. All rights reserved.
MAD Membership Tree protocol overview
• Goal of Membership Tree: reduce # routers keeping
multicast group state
• MT selects the core (root) based on hash of group ID
– Define a single base tree at this root (static)
– All groups selecting this root use the base tree to construct MT
• Subscriber join is forwarded up on the base tree until
it reaches first on-tree node for this group’s MT
– When a subtree rooted at an en-route router has more than a
min. # of first-hop routers with attached subscribers, the
parent node on the MT requires that the en-route router join
the MT
• MAD protocol provides for seamless transition to
switch from DT to MT as level of group activity
changes (reduces) over time
Page 14
© 2008 AT&T Intellectual Property. All rights reserved.
Routers Maintaining State in MAD
Base Tree
00
00
01
02
03
04
05
06
07
10
11
12
13
14
04
02
09
14
13
15
12
10
06
01
08
03
07
08
05
09
11
15
Membership Tree
(4 First-hops, 5 users)
Virtual membership tree
(fan-out 8, aggregation threshold 2)
• Fewer routers maintain state:
– 2 intermediate routers and 4 FH routers
• Forwarding by multicast/unicast – not necessarily efficient
• MT reduces number of routers keeping Multicast State by
aggregating subscriber state in a virtual sub-tree
Page 15
© 2008 AT&T Intellectual Property. All rights reserved.
Scalability of Multicast with MAD
• Evaluation using simulation and measurements with implementation
– Implementation measured on Emulab with about 100 routers
– Simulation with 16,000 routers; Power-law topology
• MAD achieves both efficient state maintenance and efficient forwarding
• Forwarding efficiency with MAD is
as good as IP multicast (DT)
Total Delay (msec)
Number of Groups (Trillions)
• State efficiency with MAD is
significantly better than IP
multicast-like approaches (DT)
Number of First-Hop Routers in a Group
Page 16
Number of First-Hop Routers in a Group
© 2008 AT&T Intellectual Property. All rights reserved.
Summary
• XTreeNet: project we have been working on –
primarily focused on the meta-data plane
– XTreeNet Architecture – complex processing at the edges;
efficient forwarding in the core
– MAD: Scalable Multicast – Large # groups; Large #
subscribers
– QDTs: Query Distribution Trees for Distribution of Complex
Queries – Load Balancing, Privacy preservation, Censorship
Resistant
– Recommendation Systems: Scalable, Privacy Preserving
• More recent work: “COPSS: An Efficient ContentOriented Publish/Subscribe System” in collaboration
with folks from University of Goettingen, Germany
Page 17
© 2008 AT&T Intellectual Property. All rights reserved.