Datalog for Decentralized Social Networking Monica Lam Stanford University with Dan Boneh, Ruven Chu, Ben Dodson, Bobby Georgescu, Sudheendra Hangal, Te-Yuan Huang, Diana MacLean, Chanh Nguyen, Debangsu Sengupta, Jiwon Seo, Seok-Won Seong, Chinmay Soman, Steven Soneff, Seng Keat Teh, Ian Vo Part of POMI (Programmable Open Mobile Internet 2020), an NSF Expedition Outline • • • • Why decentralized social networking? Overall architecture Datalog Access control Trends in Social Networking The Omniscient Monopoly Sooner Than You Think It’s the Technologists’ Fault There is no easy alternative to share! Big-Brother Portals Poker Portal Facebook Email Flicker Portal Portal Portal Web Browser Web Browser Web Browser Data privacy Data silos Loss of independence / Competition Scalability Big-Brother Portals Poker Portal Facebook Email Flicker Portal Portal Portal Web Browser Web Browser Web Browser Approach • Decentralized architecture • Scalability, independence, privacy • Much more powerful than centralized • Open API for collaboration Data is What’s Important My Personal Cloud Friends’ List Calendar GPS Trace Credit card history Email Phone record Personal Cloud Butler: Mediates access to personal data Manages a semantic index pointing to data hosted anywhere The index can’t be encrypted Where is the Butler? Where data are consumed. Tera-bytes of personal data! Person-Cloud Butlers 32 GB instantaneously. With you all the time, Even when not connected. Private. Better than the cloud! Phone: Digital Identity, Wallet • Unique password. for each website • Login in 5 seconds Challenge Response Authentication Phone: Digital Personality WeTube: Ad hoc sharing without an ASP weBluff Start Activity Invite: QR code ... Join my Bluff game! Accept with a snap 408-555-5555 ACCEPT Download software Join activity Verifiably fair [Blum 82] Concepts in Decentralization • Phone as your digital identity • Junction: a decentralized platform for ad hoc, social applications PrPl: Private-Public Data Infrastructure Facebook Email Flicker Portal Portal Portal Web Browser PersonalCloud Butler PersonalCloud Butler PersonalCloud Butler PersonalCloud Butler Web Browser Millions of Personal Terabyte Databases Out There! Social Multi-Database iPhone Android Mobile client API Contact Photo GPS Music SociaLite: Social DB Language Data PrPl OpenID Manager Index Manager Personal-Cloud Butler Data Steward API home server facebook imap Friends’ Friend’s Friend’s Butlers Butlers Butlers Basic Social Applications • Single query personal Butler (3 Datalog rules) • Butler contacts other Butlers to return results Applications Enabled by PrPl Finding data in your friends* tera-byte databases Collective photo album (more) Collaborative tagging Tag: Emma (John) Looking up friends’ library & recommendations Search Through Personal DBs Fstar(p) :- Friend(p) Fstar(p) :- Fstar(x), Friend[x](p) Fstar-CurrLoc (p,l) :- Fstar(p), CurrLoc[p](l) Datalog • Queries are naturally recursive: including the destination • To hide details of • Distribution • Authentication • Optimizations Basic System Credentials • Single-sign on: Butler presents a session ticket to other Butlers. • Tickets are issued for applications to retrieve blobs from wherever. Extensions • Localization • User-defined functions • Aggregate functions Optimizations • Dynamic query: phone to butler, butler to butler • Continuous connection: real-time activities • Polling based: maintain consistency of a selected portion of the DB • Pipelined execution: • Return and display results as they come in • Six degrees of separation • Speed is more important than completeness • Toleration of slow/offline servers Preliminary Experimental Results (100 Butlers in EC2) Tail Recursion Optimization Fstar(p) :- Friend(f). Fstar(p) :- Fstar(x), Friend[x](p). AllWeights (count<W>, sum<W>) :- Fstar(f), Weight[f](w). • Recognize tail recursion • Visit in a depth-first search • Perform reduction in the intermediate nodes • E.g. Top 10 songs: 12 sec vs. 100 sec. Butler Concepts in Decentralization • Phone as your digital identity • Junction: a decentralized platform for ad hoc, social applications • Prpl Personal Cloud Butlers • Federated storage system • Semantic index: database + semantic file system • Datalog for distribution and optimization Access Control • Facebook • 45% do not have any access control • API is hard to use • Security is as strong as the weakest link • Inspiration: E-mail • Control access of each e-mail • Many un-named lists with nuances Email Social Topology @Work @Play Enjoying powder at Heavenly! >:-D @play Working from home, sick @work Friends Come and Go Continuous update! Phone log SMS Filter is More Important than Access Control Automatic Clustering Intelligent Search … Most important optimization in SociaLite! Decentralization, Open API • Phone as your digital identity • Junction: a decentralized platform for ad hoc, social applications • Prpl Personal Cloud Butlers • Federated storage system • Semantic index: database + semantic file system • Datalog for distribution and optimization • Access control • Semi-automatically and continuously mined from e-mail • Exports different friends list to web portals