>> Lili Cheng: Thanks everybody for coming. Today... and CTO star of Onename. This is great because...

advertisement
>> Lili Cheng: Thanks everybody for coming. Today we have Muneeb Ali. He is the founder
and CTO star of Onename. This is great because Chi[phonetic] actually asked us to host him to
give a talk. Chi[phonetic] often finds amazing people and asks us to help make sure they meet
some cool people. So Onename is actually really interesting. I don't know how many of you are
familiar with blockchain, but Onename is built on top of the blockchain technology and bitcoin
and it really looks at creating a very alternative way of creating a directory, and so you’re going
to get a little background with the project and maybe show a demo and then just open it up to
questions. It's a pretty tight audience so feel free to ask questions. Muneeb is also in the
process of getting his PhD at Princeton, and one of the interesting things is this space is fairly
uncommon for researchers to look at partly because it's sort of in between, he's a distributed
systems researcher, but often blockchain has been more in the security space and so it falls in
between research areas. So if you have others too that you think would be interested in seeing
the work let me know. We connect people after. So thanks a lot and welcome and thanks for
coming out and giving a talk.
>> Muneeb Ali: All right. Thank you. So I see that [indiscernible] sitting there. We were in the
same program, and I am technically still enrolled. I'm ABD, all but dissertation, and I have an
interesting story on how all of this happened. So that's the next slide. So this is just a quick
overview of what I'm going to talk about. People who are not familiar with bitcoin I will assume
like you don't know anything, I'll just go over the basics, and then a very simplified model of
what is a blockchain. Decentralized identity is what we do, but first we will try to have a look at
what it really means, what it can enable, and then experiences from like how we launch, the
lessons we learned, and some of the open-source systems we have developed that integrate
those lessons. So we've been running for more than a year now.
Back to my story. So this is the graph people are most familiar with whenever they think about
bitcoin. They know that it went up a lot and then it crashed. If you put a timeline this is
summer of 2013. So I was working at Princeton, I think I was working on a cloud storage project
with Larry Peterson, and I had no intention whatsoever to start working this area. I was
focused on like finishing my thesis as the soonest possible and then bitcoin happened in the
sense that I discovered it, met my cofounder, I actually knew him for a while at Princeton as
well, and we both started working and exploring what is this technology?
One thing that really drew me into this was bitcoin solving problems in distributed systems that
people have been working for like decades and it wasn't getting a lot of attention from my
community. You were not seeing in a lot of researchers talking about it or actively working on
it. Most people who were actively doing research in bitcoin were actually coming from the
crypto community or there were some people who were working on game theory who were
interested this new currency as well.
So this is where I converted all of my grad student savings into bitcoins and then it went like
this. So you can tell I was very happy at this point. I remember going around giving talks about
bitcoin and I would tell people that hey, this is not investment advice. I'm very excited, but
usually I would go and talk to friends and the next thing I know they’re like hey, I bought my
first bitcoin. It's going up. I'm going to buy more. I was like please do not go and buy bitcoins.
This is not an investment. I'm personally very excited about the technology, and as a user I
wanted to immerse myself into it. Then shortly it crashed. There were a lot of security
problems not in the protocol itself, but in the implementations, in the exchanges that people
were running and it's also like really early days you would see that the community that bitcoin
was attracting wasn't as refined or as professional that you would ideally like them to be.
So let's take a step back and first ask the question what really is bitcoin? For that I think we
should try to design a new digital currency. So this is a Professor at Princeton, Brian Kernighan,
and assume that we got this idea that hey, why don't we start a new currency? Sure. I have 10
coins and Kernighan has 10 coins. We have a piece of paper, literally a piece of paper, and we
just write down that this is the balance. Paul Krugman says I want to play this game as well. I
want to be a part, and he joins the system, and now he has no coins. I’m like, okay Paul, I'm
going to help you get started so I'm sending him at two coins. Right now that transaction is not
confirmed. So I still have this, Krugman has this, and there's an unconfirmed transaction, and
when it gets confirmed the balances on that literally piece of paper changes and now I have
eight coins, Krugman has two coins and my job is over. You now know everything about
bitcoin.
This is literally, almost literally how it works. The problem is when Bill Gates wants to join and
we are no longer in the same room at Princeton working with pen and paper now there is
someone whose remote and he wants to be a part of this currency as well. That leads us into
having some sort of a distributed ledger. That's that sane piece of paper but everyone needs to
have the exact same copy of it and everyone needs to be on the same page for this currency to
work. And that leads us into blockchain; and people who work in distributed systems can
clearly identify that this is a distributed consensus problem. How do you make everyone agree
on the same thing at the same time?
Blockchain, so the approach that bitcoin took was actually very fascinating in a way that there
on one end of the spectrum it's extremely simple in a way and initially seems like there would
be a large overhead to doing it this way. How would this ever be practical? The approach is
literally that you take a snapshot, give it to everyone, everyone has a copy of not only the global
state but all revision history of the global state. It seems like an extremely inefficient system
that how is this thing ever going to scale if every single participant in the network has to be a
copy of all revisions to the global state? But this is what bitcoin is doing and it seems to work.
So block chain is actually literally a file. It grows as there are new transactions, so it grows by a
block. A block has multiple transactions in it. I like to think of it as state change. So this was
the state at some time, these are the changes that happened, so they get appended to the file,
and it repeats and repeats and just goes on and on. Feel free to ask questions at any time in
the middle.
So this kind of roughly what blockchain is. Now I'm going to take like a slightly deeper dive into
it. So on the bitcoin network everyone has private-public key pairs. We actually wrote, so this
is most of my cofounder's work. This is open source library. We are calling pybitcoin is in
Python. You can actually go and try it out. This will randomly generate a private key for you.
And this is something like what it looks like in hex. And the bitcoin address, so there is a
deterministic way to get a bitcoin address from a private key. For simplicity I'm just going to
say it's a hash of the public key, but there are more steps involved to it.
So this is what a public key would look like. I think it's actually even longer than that. And then
for simplicity a hash of the public key is actually your bitcoin address. So most bitcoin users
think of this as their bank account almost and they know that they have a private key that
corresponds to this unique ID where people send their money.
The next thing is there's actually no such thing as a bitcoin. It's only a record of inputs and
outputs. So a transaction is basically I received this much digital money and so now when I
want to spend it I actually have to refer to the transaction in which I got the money. So the
money is actually getting generated at some point and that happens with every block. This is
where we get into mining incentives that I'll talk about in a bit.
So there are a total of like 21 million bitcoins which is fixed. In that sense it works as gold that
there is like a fixed amount and the value should actually appreciate over time, and this is the
interesting part about bitcoin. A lot of people have looked at distributed consensus and there
are lots of protocols that can achieve distributed consensus. The one key thing that blockchain
and bitcoin introduce is actually mining incentives. People have incentives to be part of this
system. People have incentives for this system to succeed. And the way the protocol does this
is that it's actually literally minting you money and giving it out to miners. It used to be 50
bitcoins at the beginning; it reduces by half every couple of years. Currently it’s 25 bitcoins and
it's supposed to keep going down, keep going down, keep going down til I'm not exactly sure
but maybe in 2070 or something it will go down to zero. The idea is that at that point the
transaction fees might actually be significant enough for miners to stay online. We'll see how
that plays out.
So this is a very simple way to describe a transaction. It's literally from this person to that
person, this much money, they sign it with a private key, and then they broadcast the
transaction. So the job of the miners is, another interesting scalability thing about the bitcoin
which I think distributed systems people would love a lot is that they are literally broadcasting
every single transaction to every single load. So when you connect your bitcoin daemon to the
network you are seeing all unconfirmed transactions for everyone. And miners are seeing the
same thing. They package the transactions and then see if their block is going to get accepted
or not and I’m going to quickly actually explain that.
This is a transaction, the public ledger is just a record of all the transactions, and this is how
blocks work. So blocks refer to previous blocks with some metadata and it’s just a bunch of
transactions. I like to think of this as a point in time. So it's like a clock deck. Whenever you
get the next block it’s the next instance in time. You're basically keeping a record of all the
changes that have happened which happen in transactions.
This is the interesting part. This is how mining happens. So what you do is this is the current
block, it is unconfirmed on the network, everyone can see all the unconfirmed transactions, and
what the miners are trying to do is that they take data from this block and then they append a
nonce of that to try and find a hash that has some particular properties. So they are doing lots
and lots and lots of computations basically [indiscernible] calculation and let's assume that at
the given target difficulty you're supposed to find a number like this that starts with like five
zeros. So the protocol actually adjusts this difficulty. If the number of minors in the network go
down the protocol gives incentives by reducing the difficulty level and then more miners would
actually come back and be a part of the network, and same the other way around. So this is
what the miners are doing all the time. As soon as a miner gets a block they package all the
transactions and they try to send it out because they want to claim their reward.
Now, there are some interesting attacks possible that some people have studied that you can
actually not announce the block and try to find the next one and then release like two or three
blocks in a go, but then there is a problem that you might end up on the wrong chain if
somebody else actually announces things before you. So there's some very interesting research
stuff going on at intersection of game theory, distributed systems, and security.
So this, again, is my overly simplified description of the blockchain that it's basically time. To
me, the blockchain is just providing a global time sync with a sense of ownership. So in the case
of the Internet the 3,000 overview of the [indiscernible] is that it’s basically providing you the
simple functionality of how to send a data packet from point A to point B. That's it. Then
people started building applications on top, it doesn't matter what the data is because it could
be streaming music, video, e-mail, we don't care. It's data delivery from point A to point B. The
blockchain and bitcoin, what it’s literally providing is transfer of ownership from point A to
point B at a given point in time. And again, the data could be anything. This took a while for
the community to realize that bitcoin is basically one of form of a digital asset. It is digital
currency. But you can actually do transfer of ownership anything. It could be software licenses,
it could be user names which are unique and profile information and so on and so forth. So this
is what we got interested in that looking at the blockchain and bitcoin technology, not for the
currency aspects of it, but the non-currency aspects of building applications which requires
your trust because now you don't have any central authority that you need to trust for running
your service. So any questions so far?
>>: So this file is constantly growing. How big is it right now?
>> Muneeb Ali: It’s close to 30 gigs now. So I think when I first downloaded it it was 20 gigs. So
it's growing, and this is one of the hotly debated issues in the bitcoin community. There’s some
people saying you should start pruning the blockchain like dropping some data from it, the old
stuff, and then other people want to actually increase the block size because they want more
bandwidth, they want to send more transactions per block, but there are some very interesting
issues there because the size of the block kind of starts getting into how quickly other people
will discover that block and it starts having problems with consensus. So it's a very interesting
area at this time because bitcoin is real, so it's a real testbed, so you can actually do the
experiments on a publicly available testbed that is running and miners have incentive to
actually run that testbed for you.
So going back to this, so now I'm going to describe a fork of bitcoin. So another thing that
happened with bitcoin was when bitcoin was the first cryptocurrency that gained popularity, at
least the one that was using a blockchain, and then a lot of people started forking it and
creating other coins. Someone's like hey, I can actually reduce the time it takes for transactions
to get confirmed. Someone else is like hey, I'm going to use change the hashing function and
then it's going to be harder for large corporations to take over the network because my hashing
algorithm is actually very different and it requires like GPU's instead of something else and on
and so forth. So people started modifying, and I'm talking about in terms of hundreds, there
are like hundreds and hundreds of other old coins available.
So one of the very early forks was this coin called namecoin and it came in 2011. What it did
was they were really motivated by trying to create a decentralized DNS system. They wanted
some sort of a naming system built on the blockchain where you can put your DNS records in
the blockchain itself and then try to get rid of the DNS. So that was their motivation, and how
namecoin does that is, so the forked bitcoin modified the core daemon and added the
functionality for doing name registrations. And how name registrations works is first you
broadcast a hash of the name that you're trying to register because it's not confirmed yet,
somebody else can just listen to the network and try to grab the name before you. Once the
transaction gets accepted they wait 12 blocks which is actually fairly long, six blocks would've
been fine if you look at the probability of you being on the correct chain versus on a fork, so
they wait blocks and then you reveal that hey, this is the name I was actually trying to register.
Everyone has seen the previous transaction where I registered at, this was the actual name.
And similarly, after that once you have the name you can update any associated [indiscernible].
So I'm not even sure if they intended to make a key value store, they were really inspired by
making DNS, but they ended up making a generic key value store on top of the blockchain
which was very fascinating. The first time that we discovered it it is a generic key value store.
They used only one namespace because they described that D slash is going to be for domains,
but you can create other names spaces there. But there are a bunch of things they got wrong.
And we've been using namecoin. This is how we launched. So we launched our username
system using namecoin, and I'll describe some of our experiences and what we learned and
what are the changes that we proposing. Yeah.
>>: But there are also coins on top of this system? So that's the incentive?
>> Muneeb Ali: Yeah. So they also have coins. They use the same hashing algorithm as bitcoin.
So namecoin still has are around 70 something percent mining power as bitcoin, but to attack
the bitcoin network you need something like 51 percent, so even 70 percent is not good
enough because someone who>>: [inaudible]?
>> Muneeb Ali: Yeah. But this is still one of the more successful coins. It is still around, kind of
stable, still works, but a lot of other coins were actually 90 percent controlled by a single party
and they were like pump and dump schemes. And that's why like cryptocurrency has been
getting a bad rep, but there's actually some really interesting stuff going on as well. So one of
my motivations is to get researchers more interested in bitcoin, and that's part of the reason
why I'm here as well.
So let's switch gears a little bit and start talking about identity or decentralize that identity.
There are like all these different things in your wallet that identify you: your ID card, your keys,
the passwords that you have that you use to log in; and at a high-level, if in a perfect world
there was one private key or like some set of private keys that uniquely identify you then you
really don't need all these things. This is kind of like the problem people have been trying to
solve for a while, and the thesis of our startup is that if there's ever going to be a single sign-on
or a global username that just identifies you everywhere then no single company can actually
own that namespace because by definition that company would also be a monopoly over
everything online. If you are dependent on them to log in somewhere they would be able to
track you, they would be able to put terms of service on how to use them and so on and so
forth.
So that's a thesis that if there's ever going to be a global Internet ID it has to be decentralized.
There shouldn't be any centralized service that you are dependent on for registering those
usernames, for updating them, or using the system. So that's kind of like then you can log into
websites everywhere with the same profile information, same username, and that's kind of like
the system we started developing. And in the future it's also possible that you start merging
the online and the physical world because it's basically access control. I think Apple watch
already does this with some hotels where you can actually check into your room using your
watch but you are dependent on Apple for this authorization. So it is possible to do it in a
completely decentralized open-source way which is the system that we are trying to build.
>>: [inaudible] no more passwords like how does that actually solve that problem? Like I want
to log into Pinterest. I've got a one name account? How do I do that without passwords?
>> Muneeb Ali: Without passwords? So we're working on an auth protocol where you would
go to a website and say hey, I want to log in as Muneeb, the website would give you a challenge
to prove to me that you have ownership of this username, and you prove that by signing a
message with the private key that owns the username. But it requires integration from the
developer’s side. Currently we are talking four or five companies who are willing to do that,
mostly in the bitcoin space, that's where most of our users are, but in the future you can
imagine that it’s basically we are looking at it as, very recently there have been a lot of data
leaks.
So a lot of developers that we've talked to they kind of don't even want to store account
information on their servers. And I think, are you guys familiar with this company Stripe? So
they are a payment processor and they've felt the same thing. One of the reasons for their
success is the developers can implement their own payment system as well. Stripe was very
easy to use and really well done, but the other and bigger issue sometimes was that they just
don't want to be liable for a hack on their system where all the credit cards got stolen from
their servers. So, they're happy to not keep that information. Similarly, all of the developers
that we've talked to they’re actually very interested in not keeping any information on the user
and still letting them log in, use the system while the user is in control of all of their data.
>>: There’s still a login procedure just using Onename as the authority. You’re still typing in
your public key or what do I>> Muneeb Ali: So basically this is what the UX[[phonetic] looks like. You go to a website, you
type in the username, the website queries the blockchain so we have something similar to a
DNS resolver just like you make a DNS query, you create the blockchain, you get the
information on that user, their profile picture, their name, whatever they are comfortable
publicly putting there, and then puts up a challenge saying to prove to me that you are this
user. Once you prove that then you're able to log in. Your profile is filled in>>: [inaudible]?
>> Muneeb Ali: It’s basically, I have a bunch of slides on that so I can walk you through. So this
is like Onename right now. We launched the service and we have around like 30,000 users right
now which are using it. This is what a typical profile looks like. So this is Naval, he’s the
founder of AngelList. So I'm featuring him because he made all of the proofs and this is what
his data looks like in the blockchain. So this is just a rendering of data that is coming literally
from the blockchain, it's not in any server, and this is the format of the data. His username, his
unique username was registered on the blockchain and this is the value part of that data.
And it's actually very interesting. Just like you can send coins on the blockchain and there's a
history that this is when the coins were created and this is how they were transferred, there's a
public history of all of that, there's a public history of all of this as well that this is the hash of his
username, Naval, and in his case we actually preregistered it for him; and the first time we saw
this information status reserved, this username is reserved for Naval, if this is you please e-mail
me and we released it to him because we knew that people would try to squat certain
usernames and this is one of them which happened on launch day. We were actually spammed
and there were squatters who are trying to basically grab a bunch of usernames and then this is
when he actually got access. Now this is the point where we as a service actually transferred
the ownership of this username to him. The ownership private key changed. We made a
transaction just like I can send you bitcoins, we set his username to him, and now only he has
the private key and only he can make updates to the profile information.
>>: [inaudible]?
>> Muneeb Ali: Think of it like we transfer it from our private key to his private key, the
ownership.
>>: So the private key doesn't move, right?
>> Muneeb Ali: No, the private key doesn't move. A particular private key owns a particular
address on the blockchain, and in our system there's another mapping that a particular address
owns a particular username.
>>: So two questions on that. One is that you sent this message out that says username
reserved blah, blah, blah please e-mail us. I'm assuming that someone would have to contact
you before you sent the username to the person because once it's done you can't even get it
back.
>> Muneeb Ali: No, we can't. So this was done for>>: [inaudible] sell it or send it to someone else.
>> Muneeb Ali: Yes. So this is probably not a good example because this was done for a very
handful of people where we knew that their usernames would get squatted very quickly. So
this was basically, think of it as like friends and family that I registered my username but I
registered a couple of others as well. And hey, I actually reserved your username; if you want it
I'm going to send it to you. But for most people they wouldn’t have to go through this step.
They would just get their name.
>>: But you still have to have things like if you look at Twitter, Twitter has [inaudible] to add to
the ability to take usernames from people because people go in and see that they’re Bill G and
they're not Bill Gates. So no matter what you couldn’t have pre-reserved everyone that you
possibly have to pre-reserve so you're still going to have squatters.
>> Muneeb Ali: Yes. So this was kind of like we were trying to nudge the namespace in the
right direction. There's no way we can actually control it. And the way we actually tried to
solve that problem is by these proofs. And what these proofs do is, so I got my username on
the blockchain which has this unique property that only I own it, I don't have to depend on any
company, they can't take it away from me, nobody can take it away from me; but now what I'm
doing is I'm also posting on Twitter that I am this person on the blockchain and the blockchain
data has a link to that public proof. So we require all these proofs to be public. It could being
GitHub, your domain name, your Facebook, Twitter, and we are adding other things as well;
and it's just a two-way link that proves ownership at that particular point in time. So our
resolvers, before this displaying this information, are actually doing this check real-time every
time before they displayed that this is verified. So you have to keep those proofs online.
So what that now does is that your username becomes less important because you start getting
to this notion of probabilistic identity that if I can show to you that I own this domain,
muneebali.com, I have at Muneeb on Twitter and I have a bunch of followers there it doesn't
look like a fake account if you start getting into probabilistic identity that the more verifications
I have the better the probability that this is actually that person.
>>: [inaudible] try to avoid like robots taking and creating just dummy accounts.
>> Muneeb Ali: Yes.
>>: So, like it's a form of [inaudible].
>> Muneeb Ali: Or like web of trust in a way. So we've been running the system, so we
launched in March of last year, we went through [indiscernible] last summer and then raised
the funding and during all of this time we started noticing, so we were actually I think we can
say if not the most than probably one of the highest number of transactions on the namecoin
network was actually sent by us. I think it's north of like 100,000 because there was a limit of
520 bytes per key value pair and some of our user profiles were actually larger than that. So we
had this hack of where there was a link list in the blockchain of key value pairs where it was a
next pointer to the next key value pair, but only the username of the first one was actually
important. For other values the key was actually garbage because we were only interested in
putting more data in the blockchain.
So we started noticing a lot of things. The first thing was the reliability and security of the
blockchain because, I actually haven't publicly talked about it and I think this talk is getting
recorded so I wouldn't disclose too much information, but there were times when we couldn't
get a good throughput out of the namecoin network. The transactions weren’t going through,
and I will write about it and we know what was going on, but the idea there is, the lesson to
learn is that if you're building something on top of this blockchain system which is a broadcast
mechanism, there are miners who are incentivized to keep it running, you're really, really
dependent on that network. You're building your entire system on top of that. If something
goes wrong, let’s say that cryptocurrency collapses, your entire system would shut down; or
there's a problem in the network, let's say the nodes are under a [indiscernible] stack or
something and you can't get throughput out of the network and this is actually a harder
problem because it's not a single company that is actually managing that infrastructure. So
some of the classic design principles of distributed systems have to be modified in this
environment where everyone is kind of like completely decentralized. Every node that is
participating is just running it for themselves.
>>: What's the latency of name confirms?
>> Muneeb Ali: So that actually depends on, in namecoin it's 12 blocks and there's a block
roughly every 10 minutes. So when people sign up we display a message saying your name will
be confirmed in roughly 3 hours. Sometimes this takes slightly longer or less because when a
block comes out there's a range of time dependent on how hard the puzzle is that they're trying
to solve. So it's possible that someone just gets like a couple blocks very quickly but then there
isn't a block for an hour or something like that. That happens with a bitcoin network as well.
So another thing, I already talked about this 520 bytes limitation, so it's actually not just
annoying to do these hacks to put more data in, it's actually a real problem going back to your
previous question the size of the blockchain. Now you are literally putting all this data in the
blockchain and this is what people refer to as the blockchain bloat problem. With all the
transaction information you can get rid of it. Similarly, in our case there is old revision history
of your profile updates. You can potentially get rid of it, but at any given point in time if there
are millions of users using this as their username to profile mapping system you would actually
start running into real issues where the blockchain is just too large for everyone to have a copy
of it.
>>: [inaudible]?
>> Muneeb Ali: Yeah. That's exactly what we did in the system called blockstore. The lesson in
software engineering challenges was really that because namecoin is an old fork people
stopped keeping up with the main branch of bitcoin so now the code is over four years old; and
the lesson was really like you want to work on the underlying software tool that has the most
amount of developers, it’s being actively developed, people are actually maintaining the code,
and the last part of the scalability challenge is in the sense that it's like one monolithic daemon
that does everything. It has a database, it syncs up with the blockchain, it does broadcasting,
and I think in my initial experiments I couldn't even get more than 100 queries per minute out
of it and the daemon would just crash because it was busy doing something else. And then we
basically wrote our version of our resolver that was putting most of the namespace in memory
and keeping that in sync with the blockchain. That helped a lot because once things are in
memory they're not in the namecoin software; we can start applying classic distributed system
scalability techniques to scale that out and so we learned some lessons there. All of that kind of
resulted in this new system that we are calling blockstore.
So this is the design of blockstore. Our idea is that it is blockchain agnostic but we definitely
want to be on the most stable, largest, most secure blockchain which today happens to be
bitcoin. But we have designed the system so that if in the future it's not going to be bitcoin we
can easily actually migrate away to something else. So our production system still runs on
namecoin, we haven't it done in the migration because this is still kind of experimental, people
have started using it, let the rubber meet the road and we'll see how it goes. But it is possible
to do the migration from namecoin to this and then potentially from bitcoin to something else
as well.
So how we are thinking of this is, so bitcoin blockchain is getting used for many different things.
So we have this concept of a virtual blockchain that sits on top and our software when it's going
to the bitcoin blockchain would actually throw away all the blocks that don't have transactions
in our protocol. So we have defined the protocol with name operations like name new, name
register, and it's almost like putting headers of a higher layer protocol in the lower layer
protocol which would have its own headers and you just check it and if it’s not useful to you
you can throw away that packet. It wasn't meant for you. So we do the same thing with bitcoin
transactions. We do look at them, but we can actually throw away most of the stuff and this is
our virtual blockchain.
>>: [inaudible] the blocks that are [inaudible].
>> Muneeb Ali: Blocks are only [indiscernible]. And another thing that we are doing is as you
mentioned that we are putting hashes of data in the blockchain itself so at least now there's a
constant size and the data can be anything, and we started off, the idea is that if the data is
actually outside there's a default data store which is a Kademlia-based DHT currently so every
node in the network is also Kademlia DHT but there can be other types of data storage as well.
We as a company can just run a mirror because this becomes a secure index which gives you a
snapshot of the entire namespace. Now the only type of attack possible is I know that this is
the hash I just can't seem to get the value from anywhere. So that's a data unavailability attack.
No one will be able to give you wrong data.
Another nice thing about this is that I have a global snapshot of my entire key value namespace
and we require new name operations to include the hash of the Merkle root. So they basically
announced that I think this is the view of the namespace right now. This helps nodes to detect
if they are on a fork because if at any given block there will be only one valid Merkle root of the
entire namespace. This is basically what most people see and agree on. So if you're on a fork
you can detect it and then you can try to come back to that main virtual chain.
For DHT actually, so we modified basic Kademlia into, made it content addressable in the sense
that you can only write the values where the hash of the value is actually the key. So this was,
again, to prevent that let's say I put a value in, someone else updates it and my data is gone;
but the more I started playing with the DHT it actually became very, very interesting and I think
some people here might be interested in exploring that more. I'm happy to talk in depth about
it.
I realize that this is kind of like the first time where you can attach a cost factor to joining a DHT
without introducing any centralized party. So the blockchain itself is completely decentralized.
So not only can you ask proof of money spent that could be transactions, that you're burning
money by doing transactions before you join that network, you can actually also make the cost
significantly higher because if you have a bitcoins in your account, in your address you can
actually verify that this private key owns this amount of bitcoins and you can just set the
threshold at whatever you level you want so people would actually want have to pay, you don't
have to spend those coins, you just need to hold them and then you can become a part of the
DHT. So it makes it very, very hard for a spammer to actually launch a [indiscernible] attack on
this DHT. And I feel like this is a actually very, very interesting concept. I'm exploring this more,
and I'm happy to talk to people about it. Yeah.
>>: That sounds very interesting, but I didn't follow it. What's the incentive to be part of the
DHT and not the [inaudible] virtual nodes?
>> Muneeb Ali: Yeah. So basically right now anyone can launch simple loads and just corrupt
routing tables in a DHT. So if you're honest nodes have a policy, so I can actually, the Kademlia
random ID that a node gets I can make it deterministic from a bitcoin address that holds a
certain amount of bitcoins in it. So if you don't qualify that check you will get rejected from the
network or at least from the honest nodes. So that puts a cost factor for being a part of the
DHT.
>>: [inaudible] how do you incentivize people to even participate in the DHT?
>> Muneeb Ali: Yeah. So right now it's basically because this is baked into the blockstore
daemon and anyone who is using it is effectively participating but that's like nothing, you can
rewrite the software and take the DST part out, the way we see this is this is the secure index
and then there will be multiple different types of data stores from where people are getting the
actual data. And so we as a company, as an incentive to keep copies of the users on our
system, so again there are multiple name spaces now. So we already have a couple of
companies who want to use it for different use cases. We have usernames and profile data.
There are other companies who want to embed information on images and do some sort of
attribution that this username first found this image or created this image online and so on and
so forth.
>>: So we have to trust the company [inaudible] running DHT [inaudible]?
>> Muneeb Ali: That's the thing. For them they might have to provide their own data storage
and just use this as a secure index. So one way to think about it is if you separate the problem
into a control plane and a data plane I think it's best to use the blockchain for the control plane
only because is not going to scale if you actually try to use it for a data plane. But for the data
plane you can actually go back to standard systems design and try to solve the data plane issue
there.
I think I pretty much already talked about this. This is the secure index, these are the names,
and the names here are interesting because they are actually scarce; they are not random
values; they have meaning; they’re human readable; they’re unique. Yeah.
>>: So I may have missed this, but did you say how you are embedding your virtual blockchain
in the bitcoin blockchain?
>> Muneeb Ali: Yeah. So how that happens is that in a bitcoin transaction there are different
fields. There’s one field that's called [indiscernible] that gives you something like 40 bytes and
it might get bumped up to 80 bytes. You can put some other data there. So we defined how to
put the information there, and if you're client kind of agrees with the format it works. So
another way to look at this system is that it's using the bitcoin network as a dumb broadcasting
network and all the intelligence is actually in the clients. So then it’s actually following the
[indiscernible] end to end design principle in a way versus there are some other approaches out
there where they're trying to build a lot of complexity into the network itself for doing these
name registrations. So this is basically the Kademlia tree points out how the key value pair is
actually being stored there, but I think I already discussed this and, question?
>>: [inaudible] assign your [inaudible] to your company defines the format of the data that is
kept in the blockchain. Then what if your company decides not to share that format and just
change it to a [inaudible] format that you can just [inaudible] present?
>> Muneeb Ali: Yeah. So that's why we are very excited that our investors kind of share our
vision. So our entire code is open source on GitHub already, and there are open source
developers who are even helping in developing these standards. So we are to version 2 right
now, and it's actually a pretty active GitHub account. You can go and check it out. It's like
github.com slash openname. So blockstore how you can do registration is the resolver which
acts like a DNS resolver all these things are open source, available to everyone, so not only the
name space is decentralized, the code is also open source. So if at some point this company
goes under those name registrations are still valid. The code is still out there and so we act as
one registrar only to make it easy for users to get registered on a namespace, and they have
registered using other sources as well. I would say I haven't done an exact count but I think the
last time I checked it was somewhere between 10 to 15 percent people never been through our
systems. They actually registered themselves or used some other type of registrar.
>>: So you said bitcoin has a 21 million coin limit. Does this have a similar kind of constraint to
it?
>> Muneeb Ali: No. So basically if you look at, it’s kind of like how you divide it. So bitcoin has
a 21 million limit but then it can be subdivided a lot. So yes there is a total limit but the number
of coins are more than enough. And for this we are paying a transaction fee but the number of
names you can register is a factor of how many transactions you can put in the blockchain
which is unlimited.
>>: [inaudible] unique but then [inaudible].
>> Muneeb Ali: Bad namespace. So there are two things I want to quickly talk about. One is
the decision in namecoin you have to pay the transaction fee for a miner to process the
transaction and put it in the blockchain, but for the actual name registration they were burning
money. They were destroying money as registration fee. And they had one a very simple rule
of pricing which was the price would just drop every X months or something like that. They
thought that people would try to get all the valuable names up front 2011 and then most of the
names that remain wouldn't be that interesting, that spammers wouldn't want them. But
actually how it turned out was name registrations are so cheap on namecoin right now that you
would see a lot of spam activity. People are just grabbing these domain names. And I think
there's a study coming out of Princeton where they found that out of 190,000 domain names
registered only a couple hundred were even valid registrations where they had proper IP
addresses attached to it.
>>: So the actual name is hidden for a period of time. The thing that prevents someone from
trying to take a name that's already registered is because of what hash>> Muneeb Ali: So in namecoin it’s basically hidden to avoid race conditions. So someone finds
out you're trying to register it, I'm going to be you, I want to register it first. But once you
register it then everyone knows that it's yours and then>>: [inaudible]?
>> Muneeb Ali: Right.
>>: So it is really the hash you're registering. So there's no risk of like hash key collisions or
anything like that.
>> Muneeb Ali: The standard collision probabilities apply there. But the interesting thing is
there's expiration period on the name. So in namecoin it was hardcoded as eight months. But
when we designed blockstore this was something that was very interesting that we enabled
name spaces to have different expiry periods and different pricing mechanisms. So now you
can actually experiment with how a DNS like secondary market would actually play out if I
changed the rules. Instead of like 10 dollars per domain it's something else, and instead of
domains expiring every year what happens when you try to expire them in a different format?
So the main reason was that for certain use cases you want the names to be valuable, for
certain other use cases but where let's say someone starts embedding data on Internet of
Things in there that you want to be able to do mass registrations at a very low cost for that
namespace.
So this is enabled in blockstore. Some of the properties, open source, one thing that we
noticed was this might be just time that it's getting harder and harder to find people who
understand C and I think that dates me maybe. A lot of people these days prefer working in
Python. So we actually made the decision to implement blockstore in Python, and so all of the
blockchain functionality is handled in bitcoin which is C but it’s completely like an interface to
blockstore, and actually when I go to the developer meet ups and talk to people this point
actually stands out a lot. People wanted to use blockstore over namecoin because they don't
want to mess around with a very large code base that is in C, C plus plus and everything is
monolithic. They look at it as a very small module on top of the bitcoin blockchain which they
can actually understand and edit and try to do interesting things with it.
>>: [inaudible].
>> Muneeb Ali: I think we might come up with a version. I think I already talked about
separates control plane from data plane, enabled experiment with name spaces coding and
pricing which is something I feel like if there is ever going to be decentralized DNS system you
would have to get this right. You would have to get the price economics right to stop squatting
there.
So this is just our auth protocol that we are working on, and this is just a very high level
overview that right now I think I wrote a post about this slack hack where they were hacked,
they didn’t disclose it until they were able to figure out what was going on, but if you think
about it there are so many startups, so many companies were talking about their confidential
information and the chat logs are actually with slack and they do access control for all the
companies and then are like okay, you can get access to this. And a much better situation is
that the access control actually lies with the company and there’s only encrypted data that is on
the service. For this you need some sort of a decentralized identity or decentralized
authentication mechanism which is what we are working on. So I guess we can take questions.
>>: Your embedding of the blockstore into the namecoin network you talked about proof of
burn for the transaction fee. I assume if you went to the bitcoin network you may have to do
some sort of like transaction fee there and then the mining reward[inaudible].
>> Muneeb Ali: Yeah. That's a great question.
>>: Can you quantify how much that is in dollars?
>> Muneeb Ali: Yes. So both things. So first decision was should we do proof of burn or should
we actually pay the miners? So we ended up deciding to pay the miners because we were
actually using the infrastructure. They should have some incentive to keep the underlying
infrastructure running. So in our case the mining fees actually go out to miners and then we
actually discovered that it's actually more complicated than that because then a miner could
register names for free when they find a block because they are essentially paying themselves.
So then we had to come up with a slightly more complicated mechanism where we were
distributing the mining fee to the last X blocks and X could be 10 or 20 and then the probability
that a single miner is actually getting all those blocks is very low. So that's just incentives.
But the other thing where how much money you're paying in dollars is where the previous
discussion comes in. In the namecoin it was hardcoded and they just had a set scheme of how
it was going to drop. In our case you could create a new namespace by setting a specification
on how you want the names to be priced and how they would change over time. There's some
very interesting things you can do. You can actually try to build an auction system on top of this
where people can actually do bidding on names.
>>: Is this at the blockstore level or is this>> Muneeb Ali: This is at the blockstore level.
>>: Okay. This is not you’re embedding the blockchain and at the blockstore level you have a
mining incentive.
>> Muneeb Ali: Yes. So [indiscernible].
>>: Does that distributing, the fee actually work? I would think the incentive structure is such
that I would just cooperate with the last 10 people, basically everybody that's part of the chain
cooperates and they get free stuff.
>> Muneeb Ali: Oh, you're saying if the miners start [indiscernible]? If the miners start
[indiscernible] then we have bigger problems because they can have 51 percent of the network.
>>: Not general [inaudible], just cooperation on this aspect. I haven't thought it all the way
through. It seems like it should be possible.
>> Muneeb Ali: It is kind of like an open problem that, you can actually join the GitHub
discussion on this. It's very, very interesting. But yes, I am not 100 percent sure but it will
definitely be better than just paying one miner.
>>: So you talked about people who had joined the network on their own. Clearly they’re
storing, they’ve hashed the username or the JSON block and stuffed into the collecting. So for
the things that aren't stuffed in the blockchain where do those things go? Is the data part of
the blockstore also distributed? How does that work?
>> Muneeb Ali: Yeah. So there's a difference. Our production network is on namecoin where
all the data is literally embedded in the blockchain. So there's no need for external data
storage. That's the state of art but with blockstore the data is outside and we are in the middle
of figuring out incentive mechanisms for why would people, like the incentive mechanisms for
running the blockchain are clear, but the incentive mechanisms for the DHT aren't clear. So
that's kind of like where we are with blockstore right now.
>>: But your goal is to like using some sort of [inaudible] store for the storage part of it as well
so that that's essentially decentralized also.
>> Muneeb Ali: I mean from a company perspective we can just run a mirror and before we
start getting new problems is when we have millions and millions and millions of users and
that's a good problem to have. But from a research perspective, because the only attack
possible is unavailability if our servers are down people are not able to get the data and hash it
and verify that this was the correct data; but from a research point of view I think incentive
mechanisms for running a DHT are actually like a very exciting thing. And my hunch is that
thinking through more about linking the blockchain with the DHT how you were able to do, I'm
not sure if it solves the civil[phonetic] problem but definitely makes it harder, but I think
something in that general direction might actually work where you are able to figure out
incentives as well for running the DHT.
>>: My question is [inaudible]. So one of the advantages of one company only have the
information is that side-channels attacks harder potentially because you can't [inaudible]. How
are all the cypher texts [inaudible]? The question is has anyone been successfully monitoring
any side-channels in any of these [inaudible]?
>> Muneeb Ali: What type of side-channel attack?
>>: Let's say you know how to break the encryption. They use some sort of random number
generator that you know how it worked.
>> Muneeb Ali: Okay. So that basically goes back to the security of the bitcoin [indiscernible]
encryption, and there are things where people are saying that it's interesting. I actually had a
discussion with Ron Robust[phonetic] about this that bitcoin happens to be quantum proof
until you make your first transaction out of an address because the algorithm they actually use
is quantum proof but once your public key is known it’s no longer quantum proof if I
understood it correctly. But I would say that the way we are looking at this, and we actually
went through this like I've been managing Linux servers and clusters for years, the first time we
had actual bitcoins on our servers is when I had a real incentive to go and actually secure my
servers. It was something I just had to do otherwise people would literally come and steal
money from me and people have seen that happen everywhere that bitcoin is in a way giving
people incentives for the first time to upgrade their security infrastructure. Also end-users.
People who own bitcoin their security level of their personal laptop and how they store private
keys is completely different from someone who doesn't own bitcoins.
So one of the things that our startup is actually trying to do is make better tools for people to
keep their private keys in a way that they can actually understand. This is something I need to
keep secure, this is how a reset would work; for example you can now own the username not
by a single private key but on a multi [indiscernible] address. So to transfer ownership you
would need signatures from 3 out of 5 or 5 out of 7 different parties, but how do you design
that software so that average users can actually understand it that okay, if I lose this I need to
call my wife and my brother and then they would do something and then a reset would
happen. So it's very exciting times but also like really big problems that we are trying to>>: It’s kind of an old problem because [inaudible]. Whatever you have to provide proof to the
system you need your private key for that. But if that private key is stored only on my laptop
ISSD just went out last week. My identity is gone at that point.
>> Muneeb Ali: So what we do right now is that we force people to save a backup copy of their
private key which is encrypted with our really long public key on Dropbox or G Drive. So we just
force them during sign up. So once they come back and there like hey, I no longer have access,
we are like do you have access to that Dropbox folder? So this is just like we won>>: Does anybody want to set up a Onename account? It might be easier for somebody if you
see it.
>>: I think you just volunteered.
>>: I already signed up so you can’t see the signup process. But I think it kind of helps illustrate
what it is. It's not hard. You just need a Twitter and a Facebook account.
>> Muneeb Ali: I can just show as well what the account looks like.
>>: While you're doing this you just say that you have access to everybody's private keys?
>> Muneeb Ali: No, we don't.
>>: They're all encrypted with your public key but they're stored>> Muneeb Ali: That's the backup file that only they have access to.
>>: On Dropbox.
>> Muneeb Ali: Where we could drop something without getting access to the Dropbox itself.
It's like a write only.
>>: So it's the case that you and Dropbox in collusion now have access to everybody’s public key
or a private key. [inaudible] compromise [inaudible].
>> Muneeb Ali: But this is again, like to clarify, this is like to we won. What we are working on
right now is your mobile app and a Chrome extension where they can do two factor on their
account using their desktop, their mobile device, and then the next step would be other trusted
family members, coworkers that they can add on the multi [indiscernible] address.
>>: You can sign up on mine.
>> Muneeb Ali: So hopefully it’s not going to ask you to verify the e-mail during the sign up.
>>: And we all know how long your password is. You’ve got the same 20. password.
>>: I had a question. What is Onename’s monetization strategy? How's the company going to
make money?
>> Muneeb Ali: Yeah. I think we were looking at it as something like, first of all we are not
looking to make money in the near future at all. Our investors are fine with it. They want us to
grow the ecosystem. There are a bunch of other companies that are actually already trying to
use our infrastructure. So one way to think about it is if we were to monetize you can look at
the model of like GitHub and Git. So Git is open source, you can run your own server, you can
do everything else but developers would pay a small fee for convenience. So it's possible that
people building apps on top of this could be like hey, here's some small amount of money, I
don't want to run my own resolver or my own X and they'll just use our API.
>>: [inaudible] Twitter?
>> Muneeb Ali: You can skip that. You can manually do it as well.
>>: I have an account, but I think it’s just your account though.
>> Muneeb Ali: Yeah. So this was a step where you can save a backup file. You can just backup
it as well. So we’ll download the file but this is my computer so now I have it. File, Save, go to
New, [indiscernible] as well. This was the auth from Twitter. So this is that message where
now actually the name is not registered yet so our backend system has tried registering it and
then you can actually edit any of these fields if you go down. So everyone automatically gets a
bitcoin address. So it's like the way we are thinking of this is like the more people sign up on
this platform everyone has a private key to do encrypted communication, everyone gets a bank
account so you can transfer money to people, and they own their own data so they are not
dependent on any central company.
>>: So I just had a key pair generated. This just generated a key pair. What code was it that
generated that key pair?
>> Muneeb Ali: It’s an open source library we wrote. You can have a look at it. It's on GitHub.
>>: Okay. So I executed your code to get my key.
>> Muneeb Ali: Well, again as I'm saying, this is we won. What would happen, and we too very
soon, is you= would have open source Chrome extension, or open-source mobile app which will
generate the key pair on your device.
>>: So an estimate 1.2 pick like a bitcoin provider [inaudible] wallet. I'm assuming most people
click the first one on base. What's the difference between coinbase and blockchain?
>> Muneeb Ali: I will get two different companies. They are some of the largest. Coinbase
does both models where in one model they are custodian of your private keys as well, so it’s
basically a trade-off between UX and actual security. So coinbase does both. They started off
with just hosting your private keys for you, but then over time now they have options where
only you have the private key. Blockchain started off as saying that hey, we don't want to post
any private keys at all. Only you have the private keys and then you can try using the services.
They’re both fairly large companies. I would say>>: [inaudible] at some point during this process [inaudible].
>> Muneeb Ali: You can because, actually I can show you something. This bitcoin address is
actually not the address that owns your username. This is like a mapping service. So think of it
as like DNS, right? So the DNS system gives you a cn.com to IP address mapping. We're giving
you a username to bitcoin address mapping and the security is the same because both
information is coming from the blockchain.
So I can show you this payment [indiscernible]. This is what some bitcoin wallet started
integrating. I think we got around 18 percent market share of bitcoin where you can actually
instead of sending money to an address you can send it to a username. Just type in Fred Wilson
and we will show you all the proofs that this is the Fred Wilson on Twitter. If you click on it it
will show the proof on Twitter. If you click on Facebook it will show the proof on Facebook and
that's the bitcoin address that you no longer need to know and you just do the mapping. So
this is like one application built on top. There are people who have built another application
which is on Fred's blog this thing is actually rendered from the blockchain. So it's like just a
widget where it’s like if you, actually I have a better example. You can go to OpenBazaar; are
people familiar with OpenBazaar? It's a decentralized eBay.
So their entire team they’re actually rendering the pages using the Onename protocol, right?
So this is all data coming from the blockchain and this is interesting because this can start
getting into access control on a company basis as well. The company can sign a statement that
all of these users at this point in time are employees and then they get access to something
from there. So there are all these apps. We were thinking of like a blogging platform where it
shows a widget like this that this person wrote the post but the post is actually signed by the
private key. So there's actually very strong security guarantee that this person actually wrote
the post as well and so on and so forth. We think of this as a new platform for building
applications more than anything else.
>>: So starting in the early 1990s I started sticking my PGP public keys in photos of my e-mails
and my e-mail signature. So there are probably thousands of servers all over the world which
have e-mails for me which associate the same data that we’ve got here which is my public key,
a little bit of blurb about me and my e-mail address and it seems like that kind of solves the
same problem. I mean it's not automated in any way, but essentially it serves as proof of a
relationship between a key and my short name identity because nobody's going to take my
public key in front of me. And I wonder why I've never had to really make use of that and I
don't know that is because no one's ever challenged that my public key and my e-mail address
are the same thing. I feel like the reason I didn't use that, I’ve never relied on that at all, has
more to do with the fact that there are easier ways for me to establish my identity and to use
this distributed relationship table. It’s kind of just an open observation.
>> Muneeb Ali: I think it's very interesting. So there are two things I would comment on when
talking about PGP keys. One is technical that you're PGP private key is your identity. In this
case you have a human readable username that you can actually transfer it to another private
key, you can make a transaction which transfers the username to another private key. So the
name>>: Like I could just stick a public key in your e-mail address?
>> Muneeb Ali: So I am thinking of it like if you lose your PGP private key you now have to go
underbook all the statements that were made for you but in this case you lost private key and
there was a reset mechanism. Most of the proofs were linking to your username and not to
your private key so relocation is slightly easier. And the second thing which I think is really
important is that there is this public infrastructure now that is pretty much running a discovery
service that all these applications which were using the blockchain it’s fairly simple to wrote
them now because all you're doing is using this public discovery service to read data and then
display it which I feel like PGP infrastructure is possible to build that but it’s missing that at least
at this point.
>>: I think the services existed but maybe it was way too early. Like there were services where
you could upload your public keys and say, here's the e-mail address attached to this but I
didn’t ever use any of those either even though I uploaded them, like the data it just there and
it’s never been written I guess.
>>: So in this case the [inaudible]. Do we effectively have a full copy of the blockchain and then
then you’re just using your API in order to read out [inaudible]?
>> Muneeb Ali: Yeah. So what we did was, so this is a classic rolling RN infrastructure like DNS
problem. So in DNS you have root servers and then there are caches as a hierarchal system, but
in our case the blockchain is the root server so anyone who has a copy of the blockchain can
effectively even cache it if they trust the cache. And so what we did, let me actually go to the
GitHub. So this is the library we use for most of the bitcoin stuff. I think I showed a couple of
examples. This is blockstore; this is the specifications which is just how protocol changes and
how profiles should be expressed and anything else to do with the protocol, so this is the
resolver. What the resolver really does is we make it easy for people to run this Docker, and
the Docker already comes with a copy of the blockchain. If you trust it you can actually
recalculate it all the way from the start if you don't trust a version of the blockchain. And then
it has a memcache server around it that loads the namespace in memory only for scalability, so
you can actually literally just with one line of Docker boot up your own resolver and now you
can start responding to look up requests.
So I feel like a lot of effort is actually needed on the software engineering side as well making
it really easy for developers to start using it and [indiscernible] in the system. And so far it's
been good. I think the community is responding to a system like this. And bitcoin has done a
good job. So there are bitcoin fanatics. It’s a small community and there’s the image of bitcoin
outside that I don't know if you know what it is, it sounds like a scam, and they hear about
[indiscernible] going down and a couple other things. But people within the bitcoin community
suddenly are very, very aware of privacy and keeping private keys secure and ownership that
hey, I don't want to trust any company and things like that. And the same crowd is getting
excited about an Internet infrastructure like this as well that applies the same things to
usernames or your graph, so your social graph and other applications built on top.
>>: by Docker you just mean a Docker container?
>> Muneeb Ali: Yes. Just a container. A container boots up, installs like namecoin, loads the
blockchain, runs the memcache and warms up the cache. It all happens pretty quickly. You
should go and try it out.
>>: What kind of latency is there in the verification? So let's say you went and you deleted that
post that Ahmad[phonetic] put on your Twitter, for example, because he's not really you. How
long would that take to know that>> Muneeb Ali: Basically there are two types of latencies there. One is that the resolver that
you were talking to might have a cache. So I think by default our servers have a 20 minute
cache. So within 20 minutes if someone else made the query it will no longer be verified. But
on the other side, if you now want to, let's say your Twitter got hacked and you want to say on
the blockchain side that hey, I don't own this anymore, the latency there is when the new
profile update would get broadcasted on the network which is roughly 10 minutes, but it can be
higher or lower as well.
>>: All right. So we can break and I think if people have questions just come up and chat and>> Muneeb Ali: Yeah. I'm happy to talk to people.
>>: All right. Thank you.
Download