Ranvir Rana: BFT Proof of Location

***Please note that this is an automatically generated transcript and will contain errors.*** --- Read such direction. Like, pretty cool that I've tried to tackle this problem of geographic decentralization. Or at least, all the privilege that we need not try to, through distributed systems or photographic means, verifying the location of, different, actors in the system. And I think this was an interesting tool to use today. Maybe reconcile with the, incentives or economic considerations, I think were studying just now. Yeah. So very interesting, but very different. Like. Yeah. Thanks for the, interim. This, I want to thank you. This will be I'm going to be presenting all of these tools at the talk so far as being inside attempts to promote decentralization. But the moment you give incentives to people who like to fake their attributes. Right. And that's why we need, Byzantine fault, right? Way to, like, determine whether we like it or not. So this talk is about the proof of allocation, broadly, this is the work under witnessing, true, researchers working results. Not only they are sent to let this paper, and replacing our role as a broader goal is to achieve consensus on physical state. If you think of, blockchain, they are not, the trend of, working in a silo of their own, their only connection exists in the form of oracles, which can communicate it to the internet, but it doesn't have, connection to the real physical world. And the assumption is that connection. We are building a whole sort of technologies on which pure others market. Okay, so why location? Why did we focus on education? Because at high level, consensus or coordination happens on rare attribute. Everybody agrees in case of blockchains, it was it was purely digital realm. So it was consensus on time frame is only that attribute. And once you have consensus on time, you can build any sort of politics on top of it for the physical, for what is an analogous attribute, and that is location on top of things, you cannot replicate it. And the goal of our protocol is to have a global agreement on where and when things exist. Invisible, how you can build all sort of applications, decentralized, blockchains being one of them. Okay. So why is this, difficult problem? Because if you look at the sort of technologies available right now, forget the three one and the two, you do not have a technology that can provide spoofing resistance for location. So if you look at G.P.S., you can easily spoof a G.P.S. by just feeding in different numbers, prime numbers coming from your satellites. This can be done at a software level and at the level. Most common techniques used by, use across blockchains is to use IP to determine location that can be easily spooked by, amateurs by simply using reason, because it's a lookup on a database. There have been some approaches to achieving this, which was which typically rely on local beacons. So let's say I want a consensus on location within San Francisco, well, set of beacons everywhere and communicate. I'll have a special device to communicate to those beacons in my location or. But any such applications rely on like communication or via this medium, which has restrictions on line of sight. So I can only communicate to like, I don't know, conditions at a time. And for some reason, if those thing beacons collude, it's just your neighbors. You can tell you that you need. You will be able to spot your location. And this is something that has happened, across, networks like helium, which are providing decentralized physical infrastructure of where people had set up central nodes across the Pacific Ocean in the middle of the Pacific Ocean. Because the South Asia looks now to solve this problem, we need to, like we we have our own constraints, which is we don't want additional hardware to be needed by to solve this problem because I can, you know, take every human out of some, transmitter, and then I can solve location in some sense. And we need to rely on a global set of nodes so that you cannot collude like a small subset, and form like a local cartel, which can falsify locations. So because of this reason, we chose the method over which this location happens to be. The internet. And we rely on active measurements or rate internet to determine location. Right. Now there are some challenges with location determination on the internet. In the end, all location determination is just triangulation. Fancy ways of doing triangulation. But over the internet, the setup works as follows. So you have a framework, we call it local solutions versus local like. But the brewer's states are locations, in this case, the block producers see that that. Okay, they are based in Prague, right? Now, the task of the network is to determine whether the claim is correct or not. And to what level of accuracy can I determine that it's broke? The way this works is there are like, no unsupported versions of watchtowers of any node. They don't need to be trusted. You can just set up a watchtower on a laptop. These watchtowers are tasked with, in some sense thinking that prover and the other gates or things will determine the location. Now, there are some challenges here. First is, you know, you have because we want to achieve a very large scale of these less towers. The established a permissionless but now in a permissionless network, you obviously have untrusted participants. So you can have these watchtowers malicious and obviously you can have the prover to just prove its location to the managers. So you need to deal with some sort of Byzantine fault tolerance over, three persons and two dimensional, plane. And you have internal geometry, which is very intricate. So if I tell you the distance, if I tell you that. Okay, the latency between, two nodes is 50 milliseconds, you cannot just put speed of light over optical fiber or whatever medium to determine the distance to local roads, because internet is comprised of pretty, of a lot of intermediaries like routers, cases and so on. Because I'm done solving proof of locations. Can be, separated into two different problems. First is you need to solve for internet geometry. You need to learn the embeddings of how the lake responds to distances or the internet. And you need to apply some sort of Byzantine trigonometry on top of that, which is a global location for. Yeah. Now you can go into detail material at a high level. For this. But the first task, which is proof of internet geometry, comprises of these watchtowers pinging each other and getting some vision. So there's more stories of the locations, the claim locations there have been each other will get some measurements. Now, this will be a pretty sparse metric because we don't want all watchtowers. Bring all of the watchtowers. Prediction gets, pretty complicated. In time, the time complexity is where it will be very high. So we we create this matrix and we apply robust matrix completion on top of this matrix. So, this matrix says okay, how much time did it take for a watchtower in San Francisco to ping a watchtower? Prince Winston, how much time it takes the watchtower in Princeton to think a watchtower in London and so on. Right now, a lot of these measurements can be Byzantine. But robust matrix completions, objective is to remove these Byzantine entries, and it relies on, primitive that this matrix in some sense corresponds to distance. And you if you create, a matrix of distance, it's going to have a low rank. You can assume a rank for matrix for this. And if it's a rank for matrix, you have sufficient constraints to filter out Byzantine entries. So we apply that to create this, how much time would it take for a message for location here to reach location B across the globe to kind of learn disability. Now that we have this embedding, we apply the proof of location back on top of this now. So just no question. No. Well what type of percent like I mean, if I ask for a ping yes. Then like, I cannot make the response time faster, but I can make it slower. So do I come for that? Yes. Or you come for like, any ping? I get that precise. We are waiting for it to be slower. So anybody any slower? Not nodding because Byzantine nature. It can be slower because of some physical constraints. So you can respond to the lower bound. And this is. Yes okay. Yes. There can be thing responses that like if you are a malicious watchtower and a malicious prover, then you can said you ping to be zero. Sure. Okay. So we need to filter of those systems, those kind of collisions geometry. And you apply proof of location to then filter out, not filter out will create some sort of a common region. Assuming that ping delays can be large. Right. So it can be larger between two nodes, but there can be another node at a circle. I think because we sort it. Could you give some intuition about the magic behind this robust matrix completion? That seems to be very clear. Yeah. So so okay. So let's say we ignore robust matrix completion. That does that for like some efficiency purposes. Say that you can measure all distances between sorry, all you can loop is between all nodes, within our network. If you do that, if you have malicious watchtowers, they would be able to incorrectly, give you a ping from. Okay, let's take an example. So let's say I'm a I'm a little fan of the malicious one here. I'm sitting in San Francisco, but I claim that I have Denver right. So I can I will be playing by several watchtowers which are based in, San Francisco, and I can just delay my measurements, my ping with the artificial. Right. But I cannot artificially delay the ping between, Watchtower in Denver, because simulator is not such a clique that allows you to. Yes, yes. So the task of this completion was, is just to, like, expand it to a large, larger network. So maybe there is no, no watchtower in Denver, but there might be Watchtower in, Phoenix. Right. Which might give you a different view and you won't be able to replicate the delay because it actually took me the huge amount of time for that ping, to reach my actual location. So. But, like, if you know who is bringing you, you can like, change the delay to, like, in parcel and say, I'm like, I own a local Wi-Fi, and the local Wi-Fi makes, like, 70 milliseconds. Like a delay. Yeah. So you cannot even do it if I'm, like, a really good Denver. I can have 30 millisecond delay. Yeah. So, like, if I know, like someone from Denver is spinning me I don't know the train anymore. Delay. But I will have more delay in San Francisco. Yeah. So how you can deal with this. So so that would be classified as an inefficient watchtower. Right. So that the whole task of this is to measure the distances so that point will lie somewhere over here. Right. Which is like very high delay for a very low amount of distance. It's like it'll lie on this curve, over here and will not actually contribute to, making that jump. You know, at this boundary. So that's how in proof of location, if you are in a setting where your being is naturally low, let's say you are connected to a geosynchronous satellite, then you will be able to prove your location. That's a limitation within our system. Because if we throw off an arrow, never going to skip this person can be anywhere because they don't circulate. It will be a circular. One on this, the assumption that you make when you draw, I mean you have some malicious and some honest responses like how many honest do you need a person to manage responses for this class? So it's so there are bounds on okay. So we need more than half. Sure. That's a minute. But there are bounds on the actual matrix completion and bozo individual detections. So we assume that in each individual direction like 60 degree region, second degree. Okay. We have majority of the. So once. So we need the majority assumption for the uniformly distributed. So if a set of where you have all roads within the US and all nodes within that your correct because work need majority and uniformly distributed. That's right. Yeah. Can you explain the different lines like what's the difference between the ratio of US curve and then robust curve. Okay. So so this is this is also taken by some process of elimination. So monotone curve is the base curve which is okay I'm getting worse. This is a worst case scenario because there might be some nodes which have like very high which, which will which will have very low, very fast responses for given all the questions right. The robust curve ratio means that you are eliminating the best performing. So let's say you draw a line here and there are some nodes are performing exceptionally low, which is again, both nodes are but not malicious. They can claim to be, the one apart can give you very low it. So let's say there is no one here. Delay 2.5 millisecond, but distance to be, huge. So it's a ratio way of like. So the elimination happens by the ratio of delay to distance. Right. Binning is, binning is something that, this work was done long time ago, so that's something I'm not. So the technique is something I don't remember about. So I can get back to you on that. Main reason I'm curious is because those two curves are pretty far away from each other. The pin is the ratio. But, yeah. So I'm wondering if they look close in like the beginning, but somehow they diverge very slightly. Oh, okay. So the curve that we end up using is this one. Yes. Right now it looks the most correct. I think the binning okay I remember the bit. So the binning one is you, you just set a hard limiter to almost dilation. So if your delay is like within less than, like a certain amount, you say, you know, six because it's going to take like. Okay. So after we have this internal geometry, propagation essentially does some sort of, intersection of of the, distance regions. So each watchtower will give you, distance of. Okay, this is at max, few thousand kilometers of, like a few hundred kilometers apart. So you draw that kilometers circle, but this one circle around it, you do that for almost two hours, right? And then within a particular arc, you do, you need to remove some. It also changes. So there might be an irreversible it over here cleaning distance like this. Right. And you need to remove any watchtowers which are which are computing the, distance which is out of the range of the, so you remove for certain amount of angles, you remove one stop and adding two revolutions, and you do this across the whole, specificity. It's, it's kind of like quorum intersection, but you're doing the quorum section at each individual. But okay, for simplicity, you can just say that, okay, this is this. You get a lot of circles, understand? And then you find an intersection of those circles. The intersection of that circles is going to be the uncertainty region of that node. So this. Okay. Perfect. So that gives you the location uncertainty certainty of a node. If the node is lies within that intersection. Great. Then you look at what your metric of a correct location is. So this protocol will only give you the uncertainty radius of a particular node. So you can say that, okay, this node is within San Francisco, within a circle of radius 100 kilometer within San Francisco. Right. And it's up to the protocol of what, visit, the boundary. Yes. So there are some limitations of this approach, which is, if you look at the uncertainty of different number of rules, so on this axis, it's, how big that intersection turns out to be. And on this axis is for like, what percentage of this? So the good news is for large number of challenges. So this is district for, for 50 genders. This is all data. So you have 6000 network. But for 50 challenges you can achieve an uncertainty of 100km for almost 43% of the loss. But for 43 of on the nodes at a low density of for exploited nodes across. What you can you see this little accuracy, right. If you increase this, the scope of the shift like this, which is great, but there is a fundamental limitation because we are relying on internet. Right? So even if we have 1 million or 1 billion challenges using this, we'll be able to achieve city level accuracy for most of the nodes. But we don't we cannot go to centimeter little mechanism for that. You need like some. That's medium. The reason why we can't do that is a fundamental limitation of internet, of the, you know, packet routing protocols that I use are important. But we can you can solve both. Solve for that by layering in different, then solutions. All right. Image to location. Okay. And so on. Okay. I think that concludes the talk. We use this, location data for. So I started this by saying that we do it for consensus on physical location. So what that means is, for physical world, you just need three things you need when an event happened, where it happened and what happened during the event. So you just need an additional component of a sensor to collect that data. And like I said it was a communication, for it, it sensor can be like a T you sitting on your phone, or whatever. And you can essentially you run the world is a state machine. Right. So therefore, it's a observation at time t observation at time t plus delta t, you will see the difference. That's the transaction in the physical world. This is what we want to achieve. But this is going to take a lot of, high density observers to achieve this. Okay. So what can you build using this remote location? Well, since in the spirit of this, session, you can achieve some level of resistance based on all the process of your business of work. Also, intuitively, if you just add some latencies to nodes and just positions, you can achieve, some reduction, you can have, verifiable decentralized networks. Also the topic of this, session, one reason why this was like motivation, like why this was developed, was for infrastructure deployment, where there were a lot of decentralized physical infrastructure networks that were coming up and they wanted to know if their nodes, their base station was not selling all these stations, GPU and all its GPS miners are deployed correctly. And you can use that. You can use the system in some sense ordered deployment happen correctly. It can be used for insurance. Verifiable. Awesome. So on. So with that, I conclude my talk. You mentioned that we just seen you have like, you know, security two points. Yes. So so you have these many nodes, right. What do you need? Yeah. So but you need some similar for example. Right. So if you have civil resources, not necessarily like smashing and accountability. And so yeah. So we, we use it for to ensure that it's majority on us. So we cannot slash it because as we discussed like you know internally. Yeah. You know is is wrong. So you don't want to slash somebody because they're like you know so they cannot, you clarify. So basically you are running the, the pink measurements between the individual nodes, which is ours. I'll do that. For example, through validators. There's a few more similar. So probably the geometry was for max allocation prediction. So once you have a location to location metric you can apply the same thing, even if it's a different node in that location. So basically once you get information about efficient so what we are able from that. So you just saw a question about the model. So like I say like the honest ones are uniformly distributed. Yes. Then how are the militias distributed. Do they have to fix their locations before or after the on this lesson. So so they have to so their location, they can set the location any time after the honest locations were chosen, after. So then you need a minimum number of onslaughts, I guess, for this to work at all of it. Like it's not enough that you have most majority you need like not majority plus minimum number of. Yeah. So you need nodes in like different directions. So for example if you're if you are sitting if you have a globe okay. So so if you don't have one it's not over here. Yeah. Right. You won't be able to achieve enough certainty in the selection. Right. But in this honestly, in this direction, we succeeded slightly like we only needed one. It's not way. So you don't have an honest node in this plan, right? You will have to rely on the satiety of these nodes in this. In this direction. Yes. Okay. Which would be good. We're going to be optimal because we won't be able to. So the intersection of these two circles would have very high, distance on this can have very high distance on this axis. And we need nodes over here to reduce that, but you can still work. So that's exactly I think that's where my question comes from. Like, doesn't that mean that you can't like this, the malicious node position need to be fixed before the honest positions are chosen? We don't we don't. Yeah. Okay. But but you are. You do have to make some assumption about the distribution of the honest nodes, right? Because if they're like if all the on. Yeah. And like one point if you actually so you know that even if they're like uniformly placed and there's not enough of them then like there will be places where they can put a lot of malicious nodes and you won't be able to show and look, you can angle it. Yeah. So you need a global like we need full coverage. So our metric of decentralization is pretty simple. Is, is you just need to cover for your. Yeah. Because waiting for the number of honest nodes for the nodes cannot be chosen after the insertion. Yeah. I just put more. Thank you very much.