Volker Mische

@vmx

Joined on Oct 2, 2019

  • Non-interactive PoRep is like doing an interactive PoRep about 12 times in a row. The first phase is called synthesis, where the circuit is generated. Due to the implementation it's a memory and CPU intensive operation. Then the actual proving happens. There we have the SupraSeal improvement, which is GPU heavy. The synthesize can be run in parallel, the proving is done sequentially (one proof at a time on a GPU). Currently the synthesis and the proving of 10 proofs takes about the same (wall) time. The possible performance improvements are sorted by the ones I think would improve things the most, to the ones that would do less. Filecoin specific circuit synthesis Currently bellperson is a general purpose library for proving. Though for Filecoin we use some specific circuit which also cannot be changed without a major upgrade. Hence one possible optimization could be to optimize the generation of the Filecoin specific circuits. They depend on the input, but likely there are large parts that are independent of the input. Even if not, there's likely a way to synthesize those specific circuits in an optimized way that takes less RAM and or CPU time. Reducing the RAM would make it possible to run the synthesis of more proofs in parallel. If also the CPU time could be reduced, we might be able to do the synthesis for all proofs required for the Ni-PoRep in a single run. That would almost half the total run-time.
     Like  Bookmark
  • Content-addressing: chances for data distribution and verifiable data pipelines Workshop on Open Geospatial Science and the Decentralized Geospatial Web, Maryland, USA 2024-04-03 Goals for this talk Learn what content-addressing is Why content-addressing is useful and some of its appplications About me Volker Mische (vmx) Open source geo things for over 15 years ago
     Like  Bookmark
  • This is the pseudocode on how the labelling in PC1 works. It's for the 32GiB sector size production parameters. Glossary: node: A "node" refers to a 256-bit (32 bytes) value of the padded input data. A 32 GiB sector as 1m nodes. label: A "label" refers to a 256-bit (32 bytes) value of one of the layers. It's the result of hashing the node index, layer index, replica ID and multiple labels together. Inputs: parent_file: That's a pre-generated file (it's fixed and the same for everyone) that contains a list of random 32-bit integers that are used as offsets to read from a file.
     Like  Bookmark
  • Collaborative mapping without internet connectivity Global FOSS4G 2023 in Prizren, Kosovo 2023-06-28 Demo: connect to WiFi Switch off your Internet connectivity Connect to the WiFi: vmx@foss4g About me Volker Mische (vmx) First contact to OSGeo 15 years ago
     Like  Bookmark
  • IPLD Shallow Dive Goals of this talk What IPLD is and how it works Know why Multihashes and CIDs are needed for IPLD IPLD + Multiformats IPLD Content-addressed structured data IPLD
     Like  Bookmark
  • Note @vmx: Thinking about it again, the whole proposal isn't really different from just using two CIDs, one for the context, one for the content. There was another idea floating around about generalizing the idea of multicodec code + bytes, I'll try to find some time to write that down as well. Start Date: 2023-01-16 Related Issues:https://github.com/multiformats/cid/pull/49 https://github.com/ipfs/specs/pull/305 Summary This proposal adds another layer on top of CIDs to describe application specific context. This can range from semantic information about the data, or auxiliary data that is needed for traversal. It can be considered a fat-pointer. Motivation
     Like  Bookmark
  • The idea is to provide context to some data. Proposals IPIP proposal One proposal was started as IPIP [TODO vmx 2022-10-31: link]. This was spawn by needs from Lurk, but also other people asked for having more context [TODO vmx 2022-10-31: look through multicodec and find those]. Use cases: Lurk: the exact same data might be traversed in different ways, depending on the context Software heritage: [NOTE vmx 2022-10-31: this might be the routing information case, but I cannot recall]
     Like  Bookmark
  • This proposal is for hash functions that have different results, depending on the given parameters. Examples are BLAKE2, Skein or Poseidon. While it's possible to put all 244 variants of Skein into the multicodec table, it's not easily possible with e.g. Poseidon, which has a bigger parameter space. An option would be to only add parameters that people actually use. But even this is a problem, as there certainly is a certain overhead involved in adding entries to the multicodec table. Currently, the entries in the multicodec table are curated and people prefer having short identifiers, which leads to a bit of coordination effort to add a new hash function. Then implementations need to be updated. Ideally you'd only need to go through this process for a hash function family, and then you can choose any parameters you like. The important property of a multihash is, that it's identifiable and users can verify a hash, solely based on the information the multihash provides. This proposal keeps those properties and retrofits it into the current multihash system. Hashing parameters As the parameter space may be large, we define it in a structured way, hash that information and use that as an identifier. Those identifiers may even collide, as an application is not expected to support all possible hash functions, with all possible parameters. If it turns out that a collision is troublesome, some salt could be added.
     Like  Bookmark
  • FilCrypto Team 2022 Wins SnapDeals Update a sector without resealing. SnapDeals Design and review (@kubuxu)
     Like  Bookmark
  • Rust and GPUs My task since Q2 2021: make rust-gpu-tools framework (OpenCL/CUDA) agnostic. State Q2 2021 No established still actively maintained Rust libraries for using GPUs (neither OpenCL, nor CUDA) OpenCL options: ocl: maintaining a fork ourselves
     Like  Bookmark