Decentralized Data Compliance WG

# Decentralized Data Compliance WG [![hackmd-github-sync-badge](https://hackmd.io/zAZmzYc5SZirLWTB-ZsNOw/badge)](https://hackmd.io/zAZmzYc5SZirLWTB-ZsNOw) ## Basic Info - Primary mode of communications: #wg-data-compliance on [Filecoin Public Slack]() - async longform collab? - Slack versus Github Discussions? - Meeting cadence: Every four weeks on Monday for now - IPR regime: ? - Confidentiality: opt-in for specific work items as needed but maximally transparent by default as per [PLN norms](https://protocol.ai/blog/announcing-the-permissive-license-stack/) - Steward/Responsible Idiot: Juan Caballero, aka [bumblefudge](https://github.com/bumblefudge) on Filecoin Public Slack ## Meeting Notes ### Parking Lot for future topics - community patrols for web3 <> NOPFS - https://github.com/MetaMask/eth-phishing-detect <> chainpatrol.io, @Wallet_Guard @PhishFort, @Cipher_Blade , etc. - how get them onboard the NOPFS train? - talk to ipfs-for-dapps wg? ### 27 November, 8am PST - In attendance: - michel legault, data governance researcher and IPFS aficionado: - https://timreview.ca/sites/default/files/article_PDF/TIMReview_2021_June%20-%203.pdf ### 23 October, 8am PST Intros & Agenda - Updates on org-chart insanity - unclear who will use what or what governance will look PLN-wide - - Dennis & Probe report-out on analytics/usage [as announced on github](https://github.com/orgs/DDC-WG/discussions/9) - Robin: Any breakdown by type? - Not yet, but this would help - no preimage, no insight - Hector: overwhelming majority (99.999%) is phishing cruft, tho, very automated (google form) - submitter of hooli form not tracked on the bbits repo - methodology: - crawled PRs for almost all the bad bits entries (>99.2%) - requested all of them from various places in the network - two peerIDs responsible for serving 32% of the delivered CIDs - tracing IPs of the major violators - héctor - i've seen IPFS installed on some hijacked machines (and bad bits filtering disabled) to serve this stuff for phishing purposes - phishing walkthrough, timelines, reporting structure - researching phishing by fetching all very-fresh bbits, and filtering at gateway level by content-type sniffing - open research question: does legal tag these emails before they get added? - héctor - IPIP update - kubo is going to launch it (CLI adding to bbits list) - [docs here](https://github.com/ipfs/kubo/blob/master/docs/content-blocking.md) - rainbow (next major version?) will ship with periodic blocklist auto-fetching and cacheing once bootstrapped - [basic docs here](https://github.com/ipfs/rainbow#denylists) ### 25 September, 8am PST - low attendance, perhaps due to Iceland travel or FIL Dev? perhaps due to juan's constitutional incapacity to properly configure a luma? - Updates - bibliography on private-set intersections - https://arxiv.org/abs/2212.13567 - https://speak.protocol.berlin/protocol-berg/talk/DXHUGH/ - juan: still no response on relevant party at Brave; trying again, and also reaching out to web3.storage and pinata to gather usecases/reqs in detail out-of-band; will bring relevant party to next meeting if any are forthcoming - getting connected to Matt Obers from pinata this week by an old friend/colleague there - asking kyle for brave contact - asking bengo for the web3.storage person-to-talk-to - mario: ipfs.io happy with current IPIP, dogfooding happily, don't really need tags/annotations and waiting on detailed use-cases to implement that aspect ### 7 August 2023, 8am PST - Updates: - CF outreach continues + Boris: attempted to reach out to Cloudflare research, but they don't seem to be active any more + silence on the line from CF team? Mario: haven't heard in a while; Boris: My contacts there not responding over email + research<>product - moderation may be borne by product teams once there's a stable allocation there... will continue researching + Wildebeest was from DevRel and the community was mean - IPFS stewards know that Hector's IPIP is in flux and not to ratify it prematurely - November = worst-case deadline for at least a reqs document (aligned with bad bits), ideally also a prototype... - hashing best practices (double-hashed stuff in partic) - brooklyn might have thoughts? - boris: metaquestion about public versus private blocklists (i.e. set membership checks or better yet "private set intersection") - robin: hector's draft assumed on-disk local list, which might not be quite private enough or performant enough for huge lists - boris: IPFS codebases that might need it: kubo, ipfs-desktop, boxo, Brave etc. have diff needs - for example, shouldn't ipfs-desktop and Brave have a default no-double-hashed blocker to keep toxic CIDs off my hard disk? - next steps and perfect versus good: NO-PFS should still get built out even if usecases and layers not yet certain - different usecases, possible diff layers: - bad bits - can't be public to work well - public server blocklist - need to be public to work well - (for example the gabiverse that much of fediverse longtail uses for lack of moderation infra) - cleartext metadata, composability/mergeability, etc - trust/reputation - per-author reputation good enough to know which to merge, trust whom your trustlists trusts, etc - each list author can be unmerged later? - usernames and text blocklist (i.e. decent social systems flagging slurs in usernames) - native text - "content" in web sense -> "object storage" (S3 and R2) --> opaque but automatic flagging (see [recent blog post](https://socialweb.coop/blog/firefish-cloudflare-quickfix-r2-tutorial/)) - what's web3.storage doing? how's all their HTTP-based uploading going - cameron: no-pfs --> kubo and/or boxo? prototyping in the meantime? - boris: in practice, IPFS Stewards are mostly kubo stewards, so getting other codebases involved always helps - cameron: saturn's `bifrost gateway` [sic] based on boxo might also have requirements, - boris: fission is running nopfs and experimenting as an operator - boris: ThirdWeb is worth talking to as well, web3 SDK provider that gives web3 devs an onramp to web3storage - boris: running mini-network or sub-net that wants to federate - cameron: eric from livepeer has a similar use-case for video checking - Boris: we should zoom in on the distribution - this is baked into release pipelines, right? where is the release pipeline getting the latest list to embed in binary? how is that kept up to date? (currently very manual and unfortunately somewhat human-intensive process) - documenting that would be good - each upload point should know where to send people and document it for their users - Boris: possible federation/cost-sharing regime: blocklists of, say, bad actors or takedowns worth sharing - To be precise, the usecase of "blobs by suspected bad actors" they would be willing to share reciprocally with counterparties of ___ level of trust (appropriate to hashing regime/mechanism) - higher trust MAY need consortia and governance; - jurisdictional enforcement requires complexity (Robin: Lists of lists helpful here) - Nopfs++ framing: NOPFS for some thing, here are requirements for more complex things? - ![](https://hackmd.io/_uploads/HJBiBcRo3.png) - **ACTION ITEMS** - Juan will ask Kyle for a contanct on the IPFS-in-brave team - Robin: Hackmd of blocklist requirements should be iterated and put into github to get us closer to a reqs spec against which to evaluate prototypes and experiences - Boris will keep digging for CF people currently working on current stuff and talk to David Choi from Web3.sto to get their eyes on that draft reqs doc ### 10 July 2023, 8am PST - In Att: Juan, Robin, Cameron Wood and Mario Camou (ResCorps working on biforst, i.e. current badbits infra), Michel (Telus, Carleton College), Prashant (Shovel Co) - Agenda - Cameron's and Mario's idea to generate substantial blocklists for sharing - bifrost <> ipfs.io gateway - doc dump coming soon (link posted to https://github.com/DDC-WG ) - no-pfs into kubo & boxo (recent commitment from the stewards) - boris: "quick build something" could hold back effective governance; quick glance at bsky, nostr, and activitypub domains... - boris: iroh (rust) ipfs server - note: iroh has subnetworks/peer discovery templates out of the box which might help with blocklist discovery - context: ideation stage, bad bits is taking a lot of our team's capacity - scanning at different points: - DHT - at gateways - security headers in requests --> policy violations (XORS, malware, etc) -> flag - union of "recently-accessed" and CSP violations --> natural site for extra scrutiny (if not outright automation) - if usecase is spam/phishing abuse, creates huge corpus of data... who triages/confirms/etc? - cam: assumption has been automated low-certainty lists, with retroactive scrutiny to arrive at high-certainty lists - robin: distribute the work? - cam: Hector's IPIP about blocklist annotations - robin: IPFS architecture issues (path hosting versus subdomain hosting going forward, etc) - [Robin's draft of a governance doc for blocklist sharing](https://hackmd.io/_QVeXEeDSVaJNEpo69GdXw?view) - annotations IPIPs --> assume one list for all the land - Robin's proposal --> each blocklist generating actor produces MULTIPLE lists, operators curate and merge - argument for segregration - some "SUPERLISTS" will be co-governed by law enforcement, copyright associations, and/or civil society - in this case in particular, single-topic lists would be useful - getting from hector's NOPFS to all IPFS operators being safe - robin: the current IPIP spec isn't quite interoperability-guaranteed across implementations and languages - hint underspecified now - possible ideas: ipfs versus non-ipfs content-address versus multiaddr/ipns of bad actor versus bsky handle/DID to avoid... - architectural efficiency: superlists versus local lists? - tiers of IPFS-wide, clear-text coarse, clear-text fine/local? - boris: [activitypub precedent](https://writer.oliphant.social/oliphant/the-oliphant-social-blocklist) - boris: threads federating also makes this more relevant/topical... - federation means ipfs-based AP servers might soon be pumping threads-uploaded video content to IPFS :grimacing: - architecture/network diagram doodling? - tagging discussion? - Next steps? - architecture diagram - **ACTION: Juan will blast out a tendentious draft of a diagram to the list** - outreach to different gateways and kinds of gateways - boris: CloudFlare currently has one instance and is thus wholly responsible for this- would rather enable multiple operators to self-govern - **Action: Boris will reach out to CF folks** - boris: elastic-ipfs? - cam: used to contribute more to badbits and is now taking a more "guily-by-association" approach - boris: chatted with AWS recently about an IPFS reference implementation (for turnkey IPFS-on-AWS package/template) - AWS asked where the moderation stuff is - AWS Credentialing/Identity Heirarchy mismatch w/libp2p model; - AWS might be tempted to cache CIDs of bad static content... it's infinitely more performant and works in their business model... (CDN/Cloud priority = delivery speed in mS) - boris: saturn folks? - currently consuming bad bits - boris: regional instances will have national-specific lists NOT to be universalized... - when share Robin's doc? Robin: whenever this group says yes, bring it to the moderators group - how work with Hector? IPFS-Stewards might have opinions? - Cam: Hector on leave until October! Stewards sounds like next step - **ACTION**: Robin to bring blocklist questions/iterations to the IPFS Stewards and propose updates to Hector's proposal - Links - [juan spitballin' on fil Slack](https://filecoinproject.slack.com/archives/C03TUQZ48R1/p1688458235734319?thread_ts=1687941179.333939&cid=C03TUQZ48R1) - @bmann Related: tiered lists from Fediverse/Mastodon https://writer.oliphant.social/oliphant/the-oliphant-social-blocklist ### 5 June 2023, 8am PST - In Att: Robin, Ian, Boris, Jim - Agenda - Updates: - Hyperledger / IBM folks don't put data on the ledger and recommend not doing it so they don't have the same problems we have. - Alex Feerst (trust & safety lawyer), Mumuration Labs, works with FF already and is interested in maybe participating in the future - Someone from Ofcom was at IPFS Thing - Most important thing he said: he's at Ofcom, and is happy to help us review and discuss, as well as potential collaborations - primary concern is that we don't make lists of CSAM content for blocking purposes - That is being taken care of by double hashing, so the hashes can't be used to find the content - they work with various non-profits who work with various issues, and are willing to put us in touch with various folks who may have issues - FFDW is running a private access of public data working group with activist orgs - Quiet takes the approach of having a private network - Ian will report back from this working group to DDC as insights and approaches researched there become relevant - Network diagrams like Quiet's and Bluesky's help us reason more about how Decentralized Data Compliance would work - Whatever format, tooling, governance structure we come up with for each of these use cases will likely look very similar, and (ideally) interoperate - for some things you want appeals process - want to override a list - so much overlap they are very likely the same thing - for social networks, especially the architecture of BlueSky, people will upload things that you definitely don't want to be federating and indexing some of the user generated content - The problem space of blocklist management might be interested in this group - a bunch of this is just having a schelling point with people where folks can join from all various sub-groups (Nostr, BlueSky, etc) - Blocklist is most relevant for IPFS / Filecoin, content labeling is more relevant for social networks, but the two problems are inter-related. - **ACTION ITEM: Get Rabble to give a talk on his Nostr moderation proposal so the ecosystem understands the broad content-labeling approach?** - Maybe involve someone from ActivityPub as well? - **ACTION ITEM: Ian to figure out if Filecoin is involved in badbits, or has an appropriate solution** - Encryption: - Encryption alone doesn't solve the problem, especially for large filesets - Blocklist piece can focus on ingestion - If it's encrypted data, the scope stops there - Blocklists is great as a focus: - How do we do the following: - Governance - provenance for each list - who is in charge? - who is collaborating on it? - you need to know the source, even if it's multiple networks collaborating - Appeals and Overrides - if soemthing shouldn't be blocked, you should be able to appeal generally, and be routed to the right lists, and the right managers of those lists - you want the ability for a given provider to override any given list - if we can document this process, we can figure out something usable - Sharing (metadata and distribution) - Don't keep cleartext lists of bad stuff - Stanford Internet Observatory report came out today -> Twitter's CSAM moderation broke - Double Hashing to support badbits: - They hash the CIDs of the objectionable content - If you have the CID of objectionable content, you can check for membership easily - However, you cannot brute force the list to generate a list of CID to ask the network for - Horrors from the email / SPAM days - servers can get blacklisted with no clarity int how to reverse the choice - cross-network collaboration - Handful of systems (twitter like social networks, chat) - high volume firehoses, going to have bridges between them in various ways - can be downloaded from system A, and uploaded to system B - Need to provide a meeting point for the various folks already working on this. - Content-addressing - DWeb Camp -- what conversations to have: - shared purpose around blocklists - get time booked with Rabble to describe his content-labeling approach - get feedback on whether we think working cross-network on things around blocklist useful - Is anyone from Internet Computer Protocol going to be there? - Throw up a flag, signal that we're working on this openly, come join us if you're interested - Stanford is doing a Trust & Safety research conference the end of September - https://cyber.fsi.stanford.edu/news/trust-safety-research-conference-announced-september-28-29-2023 - Bluesky - Blocklist are public in their current architecture -- maybe they want to be invovled to figure out how to solve that issue? - Mutes are private, and they have shared mutelists in the system - No clear appeals process - Could have them talk about it technically, but they are iterating very fast and their solution is mostly quick patches rather than a fully designed solution from scratch - A lot of Mastodon operators talk about their approach to blocklists, but it's very technocrat - With blocklists as a focus, we can tug on that thread and get lots of insight from how other groups (like Mastodon, AP, BlueSky, Nostr, etc) are planning to handle this problem - The train wrecks are already happening, they are just very small right now - we should get more folks involved to make sure they don't become big trainwrecks too quickly as the network scales - Hesitant to add Spam to the scope - Mastodon and all open-inbox systems are very vulnerable to Spam - We're not focused on distributed reputation of actors (yet) - our blocklists are focused on blocklists of content first - adjacent to that is actors responsible for non-compliant content - badbits are focused on content, social networks are focused on actors - ideally there is even muting of specific content from a specific actor, eventually - Summary notes: - Actively recruite folks for talks and presentations - Blocklists are the first step - Let's bug people for architecture diagrams for each of the various networks involved so we can understand similarities - **ACTION ITEM: Robin to write <blink>mini blurb</blink> update for this working group** ### 1 May 2023, 8am PST - In Att: Robin, Jim, Boris, Ian - Agenda - Intros - Jim Collinson - Chief Strategy officer for [MaidSafe](https://maidsafe.net/), builders of [Safe Network](https://safenetwork.tech/) - decentralized data network with similar aims to IPFS / Filecoin - Approaching mainnet launch - all the topics in this group are relevant to MaidSafe - Robin: It would be useful if we could have a similar system diagram to the ones Boris / Ian are working on for MaidSafe as well - Decentralized Social moderation - Boris was hoping we coul contrain it to file uploads, but the problem is related to larger regulartory issues in the decentralized social space as well - we are starting with files and data, and that is very widely applicable - moderation of social networks is out of scope for now (at least, we are going to try) - Robin: but illegal and moderated speach is also partially a concern of data storage networks - The best that we can do to explain these issues to regulators is by analogies - There is a lack of agency by many users, including elected officials, because folks don't know how current systems work including issues - The protocols vs platforms distinction is often lost on non-technical users, including regulators. - Much of the criticsm leveld aginst BlueSky because they have not solved the things like moderation - Many folks seem to just want a Benevolent Dictator style corporation that moderates things in one way - A question people sometimes ask is "Why cany we just stop Nazi's from using Wordpress the software?" - Boris has set up a Canadian cooperative organization to help with solving some regulatory issues - Corporate entitties in certain companies usually have their own TOS, and if you break that TOS you are out - Below that is local regulations of the country that company is based - complicated by things like a Canadian country hosting servers in US means they are often subject to both jurisdictions - Robin: This is not a solved problem. The MetaGov folks are trying to figure out how to build a governance layer for the internet? - The problem is that we need this 20 years ago, but in many ways it's still a reasearch project - How can we apply the lessons of that research fast enough to provide something useful in the short term - Jim: and where do we start? - Robin: we have 2500+ years of experience with governance issues, we just haven't figured out how those rules effectively translate to the digital sphere - Robin: we get the rough and ready governance system ready for some good first issues (like blocklists) to start applying these lessons - Boris: This group started as wouldn't it be great if we could just have a shared terms of service? - Fission runs shared servers and services that users interact with, and need to understand their requirements in various regulatory envs - Pour goals - 1. Shared ToS - 2. Does end to end encryption satisfy GDPR / PII requirements? - 3. Shared Blocklists -- it's extremely dangerous to share those lists in the clear - It's illegal to share lists of CSAM or Maps to find illegal content publicly - A list of bad stuff under a certain regulatory regime might not be illegal under other regimes - Sharing a list of politically objectionable content could allow folks to easily track which nodes where sharing or requesting that content, which could provoke retaliation or consequences - IPFS is in no way safe for activist usage out of the box - Jim: This is a slgithly different problem than we are trying to solve with the Safe Network - Link to the canonical IPFS bad bits list: https://badbits.dwebops.pub/ - Pinata, Fission, Protocol Labs, and others share this list - People upload copyrighted content, and teh various folks who run gateways get links to copyrighted content - Everyone gets the takedown request - Not everyone even stores the files, somes they are just catched by a gateway - Some providers onyl serve content uploaded y their own users - Other providers who don't need a shared list of DMCA takedown content - Depending on the jurisdiction of the servers, different organizations have different requirements - The rules for bad bits list varies by jursdiction, for example germany includes much stronger requirments around sharing swastikas and other national socialist party content or hate speach - The idea is to make the list composable and easily maintainable - Maybe for simplicity of discussion, we could use a stupid example discussing cats and dogs and different jurisdictions - probably want yes-lists to ovveride a given entry in a badlist - so you both want it composable, and have them compasable or applied in a specific order - List considerations (action item to write this up): - provenance (where id the list come from, hos trustworthy is it, how was it made) - freshness (how do I make sure it's up to date) - appeals (how is it governed? what if I disagree with my inclusion in the list) - Jim: transparency overall in the process is key - especially when we are promoting networks as censorship resistant - Boris: that is not a concern for everyon. As a service provider, I care more about clear communication with my users - and making sure that it's clear there are no internt backdoors in their process - Fission takes much stronger attitudes towards specific bad actors in the network - censorship resistance can sometimes be a dogwhistle for bad actors - Video shared: what happens if folks share nuclear plans on the network? - https://www.youtube.com/live/Q14iLLkAImU?t=3870 - From Robin's slides about Community and Governace track at IPFS Thing: - Discussion at IPFS Thing — Robin Berjon (PL), Ian Davis (FF), Marjorie Ninno (Fractal), Boris Mann (Fission), Kurt Opsahl (FF) - Short term: - Marjorie: develop an overview of the issues (privacy, CSAM, copyright…) - Boris, Ian: **Diagrams of IPFS architecture, Filecoin architecture, and encryption approaches** - Robin: Reach out to IBM & support Marjorie in the overview - Who else is interested? Banyan, Private data on Filecoin, IPFS operators, PLN startups, other protocols? - Nostr proposal for decentralized content moderation - https://s3x.social/nostr-content-moderation - What do we think? - Boris: this discussion is happening accross multiple networks - already had feedback from how it was partially affected by this and not happy how they were reduced to only a couple of labels, and the word used in those labels - These issues call all the way up to directly user facing issues in the social netowrking ecosystem - It's not a problem that can be contained to the service provider level - Ian: subtles differences in how the list are applied, but in many cases the same mechanisms can be used for user moderation systems and banlists at the provider level - Boris: related to user transparency and education - a non-trivial amount of software developers had to be bad learners of copyright law in order to udnerstand open source licenses in the early 2000s - Ian: What is the difference betwee the Terms of Service and privcay policy? - Robin: they are both notices you must present on your website - Usually privacy policy is broken out, because many regulatory regimes require it to be borken outseparately - is part of your terms of service that corresponds to how you handle personal and private data. - Boris: a terms of service is often from the entity itself as a legal protection to clearly indicate what a user agrees to as part of using the service - TOS can be used to protect again accusations that you are deniying service for certain illegal or discriminatory reasons - whereas a privacy policy is often mandated by certain local jurisdictions - Boris: the main ask is to share resources in the channel to discuss, and add to the main wiki notes page as related resources - Fediverse blocklist: https://github.com/rapidblock-org/ - Working Group Working Group - Robin has started a new group as a lightweight group to coordinate which working groups are currently running and list them in a single place - https://github.com/working-group-wg/ - May eventually include shared list of best practices - Example of what kinds of resources and recommendations the WGWG might recommend https://github.com/CommunitySpecification/Community_Specification - What needs to be in the system diagrams that Ian / Boris and others are working on? - Boris first draft: https://www.tldraw.com/r/v2_c_5kWfXqGQNSEKG5bD3rFn-?viewport=0%2C-111%2C1619%2C1166&page=page%3Am7oHcAaNUVH0PZ_w2YWYv ### 3 Apr 2023, 8am PST - Agenda - Chair while Juan is out? - Ian will chair next, Juan will be back after that - Community and Governance Track at IPFS Thing: - Opening talk (11-12): overview of all layers of IPFS governance, flag issues (will be recorded); unstructured time --> unconference - this group could propose a more specific session? - also happy to signal open topics anyone wants driven-by - Brainstormed a bit into [big questions section](https://hackmd.io/kL-J_4eOQj6Q1wS1T2JGZw?both#Big-Questions) - ![](https://hackmd.io/_uploads/B11RL_OZh.png) - 12-16, minus lunch = unconference slots + DDC session near the end of the slots, maybe? - Also at Thing: COmpact Format and [IPIP298](https://github.com/ipfs/specs/pull/340) - See also Hector San Juan's [IPIP383](https://github.com/ipfs/specs/pull/383) and [working prototype](https://github.com/hsanjuan/nopfs) for sideloading into kubo - See also Hector's [Design doc](https://www.notion.so/protocollabs/Content-blocking-on-IPFS-and-Kubo-3721f6fcc8044ba9acebd7356a60c4b7) - Filecoin Fdtn: Do we have a stopgap or for-now solution to GDPR? - Robin: [Euro Cloud CoC](https://eucoc.cloud/en/home) - self-regulation guidelines, might be applicable to Filecoin services? - Ian: Meeting with DPAs and regulators in coming weeks on behalf of Filecoin Foundation? - Ian: Steer away from PII use, cultural archives only - Robin: Non-custodial and disposable privkey narrative (end-user only data controller - caveat - demonstrable/provable deletion of privkey? quality of key mgmt is a factor - Marjorie - forgetting !== deletion (good-faith/reasonable effort to remove all access to data is usually enough BUT PER USE-CASE, not universal precedent) - Ian: Belgium has a lot of questions re: IPFS <> Solid-based citizen wallet/vault - extremely sensitive data/authoritative docs in that use-case... - Hybrid solution - Solid as identity/AuthZ layer, keep keys on that layer, IPFS just as logical storage under... - news flash - [that project is using MSFT azure/entra now](https://www.belganewsagency.eu/government-of-flanders-to-partner-with-microsoft-to-develop-data-vaults), with a tbd Solid integration?? - Ian: Univ of Ghent is a powerhouse for Solid work - ### 6 Mar 2023, 8am PST - Small Group Sync - In Att: Robin, Juan, Boris, Ian - Agenda - Intros - Ian: Filecoin Foundation partner engineer working with grantees; grantees' legal/compliance needs but also needs of PLN/FF - personal interest: legal feasibility of/gameplan for personal data stores - Shared Notes presi and invitation to edit - Thing Roadmap? - Get @lidel and @thibmeu in a room to discuss [IPIP298](https://github.com/ipfs/specs/pull/340) and blocklist ideas? - [ ] Boris will email them and see if they can host a session on this in governance/standards track and/or operators track - Short-term priorities versus Medium-term ones - Boris: Harmonizing on blocklists is most urgent tooling problem for both short-term pain and openinig up medium-term tooling possibilities - Ian: Incremental tooling - - What tooling is most urgent? - 1 scan on upload (refuse to upload) versus 2 replication-filtering (refuse to serve) versus 3 storage liability, served or not (refuse to store) versus 4 DHT-accomplice risk (helping people find it even if you're not serving or storing) - Ian: We need to _start_ on #3 or else there's no backup if #1 and #2 fail - Filecoin variant: - for length of storage deal, data that's legal for creator and pinner might be downloadable to consumers in places where it isn't legal - data that is illegal to store might not be removable for the duration of the storage deal unless the storage provider is willing to lose FIL. - Boris: hypothesis: estuary might be a place where filtering could happen? - Ian: Estuary can choose not to serve... - Robin: Is #4 in scope? Is IPNI liable for helping people find bad bits? - Ian: DNS precedent would say *mostly* no... Robin: But with exceptions (see counterterrorism policies in France, for ex.) - Robin: Caching proxies closer precedent to search on - Boris: but what about the rest of a block? - Robin: IANAL, we should get a real lawyer in here/on this - Boris: service provider <> protocol division of labor rears its head again here - Boris: estuary and CID gravity; might this have been part of why infura shut down their ipfs tooling? - Ian: web3s might have a lot of input here; Boris: They're using estuary too, but the architecture has some nuances - Boris: Pinata - Boris: what is kubo's recommendation ? where is ipfs desktop? - PDS issues - BlueSky/AT Protocol/DWN versus Pods/Solid - AT may be good to bring to the table when PDS is on the agenda - Fill out more sections in shared notes as time allows - figuring out the mode for input/shared ownership of the [shared-notes doc](https://hackmd.io/kL-J_4eOQj6Q1wS1T2JGZw?view) ### Kickoff - 5 Feb 2023, 8am PST - Agenda - QUICK intros [1min per co, 1min per individual] - Decide short-term goals and, ideally one deliverable (or more) for Q1/Q2 - Update Basic Info above - Intros - Juan: lots of experience w/personal data; no agenda, no dayjob (just love PLN) - Boris: long experience; founder of Fission, which needs a ToS and GDPR compliance on our backlog since forever - Fission in a nutshell: we give users a personal file system, so we want to imagine - Marjorie: Head of legal for Fractal.id, based in Porto; Fractal is a KYC/DID provider with web3 use-cases; span custodial and non-custodial/exportable storage of personal data - Robin: Work for Protocol Labs, on standards and governance; before that worked on NYT for 5 years on privacy, GDPR and related issues - Addie: Network Goods; archelogical association (hypercerts, impact evaluation, other onchain opensource/funding mechanisms); worked on various L1s and FOSS before this - Martin (identity.com): Solana onchain DIDs; Filcoin - have worked on many projects that get "shut down by GDPR" - might be easier to solve on network - Daniel (Civic): we build apps on identity.com stuff; we issue VCs to KYC'd users that go in their wallets - Goals and deliverables - Boris: mission statements and scope? - GDPR compliance - mental models for "deletion"? - public/private network - Boris: permissioned data WG in Lisbon - Addie - WinFS - DIDs + Caps --> Permissioned data (implementation direction) - funding unclear atm, tho, or else already would be a WG - Boris: IPFS Operators WG - bad bytes list experience - hard to figure out geolocating regulatory obligations without "censoring at protocol level" - need to do both at both levels - Robin: DSA going into force will make this more relevant - takedown framework - DMCA = just copyright; DSA = broader, content-based criteria, malware, phishing, etc - 230 analogue (includes safe harbor clause, like 230) - applies differently - very web2 definitions/concepts of "deletion", "publication," etc - Martin: I want to learn from decentralized storage veterans; my impression is that filecoin core/foundation, PLN core/foundation, and Arweave core/foundation could (and maybe should?) take leadership here, I'm curious what they think/want - Boris: I don't think they have a uniform/official position here, they want to stick to lower (protocol) layer(s) and consider compliance an app-level problem; - Boris: IPFS does not want policy built in at that layer (//S3) - Anecdote: NFTs coalescing on IPFS-flavored CIDs, which de facto pumps traffic through IPFS https gateway; we got takedown requests from CIDs that looked like fission-generated/-hosted content because we controled a domain serving other gateways' CIDs - Anecdote: Open relaying --> lots of DNS whackamole - Martin: VCs or attestations/reputation for nodes? Boris: That's a diff layer/problem tho - Boris: Bad bytes list might be a good federation-layer for agreeing to mutually-enforce shared policies - Martin: How compose? - Daniel: I'd love to have a mental model for layers and on which layers compliance building blocks live - Boris: As long as we don't decide ONE REGULATORY LAYER - Robin: Privacy threat model for IPFS - ex.: How data leaks? How do encryption keys leak and what happens if they do? - ex.: How do GDPR and CCPA differ in defining PII/toxic data? - Operators aren't only actors here-- - ex. running an IPFS node in a browser or on edge device? Doesn't the CIDs you're propagating thumbprint the node? (YES PROBABLY) - Boris: FAQ/wiki-collaboration model - Martin: Chip away at the perception of a fundamental **Irreconcilability** of Private data and Public networks - threat model could be a discouraging starting point, just maps out the problem space; we should publish a conditional path forward! - Robin: Yes, we are breaking into pieces the work of putting forth a legal theory! - Boris: MANY protocol and protocol+app projects I could name are waiting and hoping for this precedent/legal theory! - Marjorie: I agree; one of my main painpoints when I started researching IPFS because I didn't understand it or its privacy attack vectors/architectural division of labor, so let's defo start there - layering IPFS with other GDPR-conscious protocols and architecture... - Geo-fenced nodes? That would also help our design efforts - Robin: Industry-specific problems and innovative - good CoC articles from EU: Cloud Hosting has one - Boris: Code of Conduct for IPFS operators would be great! - Cloudflare would likely be interested, given their history - Boris: Do we have a good technical definition and/or regulatory best practices for "Tombstoning"? - Next steps - Boris: i'd love to know which laws I should be worried about - GH Discussions will be where most magic happens - intro yourself as soon as there's a place for doing so! ## Older notes (mostly by ian) are here for some reason: https://hackmd.io/@bumblefudge/SJIS_jQ0o ## Bibliography Feel free to add to this! Anything you think might be helpful for research, meetings, etc. Primary Documents - [Text of DMA](https://eur-lex.europa.eu/legal-content/en/TXT/?uri=COM%3A2020%3A842%3AFIN) (and official translations) - [Text of GDPR](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A31995L0046) -[2016/679 (replaced 95/46/EC)](https://eur-lex.europa.eu/eli/reg/2016/679/oj) - [Text of DSA](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A32022R2065) Guidance - [EU Code of Conduct guidance for Cloud Hosting](https://scope-europe.eu/en/projects/eu-cloud-code-of-conduct/) - CSAM regs for bad bytes? Useful Secondary Sources