Wat gedachten van onze kant. Ik zou graag nog wat een concreter beeld hebben van watvoor lints zinnig zijn, en hoe we die gaan aanbieden en testen.
Het nut van en/decryptie voor een linter is mij niet helemaal helder, maar mischien is het punt hier dat je ook gewoon een volledige zip parser wil?
Idem met streaming support: voor een linter lijkt me dat niet zo relevant? Maar het is wel een fundamentele eigenschap van zip dus vanuit dat oogpunt zou je het moeten ondersteunen.
Dit lijstje is meer op basis van inhoud dan op basis van hoe het handig is te splitsen voor nlnet funding.
---
## 1. basic validation of local file headers with the central directory
the [rc-zip](https://github.com/fasterthanlime/rc-zip) rust library exposes all of the parsing logic. It looks fairly complete (e.g. it includes the extra fields) and we can easily add (and potentially upstream) missing parsers if we hit any. This library already has support for many compression algorithms (deflate, zstd, lzma, bzip2).
Using an existing parser greatly simplifies the project, because we can focus more on our actual tool and usecase: validating the information that we've parsed.
- we use the [rc-zip](https://github.com/fasterthanlime/rc-zip) for parsing
- we assume all input is available (no streaming)
- concrete checks that we want to support at minimum
* does the file name match
* ???
- the checks can be combined into one linter, but are also exposed individually
### deliverables
- a rust crate (library) that can parse a zip file, exposes its representation of the zip file, and contains various validations of the zip file structure
- a python library that internally uses the rust crate that can parse a zip file and then run validation functions on the zip file.
In the rest of this document it is assumed that any logic exposed by the rust crate is also accessible via the python interface.
## 2. expose "extra fields"
`rc-zip` is able to parse these fields: https://docs.rs/rc-zip/4.0.0/rc_zip/parse/enum.ExtraField.html. We can manually parse others if that turns out to be useful.
### deliverables
- add functionality to access these fields
- export the extra fields to a structured data format (e.g. json)
## 3. encryption (or decryption really)
`rc-zip` does not mention encryption, so I'd guess it doesn't support it? Can you provide (commands to generate) test files so we can validate that assumption?
Also, for our linting purposes: how useful is decryption support? Our linter cannot in general decrypt every (component of a) zip, so it has to do its job on the encrypted representation right?
### deliverables
???
## 4. Linting
re. "expanding the linter for files not necessarily complying with the ZIP standard", what does that mean exactly? The parser makes certain assumptions about the validity of the file, so what are we linting here exactly?
### deliverables
- lint for x
- lint for y
...