# image-info rework ## Proposal summary: 1. Split image-info into modules 2. Move image-info code into osbuild repository and reuse existing code ## Background There are currently a few big reasons that argue for re-working image-info: 1) `image-info` is getting too big for it to be easily maintainable 2) `image-info` is duplicating code and logic that is already in osbuild 3) information collected by `image-info` is separated from the osbuild stages that generate it, and thus often incomplete ### Details 1) This is quite self-expalantory. One big Python file that does all the things is not something that can easily be maintained 2) `image-info` basically has two big different tasks: 1) Format: Detect and report facts about the artefact itself, i.e. if the artefact is a qcow2, it reports the format, disk size, partition layout; if it is an ostree commit it reports the commit checksum 2) Content: Depending on the format detected in 1) it introspects the content and reports facts about the content itself, e.g. installed rpms, users, enabled systemd services. In order to be able to do 2), i.e. inspect the content, the image must be "opened". This means different things depending on the image type, e.g. for a qcow2 it must be converted, the disk attached to a loopback device and the partitons mounted. This step is very similar (in some parts identical) to the setup that osbuild has to do when it creates the image, i.e. the copy stage that transfers the content to the image needs to setup the partitions, LUKS and LVM2 devices as well. Thus, image-info could use the existing osbuild funcitonality to do exactly that. 3) Currently, the stages that modify the image and the piece of code that creates the corresponding image-info facts are separated from each other, and even are located in different repositories. It would make a lot of sense to co-locate the `query` (or `facts`) modules next to the `stages` in osbuild and every time a `stage` is added, the corresponding `query` module is added too. Later, when new options are added to the `stage` the corresponding readout is added to the `query` module. This would ensure have two big benefits: 1) the `query` module can be used in the unit tests of the corresponding `stage` 2) image-info would not lag behind osbuild for new stages, and stage options 3) the new `query` modules could be used to collect facts at the end of a pipeline (see below) ### image-info The main job of `image-info` in that new model would be to detect the artefact type and infer the neccessary information to feed into osbuild `devices` and `mounts` to setup the tree. Then run all the query modules on the image, collect the facts and report them. ### pipeline facts Introduce a new pipeline level block called `query` (or `facts`, or something) that contains a table of `query` modules with options that will be excuted once the pipeline is fully built and can thus be used to gather information about the built artefact, like the installed packages. This can replace the usage of `metadata` and thus empower the caller more, who is now in control of what information about the artefact will be returned.