# Missing/allowed checksums problems PR https://github.com/pulp/pulpcore/pull/1008#discussion_r520710746 Topics 1. control allowed checksums * check enabled in settings for checksums null * Have artifacts checked as pre-save time against allowed checksums instead of __init__ * https://pulp.plan.io/issues/7696 * init_validate takes care of calculation of checksums however we cannot enforce plugin writer to use it 2. populate missing if any * provide pulpcore-manager command. It will calculate checksums based on the file. * If file is missing - possibly incorporate repair feature. We cannot run repair feature because the checksum check won't let pulp start. * Repair feature still will not be able to fix artifacts that where uploaded and miss file 3. fix corrupted/missing files A) Sync case - Repair feature * repair endpoind fixes only artifacts that have RemoteArtifacts B) Upload case * will allow to re-upload missing file https://pulp.plan.io/issues/7791 * what about corrupted file - it needs to be replaced with a new correct one ================================== Meeting November 12 ================================== ## Outcomes: populate-missing-checksums command: 1. keep the check enabled in the settings 2. set the Null checksums that are in allowed_checksums checksum to '' ( in case of missing file) * some checksums are unique * it is only sha384 and sha512 that are unique, the user would need to have empty file and both of those checksums missing to get into unsolvable state ---> reject disallowing those two checksums along with sha256? 4. set the checksums that are not in allowed_checksums to null 5. Question: incorporate repair endpoint functionality into populate checksums command? ## Improvements for repair endpoint 1. in case file is not re-downloadable anymore, remove file entirely to not serve corrupted file, still issue a warning 'artifact unrepairable' * AI open an issue ======================================= Meeting November 13 ======================================= 1. Do not set missing checksums to '' 2. Separate command in 2 phases 3. first phase: 3. incorporate repair endpoint functionality( will re-download files from RemoteArtifact) 4. populate missing checksum command on error continue and collect missing files 5. second phase: 6. this phase starts only if first phase succeeds - set to Null checksums that are not in allowed_checksums list 7. write docs --> mention that it is good to have a back up before adding new checksums AI: ipanova to schedule meeting next week to discuss upload workflows and how to fix them =================== Meeting December 9 =================== Artifacts re-upload recovery workflows. https://pulp.plan.io/issues/7791 Repair feature should remove unrepairable artifacts https://pulp.plan.io/issues/7835 Problem statements: 1. If a file is missing it is impossible to upload a new one * when saving artifact add try/except, look for existing one * verify whether storage_path is an existing location if not update it with the newly uploaded bits * Don't issue 400 due to duplicated artifact, but return the href of the existing one. 2. If a file is corrupted it is impossible to re-upload and replace it with a valid one * Running repair can find corrupted files. It should remove the corrupted file to get back to the case outlined in 1. . * repair can be run against specific repo version, potentially can extent the functionality to repair a specific artifact/content * We could recalculate the checksum on all upload attempts. * Might be a lot of overhead for a rare failure ###### tags: `FIPS`