Skip to content

add sql script generation for deleting obsolete cocs#94

Open
eperedo wants to merge 8 commits intodevelopmentfrom
feature/regenerate-cocs-869btxm35
Open

add sql script generation for deleting obsolete cocs#94
eperedo wants to merge 8 commits intodevelopmentfrom
feature/regenerate-cocs-869btxm35

Conversation

@eperedo
Copy link
Copy Markdown
Contributor

@eperedo eperedo commented Jan 28, 2026

📌 References

📝 Implementation

Generating a sql script for deleting categoryOptionCombos.

yarn start categoryOptionCombos regenerate \
    --url=https://play.im.dhis2.org/stable-2-40-10 \
    --auth="admin:district" \
     --generate-sql-delete-script

Before deleting a categoryOptionCombo the script checks for existing data in the datavalue and datavalueaudit tables. If you have thousands or millions of records, this process can be very slow.

Adding an index to these tables improves performance significantly:

CREATE INDEX CONCURRENTLY idx_datavalue_catoptcombo ON datavalue(categoryoptioncomboid);
CREATE INDEX CONCURRENTLY idx_datavalue_attoptcombo ON datavalue(attributeoptioncomboid);
CREATE INDEX CONCURRENTLY idx_datavalueaudit_catoptcombo ON datavalueaudit(categoryoptioncomboid);
CREATE INDEX CONCURRENTLY idx_datavalueaudit_attoptcombo ON datavalueaudit(attributeoptioncomboid);

📹 Screenshots/Screen capture

🔥 Notes to the tester

#869btxm35

await this.saveCocs(categoryComboWithGeneratedCocs, options);
await this.deleteCocs(categoryComboWithGeneratedCocs, options);
if (options.deleteCocs) {
await this.deleteCocs(categoryComboWithGeneratedCocs, options);
Copy link
Copy Markdown
Contributor

@tokland tokland Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method deleteCocs also takes option deleteCocs, which is confusing, 1) it will only be called when true (as we now have this if). 2) you'd expect to delete, given the method name.

Comment thread src/scripts/commands/category-option-combos/regenerateCocsCmd.ts Outdated
@eperedo eperedo requested a review from tokland February 26, 2026 00:40
Copy link
Copy Markdown
Contributor

@tokland tokland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-line comments:

Comment thread src/data/CategoryComboD2Repository.ts
Comment thread src/data/CategoryComboD2Repository.ts Outdated
categoryIndex: categoryIndexByCocOptionPosition[inputPosition],
inputPosition,
}))
.sortBy(item => [item.categoryIndex, item.inputPosition])
Copy link
Copy Markdown
Contributor

@tokland tokland Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern has been on my mind for some time, so I took the time to explore it. According to the docs, it should work: https://lodash.com/docs/4.17.23#sortBy (_.sortBy(users, ['user', 'age']); it the equivalent example). However, I was worried if this was sorting things as one would expect (lexycographical order) or there was some string conversion that messes things up.

> var users = [{ 'user': 'fred',   'age': 48 }, { 'user': 'barney', 'age': 31 }, { 'user': 'barney', 'age': 4 }] 
> JSON.stringify(_.sortBy(users, ["user", "age"]))
'[{"user":"barney","age":4},{"user":"barney","age":31},{"user":"fred","age":48}]' 
> JSON.stringify(_.sortBy(users, [u => u.user, u => u.age]))
'[{"user":"barney","age":4},{"user":"barney","age":31},{"user":"fred","age":48}]' 
> JSON.stringify(_.sortBy(users, u => [u.user, u.age]))
'[{"user":"barney","age":31},{"user":"barney","age":4},{"user":"fred","age":48}]' 

So the code that works is passing mappers as an array (with string -we do not use that, not type-safe, or an arrow function), but not returning an array, as JS compares them as strings.

Also, in addition of fixing, the code if you can tweak the spec so we have a RED before going to GREEN, great!

private createDeleteUnusedCategoryOptionCombosSQL(ids: Id[], batchSize: number = 500): string {
const batches = Array.from({ length: Math.ceil(ids.length / batchSize) }, (_, i) =>
ids.slice(i * batchSize, (i + 1) * batchSize)
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the same than _.chunk ?

DROP TABLE temp_uids_batch;
DROP TABLE temp_ids_batch;

`;
Copy link
Copy Markdown
Contributor

@tokland tokland Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[subjective] To avoid a long multi-line variable in the middle of a function/method, we can move it to a constant/instance value and use lodash template to interpolate when used, what do you think?

`;
});

return `BEGIN;${batchStatements.join("\n")}COMMIT;`.trim();
Copy link
Copy Markdown
Contributor

@tokland tokland Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor, but id' write an a multiline array with a final join., something like this

return [
"BEGIN;", //
...batchStatements,
"COMMIT;",
].join("\n");

(in the hope that it's more readable)

return {
regeneratedCoc: RegeneratedCoc.create({
id: getUid(combinationKey, categoryCombo.id),
id: categoryComboId,
Copy link
Copy Markdown
Contributor

@tokland tokland Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So categoryOptionComboId?

import { Id } from "domain/entities/Base";

export interface CategoryOptionComboDeleteExporter {
exportDeleteScript(cocIds: Id[]): void;
Copy link
Copy Markdown
Contributor

@tokland tokland Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: instead of a side-effect (saving the file), return here a string. And make the use case return also that string somewhere in the response object. And it's the script that writes the final file from that string. What do you think?

@eperedo eperedo requested a review from tokland March 2, 2026 22:32
Copy link
Copy Markdown
Contributor

@tokland tokland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very minor things:

async getAll(): Promise<CategoryCombo[]> {
return this.getAllByPages({ page: 1, pageSize: 100, categoryCombos: [] });
async getAll(options: { ids?: Id[] }): Promise<CategoryCombo[]> {
return this.getAllByPages({ page: 1, pageSize: 100, categoryCombos: [], ids: options.ids });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor, subjective] Having getAll with an optional ids filter feels a bit confusing. Maybe it would be clearer to have two separate methods: getAll() and getByIds(ids)?

function writeSqlScriptToDisk(sqlScript: string, fileName: string): void {
writeFileSync(fileName, sqlScript);
logger.info(`SQL generated: ${fileName}`);
logger.info(`You can execute the sql with d2-docker: d2-docker run-sql ${fileName}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may have N instances running, I'd add the explicit image option

Comment thread README.md
--auth="admin:district" \
--persist \
--delete-cocs \
--cat-combos-ids=id1,id2,id3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: --category-combo-ids (no abbreviation, plural only the last term)

Comment thread README.md
--auth="admin:district" \
--persist \
--generate-sql-delete-script
--generate-sql-delete-script \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need it?

page: options.page,
pageSize: options.pageSize,
// TODO: change to request in chunks when ids are provided
filter: options.ids ? { id: { in: options.ids } } : undefined,
Copy link
Copy Markdown
Contributor

@tokland tokland Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO to do now, as we have the ids?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants