add sql script generation for deleting obsolete cocs#94
add sql script generation for deleting obsolete cocs#94eperedo wants to merge 8 commits intodevelopmentfrom
Conversation
| await this.saveCocs(categoryComboWithGeneratedCocs, options); | ||
| await this.deleteCocs(categoryComboWithGeneratedCocs, options); | ||
| if (options.deleteCocs) { | ||
| await this.deleteCocs(categoryComboWithGeneratedCocs, options); |
There was a problem hiding this comment.
method deleteCocs also takes option deleteCocs, which is confusing, 1) it will only be called when true (as we now have this if). 2) you'd expect to delete, given the method name.
| categoryIndex: categoryIndexByCocOptionPosition[inputPosition], | ||
| inputPosition, | ||
| })) | ||
| .sortBy(item => [item.categoryIndex, item.inputPosition]) |
There was a problem hiding this comment.
This pattern has been on my mind for some time, so I took the time to explore it. According to the docs, it should work: https://lodash.com/docs/4.17.23#sortBy (_.sortBy(users, ['user', 'age']); it the equivalent example). However, I was worried if this was sorting things as one would expect (lexycographical order) or there was some string conversion that messes things up.
> var users = [{ 'user': 'fred', 'age': 48 }, { 'user': 'barney', 'age': 31 }, { 'user': 'barney', 'age': 4 }]
> JSON.stringify(_.sortBy(users, ["user", "age"]))
'[{"user":"barney","age":4},{"user":"barney","age":31},{"user":"fred","age":48}]'
> JSON.stringify(_.sortBy(users, [u => u.user, u => u.age]))
'[{"user":"barney","age":4},{"user":"barney","age":31},{"user":"fred","age":48}]'
> JSON.stringify(_.sortBy(users, u => [u.user, u.age]))
'[{"user":"barney","age":31},{"user":"barney","age":4},{"user":"fred","age":48}]' So the code that works is passing mappers as an array (with string -we do not use that, not type-safe, or an arrow function), but not returning an array, as JS compares them as strings.
Also, in addition of fixing, the code if you can tweak the spec so we have a RED before going to GREEN, great!
| private createDeleteUnusedCategoryOptionCombosSQL(ids: Id[], batchSize: number = 500): string { | ||
| const batches = Array.from({ length: Math.ceil(ids.length / batchSize) }, (_, i) => | ||
| ids.slice(i * batchSize, (i + 1) * batchSize) | ||
| ); |
There was a problem hiding this comment.
Is the same than _.chunk ?
| DROP TABLE temp_uids_batch; | ||
| DROP TABLE temp_ids_batch; | ||
|
|
||
| `; |
There was a problem hiding this comment.
[subjective] To avoid a long multi-line variable in the middle of a function/method, we can move it to a constant/instance value and use lodash template to interpolate when used, what do you think?
| `; | ||
| }); | ||
|
|
||
| return `BEGIN;${batchStatements.join("\n")}COMMIT;`.trim(); |
There was a problem hiding this comment.
Very minor, but id' write an a multiline array with a final join., something like this
return [
"BEGIN;", //
...batchStatements,
"COMMIT;",
].join("\n");
(in the hope that it's more readable)
| return { | ||
| regeneratedCoc: RegeneratedCoc.create({ | ||
| id: getUid(combinationKey, categoryCombo.id), | ||
| id: categoryComboId, |
There was a problem hiding this comment.
So categoryOptionComboId?
| import { Id } from "domain/entities/Base"; | ||
|
|
||
| export interface CategoryOptionComboDeleteExporter { | ||
| exportDeleteScript(cocIds: Id[]): void; |
There was a problem hiding this comment.
Proposal: instead of a side-effect (saving the file), return here a string. And make the use case return also that string somewhere in the response object. And it's the script that writes the final file from that string. What do you think?
| async getAll(): Promise<CategoryCombo[]> { | ||
| return this.getAllByPages({ page: 1, pageSize: 100, categoryCombos: [] }); | ||
| async getAll(options: { ids?: Id[] }): Promise<CategoryCombo[]> { | ||
| return this.getAllByPages({ page: 1, pageSize: 100, categoryCombos: [], ids: options.ids }); |
There was a problem hiding this comment.
[Minor, subjective] Having getAll with an optional ids filter feels a bit confusing. Maybe it would be clearer to have two separate methods: getAll() and getByIds(ids)?
| function writeSqlScriptToDisk(sqlScript: string, fileName: string): void { | ||
| writeFileSync(fileName, sqlScript); | ||
| logger.info(`SQL generated: ${fileName}`); | ||
| logger.info(`You can execute the sql with d2-docker: d2-docker run-sql ${fileName}`); |
There was a problem hiding this comment.
we may have N instances running, I'd add the explicit image option
| --auth="admin:district" \ | ||
| --persist \ | ||
| --delete-cocs \ | ||
| --cat-combos-ids=id1,id2,id3 |
There was a problem hiding this comment.
Proposal: --category-combo-ids (no abbreviation, plural only the last term)
| --auth="admin:district" \ | ||
| --persist \ | ||
| --generate-sql-delete-script | ||
| --generate-sql-delete-script \ |
| page: options.page, | ||
| pageSize: options.pageSize, | ||
| // TODO: change to request in chunks when ids are provided | ||
| filter: options.ids ? { id: { in: options.ids } } : undefined, |
There was a problem hiding this comment.
Is this TODO to do now, as we have the ids?
📌 References
📝 Implementation
Generating a sql script for deleting categoryOptionCombos.
yarn start categoryOptionCombos regenerate \ --url=https://play.im.dhis2.org/stable-2-40-10 \ --auth="admin:district" \ --generate-sql-delete-scriptBefore deleting a
categoryOptionCombothe script checks for existing data in thedatavalueanddatavalueaudittables. If you have thousands or millions of records, this process can be very slow.Adding an index to these tables improves performance significantly:
📹 Screenshots/Screen capture
🔥 Notes to the tester
#869btxm35