Conversation
This reverts commit 99cfd46.
…into 1142/feat-support-opensearch
…into 1142/feat-support-opensearch
| const growthChartPalette = ['#ef5350', '#80deea', '#3949ab']; | ||
|
|
||
| const multipleModelsChartPalette = ['#d4e157', '#0277bd']; | ||
| const multipleModelsChartPalette = ['#d4e157', '#0277bd', '#f06292']; |
There was a problem hiding this comment.
Added a 3rd value for differentiating 'No Data' from 'No'
mistryrn
left a comment
There was a problem hiding this comment.
Still need to complete my local testing once the docker-compose change we discussed on Slack is implemented (adding OpenSearch), but I've reviewed the code changes and they look great! Just a few minor change suggestions and cleanup/documentation asks. Other than this (and pending local testing), I think we're good to go 🙂
Awesome job, Dan! 🎉
| if (value && typeof value === 'object') { | ||
| if (Array.isArray(value)) { | ||
| const firstEntry = value[0]; | ||
| const cleanedVals = | ||
| firstEntry && typeof firstEntry === 'object' | ||
| ? value.map((val) => { | ||
| const cleanedValue = _.omit(val, mongoKeys); | ||
| return cleanedValue; | ||
| }) | ||
| : value; | ||
| cleanedDoc[key] = cleanedVals; | ||
| } else if (value instanceof mongoose.Types.ObjectId) { | ||
| cleanedDoc[key] = value.toString(); | ||
| } else if (!(value instanceof Date)) { | ||
| const cleanedValue = _.omit(value, mongoKeys); | ||
| cleanedDoc[key] = cleanedValue; | ||
| } | ||
| } |
There was a problem hiding this comment.
Thanks for working all this logic out, it must have been quite a bit of trial and error!
For future developers' reference, would you mind updating the TSDoc comment above this function to describe why we're cleaning/handling objects, arrays, mongoose ObjectIds, and Dates differently? Either that, or put some inline comments above each conditional explaining the purpose/logic, or both!
This is definitely the kind of thing that, if documented well, will be greatly appreciated by whoever looks at this months from now 🙂
There was a problem hiding this comment.
Comments added, let me know if you would revise these
| import variantsIndexConfig from '../../elasticsearch/variantsIndex.json' with { type: "json" }; | ||
| const modelsIndexName = process.env.ES_INDEX || pm2.ES_INDEX || 'hcmi'; | ||
| const esHost = process.env.ES_HOST || `${pm2.ES_HOST}:${pm2.ES_PORT}`; | ||
| const updateIndexName = process.env.ES_UPDATE_INDEX || pm2.ES_UPDATE_INDEX || 'hcmi-update'; |
There was a problem hiding this comment.
Nice catch explicitly handling the last updated index 👍
mistryrn
left a comment
There was a problem hiding this comment.
Couple small suggestions for the docker-compose file 🙂
mistryrn
left a comment
There was a problem hiding this comment.
Found one bug with the republish script in my local testing: looks like one of the deleted ES schema files in /cms was used there. Hopefully it'll be a quick fix to get that working again 🙏
Also suggested a minor logs cleanup.
There was a problem hiding this comment.
Related to my previous comment: looks like the republish script relied on this export as a dependency. We may need to restore it or use an alternative to get that working again.
There was a problem hiding this comment.
Thanks for addressing the feedback so far! I encountered a few more issues in my local testing today:
- (1) The "Updated" column on the
/adminCMS table and the "Updated" timestamp shown on the edit model view don't seem to be getting updated any more when saving or publishing a model. This isn't a blocker for testing on the Dev server, but is something we'll need to fix before we go to Prod.
- (2) I'm encountering some data/index-related issues when loading the public-facing UI pages. These might be a blocker for testing on the Dev server, as we might encounter these when we go to test. I know you've fixed similar issues on your local already, so I'm going to leave some instructions that might help reproduce them/recreate the scenario of setting up a new OpenSearch instance.
- a. The "Search By Model Name" filter on the Search page is failing with
400errors on theModelQuickSearchquery.code: "GRAPHQL_VALIDATION_FAILED"message: 'Cannot query field "autocomplete" on type "modelNode".'
- b. The "Model Details" page fails to load when I click on a model from the Search page table. I'm getting
400errors forModelDataQueryandVariantsData:- ModelDataQuery errors:
message: 'Cannot query field "distributor_part_number" on type "modelNode".'message: 'Cannot query field "proteomics_url" on type "modelNode".'message: 'Cannot query field "hits" on type "modelFiles".'message: 'Cannot query field "matched_models" on type "modelNode". Did you mean "has_matched_models" or "matched_models_list"?'code: "GRAPHQL_VALIDATION_FAILED"(same code for each error)
- VariantsData errors:
message: 'Cannot query field "variants" on type "modelNode".'code: "GRAPHQL_VALIDATION_FAILED"
- ModelDataQuery errors:
- a. The "Search By Model Name" filter on the Search page is failing with
- (3) The publish validation seems to be preventing models that were previously valid from getting published. The following 2 fields are no longer allowing null values, despite the Prod Catalog containing many published examples of null or N/A for them:
growth_rateage_at_diagnosis
- To get around (3) locally, I added
.nullable(true)to both fields undergetPublishSchema.
I know I've seen the issues under (2) addressed on your local, so maybe it's something you'll only see if you're setting things up from scratch in OpenSearch or when attempting to migrate data from Elasticsearch?
To reproduce the issues under (2), the quickest way is to run the repairEs script, as this will delete your current indices, initialize new ones, and then republish all published models in your db.
# from project root
ENV=prd npm run repairEsThis should simulate starting from a fresh OpenSearch instance/empty indices and should hopefully reproduce the issues in (2).
Note that you'll likely need to trigger a variant/gene index publish after this to get the "Search by Genes" filters working on the home page. Do this by editing a model in the CMS, opening the "Variants" tab and use the "+ Research Somatic Variants" dropdown to do an "Automatic Import from GDC"
If you are able to reproduce the issues under (2), I suspect we might need to revisit some of the new model/publish schemas. Comparing things against the 1.26.2 tag might be a good way to double-check that we're including everything we need: https://github.com/nci-hcmi-catalog/portal/tree/1.26.2
One thing I noticed when comparing an older ES index to a new OS index: looks like there were 3 fields related to matched models before.

matched_models(array)has_matched_models(boolean)matched_models_list(string list of relevant matched models, containing the name of the current model)
I'm not sure if that might be related to the errors I'm getting loading the Model Details page, but it might be worth looking into.
There was a problem hiding this comment.
Thanks for addressing the "Last Updated" and "publish validation" issues from my previous feedback. I've re-tested locally and can confirm that those are mostly resolved 🎉
Additionally, the index issues I encountered during my testing last week are no longer occurring! Looking back, it may have been related to my running both ES and OS locally at the same time on Thursday 🙈
Just two more small issues, but these won't block testing on the new dev server:
- (1) "Last Updated" is still not getting updated on Save. Thanks for fixing it on Publish, but we should also be updating this on Save as well for consistency with the existing behaviour (and to make it easier to find the last model worked on if it wasn't published right away).
- (2) Individual model "Delete" functionality is currently broken (bulk delete still works). This is unrelated to any of your code changes, but instead a consequence of the Mongo/Mongoose 8 upgrade. Looks like Mongoose 8 relies on a dependency
mquery@5.0.0that tries to use a deprecated method.findOneAndRemove(which, consequently, was removed in Mongoose 8) 🙃- Not a huge deal, the fix would just be to upgrade to Mongoose 9, which uses
mquery@6.0.0, which no longer references the deprecated methods. - I skimmed the breaking changes for Mongoose 8 → 9, the main change that might impact us is "Update pipelines disallowed by default": https://mongoosejs.com/docs/migrating_to_9.html#update-pipelines-disallowed-by-default
- We use this functionality in a few of our migrations, but we should be able to just update them by adding
, { updatePipeline: true }
- We use this functionality in a few of our migrations, but we should be able to just update them by adding
- If upgrading to Mongoose 9 breaks too many things, we could alternatively use a module resolution in the root
package.jsonto forcemqueryto resolve to6.0.0
- Not a huge deal, the fix would just be to upgrade to Mongoose 9, which uses
Let's try to get (1) resolved ASAP, but (2) can be spun out into a separate ticket and we can inform the NIH of the temporary workaround if we don't resolve it before going to Prod.
Just leaving some reference for (2) below:
This reverts commit f017073.
Updates HCMI application to run using Arranger SearchClient with OpenSearch