Skip to content

#3129 Fix to verify DBJson dirtiness using same field ordering on both sides (due to PostgreSQL reordering fields for JsonB types)#3405

Open
AntoineDuComptoirDesPharmacies wants to merge 1 commit intoebean-orm:masterfrom
LeComptoirDesPharmacies:fix/3129-from-master
Open

Conversation

@AntoineDuComptoirDesPharmacies
Copy link
Copy Markdown

Same as PR #3402 but starting / targeting master branch.

This pull request aim to fix a problem (#3129) where @DbJsonB fields are always considered dirty on load if the order of their properties do not match fields ordering from data stored in Postgres.

This is due to the fact that Postgres is re-ordering the fields if column type is a JsonB in order to optimize storage / index / etc...

To solve this problem, we parse/format the value coming from readSet method in order to be sure that comparison will be done using Java Jackson encoded JSON on both side.

Fix to verify DBJson dirtiness using same field ordering on both sides (due to PostgreSQL reordering fields for JsonB types)
@AntoineDuComptoirDesPharmacies
Copy link
Copy Markdown
Author

Hi @rbygrave,

If you feel there is something missing to complete the PR or need additional information, do not hesitate to ask, we are available. 👍

Yours faithfully,
LCDP

@rbygrave
Copy link
Copy Markdown
Member

Yeah sorry, been busy. At the back of my mind I'm wondering if there is another design option. That is, the design started from more the view that the json coming back out of the db would match the json going in, and I think that isn't holding up well. For example, the checksum for the content could be stored as a derived column and in that way we only care about the one direction in terms of json content and it's checksum. So I haven't quite had time to look at that design option yet, but that is the background thinking going on.

@AntoineDuComptoirDesPharmacies
Copy link
Copy Markdown
Author

No problem, it is to be sure that we could not have help more on our side.

Great idea about storing object checksum as a derived data of the POJO. So that instead of comparing checksum(rawDbValue) with checksum(newPojoAsJson) we would compare storedChecksumFromDb with checksum(newPojoAsJson) .

It would consume more disk space but the benefit is that it will speed up checksum comparison.

@AntoineDuComptoirDesPharmacies
Copy link
Copy Markdown
Author

Hello @rbygrave,

We are just coming back for news about this subject.
Did you get any time to look at the other design option ?

Thanks in advance and have a nice day ! 👍

@thomas-lcdp
Copy link
Copy Markdown
Contributor

Hello @rbygrave,

Is there anything blocking this PR ?

Best regards

@AntoineDuComptoirDesPharmacies
Copy link
Copy Markdown
Author

Hello @rbygrave,

We are sorry to bother you with this ticket.
We currently having downgrade performances on our production DB because every time we load / update our entities, they are considered "dirty" and re-persisted with version upgrade.

We really would like to help fixing this problem and can participate through pull requests.

We suppose that you are a bit hesitating because of the performance linked to parsing/formatting JSON ?

We can update our Pull Request to propose a solution that will comply with your your implementation point of view (checksum stored in DB). Or maybe any other ideas ?

We understand this according to your favorite solution :

  1. Instead of computing checksum of retrived DBJSONB from database, it should be get from the database value.
  2. When persisting the DBJSONB into the database, it should be stored with the checksum next to it.

We just need the following precision from you :
For a DBJSONB like this :
{"fieldA": 12, "fieldB": 14}

Do you prefer having the checksum as an addition field or we should create a tree ?
Solution A with given key :
{"fieldA": 12, "fieldB": 14, "checksum": XXXXX}

Solution B with a tree :
{"value": {"fieldA": 12, "fieldB": 14}, "checksum": XXXXX}

Solution B seems to give more freedom to end developper but it seems harder to implement.

Thanks in advance for your help.

Yours faithfully,
LCDP

@AntoineDuComptoirDesPharmacies
Copy link
Copy Markdown
Author

Hi @rbygrave,

We have an other idea using JsonParser to create an alternative to CRC32 (on 64 bits) which is doing a Checksum with non-ordered comparison of Json fields.
We think it would provide more performance then a Json Parse/Format and it will be less intrusive then a checksum stored in database.

Please find here the pull request that propose the new checksum algorithm, using commutative hash for fields and non-commutative for array values.

We propose to close this Pull Request (#3405) in favor of this new one (#3746).

Yours faithfully,
LCDP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants