Hi, thanks for releasing this great work and repository.
We are currently trying to reproduce the ProtVAE framework described in the paper, but we could not find the implementation corresponding to the compression module in the repository.
Based on the paper description, we implemented the compression module ourselves (roughly an encoder-decoder structure using 1D convolutions on PT5 embeddings). However, after introducing the compression module, reconstruction performance dropped significantly:
PT5 representations alone: reconstruction accuracy ≈ 1.0
After adding compression module: reconstruction accuracy ≈ 0.2
Our implementation roughly follows:
Input: [B, L, 1024]
Encoder:
1024 → 512 → 256 → compression_dim
(Conv1d + BatchNorm + GELU)
Latent:
[B, compression_dim, L]
↓ flatten
Decoder:
compression_dim → compression_dim/2
→ compression_dim
→ 1024
We are wondering:
Is our understanding / implementation of the compression module incorrect?
Are there important implementation details omitted from the paper?
Is there source code corresponding to the ProtVAE compression module that we may have overlooked?
If the compression module implementation and training pipeline have not yet been open-sourced, are there plans to release them in the future? Access to both the implementation and training pipeline would be extremely helpful for reproduction efforts.
Any guidance or pointers would be greatly appreciated. Thanks!
Hi, thanks for releasing this great work and repository.
We are currently trying to reproduce the ProtVAE framework described in the paper, but we could not find the implementation corresponding to the compression module in the repository.
Based on the paper description, we implemented the compression module ourselves (roughly an encoder-decoder structure using 1D convolutions on PT5 embeddings). However, after introducing the compression module, reconstruction performance dropped significantly:
PT5 representations alone: reconstruction accuracy ≈ 1.0
After adding compression module: reconstruction accuracy ≈ 0.2
Our implementation roughly follows:
Input: [B, L, 1024]
Encoder:
1024 → 512 → 256 → compression_dim
(Conv1d + BatchNorm + GELU)
Latent:
[B, compression_dim, L]
↓ flatten
Decoder:
compression_dim → compression_dim/2
→ compression_dim
→ 1024
We are wondering:
Is our understanding / implementation of the compression module incorrect?
Are there important implementation details omitted from the paper?
Is there source code corresponding to the ProtVAE compression module that we may have overlooked?
If the compression module implementation and training pipeline have not yet been open-sourced, are there plans to release them in the future? Access to both the implementation and training pipeline would be extremely helpful for reproduction efforts.
Any guidance or pointers would be greatly appreciated. Thanks!