Skip to content

[Bug] Graphman dump/restore cant handle larger subgraphs #6609

@mindstyle85

Description

@mindstyle85

Bug report

So we are testing/working with graphman dump and restore. It works perfectly fine for smaller subgraphs, but on the larger ones it fails.

Relevant log output

This is a known issue with arrow-array hitting a 2GB limit on byte array buffers. This typically happens when graphman (or graph-node) is trying to process a very large result set in one shot.
What's causing it:
Arrow's GenericBytesBuilder uses 32-bit offsets internally — once the accumulated byte data exceeds ~2GB, it overflows. This is a hard limit in the arrow-array crate when using i32 offsets (vs i64 for "large" variants).
Likely triggers in your case:
Running a graphman command that fetches a large dataset (entity counts, history, stats across many subgraphs/blocks)
A subgraph with very large string fields or a huge number of entities being exported/queried at once
Workarounds to try:
Filter/limit the query — if the command accepts filters, scope it down (e.g. specific deployment hash, block range, or shard)

IPFS hash

No response

Subgraph name or link to explorer

No response

Some information to help us out

  • Tick this box if this bug is caused by a regression found in the latest release.
  • Tick this box if this bug is specific to the hosted service.
  • I have searched the issue tracker to make sure this issue is not a duplicate.

OS information

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions