Gitcoin is GDPR complaint. Learn more in
Gitcoin's Terms & Conditions.
Check out the Issue Explorer
Looking to fund some work? You can submit a new Funded Issue here.
The `geth dump` command makes it possible to extract the entire contents of the state trie as a json blob. This is very useful in certain cases, to perform data analysis about the state of the ethereum network.
However, the state trie is a so called 'secure' trie, and the actual key that is used to store the data for a certain account are `sha3` hashes of the actual `address`. The `address` is not part of the state data; thus, while it's possible to lookup the data for a given `address`, it's not possible to do the reverse.
Therefore, geth also has a secondary storage for so called `preimage`s, which is a mapping between `sha3(address)` -> `address`, or `securekey`->`key`. This becomes populated during normal block-by-block sync, as things are put into the trie.
A newly fast-synced client will therefore have close to zero `preimage`s, and is useless for dumping state. The absence of `address`es bundles several million nodes into the same empty key. To get _all_ the keys, one needs to use a geth version which is `archive`-synced from block `0`.
## Feature request
1. Ability to do `geth preimagedump dump.dat`, which would dump the preimage database into json-file.
2. Ability to do `geth preimageimport dump.dat` , which would import the preimage database from given json-file.
This would make it possible to do a state dump from a fast-synced node, and then download a preimage db from an archive-node somewhere to make the dataset complete.