# Solidity Metadata Exposure

### Theory

When you compile a Solidity contract, `solc` not only emits executable EVM bytecode, it also generates a **metadata JSON file** containing rich information about the compilation and contract structure.

Crucially, **a reference to this metadata is appended directly into the deployed runtime bytecode**. This is not accidental, Solidity intends for tools like wallets, explorers, and verifiers to automatically fetch interface and documentation data, but it also means more information than intended can be exposed to anyone who inspects contract bytecode ([Solidity](https://docs.soliditylang.org/en/stable/metadata.html?utm_source=chatgpt.com)).

#### What’s in “Metadata”

The metadata file is a canonical JSON structure describing:

* compiler version and settings (optimization, paths, options)
* language and version attributes
* ABI (Application Binary Interface) definitions
* NatSpec documentation (developer and user comments)
* source file references and cryptographic hashes
* any libraries used, including their encoded addresses

That JSON is useful for reproducible builds and verification, but the presence of this rich, human‑meaningful description *in conjunction with the contract’s bytecode* means that **security‑relevant information can be exposed off‑chain merely by inspecting deployed code**.

#### How Metadata Is Embedded in Bytecode

The Solidity compiler *does not embed the full JSON metadata in bytecode*. Instead, it puts a **compact reference to the metadata JSON** at the very end of the runtime bytecode. This reference is encoded using **CBOR** (Concise Binary Object Representation), a binary serialization designed for efficiency, not secrecy, and then its length is appended.

To extract it:

1. Read the **last two bytes** of the deployed bytecode.
2. Treat those two bytes as a **big‑endian integer**, giving the length `L` of the CBOR payload.
3. Look at the `L` bytes immediately before those two bytes — that is the CBOR‑encoded metadata map.

Inside that CBOR map, Solidity typically includes keys such as `"ipfs"` (an IPFS CID pointing to the full metadata JSON) and `"solc"` (the compiler version). It *may also include additional optional keys* (e.g., experimental flags or alternate hash types such as Swarm) if certain compiler settings were used.

A simplified conceptual structure looks like this in smart contrat runtime bytecodes:

```
... actual runtime opcode bytecode ... | [CBOR map of metadata keys] | [2‑byte big‑endian length]
```

Since CBOR is schemaless and map entries can vary, the **only reliable extraction method** is:

* read last two bytes for length,
* then run a CBOR decoder on the preceding segment.

#### What CBOR Actually Encodes

CBOR encodes binary maps efficiently. A common structure appended by solc might decode to a map like:

```json
{
  "ipfs": "<CIDv0 metadata hash>",
  "solc": "<version bytes>"
}
```

Where:

* `"ipfs"` holds the CID that identifies the full metadata JSON on [IPFS](https://fr.wikipedia.org/wiki/InterPlanetary_File_System).
* `"solc"` contains a compact version encoding such as 3 bytes representing the major, minor, and patch version numbers of the Solidity compiler used.

The encoder prefixes and length prefixes in CBOR are not human‑readable, but a CBOR parser easily converts them into a standard JSON object. This CBOR mapping is precisely what you find if you inspect a contract’s runtime bytecode and run it through a CBOR decoder. Tools like [SolMeta](https://github.com/v4resk/SolMeta) automate this extraction.

#### Why This Exposure Matters

For security review, the exposure of metadata references is significant:

1. **Exact Compiler Version & Settings**\
   Knowing the precise compiler version and flags can reveal which **compiler bugs** or optimizer issues may apply, turning benign‑looking code into a vulnerability candidate.
2. **Source & ABI Access**\
   Once you resolve the IPFS CID from the CBOR, you can fetch the full metadata JSON and then fetch the **entire source files**, ABI, and even NatSpec docstrings. This undermines any “obfuscation” that relied on keeping source code non‑public.
3. **Automated Attack Planning**\
   With full source and ABI data available, tools (slither, hevm, echidna) can run deeper static analysis.
4. **Sensitive Parameters**\
   Metadata can sometimes reveal sensitive informations

#### Disabling Metadata Append

Solidity provides a compile‑time option (`--no-cbor-metadata` or via the Standard JSON interface with `settings.metadata.appendCBOR: false`) to *omit* appending the CBOR metadata entirely. Omitting it can save deployment gas and prevent exposure, but also *breaks source verification workflows* on tools like Etherscan or Sourcify

## Practice

In a practical audit or pentest, your first step when encountering unknown bytecode on-chain is to **extract and decode the CBOR metadata**.

{% tabs %}
{% tab title="SolMeta" %}
[SolMeta](https://github.com/v4resk/SolMeta) is a Python tool for extracting Solidity smart contract metadata from bytecode or contract addres

SolMeta automates the entire process:

1. Retrieves the **runtime bytecode** via RPC / or if provided by file
2. Extracts the **last 2 bytes** to determine CBOR length.
3. Reads and **decodes the CBOR map** containing IPFS/Swarm metadata references.
4. Fetches the **full metadata JSON** from IPFS or Swarm.
5. Outputs **ABI, compiler version, and source file references**.

```bash
# Extract from RPC endpoint
solmeta --rpc http://rpc.example.com --contract 0xContractAddress

# Extract from local bytecode file
solmeta --file bytecode.txt

# JSON-only to Jq
solmeta --rpc http://rpc.example.com --contract 0xContractAddress --json | jq
solmeta --file bytecode.txt --json | jq

# Skip IPFS metadata fetch
solmeta --rpc http://rpc.example.com --contract 0xContractAddress --no-ipfs
```

{% endtab %}

{% tab title="Sourcify Playground" %}
For a **browser-based workflow**, the Sourcify Playground (<https://playground.sourcify.dev/>) can:

* Extract CBOR metadata
* Fetch the full metadata JSON
* Reconstruct ABI and source references for quick inspection
  {% endtab %}
  {% endtabs %}

## Resources

{% embed url="<https://docs.soliditylang.org/en/latest/metadata.html>" %}
