Hash Collisions (abi.encodePacked)

SCWE-074: Hash Collisions with Multiple Variable Length Arguments

Theory

In Solidity, abi.encodePacked() concatenates arguments without padding or type information, creating a compact byte representation. Unlike abi.encode(), which adds 32-byte padding and type metadata, abi.encodePacked() produces ambiguous encodings when combining multiple dynamic-length types.

The Collision Mechanism

When multiple dynamic-length arguments (strings, bytes, dynamic arrays) are packed together, their boundaries become indistinguishable. This allows different input combinations to produce identical hash outputs.

Example collision scenario:

solidity

abi.encodePacked("aa", "ab")  // Results in: 0x61616162
abi.encodePacked("aa", "a", "b")  // Results in: 0x61616162
abi.encodePacked("a", "aab")  // Results in: 0x61616162

All three combinations produce the same bytes, and consequently the same keccak256 hash. This occurs because abi.encodePacked() simply concatenates the UTF-8 representations without length delimiters.

Storage Impact

Solidity mappings using keccak256(abi.encodePacked(...)) as keys are particularly vulnerable. When collision-prone encodings serve as unique identifiers, attackers can:

  • Access unauthorized data by crafting colliding keys

  • Overwrite existing entries with malicious values

  • Bypass authentication mechanisms relying on hash-based lookups

  • Manipulate signature verification systems

Why abi.encode() Prevents Collisions

abi.encode() adds explicit length information and 32-byte padding for each argument, making collisions computationally infeasible:

solidity

Practice

When auditing smart contracts, identify any use of abi.encodePacked() combined with keccak256() for generating mapping keys, signatures, or unique identifiers. Focus on functions where users can control multiple dynamic-length arguments.

High-Risk Patterns

Critical vulnerabilities occur when:

  • Multiple string or bytes parameters are passed to abi.encodePacked()

  • Dynamic arrays of variable-length types are concatenated

  • User-controlled inputs are hashed without fixed-length separators

  • Signature schemes rely on encodePacked for message construction

  • Access control or authentication uses collision-prone hash keys

Safe usage scenarios:

  • Single dynamic-length argument: abi.encodePacked(string)

  • Only fixed-size types: abi.encodePacked(uint256, address, bytes32)

  • Mixed with proper separators: abi.encodePacked(string1, uint256, string2)

Vulnerable Contract Example

Exemple - Decentralized registry system

Consider a decentralized registry system where users can claim domain names by combining a username with a top-level domain. The contract uses a mapping to track ownership, with the key generated by hashing the concatenated strings. This design appears reasonable at first glance since each user should have a unique combination of name and domain.

The vulnerability lies in the boundary ambiguity between the two string parameters. When Alice attempts to register ("alice", "example.com"), the contract concatenates these into "aliceexample.com" before hashing. However, an attacker monitoring the mempool can front-run her transaction by registering ("aliceexample", ".com") instead, which produces the identical concatenated string and therefore the same hash.

Scenario: Alice registers ("alice", "example.com")

Attack: Bob finds a collision and registers first

When Alice attempts registration, it fails because Bob already owns that hash key. Bob effectively squatted on Alice's intended registration through collision.

Below is a small proof of concept to find collisions:

Exemple 2 - Signature Collision Attack

Hash collisions in signature verification can enable privilege escalation:

An attacker can craft colliding (action, target) pairs to reuse a valid signature for unauthorized operations:

  • Admin signs: ("transfer", "100")

  • Attacker submits: ("transfe", "r100") with the same signature

Resources

Last updated

Was this helpful?