Skip to content

🔐 Integrity Functions

Datason includes a set of helpers for verifying data integrity and authenticity. These utilities provide deterministic hashing, optional redaction, and Ed25519 signatures.

🎯 Function Overview

Function Purpose Best For
canonicalize() Deterministic JSON output Stable hashing
hash_object() Hash Python objects Audit trails
hash_json() Hash JSON structures API responses
verify_object() Verify object integrity Validation
verify_json() Verify JSON integrity API testing
hash_and_redact() Redact then hash Compliance
sign_object() Create digital signature Authenticity
verify_signature() Validate signature Document integrity

📦 Detailed Function Documentation

canonicalize()

datason.integrity.canonicalize(obj: Any, *, redact: dict[str, Any] | None = None) -> str

Return a canonical JSON representation of obj.

If redact is provided, redaction is applied before serialization. The output uses sorted keys and compact separators to ensure stable ordering for hashing.

Parameters:

Name Type Description Default
obj Any

Object to canonicalize

required
redact dict[str, Any] | None

Optional redaction configuration. If provided but redaction module is not available, raises RuntimeError.

None

Returns:

Type Description
str

Canonical JSON string representation

Raises:

Type Description
RuntimeError

If redact is specified but redaction module is unavailable

Source code in datason/integrity.py
def canonicalize(obj: Any, *, redact: dict[str, Any] | None = None) -> str:
    """Return a canonical JSON representation of ``obj``.

    If ``redact`` is provided, redaction is applied before serialization.
    The output uses sorted keys and compact separators to ensure stable
    ordering for hashing.

    Args:
        obj: Object to canonicalize
        redact: Optional redaction configuration. If provided but redaction
               module is not available, raises RuntimeError.

    Returns:
        Canonical JSON string representation

    Raises:
        RuntimeError: If redact is specified but redaction module is unavailable
    """
    if redact:
        obj = _apply_redaction(obj, redact)

    serialized = serialize(obj)
    return json.dumps(serialized, sort_keys=True, separators=(",", ":"))

hash_object()

datason.integrity.hash_object(obj: Any, *, redact: dict[str, Any] | None = None, hash_algo: str = 'sha256') -> str

Compute a deterministic hash of obj.

Parameters:

Name Type Description Default
obj Any

Object to hash

required
redact dict[str, Any] | None

Optional redaction configuration. If provided but redaction module is not available, raises RuntimeError.

None
hash_algo str

Hash algorithm to use (default: sha256). Must be a strong algorithm.

'sha256'

Returns:

Type Description
str

Hexadecimal hash string

Raises:

Type Description
RuntimeError

If redact is specified but redaction module is unavailable

ValueError

If an unsupported hash algorithm is specified

Source code in datason/integrity.py
def hash_object(obj: Any, *, redact: dict[str, Any] | None = None, hash_algo: str = "sha256") -> str:
    """Compute a deterministic hash of ``obj``.

    Args:
        obj: Object to hash
        redact: Optional redaction configuration. If provided but redaction
               module is not available, raises RuntimeError.
        hash_algo: Hash algorithm to use (default: sha256). Must be a strong
                  algorithm.

    Returns:
        Hexadecimal hash string

    Raises:
        RuntimeError: If redact is specified but redaction module is unavailable
        ValueError: If an unsupported hash algorithm is specified
    """
    validate_hash_algorithm(hash_algo)
    canon = canonicalize(obj, redact=redact)
    h = hashlib.new(hash_algo)
    h.update(canon.encode("utf-8"))
    return h.hexdigest()

hash_json()

datason.integrity.hash_json(json_data: Any, hash_algo: str = 'sha256') -> str

Compute a deterministic hash for a JSON-compatible structure.

Parameters:

Name Type Description Default
json_data Any

JSON-compatible structure to hash

required
hash_algo str

Hash algorithm to use (default: sha256). Must be a strong algorithm.

'sha256'

Returns:

Type Description
str

Hexadecimal hash string

Raises:

Type Description
ValueError

If an unsupported hash algorithm is specified

Source code in datason/integrity.py
def hash_json(json_data: Any, hash_algo: str = "sha256") -> str:
    """Compute a deterministic hash for a JSON-compatible structure.

    Args:
        json_data: JSON-compatible structure to hash
        hash_algo: Hash algorithm to use (default: sha256). Must be a strong
                  algorithm.

    Returns:
        Hexadecimal hash string

    Raises:
        ValueError: If an unsupported hash algorithm is specified
    """
    validate_hash_algorithm(hash_algo)
    canon = json.dumps(json_data, sort_keys=True, separators=(",", ":"))
    h = hashlib.new(hash_algo)
    h.update(canon.encode("utf-8"))
    return h.hexdigest()

verify_object()

datason.integrity.verify_object(obj: Any, expected_hash: str, *, redact: dict[str, Any] | None = None, hash_algo: str = 'sha256') -> bool

Verify that obj hashes to expected_hash.

Parameters:

Name Type Description Default
obj Any

Object to verify

required
expected_hash str

Expected hash value

required
redact dict[str, Any] | None

Optional redaction configuration. If provided but redaction module is not available, raises RuntimeError.

None
hash_algo str

Hash algorithm to use (default: sha256). Must be a strong algorithm.

'sha256'

Returns:

Type Description
bool

True if hash matches, False otherwise

Raises:

Type Description
RuntimeError

If redact is specified but redaction module is unavailable

ValueError

If an unsupported hash algorithm is specified

Source code in datason/integrity.py
def verify_object(
    obj: Any,
    expected_hash: str,
    *,
    redact: dict[str, Any] | None = None,
    hash_algo: str = "sha256",
) -> bool:
    """Verify that ``obj`` hashes to ``expected_hash``.

    Args:
        obj: Object to verify
        expected_hash: Expected hash value
        redact: Optional redaction configuration. If provided but redaction
               module is not available, raises RuntimeError.
        hash_algo: Hash algorithm to use (default: sha256). Must be a strong
                  algorithm.

    Returns:
        True if hash matches, False otherwise

    Raises:
        RuntimeError: If redact is specified but redaction module is unavailable
        ValueError: If an unsupported hash algorithm is specified
    """
    actual = hash_object(obj, redact=redact, hash_algo=hash_algo)
    return actual == expected_hash

verify_json()

datason.integrity.verify_json(json_data: Any, expected_hash: str, hash_algo: str = 'sha256') -> bool

Verify a JSON-compatible structure against expected_hash.

Parameters:

Name Type Description Default
json_data Any

JSON-compatible structure to verify

required
expected_hash str

Expected hash value

required
hash_algo str

Hash algorithm to use (default: sha256). Must be a strong algorithm.

'sha256'

Returns:

Type Description
bool

True if hash matches, False otherwise

Raises:

Type Description
ValueError

If an unsupported hash algorithm is specified

Source code in datason/integrity.py
def verify_json(json_data: Any, expected_hash: str, hash_algo: str = "sha256") -> bool:
    """Verify a JSON-compatible structure against ``expected_hash``.

    Args:
        json_data: JSON-compatible structure to verify
        expected_hash: Expected hash value
        hash_algo: Hash algorithm to use (default: sha256). Must be a strong
                  algorithm.

    Returns:
        True if hash matches, False otherwise

    Raises:
        ValueError: If an unsupported hash algorithm is specified
    """
    actual = hash_json(json_data, hash_algo=hash_algo)
    return actual == expected_hash

hash_and_redact()

datason.integrity.hash_and_redact(obj: Any, *, redact: dict[str, Any] | None = None, hash_algo: str = 'sha256') -> tuple[Any, str]

Redact obj, hash the result, and return (redacted_obj, hash).

Parameters:

Name Type Description Default
obj Any

Object to redact and hash

required
redact dict[str, Any] | None

Redaction configuration. If None, no redaction is applied. If provided but redaction module is not available, raises RuntimeError.

None
hash_algo str

Hash algorithm to use (default: sha256). Must be a strong algorithm.

'sha256'

Returns:

Type Description
tuple[Any, str]

Tuple of (redacted_object, hash_string)

Raises:

Type Description
RuntimeError

If redact is specified but redaction module is unavailable

ValueError

If an unsupported hash algorithm is specified

Source code in datason/integrity.py
def hash_and_redact(obj: Any, *, redact: dict[str, Any] | None = None, hash_algo: str = "sha256") -> tuple[Any, str]:
    """Redact ``obj``, hash the result, and return ``(redacted_obj, hash)``.

    Args:
        obj: Object to redact and hash
        redact: Redaction configuration. If None, no redaction is applied.
               If provided but redaction module is not available, raises RuntimeError.
        hash_algo: Hash algorithm to use (default: sha256). Must be a strong
                  algorithm.

    Returns:
        Tuple of (redacted_object, hash_string)

    Raises:
        RuntimeError: If redact is specified but redaction module is unavailable
        ValueError: If an unsupported hash algorithm is specified
    """
    redacted = _apply_redaction(obj, redact or {}) if redact else obj
    hash_val = hash_object(redacted, hash_algo=hash_algo)
    return redacted, hash_val

sign_object()

datason.integrity.sign_object(obj: Any, private_key_pem: str, *, redact: dict[str, Any] | None = None) -> str

Sign obj with an Ed25519 private key.

The signature is returned as base64-encoded text. This function lazily imports :mod:cryptography so the package is only required when used.

Parameters:

Name Type Description Default
obj Any

Object to sign

required
private_key_pem str

Ed25519 private key in PEM format

required
redact dict[str, Any] | None

Optional redaction configuration. If provided but redaction module is not available, raises RuntimeError.

None

Returns:

Type Description
str

Base64-encoded signature string

Raises:

Type Description
RuntimeError

If cryptography package is not available or if redact is specified but redaction module is unavailable

TypeError

If key is not an Ed25519 private key

Source code in datason/integrity.py
def sign_object(
    obj: Any,
    private_key_pem: str,
    *,
    redact: dict[str, Any] | None = None,
) -> str:
    """Sign ``obj`` with an Ed25519 private key.

    The signature is returned as base64-encoded text. This function lazily
    imports :mod:`cryptography` so the package is only required when used.

    Args:
        obj: Object to sign
        private_key_pem: Ed25519 private key in PEM format
        redact: Optional redaction configuration. If provided but redaction
               module is not available, raises RuntimeError.

    Returns:
        Base64-encoded signature string

    Raises:
        RuntimeError: If cryptography package is not available or if redact
                     is specified but redaction module is unavailable
        TypeError: If key is not an Ed25519 private key
    """

    try:  # Lazy import
        from cryptography.hazmat.primitives import serialization
        from cryptography.hazmat.primitives.asymmetric.ed25519 import (
            Ed25519PrivateKey,
        )
    except ImportError as exc:  # pragma: no cover - optional dependency
        raise RuntimeError("cryptography is required for signing") from exc

    canon = canonicalize(obj, redact=redact)
    private_key = serialization.load_pem_private_key(private_key_pem.encode("utf-8"), password=None)
    if not isinstance(private_key, Ed25519PrivateKey):
        raise TypeError("Only Ed25519 keys are supported for signing")

    signature = private_key.sign(canon.encode("utf-8"))
    return base64.b64encode(signature).decode("ascii")

verify_signature()

datason.integrity.verify_signature(obj: Any, signature: str, public_key_pem: str, *, redact: dict[str, Any] | None = None) -> bool

Verify signature for obj using the given Ed25519 public key.

Parameters:

Name Type Description Default
obj Any

Object to verify signature for

required
signature str

Base64-encoded signature to verify

required
public_key_pem str

Ed25519 public key in PEM format

required
redact dict[str, Any] | None

Optional redaction configuration. If provided but redaction module is not available, raises RuntimeError.

None

Returns:

Type Description
bool

True if signature is valid, False otherwise

Raises:

Type Description
RuntimeError

If cryptography package is not available or if redact is specified but redaction module is unavailable

TypeError

If key is not an Ed25519 public key

Source code in datason/integrity.py
def verify_signature(
    obj: Any,
    signature: str,
    public_key_pem: str,
    *,
    redact: dict[str, Any] | None = None,
) -> bool:
    """Verify ``signature`` for ``obj`` using the given Ed25519 public key.

    Args:
        obj: Object to verify signature for
        signature: Base64-encoded signature to verify
        public_key_pem: Ed25519 public key in PEM format
        redact: Optional redaction configuration. If provided but redaction
               module is not available, raises RuntimeError.

    Returns:
        True if signature is valid, False otherwise

    Raises:
        RuntimeError: If cryptography package is not available or if redact
                     is specified but redaction module is unavailable
        TypeError: If key is not an Ed25519 public key
    """

    try:  # Lazy import
        from cryptography.hazmat.primitives import serialization
        from cryptography.hazmat.primitives.asymmetric.ed25519 import (
            Ed25519PublicKey,
        )
    except ImportError as exc:  # pragma: no cover - optional dependency
        raise RuntimeError("cryptography is required for signature verification") from exc

    canon = canonicalize(obj, redact=redact)
    public_key = serialization.load_pem_public_key(public_key_pem.encode("utf-8"))
    if not isinstance(public_key, Ed25519PublicKey):
        raise TypeError("Only Ed25519 keys are supported for verification")

    try:
        public_key.verify(base64.b64decode(signature), canon.encode("utf-8"))
        return True
    except Exception:  # pragma: no cover - verification failure path
        return False

Basic Verification Example:

import datason
from datason import integrity

payload = {"id": 1, "value": 42}
hash_val = integrity.hash_object(payload)
assert integrity.verify_object(payload, hash_val)

Signing Example:

from datason import integrity
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
from cryptography.hazmat.primitives import serialization

private_key = Ed25519PrivateKey.generate()
public_key = private_key.public_key()
private_pem = private_key.private_bytes(
    encoding=serialization.Encoding.PEM,
    format=serialization.PrivateFormat.PKCS8,
    encryption_algorithm=serialization.NoEncryption(),
).decode("utf-8")
public_pem = public_key.public_bytes(
    encoding=serialization.Encoding.PEM,
    format=serialization.PublicFormat.SubjectPublicKeyInfo,
).decode("utf-8")

document = {"msg": "hello"}

signature = integrity.sign_object(document, private_pem)
assert integrity.verify_signature(document, signature, public_pem)