Skip to content

⚙️ Configuration System

Configuration classes and preset functions for customizing serialization behavior, with special focus on UUID handling and web framework compatibility.

🎯 Overview

The configuration system provides comprehensive control over datason's serialization behavior through the SerializationConfig class and preset configurations. Most importantly, it solves UUID compatibility issues with Pydantic, FastAPI, Django, and other web frameworks.

🚀 Quick Start: UUID + Pydantic Compatibility

The #1 use case: Fix UUID compatibility with Pydantic models in FastAPI/Django:

import datason
from datason.config import get_api_config

# ❌ Problem: Default behavior converts UUIDs to objects
data = {"user_id": "12345678-1234-5678-9012-123456789abc"}
result = datason.auto_deserialize(data)  # user_id becomes UUID object

# ✅ Solution: Use API config to keep UUIDs as strings
api_config = get_api_config()
result = datason.auto_deserialize(data, config=api_config)  # user_id stays string

# Now works with Pydantic!
from pydantic import BaseModel
class User(BaseModel):
    user_id: str  # ✅ Works perfectly!

user = User(**result)  # Success! 🎉

📦 SerializationConfig Class

Main configuration class for all serialization options.

Key UUID Configuration Options

Option Type Default Description
uuid_format str "object" "object" converts to uuid.UUID, "string" keeps as str
parse_uuids bool True Whether to attempt UUID string parsing at all

datason.SerializationConfig(date_format: DateFormat = DateFormat.ISO, custom_date_format: Optional[str] = None, uuid_format: str = 'object', parse_uuids: bool = True, dataframe_orient: DataFrameOrient = DataFrameOrient.RECORDS, datetime_output: OutputType = OutputType.JSON_SAFE, series_output: OutputType = OutputType.JSON_SAFE, dataframe_output: OutputType = OutputType.JSON_SAFE, numpy_output: OutputType = OutputType.JSON_SAFE, nan_handling: NanHandling = NanHandling.NULL, type_coercion: TypeCoercion = TypeCoercion.SAFE, preserve_decimals: bool = True, preserve_complex: bool = True, max_depth: int = 50, max_size: int = 100000, max_string_length: int = 1000000, custom_serializers: Optional[Dict[type, Callable[[Any], Any]]] = None, sort_keys: bool = False, ensure_ascii: bool = False, check_if_serialized: bool = False, include_type_hints: bool = False, auto_detect_types: bool = False, redact_fields: Optional[List[str]] = None, redact_patterns: Optional[List[str]] = None, redact_large_objects: bool = False, redaction_replacement: str = '<REDACTED>', include_redaction_summary: bool = False, audit_trail: bool = False, cache_scope: CacheScope = CacheScope.OPERATION, cache_size_limit: int = 1000, cache_warn_on_limit: bool = True, cache_metrics_enabled: bool = False) dataclass

Configuration for datason serialization behavior.

Attributes:

Name Type Description
date_format DateFormat

How to format datetime objects

custom_date_format Optional[str]

Custom strftime format when date_format is CUSTOM

dataframe_orient DataFrameOrient

Pandas DataFrame orientation

datetime_output OutputType

How to output datetime objects

series_output OutputType

How to output pandas Series

dataframe_output OutputType

How to output pandas DataFrames (overrides orient for object output)

numpy_output OutputType

How to output numpy arrays

nan_handling NanHandling

How to handle NaN/null values

type_coercion TypeCoercion

Type coercion behavior

preserve_decimals bool

Whether to preserve decimal.Decimal precision

preserve_complex bool

Whether to preserve complex numbers as dict

max_depth int

Maximum recursion depth (security)

max_size int

Maximum collection size (security)

max_string_length int

Maximum string length (security)

custom_serializers Optional[Dict[type, Callable[[Any], Any]]]

Dict of type -> serializer function

sort_keys bool

Whether to sort dictionary keys in output

ensure_ascii bool

Whether to ensure ASCII output only

check_if_serialized bool

Skip processing if object is already JSON-safe

include_type_hints bool

Include type metadata for perfect round-trip deserialization

redact_fields Optional[List[str]]

Field patterns to redact (e.g., ["password", "api_key", "*.secret"])

redact_patterns Optional[List[str]]

Regex patterns to redact (e.g., credit card numbers)

redact_large_objects bool

Auto-redact objects >10MB

redaction_replacement str

Replacement text for redacted content

include_redaction_summary bool

Include summary of what was redacted

audit_trail bool

Track all redaction operations for compliance

🔧 Configuration Presets

Pre-built configurations for common scenarios.

get_api_config() ⭐ Most Used

Perfect for web APIs, Pydantic models, and framework integration.

  • ✅ Keeps UUIDs as strings (Pydantic compatible)
  • ✅ ISO datetime format
  • ✅ Consistent JSON output
  • ✅ Safe for HTTP clients

datason.get_api_config() -> SerializationConfig

Get configuration optimized for API responses.

Returns:

Type Description
SerializationConfig

Configuration with clean, consistent output for web APIs

Source code in datason/config.py
def get_api_config() -> SerializationConfig:
    """Get configuration optimized for API responses.

    Returns:
        Configuration with clean, consistent output for web APIs
    """
    return SerializationConfig(
        date_format=DateFormat.ISO,
        dataframe_orient=DataFrameOrient.RECORDS,
        nan_handling=NanHandling.NULL,
        type_coercion=TypeCoercion.SAFE,
        preserve_decimals=True,
        preserve_complex=True,
        sort_keys=True,
        ensure_ascii=True,  # Safe for all HTTP clients
        # NEW: Keep UUIDs as strings for API compatibility (Pydantic/FastAPI)
        uuid_format="string",
        parse_uuids=False,
    )

Use Cases: - FastAPI + Pydantic applications - Django REST Framework APIs - Flask JSON endpoints - Any web API requiring string UUIDs

get_ml_config()

Optimized for machine learning and data processing workflows.

  • 🔬 Converts UUIDs to objects (for ML processing)
  • 🔬 Rich type preservation
  • 🔬 Optimized for scientific computing

datason.get_ml_config() -> SerializationConfig

Get configuration optimized for ML workflows.

Returns:

Type Description
SerializationConfig

Configuration with aggressive type coercion and tensor-friendly settings

Source code in datason/config.py
def get_ml_config() -> SerializationConfig:
    """Get configuration optimized for ML workflows.

    Returns:
        Configuration with aggressive type coercion and tensor-friendly settings
    """
    return SerializationConfig(
        date_format=DateFormat.UNIX_MS,
        dataframe_orient=DataFrameOrient.RECORDS,
        nan_handling=NanHandling.NULL,
        type_coercion=TypeCoercion.AGGRESSIVE,
        preserve_decimals=False,  # ML often doesn't need exact decimal precision
        preserve_complex=False,  # ML typically converts complex to real
        sort_keys=True,  # Consistent output for ML pipelines
        include_type_hints=True,  # Enable type metadata for ML objects
    )

Use Cases: - Machine learning pipelines - Data science notebooks - Scientific computing - Internal data processing

get_strict_config()

Enhanced security and validation for production systems.

datason.get_strict_config() -> SerializationConfig

Get configuration with strict type checking.

Returns:

Type Description
SerializationConfig

Configuration that raises errors on unknown types

Source code in datason/config.py
def get_strict_config() -> SerializationConfig:
    """Get configuration with strict type checking.

    Returns:
        Configuration that raises errors on unknown types
    """
    return SerializationConfig(
        date_format=DateFormat.ISO,
        dataframe_orient=DataFrameOrient.RECORDS,
        nan_handling=NanHandling.NULL,
        type_coercion=TypeCoercion.STRICT,
        preserve_decimals=True,
        preserve_complex=True,
    )

get_performance_config()

Optimized for high-performance scenarios.

datason.get_performance_config() -> SerializationConfig

Get configuration optimized for performance.

Returns:

Type Description
SerializationConfig

Configuration with minimal processing for maximum speed

Source code in datason/config.py
def get_performance_config() -> SerializationConfig:
    """Get configuration optimized for performance.

    Returns:
        Configuration with minimal processing for maximum speed
    """
    return SerializationConfig(
        date_format=DateFormat.UNIX,  # Fastest date format
        dataframe_orient=DataFrameOrient.VALUES,  # Fastest DataFrame format
        nan_handling=NanHandling.NULL,
        type_coercion=TypeCoercion.SAFE,
        preserve_decimals=False,  # Skip decimal preservation for speed
        preserve_complex=False,  # Skip complex preservation for speed
        sort_keys=False,  # Don't sort for speed
    )

🌐 Framework-Specific Usage

FastAPI Integration

from fastapi import FastAPI
from datason.config import get_api_config
import datason

app = FastAPI()
API_CONFIG = get_api_config()  # Set once, use everywhere

@app.post("/users/")
async def create_user(user_data: dict):
    processed = datason.auto_deserialize(user_data, config=API_CONFIG)
    return User(**processed)  # Works with Pydantic!

Django Integration

from datason.config import get_api_config
import datason

class UserAPIView(View):
    def __init__(self):
        self.api_config = get_api_config()

    def post(self, request):
        data = json.loads(request.body)
        processed = datason.auto_deserialize(data, config=self.api_config)
        user = User.objects.create(**processed)
        return JsonResponse(user.to_dict())

Flask Integration

from flask import Flask, request, jsonify
from datason.config import get_api_config
import datason

app = Flask(__name__)
API_CONFIG = get_api_config()

@app.route('/api/users/', methods=['POST'])
def create_user():
    processed = datason.auto_deserialize(request.json, config=API_CONFIG)
    # UUIDs are now strings, compatible with database operations
    return jsonify(processed)

🛠️ Custom Configuration Examples

Strict API Configuration

from datason.config import SerializationConfig

strict_api_config = SerializationConfig(
    uuid_format="string",      # Keep UUIDs as strings
    parse_uuids=False,         # Don't auto-convert to UUID objects
    max_size=1_000_000,       # 1MB payload limit
    max_depth=10,             # Prevent deep nesting attacks
    sort_keys=True,           # Consistent JSON output
    ensure_ascii=True         # Safe for all HTTP clients
)

Database JSON Field Configuration

json_field_config = SerializationConfig(
    uuid_format="string",      # Store UUIDs as strings in JSON
    preserve_decimals=True,    # Keep precision in JSON
    max_depth=20,             # Allow deeper nesting in JSON fields
)

📊 Configuration Comparison

Use Case Preset UUID Format Parse UUIDs Best For
Web APIs get_api_config() "string" False FastAPI, Django, Flask
ML Workflows get_ml_config() "object" True Data science, ML pipelines
High Performance get_performance_config() "object" True Speed-critical applications
Security Critical get_strict_config() "string" False Production APIs with limits

🚨 Common Pitfalls

❌ Inconsistent Configuration

# Don't mix configurations!
result1 = datason.auto_deserialize(data1)  # Default config
result2 = datason.auto_deserialize(data2, config=get_api_config())  # Different!
# UUIDs will be different types!

✅ Consistent Configuration

# Use consistent configuration throughout your app
API_CONFIG = get_api_config()
result1 = datason.auto_deserialize(data1, config=API_CONFIG)
result2 = datason.auto_deserialize(data2, config=API_CONFIG)
# All UUIDs are consistently strings