π Core Functions¶
The main serialization and deserialization functions including the perfect JSON module replacement and traditional comprehensive APIs.
π JSON Module Drop-in Replacement¶
Zero migration effort - use datason exactly like Python's json
module with optional enhanced features.
JSON Compatibility API¶
# Perfect drop-in replacement for Python's json module
import datason.json as json
# Exact same behavior as stdlib json
data = json.loads('{"timestamp": "2024-01-01T00:00:00Z", "value": 42}')
# Returns: {'timestamp': '2024-01-01T00:00:00Z', 'value': 42}
output = json.dumps({"key": "value"}, indent=2, sort_keys=True)
# All json.dumps() parameters work exactly the same
Enhanced API with Smart Defaults¶
# Enhanced features with same simple API
import datason
# Smart datetime parsing automatically enabled
data = datason.loads('{"timestamp": "2024-01-01T00:00:00Z", "value": 42}')
# Returns: {'timestamp': datetime.datetime(2024, 1, 1, 0, 0, tzinfo=timezone.utc), 'value': 42}
# Enhanced serialization with dict output
result = datason.dumps({"timestamp": datetime.now(), "data": [1, 2, 3]})
# Returns: dict (not string) with smart type handling
Function | Purpose | Output Type | Enhanced Features |
---|---|---|---|
datason.loads() |
JSON string parsing | dict | β Smart datetime parsing |
datason.dumps() |
Object serialization | dict | β Enhanced type handling |
datason.loads_json() |
JSON compatibility | dict | β Exact stdlib behavior |
datason.dumps_json() |
JSON string output | str | β Exact stdlib behavior |
π― Traditional API Overview¶
The traditional core functions provide comprehensive, configuration-based serialization with maximum control and flexibility.
Function | Purpose | Best For |
---|---|---|
serialize() |
Main serialization function | Custom configurations |
deserialize() |
Main deserialization function | Structured data restoration |
auto_deserialize() |
Automatic type detection | Quick data exploration |
safe_deserialize() |
Error-resilient deserialization | Untrusted data sources |
π¦ Detailed Function Documentation¶
serialize()¶
The primary serialization function with full configuration support.
datason.serialize(obj: Any, config: Any = None, **kwargs: Any) -> Any
¶
Serialize an object (DEPRECATED - use dump/dumps instead).
DEPRECATION WARNING: Direct use of serialize() is discouraged. Use the clearer API functions instead: - dump(obj, file) - write to file (like json.dump) - dumps(obj) - convert to string (like json.dumps) - serialize_enhanced(obj, **options) - enhanced serialization with clear options
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
Any
|
Object to serialize |
required |
config
|
Any
|
Optional configuration |
None
|
**kwargs
|
Any
|
Additional options |
{}
|
Returns:
Type | Description |
---|---|
Any
|
Serialized object |
Source code in datason/__init__.py
Configuration Example:
import datason as ds
from datetime import datetime
import pandas as pd
# Basic serialization
data = {"values": [1, 2, 3], "timestamp": datetime.now()}
result = ds.serialize(data)
# With custom configuration
config = ds.SerializationConfig(
include_type_info=True,
compress_arrays=True,
date_format=ds.DateFormat.ISO_8601,
nan_handling=ds.NanHandling.NULL
)
complex_data = {
"dataframe": pd.DataFrame({"x": [1, 2, 3]}),
"timestamp": datetime.now(),
"metadata": {"version": 1.0}
}
result = ds.serialize(complex_data, config=config)
deserialize()¶
The primary deserialization function with configuration support.
datason.deserialize(obj: Any, parse_dates: bool = True, parse_uuids: bool = True) -> Any
¶
Recursively deserialize JSON-compatible data back to Python objects.
Attempts to intelligently restore datetime objects, UUIDs, and other types that were serialized to strings by the serialize function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
Any
|
The JSON-compatible object to deserialize |
required |
parse_dates
|
bool
|
Whether to attempt parsing ISO datetime strings back to datetime objects |
True
|
parse_uuids
|
bool
|
Whether to attempt parsing UUID strings back to UUID objects |
True
|
Returns:
Type | Description |
---|---|
Any
|
Python object with restored types where possible |
Examples:
>>> data = {"date": "2023-01-01T12:00:00", "id": "12345678-1234-5678-9012-123456789abc"}
>>> deserialize(data)
{"date": datetime(2023, 1, 1, 12, 0), "id": UUID('12345678-1234-5678-9012-123456789abc')}
Source code in datason/deserializers_new.py
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
|
Deserialization Example:
# Basic deserialization
restored_data = ds.deserialize(serialized_result)
# With custom configuration for specific type handling
config = ds.SerializationConfig(
strict_types=True,
preserve_numpy_arrays=True,
datetime_parsing=True
)
restored_data = ds.deserialize(serialized_result, config=config)
print(type(restored_data["dataframe"])) # <class 'pandas.core.frame.DataFrame'>
auto_deserialize()¶
Automatic type detection and intelligent deserialization.
datason.auto_deserialize(obj: Any, aggressive: bool = False, config: Optional[SerializationConfig] = None) -> Any
¶
NEW: Intelligent auto-detection deserialization with heuristics.
Uses pattern recognition and heuristics to automatically detect and restore complex data types without explicit configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
Any
|
JSON-compatible object to deserialize |
required |
aggressive
|
bool
|
Whether to use aggressive type detection (may have false positives) |
False
|
config
|
Optional[SerializationConfig]
|
Configuration object to control deserialization behavior |
None
|
Returns:
Type | Description |
---|---|
Any
|
Python object with auto-detected types restored |
Examples:
>>> data = {"records": [{"a": 1, "b": 2}, {"a": 3, "b": 4}]}
>>> auto_deserialize(data, aggressive=True)
{"records": DataFrame(...)} # May detect as DataFrame
>>> # API-compatible UUID handling
>>> from datason.config import get_api_config
>>> auto_deserialize("12345678-1234-5678-9012-123456789abc", config=get_api_config())
"12345678-1234-5678-9012-123456789abc" # Stays as string
Source code in datason/deserializers_new.py
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 |
|
Auto-Detection Example:
# Automatically detect and restore types from JSON
json_data = '{"timestamp": "2024-01-01T12:00:00", "values": [1, 2, 3]}'
# Intelligent type detection
auto_restored = ds.auto_deserialize(json_data)
print(type(auto_restored["timestamp"])) # <class 'datetime.datetime'>
# Works with complex nested structures
complex_json = ds.serialize({
"df": pd.DataFrame({"x": [1, 2, 3]}),
"date": datetime.now(),
"array": np.array([1, 2, 3])
})
auto_complex = ds.auto_deserialize(complex_json)
safe_deserialize()¶
Error-resilient deserialization for untrusted or malformed data.
datason.safe_deserialize(json_str: str, allow_pickle: bool = False, **kwargs: Any) -> Any
¶
Safely deserialize a JSON string, handling parse errors gracefully.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
json_str
|
str
|
JSON string to parse and deserialize |
required |
allow_pickle
|
bool
|
Whether to allow deserialization of pickle-serialized objects |
False
|
**kwargs
|
Any
|
Arguments passed to deserialize() |
{}
|
Returns:
Type | Description |
---|---|
Any
|
Deserialized Python object, or the original string if parsing fails |
Raises:
Type | Description |
---|---|
DeserializationSecurityError
|
If pickle data is detected and allow_pickle=False |
Source code in datason/deserializers_new.py
Safe Processing Example:
# Handle potentially malformed data
untrusted_data = '{"timestamp": "invalid-date", "values": [1, "bad", 3]}'
try:
# Regular deserialization might fail
result = ds.deserialize(untrusted_data)
except Exception as e:
# Safe deserialization provides fallbacks
safe_result = ds.safe_deserialize(untrusted_data)
print("Safely processed:", safe_result)
# With custom error handling
safe_result = ds.safe_deserialize(
untrusted_data,
fallback_values={"timestamp": None, "values": []},
skip_invalid=True
)
π§ Configuration System Integration¶
The core functions work seamlessly with datason's configuration system:
Preset Configurations¶
# Use predefined configurations for common scenarios
ml_config = ds.get_ml_config()
ml_result = ds.serialize(ml_data, config=ml_config)
api_config = ds.get_api_config()
api_result = ds.serialize(api_data, config=api_config)
strict_config = ds.get_strict_config()
strict_result = ds.serialize(data, config=strict_config)
performance_config = ds.get_performance_config()
fast_result = ds.serialize(data, config=performance_config)
Custom Configuration¶
# Build custom configurations
custom_config = ds.SerializationConfig(
# Type handling
include_type_info=True,
strict_types=False,
preserve_numpy_arrays=True,
# Performance
compress_arrays=True,
optimize_memory=True,
# Data handling
date_format=ds.DateFormat.TIMESTAMP,
nan_handling=ds.NanHandling.STRING,
dataframe_orient=ds.DataFrameOrient.RECORDS,
# Security
redact_patterns=["ssn", "password"],
max_depth=100
)
result = ds.serialize(data, config=custom_config)
π Error Handling Patterns¶
Graceful Degradation¶
def robust_serialize(data):
"""Serialize with multiple fallback strategies."""
try:
# Try with full configuration
return ds.serialize(data, config=ds.get_ml_config())
except MemoryError:
# Fall back to chunked processing
return ds.serialize_chunked(data)
except SecurityError:
# Fall back to safe mode
safe_config = ds.SerializationConfig(secure_mode=True)
return ds.serialize(data, config=safe_config)
except Exception:
# Last resort: safe deserialization
return ds.safe_deserialize(data)
Validation and Recovery¶
def validate_and_deserialize(serialized_data):
"""Validate data before deserialization."""
try:
# First attempt: auto deserialization
result = ds.auto_deserialize(serialized_data)
return result
except ValueError:
# Second attempt: safe deserialization
return ds.safe_deserialize(serialized_data)
π Performance Considerations¶
Function Performance Characteristics¶
Function | Speed | Reliability | Features |
---|---|---|---|
serialize() |
β‘β‘ | π‘οΈπ‘οΈπ‘οΈ | βββ |
deserialize() |
β‘β‘ | π‘οΈπ‘οΈπ‘οΈ | βββ |
auto_deserialize() |
β‘ | π‘οΈπ‘οΈ | ββ |
safe_deserialize() |
β‘ | π‘οΈπ‘οΈπ‘οΈπ‘οΈ | β |
Optimization Tips¶
# Reuse configurations for better performance
config = ds.get_ml_config()
for batch in data_batches:
result = ds.serialize(batch, config=config)
# Use appropriate function for your needs
if data_is_trusted:
result = ds.deserialize(data) # Fastest
else:
result = ds.safe_deserialize(data) # Most reliable
π Related Documentation¶
- Configuration System - Detailed configuration options
- Chunked & Streaming - Large data processing
- Template System - Data validation
- Modern API - Compare with intention-revealing functions