Modern Python Tooling Guide¶
Overview¶
datason uses best-in-class modern Python tooling for development, testing, security, and documentation. This guide explains our choices and how to use them effectively.
๐ Tool Stack Summary¶
Category | Tool | Replaces | Why Better |
---|---|---|---|
Linting & Formatting | Ruff | Black + isort + flake8 + pyupgrade | 10-100x faster, single tool |
Type Checking | mypy | - | Industry standard, strict typing |
Testing | pytest + coverage | unittest | Better DX, plugins, fixtures |
Security | Bandit + Safety + pip-audit | Manual reviews | Automated vulnerability detection |
Documentation | MkDocs Material | Sphinx | Modern, beautiful, fast |
Dependency Management | pip-tools + Dependabot | Manual updates | Automated, deterministic |
Git Hooks | pre-commit | Manual checks | Automatic quality gates |
Performance | pytest-benchmark | Manual timing | Statistical benchmarking |
๐ง Core Tooling Decisions¶
โก Ruff (Linting & Formatting)¶
Why Ruff over Black + isort + flake8?
# Old approach (slow, multiple tools)
black datason/ tests/ # ~2 seconds
isort datason/ tests/ # ~1 second
flake8 datason/ tests/ # ~3 seconds
pyupgrade **/*.py # ~2 seconds
# Total: ~8 seconds
# New approach (fast, single tool)
ruff check --fix datason/ tests/ # ~0.1 seconds
ruff format datason/ tests/ # ~0.1 seconds
# Total: ~0.2 seconds (40x faster!)
Ruff Configuration:
[tool.ruff]
target-version = "py38"
line-length = 88
[tool.ruff.lint]
select = [
"E", "W", # pycodestyle
"F", # pyflakes
"I", # isort
"N", # pep8-naming
"D", # pydocstyle
"UP", # pyupgrade
"S", # bandit security
"B", # bugbear
"C4", # comprehensions
"SIM", # simplify
"RUF", # ruff-specific
]
Usage:
# Check and fix issues
ruff check --fix .
# Format code
ruff format .
# Check specific rule
ruff check --select E501 . # Line too long
๐ Security Tools¶
Multi-layered security approach:
- Bandit - Static code analysis for security issues
- Safety - Known vulnerability database checking
- pip-audit - Dependency vulnerability scanning
- Semgrep - Advanced pattern-based security scanning
# Run all security checks
bandit -c pyproject.toml -r datason/
safety check --json
pip-audit --format=json
semgrep --config=auto datason/
Common security issues caught:
- Hardcoded passwords/secrets
- SQL injection patterns
- Use of eval()
or exec()
- Weak random number generation
- Insecure hash functions
๐ Documentation with MkDocs Material¶
Why MkDocs over Sphinx?
# Modern, beautiful, fast
mkdocs serve # Live reload dev server
mkdocs build # Static site generation
mkdocs gh-deploy # Deploy to GitHub Pages
Features: - ๐จ Beautiful UI - Material Design - ๐ Search - Full-text search built-in - ๐ฑ Mobile - Responsive design - ๐ Dark mode - Automatic theme switching - ๐ Diagrams - Mermaid support - ๐ Cross-refs - Automatic API linking
๐งช Testing Excellence¶
Comprehensive testing setup:
# Run all tests with coverage
pytest --cov=datason --cov-report=html
# Parallel testing (faster)
pytest -n auto
# Performance benchmarks
pytest tests/test_performance.py --benchmark-only
# Specific test markers
pytest -m "not slow" # Skip slow tests
pytest -m "ml" # Only ML-related tests
pytest -m "integration" # Integration tests only
Test markers configured:
# In pyproject.toml
markers = [
"slow: marks tests as slow",
"integration: marks tests as integration tests",
"benchmark: marks tests as benchmarks",
"ml: marks tests that require ML libraries",
]
๐ Pre-commit Hooks¶
Automatic quality gates:
# .pre-commit-config.yaml highlights
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
hooks:
- id: ruff # Linting
- id: ruff-format # Formatting
- repo: https://github.com/PyCQA/bandit
hooks:
- id: bandit # Security scanning
Benefits: - โ Prevents bad commits - Catches issues before they reach CI - โ Consistent formatting - Everyone uses same style - โ Security checks - Vulnerabilities caught early - โ Fast feedback - Issues fixed immediately
๐ ๏ธ Development Workflow¶
Initial Setup¶
# 1. Install development dependencies
pip install -e ".[dev]"
# 2. Install pre-commit hooks
pre-commit install
# 3. Run initial checks
pre-commit run --all-files
Daily Development¶
# Before committing (automatic via pre-commit)
ruff check --fix . # Fix linting issues
ruff format . # Format code
pytest # Run tests
mypy datason/ # Type checking
Release Process¶
# 1. Security audit
bandit -c pyproject.toml -r datason/
safety check
pip-audit
# 2. Full test suite
pytest --cov=datason --cov-report=term-missing
# 3. Documentation
mkdocs build
# 4. Build and publish
python -m build
twine upload dist/*
๐ Performance Optimization¶
Benchmarking with pytest-benchmark¶
def test_serialization_performance(benchmark):
"""Benchmark serialization performance."""
data = create_test_data()
result = benchmark(datason.serialize, data)
# Assertions on performance
assert benchmark.stats.median < 0.01 # < 10ms
Profiling¶
# Profile specific functions
python -m cProfile -o profile.stats benchmark_real_performance.py
# Analyze with snakeviz
pip install snakeviz
snakeviz profile.stats
๐ Dependency Management¶
pip-tools for Reproducible Builds¶
# Create requirements.txt from pyproject.toml
pip-compile pyproject.toml
# Update dependencies
pip-compile --upgrade pyproject.toml
# Install exact versions
pip-sync requirements.txt
Dependabot Configuration¶
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
๐ Monitoring & Metrics¶
Coverage Tracking¶
# Generate coverage report
pytest --cov=datason --cov-report=html
# Coverage thresholds in pyproject.toml
[tool.coverage.report]
fail_under = 80
show_missing = true
Performance Monitoring¶
# Benchmark regression testing
@pytest.mark.benchmark
def test_performance_regression(benchmark):
"""Ensure performance doesn't regress."""
result = benchmark(serialize_large_dataset)
# CI will fail if significantly slower than baseline
๐ฏ Best Practices Summary¶
Code Quality¶
- โ Ruff for all linting/formatting (replaces 4 tools)
- โ mypy with strict mode for type safety
- โ Pre-commit hooks for automatic quality gates
- โ 100% test coverage for critical paths
Security¶
- โ Multi-tool scanning (Bandit, Safety, pip-audit)
- โ Regular dependency updates via Dependabot
- โ Secret detection in pre-commit
- โ Security policy with responsible disclosure
Documentation¶
- โ MkDocs Material for beautiful docs
- โ API reference auto-generated from docstrings
- โ Examples for all major features
- โ Cross-references and search
Performance¶
- โ Benchmark suite with statistical analysis
- โ Performance regression testing in CI
- โ Profiling tools for optimization
- โ Real-world performance data in docs
๐ Upgrading from Legacy Tools¶
Migration Guide¶
From Black + isort + flake8 โ Ruff:
# Remove old tools
pip uninstall black isort flake8
# Install Ruff
pip install ruff
# Update pre-commit
# Replace separate hooks with ruff hooks
# Run once
ruff check --fix .
ruff format .
Benefits: - ๐ 40x faster linting and formatting - ๐งน Single tool instead of multiple - ๐ง More rules - catches more issues - ๐ฆ Smaller dependency tree
๐ Results¶
This modern tooling setup delivers:
- โก 10x faster development workflow
- ๐ก๏ธ Better security with automated scanning
- ๐ Professional docs with zero maintenance
- ๐งช Reliable testing with performance tracking
- ๐ค Automated quality via pre-commit hooks
Developer Experience Score: 10/10 ๐ฏ
The investment in modern tooling pays off immediately in developer productivity and code quality!