38686-vm/core/management/commands/restore_data.py
Konrad du Plessis 0ace7c6786 Phase 1: security fixes + backup/restore tooling + vat_type migrations
Minimal infrastructure push before the bigger feature release (worker/team/
project management UIs, WeasyPrint migration, new models). Deploying this
first gives us a browser-accessible `/backup-data/` endpoint so we can
snapshot production before the bigger change lands.

SECURITY
  - Remove hardcoded Gmail App Password from settings.py (was leaking via
    git history; new password now lives in Flatlogic's `../.env` file)
  - Remove hardcoded SECRET_KEY default; raise ImproperlyConfigured in
    prod if env var missing; dev fallback only when USE_SQLITE is set
  - Flip DEBUG default from 'true' to 'false' so missing env var doesn't
    silently expose tracebacks
  - Remove hardcoded EMAIL_HOST_USER / DEFAULT_FROM_EMAIL defaults
  - Add startup warning when email vars missing in production
  - Fix CSRF_TRUSTED_ORIGINS double-scheme bug (would break with
    pre-prefixed HOST_FQDN env var)

BACKUP / RESTORE
  - New `backup_data` management command — serialises every core + auth
    row to a timestamped JSON file. Gracefully handles models missing at
    older schema versions (WorkerCertificate/Warning imported optionally).
  - New `restore_data` management command — loads JSON back into the DB
    with a populated-DB safety guard and transactional all-or-nothing
    semantics.
  - New `/backup-data/` admin-only URL — downloads the JSON to browser.
  - New `/restore-data/` admin-only URL — upload form with CSRF and
    explicit confirm checkbox before any data is loaded.

MIGRATIONS
  - Add 0007_vat_type_default + 0008_vat_type_default_none (change
    ExpenseReceipt.vat_type default to 'None').
  - Update models.py to match migration 0008's end state.

HOUSEKEEPING
  - Extend .gitignore: .claude/, .vscode/, .idea/, test_*.pdf,
    test_*.json, nul, backups/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 00:28:21 +02:00

142 lines
5.1 KiB
Python

# === RESTORE DATA MANAGEMENT COMMAND ===
# Restores a backup produced by `backup_data` — takes a JSON file and
# loads every row into the database.
#
# SAFETY:
# By default this command REFUSES to run against a non-empty database
# (prevents accidentally overwriting live data). Pass --force to
# bypass — but only when you know the target is empty or already
# matches the backup.
#
# USAGE (local):
# python manage.py restore_data backups/foxlog_20260421_120000.json
# python manage.py restore_data backup.json --force (overwrite existing)
#
# USAGE (Flatlogic, via browser):
# Upload a .json backup file via /restore-data/ (admin only).
#
# BEHAVIOUR:
# Uses Django's built-in `loaddata` under the hood, which:
# - Updates existing rows if their pk matches (no duplicates)
# - Creates new rows for any pk not yet in the DB
# - Respects FK/M2M dependencies
# - Runs inside a transaction — if any row fails, nothing is saved
import json
import sys
from pathlib import Path
from django.core.management.base import BaseCommand, CommandError
from django.core.management import call_command
from django.db import transaction
from django.contrib.auth.models import User
from core.models import Worker, WorkLog, PayrollRecord
def check_database_is_populated():
"""Return True if the database already has meaningful data.
Used as a guardrail: by default we refuse to restore into a DB that
already contains workers, work logs, or payroll records, because
that could double-insert and corrupt the state.
"""
has_workers = Worker.objects.exists()
has_logs = WorkLog.objects.exists()
has_payments = PayrollRecord.objects.exists()
return has_workers or has_logs or has_payments
def restore_from_json_string(json_str):
"""Load a JSON backup string into the database.
Returns (success, message_or_summary). Used both by this management
command and by the browser-accessible `/restore-data/` view so the
same logic runs in both places.
Raises no exceptions — returns (False, error_message) on failure so
the caller (CLI or web view) can format the error appropriately.
"""
try:
payload = json.loads(json_str)
except json.JSONDecodeError as e:
return False, f"File is not valid JSON: {e}"
# Backups produced by `backup_data` wrap rows in a top-level dict.
# Raw dumpdata output is a bare list — support both for flexibility.
if isinstance(payload, dict) and "data" in payload:
rows = payload["data"]
elif isinstance(payload, list):
rows = payload
else:
return False, "Unexpected JSON structure — expected dict with 'data' key or a list."
if not rows:
return False, "Backup file contains no rows."
# Write the rows to a tmp file then let Django's loaddata do the work
# (it handles FK order, transaction wrapping, and natural keys).
import tempfile
with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False,
encoding="utf-8") as tmp:
# loaddata expects the bare list format
json.dump(rows, tmp, default=str)
tmp_path = tmp.name
try:
with transaction.atomic():
call_command("loaddata", tmp_path, verbosity=0)
except Exception as e:
return False, f"Restore failed: {e}"
finally:
try:
Path(tmp_path).unlink()
except Exception:
pass # cleanup best-effort
# Build a summary for the caller to display
summary = {
"users": User.objects.count(),
"workers": Worker.objects.count(),
"work_logs": WorkLog.objects.count(),
"payroll_records": PayrollRecord.objects.count(),
"rows_in_backup": len(rows),
}
return True, summary
class Command(BaseCommand):
help = "Restore a JSON backup produced by `backup_data`."
def add_arguments(self, parser):
parser.add_argument("backup_file", type=str, help="Path to a .json backup file")
parser.add_argument(
"--force",
action="store_true",
help="Allow restore even if the target database already has data",
)
def handle(self, *args, **options):
backup_path = Path(options["backup_file"])
if not backup_path.exists():
raise CommandError(f"Backup file not found: {backup_path}")
if not options["force"] and check_database_is_populated():
raise CommandError(
"Database already contains data (workers/logs/payments). "
"Restoring now could duplicate or corrupt rows.\n"
"If you really want to proceed, run again with --force.\n"
"Or flush first: python manage.py flush (irreversible)."
)
json_str = backup_path.read_text(encoding="utf-8")
ok, result = restore_from_json_string(json_str)
if not ok:
raise CommandError(result)
self.stdout.write(self.style.SUCCESS("Restore complete."))
self.stdout.write("Rows in database after restore:")
for k, v in result.items():
self.stdout.write(f" {k}: {v}")