add tests for postgresql

2 months ago · 9bdd00b193
21 changed files with 1500 additions and 7 deletions
--- a/.gitignore
+++ b/.gitignore
@ -3,3 +3,5 @@
 !fcos.bu
 !overlay.bu
 */butane.blocklist
 __pycache__/
 .pytest_cache/
--- a/README.md
+++ b/README.md
@ -53,6 +53,13 @@ This repository gathers all the recipes (hence the name "Cookbook") to deploy Op
 - Fedora / CentOS Stream / RHEL or derivative operating system.
 - Systemd
 ## End-to-end testing
 ```
 pip install -e .
 pytest postgresql/tests/
 ```
 ## Development
 To develop Podman Quadlets, it is advised to create a Fedora Virtual Machine dedicated to this task.
--- a/TESTS.md
+++ b/TESTS.md
@ -0,0 +1,235 @@
 # Testing Guide
 This project uses **pytest** with the **pytest-testinfra** plugin to run
 end-to-end integration tests against real Fedora CoreOS virtual machines.
 ## Dependencies
 Declared in `pyproject.toml`:
 | Package | Purpose |
 |---------|---------|
 | `pytest>=8.0` | Test runner and framework |
 | `pytest-testinfra>=10.1` | Infrastructure testing (services, files, sockets, ...) |
 | `paramiko>=3.4` | SSH transport used by testinfra |
 ## Core pytest concepts
 ### Test discovery
 pytest automatically finds tests by scanning for files named `test_*.py` and
 collecting functions named `test_*` inside them. No registration is needed.
 The `pyproject.toml` configuration:
 ```toml
 [tool.pytest.ini_options]
 log_cli = true
 log_cli_level = "INFO"
 addopts = "-v"
 ```
 No `testpaths` is set, so pytest discovers tests in all sub-directories.
 To run a specific cookbook's tests:
 ```bash
 pytest postgresql/tests/
 ```
 ### Fixtures
 A **fixture** is a function decorated with `@pytest.fixture` that prepares a
 resource for a test. Fixtures are injected by naming them as test function
 parameters:
 ```python
@pytest.fixture(scope="module")
 def pg_host(...):
    return testinfra.get_host(f"ssh://root@{vm.ip}", ...)
 def test_port_listening(pg_host):          # ← pg_host is injected automatically
    assert pg_host.socket("tcp://127.0.0.1:5432").is_listening
 ```
 pytest resolves the full dependency graph: if fixture A depends on fixture B,
 B is created first.
 ### Fixture scopes
 The `scope` parameter controls how long a fixture lives:
 | Scope | Lifetime |
 |-------|----------|
 | `"function"` (default) | Recreated for every single test |
 | `"module"` | One instance per `.py` file |
 | `"session"` | One instance for the entire pytest run |
 In this project:
 - `test_ssh_key` / `test_ssh_pubkey` are **session-scoped** — a single SSH
  key pair is generated once and shared across all tests.
 - `postgresql_vm` / `pg_host` are **module-scoped** — each test file gets its
  own VM that is destroyed after the last test in that file.
 ### `yield` fixtures (setup + teardown)
 When a fixture uses `yield`, the code before `yield` is setup and the code
 after is teardown. Teardown always runs, even if a test fails.
 ```python
@pytest.fixture(scope="module")
 def postgresql_vm(...):
    vm = FCOSVirtualMachine(...)
    vm.create()                     # ← setup
    vm.wait_ssh(...)
    yield vm                        # ← value passed to the test
    vm.destroy()                    # ← teardown (always runs)
 ```
 ### `conftest.py` — shared fixtures
 `conftest.py` files are loaded automatically by pytest. Every fixture defined
 in a `conftest.py` is available to all tests in the same directory and its
 sub-directories.
 This project has two:
 | File | Scope | Contents |
 |------|-------|----------|
 | `conftest.py` (root) | Global | SSH key pair generation |
 | `postgresql/tests/conftest.py` | PostgreSQL tests | VM creation, testinfra host, upgrade VM |
 ## pytest-testinfra
 **testinfra** is a pytest plugin that provides a high-level Python API to
 inspect the state of a remote server over SSH. A connection is established
 via `testinfra.get_host()` and the resulting object exposes modules to
 inspect:
 | Module | Example | What it checks |
 |--------|---------|----------------|
 | `service` | `host.service("postgresql.target").is_running` | systemd unit state |
 | `socket` | `host.socket("tcp://127.0.0.1:5432").is_listening` | open ports |
 | `file` | `host.file("/etc/config").exists` | file existence, permissions, ownership |
 | `mount_point` | `host.mount_point("/data").filesystem` | mounted filesystems |
 | `run` | `host.run("systemctl is-active ...")` | arbitrary commands (returns `.stdout`, `.rc`) |
 ## Project test architecture
 ```
 conftest.py (root)              → SSH key pair (session-scoped)
 tests/
  └── vm.py                     → FCOSVirtualMachine class (create/destroy/ssh)
 postgresql/tests/
  ├── conftest.py               → VM + pg_host fixtures (module-scoped)
  ├── helpers.py                → constants (PG_MAJOR_DEFAULT, credentials) + run_sql()
  ├── test_install.py           → fresh install: services, ports, filesystem, connectivity
  ├── test_backup.py            → trigger backup, verify artefacts, retention policy
  ├── test_recovery.py          → restore from backup
  └── test_upgrade.py           → major version upgrade (uses a separate VM)
 ```
 `FCOSVirtualMachine` (in `tests/vm.py`) is a plain Python class — not a
 fixture. It manages the full lifecycle of a KVM virtual machine: QCOW2 disk
 creation, `virt-install`, SSH readiness polling, remote command execution via
 SSH, and `virsh destroy` cleanup. Fixtures in `conftest.py` wrap this class.
 ## Test execution flow
 Taking `test_postgresql_port_listening` as an example:
 1. pytest discovers `test_install.py`.
 2. It sees `test_postgresql_port_listening(pg_host)` and resolves the fixture
   chain: `pg_host` → `postgresql_vm` + `test_ssh_key`.
 3. `test_ssh_key` (session-scoped) generates an Ed25519 key pair — once for
   the entire run.
 4. `postgresql_vm` (module-scoped):
   - Compiles the Fedora CoreOS ignition via `make butane`.
   - Creates a KVM VM with `virt-install`.
   - Polls until SSH is reachable.
   - Waits for `postgresql.target` to become active.
 5. `pg_host` connects testinfra to the VM via SSH.
 6. The test runs: `pg_host.socket("tcp://127.0.0.1:5432").is_listening`.
 7. After **all** tests in the module complete, `vm.destroy()` tears down the
   VM.
 ## Test ordering
 ### Module (file) order
 Modules are executed in **alphabetical order** by path:
 1. `test_backup.py`
 2. `test_install.py`
 3. `test_recovery.py`
 4. `test_upgrade.py`
 Since each module gets its own VM (module-scoped fixtures), there are **no
 dependencies between modules**.
 ### Test (function) order within a module
 Within a file, tests run in **source order** (top to bottom). This is
 pytest's default behavior — no plugin needed.
 This matters when tests have side effects. For example in `test_backup.py`:
 1. `test_trigger_backup` — triggers the backup service.
 2. `test_backup_completes_successfully` — waits for the service to finish.
 3. `test_backup_directory_exists_in_virtiofs` — checks files created by step 1.
 4. ...and so on.
 Later tests depend on artefacts created by earlier ones. The ordering relies
 on the declaration order in the source file.
 ## Pausing tests for manual inspection
 ### `breakpoint()` + `--pdb`
 Add `breakpoint()` at any point in a test. Run with `--pdb` and `-x` (stop
 at first failure):
 ```bash
 pytest postgresql/tests/test_install.py --pdb -x
 ```
 `--pdb` drops into the Python debugger on failure. `breakpoint()` drops into
 it unconditionally. Type `c` to continue.
 ### `input()` + `-s`
 The simplest approach — add a manual pause:
 ```python
 def test_postgresql_port_listening(pg_host):
    assert pg_host.socket("tcp://127.0.0.1:5432").is_listening
    input("VM is running. Press Enter to continue.")
 ```
 Run with `-s` so pytest does not capture stdin/stdout:
 ```bash
 pytest postgresql/tests/test_install.py -s -k test_postgresql_port_listening
 ```
 ### Scope-aware pausing
 The VM is destroyed after the **last** test in a module. If you pause in the
 last test, the VM will be destroyed as soon as you resume. To inspect after
 all tests, add a sentinel test at the end of the file:
 ```python
 def test_zz_pause_for_inspection(postgresql_vm, test_ssh_key):
    print(f"\nVM accessible: ssh -i {test_ssh_key} root@{postgresql_vm.ip}")
    input("Inspecting... Press Enter to destroy the VM.")
 ```
 ### `-k` to target a specific test
 Combine with any of the above to skip unrelated tests:
 ```bash
 pytest postgresql/tests/test_install.py -s -k test_postgresql_port_listening
 ```
--- a/common.mk
+++ b/common.mk
@ -60,12 +60,13 @@ endif
 PROJECT_NAME := $(shell basename "$${PWD}")
 # Quadlets files and their corresponding systemd unit names
-QUADLETS_FILES = $(wildcard *.container *.volume *.network *.pod *.build)
+QUADLETS_FILES = $(wildcard *.container *.volume *.network *.pod *.build *.image)
 QUADLET_UNIT_NAMES := $(patsubst %.container, %.service, $(wildcard *.container)) \
 					 $(patsubst %.volume, %-volume.service, $(wildcard *.volume)) \
 					 $(patsubst %.network, %-network.service, $(wildcard *.network)) \
 					 $(patsubst %.pod, %-pod.service, $(wildcard *.pod)) \
-					 $(patsubst %.build, %-build.service, $(wildcard *.build))
+					 $(patsubst %.build, %-build.service, $(wildcard *.build)) \
 					 $(patsubst %.image, %-image.service, $(wildcard *.image))
 # Wellknown systemd unit file types
 SYSTEMD_FILES = $(wildcard *.service *.target *.timer *.mount)
@ -133,7 +134,7 @@ pre-requisites::
 		exit 1; \
 	fi
 	@set -Eeuo pipefail; \
-	for tool in install systemctl systemd-analyze systemd-tmpfiles sysctl virt-install virsh qemu-img journalctl coreos-installer resize butane yq podlet; do \
+	for tool in install systemctl systemd-analyze systemd-tmpfiles sysctl virt-install virsh qemu-img journalctl coreos-installer resize butane yq podlet pip3; do \
 		if ! which $$tool &>/dev/null ; then \
 			echo "$$tool is not installed. Please install it first." >&2; \
 			exit 1; \
--- a/conftest.py
+++ b/conftest.py
@ -0,0 +1,23 @@
 import subprocess
 from pathlib import Path
 import pytest
@pytest.fixture(scope="session")
 def test_ssh_key(tmp_path_factory: pytest.TempPathFactory) -> Path:
    """Generate a temporary SSH key pair (no passphrase) for VM access."""
    key_dir = tmp_path_factory.mktemp("ssh-key")
    key_path = key_dir / "id_ed25519"
    subprocess.run(
        ["ssh-keygen", "-t", "ed25519", "-N", "", "-f", str(key_path)],
        check=True,
        capture_output=True,
    )
    return key_path
@pytest.fixture(scope="session")
 def test_ssh_pubkey(test_ssh_key: Path) -> str:
    """Public key string corresponding to test_ssh_key."""
    return test_ssh_key.with_suffix(".pub").read_text().strip()
--- a/postgresql/postgresql-backup.container
+++ b/postgresql/postgresql-backup.container
@ -9,7 +9,7 @@ PartOf=postgresql.target
 [Container]
 ContainerName=postgresql-backup-job
-Image=docker.io/library/postgres:${PG_MAJOR}-alpine
+Image=postgresql.image
 # Network configuration
 Network=host
--- a/postgresql/postgresql-init.container
+++ b/postgresql/postgresql-init.container
@ -15,7 +15,7 @@ PartOf=postgresql.target
 [Container]
 ContainerName=postgresql-init-job
-Image=docker.io/library/postgres:${PG_MAJOR}-alpine
+Image=postgresql.image
 # Network configuration
 Network=host
--- a/postgresql/postgresql-pgautoupgrade.image
+++ b/postgresql/postgresql-pgautoupgrade.image
@ -0,0 +1,13 @@
 [Unit]
 Description=podman pull docker.io/pgautoupgrade/pgautoupgrade
 Documentation=https://hub.docker.com/_/postgres/
 # Only start if PostgreSQL has been configured
 ConditionPathExists=/etc/quadlets/postgresql/config.env
 [Image]
 Image=docker.io/pgautoupgrade/pgautoupgrade:${PG_MAJOR}-alpine
 [Service]
 # These environment variables are sourced to be used by systemd in the Exec* commands
 EnvironmentFile=/etc/quadlets/postgresql/config.env
--- a/postgresql/postgresql-server.container
+++ b/postgresql/postgresql-server.container
@ -17,7 +17,7 @@ PartOf=postgresql.target
 [Container]
 ContainerName=postgresql-server
-Image=docker.io/library/postgres:${PG_MAJOR}-alpine
+Image=postgresql.image
 AutoUpdate=registry
 # Network configuration
--- a/postgresql/postgresql-upgrade.container
+++ b/postgresql/postgresql-upgrade.container
@ -17,7 +17,7 @@ PartOf=postgresql.target
 [Container]
 ContainerName=postgresql-upgrade-to-${PG_MAJOR}-job
-Image=docker.io/pgautoupgrade/pgautoupgrade:${PG_MAJOR}-alpine
+Image=postgresql-pgautoupgrade.image
 # Network configuration
 Network=host
--- a/postgresql/postgresql.image
+++ b/postgresql/postgresql.image
@ -0,0 +1,13 @@
 [Unit]
 Description=podman pull docker.io/pgautoupgrade/pgautoupgrade
 Documentation=https://hub.docker.com/r/pgautoupgrade/pgautoupgrade
 # Only start if PostgreSQL has been configured
 ConditionPathExists=/etc/quadlets/postgresql/config.env
 [Image]
 Image=docker.io/library/postgres:${PG_MAJOR}-alpine
 [Service]
 # These environment variables are sourced to be used by systemd in the Exec* commands
 EnvironmentFile=/etc/quadlets/postgresql/config.env
--- a/postgresql/tests/init.py
+++ b/postgresql/tests/init.py
--- a/postgresql/tests/conftest.py
+++ b/postgresql/tests/conftest.py
@ -0,0 +1,167 @@
 """Pytest fixtures for the PostgreSQL cookbook end-to-end tests.
 Prerequisites:
  - Must run as root (KVM/libvirt access).
  - The Fedora CoreOS base QCOW2 image must be present at
    /var/lib/libvirt/images/library/fedora-coreos.qcow2.
    Run ``coreos-installer download -p qemu -f qcow2.xz -d
    -C /var/lib/libvirt/images/library/`` to fetch it.
  - fcos.ign for the postgresql cookbook is built on demand by
    ``make -C postgresql butane`` if it is missing.  This requires
    local.bu (SSH keys, user setup) to be present at the repository root.
 """
 import os
 import shutil
 import subprocess
 import sys
 from pathlib import Path
 import pytest
 import testinfra
 REPO_ROOT = Path(__file__).parent.parent.parent
 POSTGRESQL_DIR = REPO_ROOT / "postgresql"
 # Add directories to the path so we can import local helpers and shared vm.py.
 sys.path.insert(0, str(Path(__file__).parent))
 sys.path.insert(0, str(REPO_ROOT / "tests"))
 from vm import FCOSVirtualMachine, build_test_ignition, ensure_fcos_ign  # noqa: E402
 from helpers import (
    PG_DB,
    PG_MAJOR_DEFAULT,
    PG_MAJOR_UPGRADE_FROM,
    PG_MAJOR_UPGRADE_TO,
    PG_PASSWORD,
    PG_USER,
    run_sql,
 )
 # ---------------------------------------------------------------------------
 # Helpers
 # ---------------------------------------------------------------------------
 def _default_config_env(pg_major: str) -> dict[str, str]:
    """Return the full default config.env content as a dict for the given PG major."""
    return {
        "PG_MAJOR": pg_major,
        "POSTGRES_USER": PG_USER,
        "POSTGRES_PASSWORD": PG_PASSWORD,
        "POSTGRES_DB": PG_DB,
        "POSTGRES_HOST_AUTH_METHOD": "scram-sha-256",
        "POSTGRES_INITDB_ARGS": "--auth-host=scram-sha-256",
        "POSTGRES_ARGS": "-h 127.0.0.1",
        "PGPORT": "5432",
        "POSTGRES_BACKUP_RETENTION": "7",
    }
 # ---------------------------------------------------------------------------
 # Shared fixtures (module-scoped → one VM per test module)
 # ---------------------------------------------------------------------------
@pytest.fixture(scope="module")
 def virtiofs_dir() -> Path:
    """Unique VirtioFS host directory for the default test VM."""
    d = Path("/srv") / f"fcos-test-postgresql-{os.getpid()}"
    d.mkdir(parents=True, exist_ok=True)
    yield d
    if d.exists():
        shutil.rmtree(d)
@pytest.fixture(scope="module")
 def postgresql_vm(
    test_ssh_key: Path,
    test_ssh_pubkey: str,
    virtiofs_dir: Path,
    tmp_path_factory: pytest.TempPathFactory,
 ) -> FCOSVirtualMachine:
    """Running CoreOS VM with PostgreSQL installed at the default PG version.
    The VM is created once per test module and destroyed in teardown.
    All tests in the same module share this VM instance.
    """
    fcos_ign = ensure_fcos_ign(POSTGRESQL_DIR)
    test_ign = tmp_path_factory.mktemp("ign") / "fcos-test.ign"
    build_test_ignition(
        base_ignition=fcos_ign,
        ssh_pubkey=test_ssh_pubkey,
        output=test_ign,
    )
    vm = FCOSVirtualMachine(
        name=f"postgresql-{os.getpid()}",
        ignition_file=test_ign,
        virtiofs_dir=virtiofs_dir,
    )
    vm.create()
    vm.wait_ssh(ssh_key=test_ssh_key, timeout=300)
    vm.wait_for_service("postgresql.target", ssh_key=test_ssh_key, timeout=300)
    yield vm
    vm.destroy()
@pytest.fixture(scope="module")
 def pg_host(postgresql_vm: FCOSVirtualMachine, test_ssh_key: Path):
    """testinfra SSH host connected to the default PostgreSQL VM."""
    return testinfra.get_host(
        f"ssh://root@{postgresql_vm.ip}",
        ssh_extra_args=(
            f"-i {test_ssh_key}"
            " -o StrictHostKeyChecking=no"
            " -o UserKnownHostsFile=/dev/null"
        ),
    )
@pytest.fixture(scope="module")
 def upgrade_virtiofs_dir() -> Path:
    """Unique VirtioFS host directory for the upgrade test VM."""
    d = Path("/srv") / f"fcos-test-pg-upgrade-{os.getpid()}"
    d.mkdir(parents=True, exist_ok=True)
    yield d
    if d.exists():
        shutil.rmtree(d)
@pytest.fixture(scope="module")
 def upgrade_vm(
    test_ssh_key: Path,
    test_ssh_pubkey: str,
    upgrade_virtiofs_dir: Path,
    tmp_path_factory: pytest.TempPathFactory,
 ) -> FCOSVirtualMachine:
    """Running CoreOS VM with PostgreSQL installed at PG_MAJOR_UPGRADE_FROM.
    Used exclusively by test_upgrade.py to verify the major version upgrade path.
    The config.env is overridden via the ignition overlay so the VM boots
    directly with PG_MAJOR_UPGRADE_FROM, regardless of the cookbook's default.
    """
    fcos_ign = ensure_fcos_ign(POSTGRESQL_DIR)
    test_ign = tmp_path_factory.mktemp("ign-upgrade") / "fcos-upgrade.ign"
    build_test_ignition(
        base_ignition=fcos_ign,
        ssh_pubkey=test_ssh_pubkey,
        output=test_ign,
        config_env_overrides=_default_config_env(PG_MAJOR_UPGRADE_FROM),
    )
    vm = FCOSVirtualMachine(
        name=f"pg-upgrade-{os.getpid()}",
        ignition_file=test_ign,
        virtiofs_dir=upgrade_virtiofs_dir,
    )
    vm.create()
    vm.wait_ssh(ssh_key=test_ssh_key, timeout=300)
    vm.wait_for_service("postgresql.target", ssh_key=test_ssh_key, timeout=300)
    yield vm
    vm.destroy()
--- a/postgresql/tests/helpers.py
+++ b/postgresql/tests/helpers.py
@ -0,0 +1,39 @@
 """Shared constants and helper functions for PostgreSQL integration tests.
 These are extracted from conftest.py so that test modules can import them
 without conflicting with pytest's conftest discovery mechanism.
 """
 from pathlib import Path
 # Default version shipped in the example config.env.
 PG_MAJOR_DEFAULT = "14"
 # Version to start from in the major-upgrade scenario.
 PG_MAJOR_UPGRADE_FROM = "14"
 # Version to upgrade to in the major-upgrade scenario.
 PG_MAJOR_UPGRADE_TO = "17"
 # Default credentials from config/examples/config.env.
 PG_USER = "postgres"
 PG_PASSWORD = "postgres"
 PG_DB = "postgres"
 def run_sql(vm, ssh_key: Path, sql: str) -> str:
    """Execute *sql* via ``podman exec`` on the running postgresql-server container.
    Uses the Unix socket at /var/run/postgresql inside the container (mapped
    from /run/quadlets/postgresql on the host).  The pg_hba.conf generated by
    the official postgres image grants trust access on local sockets, so no
    password is required.
    Returns:
        Stripped stdout of the psql command.
    """
    result = vm.ssh_run(
        f"podman exec postgresql-server psql -U {PG_USER} -t -c \"{sql}\"",
        ssh_key,
    )
    return result.stdout.strip()
--- a/postgresql/tests/test_backup.py
+++ b/postgresql/tests/test_backup.py
@ -0,0 +1,119 @@
 """Test PostgreSQL backup creation and VirtioFS storage.
 These tests verify that:
  - The backup oneshot service can be triggered manually and runs to completion.
  - The expected backup artefacts land in the VirtioFS share (accessible from
    the test runner's host filesystem without SSH).
  - The backup retention policy removes stale backups.
 Note: tests within a module share a single VM (module-scoped fixture), so
 the order of test execution matters here: the backup files checked in later
 tests are created by the earlier trigger test.
 """
 import time
 from pathlib import Path
 # ---------------------------------------------------------------------------
 # Trigger and completion
 # ---------------------------------------------------------------------------
 def test_create_database_and_table(postgresql_vm, test_ssh_key):
    """Create a test database and table with some data to ensure the backup has
    something to capture."""
    postgresql_vm.ssh_run(
        "podman exec postgresql-server psql -U postgres -c \"CREATE DATABASE test;\"",
        test_ssh_key,
    )
    postgresql_vm.ssh_run(
        "podman exec postgresql-server psql -U postgres -d test -c \"CREATE TABLE witness (id SERIAL PRIMARY KEY, version VARCHAR); INSERT INTO witness (version) SELECT version();\"",
        test_ssh_key,
    )
 def test_trigger_backup(postgresql_vm, test_ssh_key):
    """Starting postgresql-backup.service must succeed (no immediate error)."""
    postgresql_vm.ssh_run(
        "systemctl start postgresql-backup.service",
        test_ssh_key,
    )
 def test_backup_completes_successfully(postgresql_vm, test_ssh_key):
    """postgresql-backup.service must finish in ``inactive`` state (not ``failed``)."""
    state = postgresql_vm.wait_for_unit_done(
        "postgresql-backup.service", test_ssh_key, timeout=120
    )
    assert state == "inactive", (
        f"Backup service ended in unexpected state {state!r}. "
        "Run: systemctl status postgresql-backup.service --no-pager"
    )
 # ---------------------------------------------------------------------------
 # VirtioFS artefacts (verified from the host — no SSH required)
 # ---------------------------------------------------------------------------
 def test_backup_directory_exists_in_virtiofs(virtiofs_dir: Path):
    """The postgresql/backup sub-directory must exist in the VirtioFS share."""
    backup_root = virtiofs_dir / "postgresql" / "backup"
    assert backup_root.is_dir(), f"Backup directory not found on host: {backup_root}"
 def test_at_least_one_backup_present(virtiofs_dir: Path):
    """At least one timestamped backup sub-directory must exist."""
    backup_root = virtiofs_dir / "postgresql" / "backup"
    backups = sorted(backup_root.iterdir())
    assert backups, f"No backup sub-directories found under {backup_root}"
 def test_backup_manifest_present(virtiofs_dir: Path):
    """The latest backup must contain a ``backup_manifest`` file (pg_basebackup)."""
    backup_root = virtiofs_dir / "postgresql" / "backup"
    latest = sorted(backup_root.iterdir())[-1]
    assert (latest / "backup_manifest").exists(), (
        f"backup_manifest missing in {latest}"
    )
 def test_backup_base_tar_present(virtiofs_dir: Path):
    """The latest backup must contain a ``base.tar`` cluster archive."""
    backup_root = virtiofs_dir / "postgresql" / "backup"
    latest = sorted(backup_root.iterdir())[-1]
    assert (latest / "base.tar").exists(), f"base.tar missing in {latest}"
 def test_database_dump_present(virtiofs_dir: Path):
    """At least one ``dump-test.sql.gz`` file must exist alongside the cluster backup."""
    backup_root = virtiofs_dir / "postgresql" / "backup"
    latest = sorted(backup_root.iterdir())[-1]
    dumps = list(latest.glob("dump-test.sql.gz"))
    assert dumps, f"No dump-test.sql.gz files found in {latest}"
 # ---------------------------------------------------------------------------
 # Retention policy
 # ---------------------------------------------------------------------------
 def test_backup_retention_enforced(postgresql_vm, test_ssh_key, virtiofs_dir: Path):
    """After triggering several extra backups the count must stay within the
    configured retention limit (POSTGRES_BACKUP_RETENTION=7)."""
    retention = 7
    # Trigger ten additional backups so the rotation code has something to do.
    for _ in range(10):
        postgresql_vm.ssh_run(
            "systemctl start postgresql-backup.service", test_ssh_key
        )
        state = postgresql_vm.wait_for_unit_done(
            "postgresql-backup.service", test_ssh_key, timeout=120
        )
        assert state == "inactive"
        time.sleep(1)  # ensure distinct timestamp directories
    backup_root = virtiofs_dir / "postgresql" / "backup"
    count = len(list(backup_root.iterdir()))
    assert count <= retention, (
        f"Retention policy failed: {count} backups present, expected ≤ {retention}"
    )
--- a/postgresql/tests/test_install.py
+++ b/postgresql/tests/test_install.py
@ -0,0 +1,149 @@
 """Test that a fresh PostgreSQL installation is healthy.
 These tests run against a brand-new VM booted from the cookbook's default
 ignition (PG_MAJOR=14, example credentials).  They verify:
  - All expected systemd units are in the correct state.
  - The PostgreSQL server is listening and accepts queries.
  - VirtioFS is mounted and the expected directories exist.
 """
 from pathlib import Path
 from helpers import PG_MAJOR_DEFAULT, run_sql
 # ---------------------------------------------------------------------------
 # Systemd unit state
 # ---------------------------------------------------------------------------
 def test_postgresql_target_active(pg_host):
    """postgresql.target must be active once the full startup chain completes."""
    assert pg_host.service("postgresql.target").is_running
 def test_postgresql_server_running(pg_host):
    """The long-running PostgreSQL server container must be active."""
    assert pg_host.service("postgresql-server.service").is_running
 def test_set_major_oneshot_completed(pg_host):
    """postgresql-set-major.service (oneshot) must have finished — not still running."""
    result = pg_host.run("systemctl is-active postgresql-set-major.service")
    assert result.stdout.strip() == "inactive"
 def test_init_oneshot_completed(pg_host):
    """postgresql-init.service (oneshot) must have finished after initialization."""
    result = pg_host.run("systemctl is-active postgresql-init.service")
    assert result.stdout.strip() == "inactive"
 def test_upgrade_oneshot_completed(pg_host):
    """postgresql-upgrade.service (oneshot) must have finished — no upgrade needed
    on a fresh install."""
    result = pg_host.run("systemctl is-active postgresql-upgrade.service")
    assert result.stdout.strip() == "inactive"
 def test_backup_timer_scheduled(pg_host):
    """The daily backup timer must be active (scheduled)."""
    assert pg_host.service("postgresql-backup.timer").is_running
 # ---------------------------------------------------------------------------
 # Network / socket
 # ---------------------------------------------------------------------------
 def test_postgresql_port_listening(pg_host):
    """PostgreSQL must be listening on 127.0.0.1:5432 (POSTGRES_ARGS=-h 127.0.0.1)."""
    assert pg_host.socket("tcp://127.0.0.1:5432").is_listening
 # ---------------------------------------------------------------------------
 # Filesystem layout
 # ---------------------------------------------------------------------------
 def test_virtiofs_mounted(pg_host):
    """The VirtioFS share must be mounted at /var/lib/virtiofs/data."""
    mount = pg_host.mount_point("/var/lib/virtiofs/data")
    assert mount.exists
    assert mount.filesystem == "virtiofs"
 def test_virtiofs_postgresql_dir(pg_host):
    """/var/lib/virtiofs/data/postgresql must be created by tmpfiles.d."""
    assert pg_host.file("/var/lib/virtiofs/data/postgresql").is_directory
 def test_virtiofs_backup_dir(pg_host):
    """/var/lib/virtiofs/data/postgresql/backup must be created by tmpfiles.d."""
    assert pg_host.file("/var/lib/virtiofs/data/postgresql/backup").is_directory
 def test_data_dir_exists(pg_host):
    """/var/lib/quadlets/postgresql must exist with the correct ownership."""
    f = pg_host.file("/var/lib/quadlets/postgresql")
    assert f.is_directory
    assert f.user == "postgresql"
 def test_latest_symlink_exists(pg_host):
    """The 'latest' symlink must point to the active major-version directory."""
    link = pg_host.file("/var/lib/quadlets/postgresql/latest")
    assert link.exists
    assert link.is_symlink
 def test_version_dir_exists(pg_host):
    """A directory named after PG_MAJOR_DEFAULT must exist under the data dir."""
    assert pg_host.file(
        f"/var/lib/quadlets/postgresql/{PG_MAJOR_DEFAULT}"
    ).is_directory
 def test_initialized_flag_exists(pg_host):
    """The .initialized sentinel file must be written after a successful init."""
    assert pg_host.file("/var/lib/quadlets/postgresql/.initialized").exists
 def test_config_env_present(pg_host):
    """/etc/quadlets/postgresql/config.env must be present and not world-readable."""
    f = pg_host.file("/etc/quadlets/postgresql/config.env")
    assert f.exists
    # mode 0600 — world and group bits must be 0
    assert f.mode & 0o077 == 0
 # ---------------------------------------------------------------------------
 # Database connectivity
 # ---------------------------------------------------------------------------
 def test_postgresql_accepts_connections(postgresql_vm, test_ssh_key):
    """PostgreSQL must respond to a trivial SQL query."""
    output = run_sql(postgresql_vm, test_ssh_key, "SELECT 1 AS probe")
    assert "1" in output
 def test_postgresql_version_matches_config(postgresql_vm, test_ssh_key):
    """The running PostgreSQL server must report the version from PG_MAJOR_DEFAULT."""
    output = run_sql(postgresql_vm, test_ssh_key, "SHOW server_version")
    assert PG_MAJOR_DEFAULT in output
 def test_can_create_database(postgresql_vm, test_ssh_key):
    """Should be possible to create a new database."""
    run_sql(
        postgresql_vm,
        test_ssh_key,
        "CREATE DATABASE install_test_db",
    )
    output = run_sql(
        postgresql_vm,
        test_ssh_key,
        "SELECT datname FROM pg_database WHERE datname = 'install_test_db'",
    )
    assert "install_test_db" in output
--- a/postgresql/tests/test_recovery.py
+++ b/postgresql/tests/test_recovery.py
@ -0,0 +1,154 @@
 """Test PostgreSQL automatic crash recovery.
 Scenarios covered:
  1. Container crash (SIGKILL via ``podman kill``) → systemd restarts the
     service automatically (Restart=always, RestartSec=10).
  2. Hard VM reboot → all services start cleanly and data is intact.
 All tests share the module-scoped ``postgresql_vm`` fixture.  Because some
 tests are destructive (they kill the container), they are intentionally
 sequenced: create data → crash → verify recovery → create more data →
 reboot → verify recovery.
 """
 import time
 from helpers import run_sql
 # Data written before the crash that must survive each recovery scenario.
 CRASH_WITNESS_TABLE = "crash_witness"
 CRASH_WITNESS_VALUE = "before_crash"
 REBOOT_WITNESS_TABLE = "reboot_witness"
 REBOOT_WITNESS_VALUE = "before_reboot"
 # ---------------------------------------------------------------------------
 # Scenario 1: container crash
 # ---------------------------------------------------------------------------
 def test_server_running_before_crash(pg_host):
    """Precondition: postgresql-server.service must be active before we crash it."""
    assert pg_host.service("postgresql-server.service").is_running
 def test_create_data_before_crash(postgresql_vm, test_ssh_key):
    """Insert a row that must survive the container crash."""
    run_sql(
        postgresql_vm,
        test_ssh_key,
        (
            f"CREATE TABLE IF NOT EXISTS {CRASH_WITNESS_TABLE} "
            f"(id SERIAL PRIMARY KEY, message TEXT NOT NULL); "
            f"INSERT INTO {CRASH_WITNESS_TABLE} (message) "
            f"VALUES ('{CRASH_WITNESS_VALUE}');"
        ),
    )
 def test_kill_postgresql_container(postgresql_vm, test_ssh_key):
    """Simulate a process crash by sending SIGKILL to the container.
    ``podman kill`` delivers SIGKILL to the container's PID 1.  Systemd will
    detect the exit and restart the service after RestartSec=10 seconds.
    """
    postgresql_vm.ssh_run(
        "podman kill --signal SIGKILL postgresql-server",
        test_ssh_key,
    )
 def test_service_restarts_automatically(postgresql_vm, test_ssh_key):
    """postgresql-server.service must be active again after the crash.
    Allow up to 60 seconds: systemd waits RestartSec=10 s before restarting,
    then the container start-up and health check take additional time.
    """
    # Brief pause to let systemd register the exit before we start polling.
    time.sleep(5)
    postgresql_vm.wait_for_service(
        "postgresql-server.service", test_ssh_key, timeout=120
    )
 def test_data_intact_after_crash_recovery(postgresql_vm, test_ssh_key):
    """Rows written before the crash must be present after automatic recovery."""
    output = run_sql(
        postgresql_vm,
        test_ssh_key,
        f"SELECT message FROM {CRASH_WITNESS_TABLE} "
        f"WHERE message = '{CRASH_WITNESS_VALUE}'",
    )
    assert CRASH_WITNESS_VALUE in output, (
        f"Crash witness row not found after recovery. Query returned: {output!r}"
    )
 def test_target_still_active_after_crash(pg_host):
    """postgresql.target must remain active after the container recovery."""
    assert pg_host.service("postgresql.target").is_running
 # ---------------------------------------------------------------------------
 # Scenario 2: hard reboot
 # ---------------------------------------------------------------------------
 def test_create_data_before_reboot(postgresql_vm, test_ssh_key):
    """Insert a row that must survive a full VM reboot."""
    run_sql(
        postgresql_vm,
        test_ssh_key,
        (
            f"CREATE TABLE IF NOT EXISTS {REBOOT_WITNESS_TABLE} "
            f"(id SERIAL PRIMARY KEY, message TEXT NOT NULL); "
            f"INSERT INTO {REBOOT_WITNESS_TABLE} (message) "
            f"VALUES ('{REBOOT_WITNESS_VALUE}');"
        ),
    )
 def test_reboot_vm(postgresql_vm, test_ssh_key):
    """Trigger a graceful OS reboot.  SSH will temporarily drop."""
    postgresql_vm.ssh_run("systemctl reboot", test_ssh_key, check=False)
    # Wait for the VM to go down before polling for SSH again.
    time.sleep(15)
 def test_ssh_available_after_reboot(postgresql_vm, test_ssh_key):
    """SSH must become available again within 5 minutes of the reboot."""
    # Reset the cached IP so wait_ssh re-probes it.
    postgresql_vm._ip = None
    postgresql_vm.wait_ssh(ssh_key=test_ssh_key, timeout=300)
 def test_postgresql_target_active_after_reboot(postgresql_vm, test_ssh_key):
    """postgresql.target must come up automatically on reboot (enabled in ignition)."""
    postgresql_vm.wait_for_service(
        "postgresql.target", ssh_key=test_ssh_key, timeout=300
    )
 def test_data_intact_after_reboot(postgresql_vm, test_ssh_key):
    """Rows written before the reboot must still be present after boot."""
    output = run_sql(
        postgresql_vm,
        test_ssh_key,
        f"SELECT message FROM {REBOOT_WITNESS_TABLE} "
        f"WHERE message = '{REBOOT_WITNESS_VALUE}'",
    )
    assert REBOOT_WITNESS_VALUE in output, (
        f"Reboot witness row not found. Query returned: {output!r}"
    )
 def test_crash_witness_also_intact_after_reboot(postgresql_vm, test_ssh_key):
    """Data written before the crash must also survive the subsequent reboot."""
    output = run_sql(
        postgresql_vm,
        test_ssh_key,
        f"SELECT message FROM {CRASH_WITNESS_TABLE} "
        f"WHERE message = '{CRASH_WITNESS_VALUE}'",
    )
    assert CRASH_WITNESS_VALUE in output
--- a/postgresql/tests/test_upgrade.py
+++ b/postgresql/tests/test_upgrade.py
@ -0,0 +1,163 @@
 """Test the PostgreSQL major version upgrade path: PG 14 → PG 17.
 The upgrade mechanism works as follows:
  1. postgresql-set-major.service updates the ``latest`` symlink to point at
     the new PG_MAJOR directory (e.g. /var/lib/quadlets/postgresql/17/).
  2. postgresql-upgrade.service detects that
     ``latest/docker/PG_VERSION`` does not exist (the 17/ directory is
     empty) and triggers pgautoupgrade.
  3. pg_upgrade migrates data from the old directory to the new one.
  4. postgresql-server.service starts against the upgraded data.
 All tests in this module share a single ``upgrade_vm`` fixture that starts
 with PG_MAJOR_UPGRADE_FROM (14).  Tests are intentionally ordered to form a
 sequential scenario: create data → trigger upgrade → verify outcome.
 """
 from pathlib import Path
 from helpers import PG_MAJOR_UPGRADE_FROM, PG_MAJOR_UPGRADE_TO, run_sql
 # Sentinel table and row used to verify data survives the upgrade.
 WITNESS_TABLE = "upgrade_witness"
 WITNESS_VALUE = "before_upgrade"
 # ---------------------------------------------------------------------------
 # Pre-upgrade baseline
 # ---------------------------------------------------------------------------
 def test_initial_version_is_upgrade_from(upgrade_vm, test_ssh_key):
    """Precondition: the VM must be running PG_MAJOR_UPGRADE_FROM."""
    output = run_sql(upgrade_vm, test_ssh_key, "SHOW server_version")
    assert PG_MAJOR_UPGRADE_FROM in output, (
        f"Expected PG {PG_MAJOR_UPGRADE_FROM}, got: {output!r}"
    )
 def test_create_witness_data(upgrade_vm, test_ssh_key):
    """Insert a row that must survive the major version upgrade."""
    run_sql(
        upgrade_vm,
        test_ssh_key,
        (
            f"CREATE TABLE IF NOT EXISTS {WITNESS_TABLE} "
            f"(id SERIAL PRIMARY KEY, message TEXT NOT NULL); "
            f"INSERT INTO {WITNESS_TABLE} (message) VALUES ('{WITNESS_VALUE}');"
        ),
    )
    output = run_sql(
        upgrade_vm,
        test_ssh_key,
        f"SELECT message FROM {WITNESS_TABLE} WHERE message = '{WITNESS_VALUE}'",
    )
    assert WITNESS_VALUE in output
 # ---------------------------------------------------------------------------
 # Trigger the upgrade
 # ---------------------------------------------------------------------------
 def test_bump_pg_major_in_config(upgrade_vm, test_ssh_key):
    """Change PG_MAJOR in config.env from UPGRADE_FROM to UPGRADE_TO."""
    upgrade_vm.ssh_run(
        f"sed -i 's/^PG_MAJOR={PG_MAJOR_UPGRADE_FROM}$/PG_MAJOR={PG_MAJOR_UPGRADE_TO}/' "
        "/etc/quadlets/postgresql/config.env",
        test_ssh_key,
    )
    # Verify the substitution worked.
    result = upgrade_vm.ssh_run(
        "grep ^PG_MAJOR= /etc/quadlets/postgresql/config.env",
        test_ssh_key,
    )
    assert f"PG_MAJOR={PG_MAJOR_UPGRADE_TO}" in result.stdout
 def test_restart_postgresql_target(upgrade_vm, test_ssh_key):
    """Restart postgresql.target to kick off the upgrade chain."""
    upgrade_vm.ssh_run("systemctl restart postgresql.target", test_ssh_key)
 def test_upgrade_service_completes(upgrade_vm, test_ssh_key):
    """postgresql-upgrade.service must finish in ``inactive`` state (not ``failed``).
    pgautoupgrade can take several minutes for large databases; allow up to
    10 minutes.
    """
    state = upgrade_vm.wait_for_unit_done(
        "postgresql-upgrade.service", test_ssh_key, timeout=600
    )
    assert state == "inactive", (
        f"Upgrade service ended in state {state!r}. "
        "Inspect with: systemctl status postgresql-upgrade.service --no-pager "
        "and: journalctl -u postgresql-upgrade.service"
    )
 def test_server_active_after_upgrade(upgrade_vm, test_ssh_key):
    """postgresql-server.service must be active after the upgrade."""
    upgrade_vm.wait_for_service(
        "postgresql-server.service", test_ssh_key, timeout=120
    )
 # ---------------------------------------------------------------------------
 # Post-upgrade verification
 # ---------------------------------------------------------------------------
 def test_new_version_is_running(upgrade_vm, test_ssh_key):
    """PostgreSQL must now report PG_MAJOR_UPGRADE_TO as the server version."""
    output = run_sql(upgrade_vm, test_ssh_key, "SHOW server_version")
    assert PG_MAJOR_UPGRADE_TO in output, (
        f"Expected PG {PG_MAJOR_UPGRADE_TO} after upgrade, got: {output!r}"
    )
 def test_witness_data_preserved(upgrade_vm, test_ssh_key):
    """The row inserted before the upgrade must still be present and correct."""
    output = run_sql(
        upgrade_vm,
        test_ssh_key,
        f"SELECT message FROM {WITNESS_TABLE} WHERE message = '{WITNESS_VALUE}'",
    )
    assert WITNESS_VALUE in output, (
        f"Witness row '{WITNESS_VALUE}' not found after upgrade. "
        f"Query returned: {output!r}"
    )
 def test_old_data_dir_removed(upgrade_vm, test_ssh_key):
    """pgautoupgrade must remove the source data directory after a clean upgrade."""
    result = upgrade_vm.ssh_run(
        f"test -d /var/lib/quadlets/postgresql/{PG_MAJOR_UPGRADE_FROM}/docker",
        test_ssh_key,
        check=False,
    )
    assert result.returncode != 0, (
        f"Old data directory for PG {PG_MAJOR_UPGRADE_FROM} still exists — "
        "upgrade may not have cleaned up properly"
    )
 def test_latest_symlink_points_to_new_version(upgrade_vm, test_ssh_key):
    """The ``latest`` symlink must now point at the PG_MAJOR_UPGRADE_TO directory."""
    result = upgrade_vm.ssh_run(
        "readlink /var/lib/quadlets/postgresql/latest",
        test_ssh_key,
    )
    assert PG_MAJOR_UPGRADE_TO in result.stdout, (
        f"latest symlink does not point at PG {PG_MAJOR_UPGRADE_TO}: "
        f"{result.stdout.strip()!r}"
    )
 def test_new_data_dir_has_pg_version_file(upgrade_vm, test_ssh_key):
    """PG_VERSION file must exist in the new data directory (server is healthy)."""
    result = upgrade_vm.ssh_run(
        f"cat /var/lib/quadlets/postgresql/{PG_MAJOR_UPGRADE_TO}/docker/PG_VERSION",
        test_ssh_key,
    )
    assert PG_MAJOR_UPGRADE_TO in result.stdout
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,24 @@
 [build-system]
 requires = ["setuptools>=68"]
 build-backend = "setuptools.build_meta"
 [project]
 name = "podman-quadlet-cookbook-tests"
 version = "0.1.0"
 requires-python = ">=3.11"
 dependencies = [
    "pytest>=8.0",
    "pytest-testinfra>=10.1",
    "paramiko>=3.4",
 ]
 [tool.pytest.ini_options]
 # No testpaths set: pytest discovers tests in all */tests/ directories.
 # Run a specific cookbook: pytest postgresql/tests/
 log_cli = true
 log_cli_level = "INFO"
 addopts = "-v"
 [tool.setuptools]                                                                                               
 # This repo is not a Python package — suppress automatic package discovery.                              
 packages = []                                                                                            
--- a/tests/init.py
+++ b/tests/init.py
--- a/tests/vm.py
+++ b/tests/vm.py
@ -0,0 +1,384 @@
 """Fedora CoreOS VM lifecycle helpers for end-to-end testing.
 Requires running as root (virt-install, virsh, qemu-img need root privileges).
 Typical usage:
    vm = FCOSVirtualMachine(
        name="postgresql-abc123",
        ignition_file=Path("/tmp/fcos-test.ign"),
        virtiofs_dir=Path("/srv/fcos-test-postgresql-abc123"),
    )
    vm.create()
    vm.wait_ssh(ssh_key=key_path)
    vm.wait_for_service("postgresql.target", ssh_key=key_path)
    # ... run tests ...
    vm.destroy()
 """
 import base64
 import re
 import shutil
 import subprocess
 import tempfile
 import textwrap
 import time
 from pathlib import Path
 LIBVIRT_IMAGES_DIR = Path("/var/lib/libvirt/images")
 FCOS_BASE_IMAGE = LIBVIRT_IMAGES_DIR / "library" / "fedora-coreos.qcow2"
 # Butane spec version — must match the project convention.
 BUTANE_VERSION = "1.4.0"
 def ensure_fcos_ign(cookbook_dir: Path) -> Path:
    """Return the path to fcos.ign, building it via ``make butane`` if absent."""
    fcos_ign = cookbook_dir / "fcos.ign"
    if not fcos_ign.exists():
        subprocess.run(
            ["make", "-C", str(cookbook_dir), "butane"],
            check=True,
        )
    return fcos_ign
 def build_test_ignition(
    base_ignition: Path,
    ssh_pubkey: str,
    output: Path,
    config_env_overrides: dict[str, str] | None = None,
    extra_files: dict[str, tuple[str, int]] | None = None,
 ) -> Path:
    """Build a test ignition file by overlaying the cookbook's fcos.ign.
    The overlay:
      - Merges the base cookbook ignition (fcos.ign).
      - Adds the test SSH public key to the root user so the test runner can
        SSH in (FCOS allows root login with keys via PermitRootLogin
        prohibit-password).
      - Optionally patches /etc/quadlets/postgresql/config.env via
        ``config_env_overrides`` (merged on top of whatever the base ignition
        already sets).
      - Optionally injects arbitrary extra files via ``extra_files``:
        ``{"/path/on/vm": ("file content", 0o644)}``.
    Args:
        base_ignition: Path to the pre-built fcos.ign for the cookbook.
        ssh_pubkey: Ed25519 public key string to inject for root.
        output: Destination path for the compiled test ignition.
        config_env_overrides: Key/value pairs to override in config.env.
            The full config.env is re-written with these values merged on
            top of the defaults from the base ignition.
        extra_files: Additional files to inject into the VM image.
    Returns:
        ``output`` path.
    """
    with tempfile.TemporaryDirectory() as _tmpdir:
        d = Path(_tmpdir)
        # butane resolves "local:" references relative to the directory passed
        # via -d; copy the base ignition there.
        shutil.copy(base_ignition, d / "base.ign")
        # Build the storage.files section of the overlay.
        storage_section = _build_storage_section(config_env_overrides, extra_files)
        overlay_bu = textwrap.dedent(f"""\
            variant: fcos
            version: {BUTANE_VERSION}
            ignition:
              config:
                merge:
                  - local: base.ign
            passwd:
              users:
                - name: root
                  ssh_authorized_keys:
                    - {ssh_pubkey}
            systemd:
              units:
              # Disable & mask zincati to avoid reboots during testing.
              - name: zincati.service
                enabled: false
                mask: true
        """)
        if storage_section:
            overlay_bu += storage_section
        overlay_bu_path = d / "test-overlay.bu"
        overlay_bu_path.write_text(overlay_bu)
        subprocess.run(
            [
                "butane",
                "--strict",
                "-d", str(d),
                "-o", str(output),
                str(overlay_bu_path),
            ],
            check=True,
        )
    return output
 def _build_storage_section(
    config_env_overrides: dict[str, str] | None,
    extra_files: dict[str, tuple[str, int]] | None,
 ) -> str:
    """Return a Butane ``storage:`` YAML block (or empty string if nothing to inject)."""
    files = []
    if config_env_overrides:
        content = "\n".join(f"{k}={v}" for k, v in config_env_overrides.items()) + "\n"
        files.append(
            _butane_file("/etc/quadlets/postgresql/config.env", content, 0o600)
        )
    if extra_files:
        for path, (content, mode) in extra_files.items():
            files.append(_butane_file(path, content, mode))
    if not files:
        return ""
    joined = "\n".join(files)
    return f"storage:\n  files:\n{joined}\n"
 def _butane_file(path: str, content: str, mode: int) -> str:
    """Return a Butane file entry using a base64 data URI (avoids YAML quoting)."""
    b64 = base64.b64encode(content.encode()).decode()
    return (
        f"    - path: {path}\n"
        f"      mode: {mode}\n"
        f"      contents:\n"
        f'        source: "data:text/plain;base64,{b64}"\n'
    )
 class FCOSVirtualMachine:
    """Manages a Fedora CoreOS KVM virtual machine for end-to-end testing.
    All public methods are synchronous and raise on failure.  The caller is
    responsible for calling ``destroy()`` (typically from a pytest fixture
    teardown).
    """
    def __init__(self, name: str, ignition_file: Path, virtiofs_dir: Path) -> None:
        """
        Args:
            name: Short identifier appended to "fcos-test-" to form the
                  libvirt domain name.  Keep it unique across parallel tests.
            ignition_file: Path to the compiled Ignition (.ign) file.
            virtiofs_dir: Host directory that will be exposed inside the VM
                          at /var/lib/virtiofs/data via VirtioFS.
        """
        self.name = name
        self.vm_name = f"fcos-test-{name}"
        self.ignition_file = Path(ignition_file)
        self.virtiofs_dir = Path(virtiofs_dir)
        self._images_dir = LIBVIRT_IMAGES_DIR / self.vm_name
        self._ip: str | None = None
    # ------------------------------------------------------------------
    # Lifecycle
    # ------------------------------------------------------------------
    def create(self) -> None:
        """Create disk images and start the VM via virt-install."""
        self._images_dir.mkdir(parents=True, exist_ok=True)
        self.virtiofs_dir.mkdir(parents=True, exist_ok=True)
        ign_dest = self._images_dir / "fcos.ign"
        shutil.copy(self.ignition_file, ign_dest)
        ign_dest.chmod(0o644)
        # Root OS disk: copy from the shared base QCOW2 image.
        root_qcow2 = self._images_dir / "root.qcow2"
        shutil.copy(FCOS_BASE_IMAGE, root_qcow2)
        # Secondary disk for /var (keeps OS and data separate, matches common.mk).
        var_qcow2 = self._images_dir / "var.qcow2"
        subprocess.run(
            ["qemu-img", "create", "-f", "qcow2", str(var_qcow2), "100G"],
            check=True,
        )
        subprocess.run(
            [
                "virt-install",
                f"--name={self.vm_name}",
                "--import",
                "--noautoconsole",
                "--ram=4096",
                "--vcpus=2",
                "--os-variant=fedora-coreos-stable",
                f"--disk=path={root_qcow2},format=qcow2,size=50",
                f"--disk=path={var_qcow2},format=qcow2",
                f"--qemu-commandline=-fw_cfg name=opt/com.coreos/config,file={ign_dest}",
                "--network=network=default,model=virtio",
                "--console=pty,target.type=virtio",
                "--serial=pty",
                "--graphics=none",
                "--boot=uefi",
                "--memorybacking=access.mode=shared,source.type=memfd",
                (
                    f"--filesystem=type=mount,accessmode=passthrough,"
                    f"driver.type=virtiofs,driver.queue=1024,"
                    f"source.dir={self.virtiofs_dir},target.dir=data"
                ),
            ],
            check=True,
        )
    def destroy(self) -> None:
        """Forcefully stop and delete the VM and all associated disk images."""
        subprocess.run(["virsh", "destroy", self.vm_name], capture_output=True)
        subprocess.run(
            ["virsh", "undefine", self.vm_name, "--nvram"],
            capture_output=True,
        )
        if self._images_dir.exists():
            shutil.rmtree(self._images_dir)
        if self.virtiofs_dir.exists():
            shutil.rmtree(self.virtiofs_dir)
    # ------------------------------------------------------------------
    # Readiness polling
    # ------------------------------------------------------------------
    def get_ip(self) -> str | None:
        """Return the VM's primary IPv4 address reported by virsh, or None."""
        result = subprocess.run(
            ["virsh", "domifaddr", self.vm_name],
            capture_output=True,
            text=True,
        )
        if result.returncode != 0:
            return None
        match = re.search(r"(\d+\.\d+\.\d+\.\d+)", result.stdout)
        return match.group(1) if match else None
    @property
    def ip(self) -> str:
        if self._ip is None:
            self._ip = self.get_ip()
        if self._ip is None:
            raise RuntimeError(f"VM {self.vm_name!r} has no IP address yet")
        return self._ip
    def wait_ssh(self, ssh_key: Path, timeout: int = 300) -> str:
        """Block until SSH is reachable. Returns the IP address.
        Polls every 5 seconds until ``timeout`` seconds have elapsed.
        """
        deadline = time.monotonic() + timeout
        while time.monotonic() < deadline:
            ip = self.get_ip()
            if ip:
                try:
                    result = subprocess.run(
                        [
                            "ssh",
                            "-i", str(ssh_key),
                            "-o", "StrictHostKeyChecking=no",
                            "-o", "UserKnownHostsFile=/dev/null",
                            "-o", "ConnectTimeout=5",
                            "-o", "BatchMode=yes",
                            f"root@{ip}",
                            "true",
                        ],
                        capture_output=True,
                        timeout=10,
                    )
                    if result.returncode == 0:
                        self._ip = ip
                        return ip
                except subprocess.TimeoutExpired:
                    pass
            time.sleep(5)
        raise TimeoutError(
            f"VM {self.vm_name!r} did not become SSH-ready within {timeout}s"
        )
    def wait_for_service(
        self, service: str, ssh_key: Path, timeout: int = 120
    ) -> None:
        """Block until *service* reaches the ``active`` state."""
        deadline = time.monotonic() + timeout
        while time.monotonic() < deadline:
            result = self.ssh_run(
                f"systemctl is-active {service}", ssh_key, check=False
            )
            if result.stdout.strip() == "active":
                return
            time.sleep(5)
        status = self.ssh_run(
            f"systemctl status {service} --no-pager", ssh_key, check=False
        )
        raise TimeoutError(
            f"Service {service!r} not active after {timeout}s:\n{status.stdout}"
        )
    def wait_for_unit_done(
        self, service: str, ssh_key: Path, timeout: int = 120
    ) -> str:
        """Block until a oneshot service finishes (``inactive`` or ``failed``).
        Returns:
            The final state string: ``"inactive"`` on success, ``"failed"``
            on failure.
        """
        deadline = time.monotonic() + timeout
        while time.monotonic() < deadline:
            result = self.ssh_run(
                f"systemctl is-active {service}", ssh_key, check=False
            )
            state = result.stdout.strip()
            if state in ("inactive", "failed"):
                return state
            time.sleep(5)
        raise TimeoutError(
            f"Service {service!r} did not finish within {timeout}s"
        )
    # ------------------------------------------------------------------
    # Remote execution
    # ------------------------------------------------------------------
    def ssh_run(
        self,
        command: str,
        ssh_key: Path,
        check: bool = True,
    ) -> subprocess.CompletedProcess:
        """Run a shell command in the VM via SSH.
        Args:
            command: Shell command string passed to the remote bash.
            ssh_key: Path to the private key used for authentication.
            check: If True (default), raise RuntimeError on non-zero exit.
        Returns:
            CompletedProcess with stdout/stderr as text.
        """
        result = subprocess.run(
            [
                "ssh",
                "-i", str(ssh_key),
                "-o", "StrictHostKeyChecking=no",
                "-o", "UserKnownHostsFile=/dev/null",
                f"root@{self.ip}",
                command,
            ],
            capture_output=True,
            text=True,
        )
        if check and result.returncode != 0:
            raise RuntimeError(
                f"SSH command failed (exit {result.returncode}): {command!r}\n"
                f"stdout: {result.stdout}\nstderr: {result.stderr}"
            )
        return result