mirror of
https://github.com/kuhyx/testsAndMisc.git
synced 2026-07-04 15:23:03 +02:00
docs: add oom-prevention skill for git hook memory management
Documents the machine-freeze root cause (zram + cgroup without MemorySwapMax=0), the run_capped() pattern in .git/hooks/, the 2GB nested cgroup per pytest package, and the COVERAGE_FILE isolation fix for pytest-cov SQLite corruption.
This commit is contained in:
parent
39c47777be
commit
30abcd5864
118
.github/skills/oom-prevention/SKILL.md
vendored
Normal file
118
.github/skills/oom-prevention/SKILL.md
vendored
Normal file
@ -0,0 +1,118 @@
|
|||||||
|
---
|
||||||
|
name: oom-prevention
|
||||||
|
description: "Prevent OOM freezes from pre-commit and pre-push hooks. Apply when modifying .git/hooks/, scripts/pytest_changed_packages.py, or any hook that runs pytest, mypy, or Node.js tools. The system has zram swap (4 GB compressed, stored IN RAM) — without MemorySwapMax=0 a memory-limited cgroup thrashes zram instead of dying cleanly, freezing the machine."
|
||||||
|
---
|
||||||
|
|
||||||
|
# OOM Prevention in Git Hooks
|
||||||
|
|
||||||
|
## Why This Matters
|
||||||
|
|
||||||
|
This machine uses **zram swap** (compressed RAM-backed swap, ~4 GB). Without swap
|
||||||
|
disabled in cgroup limits, a process that exceeds `MemoryMax` spills into zram instead
|
||||||
|
of being killed — the kernel thrashes zram, consuming MORE real RAM, and the machine
|
||||||
|
hard-freezes before the OOM killer fires.
|
||||||
|
|
||||||
|
**Symptom:** running `git push` or `git commit` freezes the entire machine for 5–10+
|
||||||
|
minutes.
|
||||||
|
|
||||||
|
## The Fix: `MemorySwapMax=0`
|
||||||
|
|
||||||
|
Every hook invocation that runs memory-heavy tools (pytest, mypy, pylint, Node.js/TS)
|
||||||
|
**must** be wrapped in a systemd scope with _both_ `MemoryMax` and `MemorySwapMax=0`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
systemd-run --user --scope \
|
||||||
|
-p MemoryMax=4G \
|
||||||
|
-p MemorySwapMax=0 \
|
||||||
|
-- <command>
|
||||||
|
```
|
||||||
|
|
||||||
|
`MemorySwapMax=0` disables swap for the cgroup entirely. At `MemoryMax`, the process
|
||||||
|
is SIGKILL'd instantly rather than thrashing swap.
|
||||||
|
|
||||||
|
## Affected Files
|
||||||
|
|
||||||
|
### `.git/hooks/pre-commit`
|
||||||
|
|
||||||
|
Contains a `run_capped()` shell function:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
run_capped() {
|
||||||
|
if command -v systemd-run >/dev/null 2>&1; then
|
||||||
|
systemd-run --user --scope \
|
||||||
|
-p MemoryMax=4G \
|
||||||
|
-p MemorySwapMax=0 \
|
||||||
|
-- "$@"
|
||||||
|
else
|
||||||
|
"$@"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
All heavy tool invocations go through `run_capped`. Also sets
|
||||||
|
`NODE_OPTIONS=--max-old-space-size=512` for TypeScript/ESLint tools.
|
||||||
|
|
||||||
|
### `.git/hooks/pre-push`
|
||||||
|
|
||||||
|
Same `run_capped()` pattern as pre-commit, with `MemoryMax=4G MemorySwapMax=0`.
|
||||||
|
|
||||||
|
### `scripts/pytest_changed_packages.py`
|
||||||
|
|
||||||
|
Runs each affected package's tests in a **nested** 2 GB cgroup to keep memory
|
||||||
|
bounded across sequential package runs:
|
||||||
|
|
||||||
|
```python
|
||||||
|
use_cgroup = shutil.which("systemd-run") is not None
|
||||||
|
cmd = [
|
||||||
|
"systemd-run", "--user", "--scope",
|
||||||
|
"-p", "MemoryMax=2G",
|
||||||
|
"-p", "MemorySwapMax=0",
|
||||||
|
*pytest_cmd,
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Each package also gets its own `COVERAGE_FILE` env var pointing to a tmpfile,
|
||||||
|
so sequential runs don't corrupt each other's SQLite coverage DB:
|
||||||
|
|
||||||
|
```python
|
||||||
|
with tempfile.NamedTemporaryFile(
|
||||||
|
prefix=f".coverage_{pkg}_", dir=".", delete=False
|
||||||
|
) as tmp:
|
||||||
|
cov_file = tmp.name
|
||||||
|
env = {**os.environ, "COVERAGE_FILE": cov_file}
|
||||||
|
result = subprocess.run(cmd, check=False, env=env)
|
||||||
|
Path(cov_file).unlink(missing_ok=True)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Coverage DB Corruption
|
||||||
|
|
||||||
|
When multiple pytest-cov processes run sequentially sharing the same `.coverage`
|
||||||
|
SQLite file, the second process finds the first's leftover `.coverage` and tries to
|
||||||
|
`combine()` it as a parallel data source — but the schema is incompatible, causing:
|
||||||
|
|
||||||
|
```
|
||||||
|
INTERNALERROR> coverage.exceptions.DataError:
|
||||||
|
Couldn't use data file '.coverage': no such table: other_db.file
|
||||||
|
```
|
||||||
|
|
||||||
|
**Fix:** `COVERAGE_FILE=<unique-per-run-path>` so each subprocess writes to an
|
||||||
|
isolated DB, never seeing the prior run's file.
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
| Problem | Root Cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| Machine freezes during `git push` | zram swap absorbs cgroup-limited process | `MemorySwapMax=0` in all cgroup scopes |
|
||||||
|
| Machine freezes during `git commit` | Same as above | `run_capped()` in `.git/hooks/pre-commit` |
|
||||||
|
| pytest INTERNALERROR on coverage combine | Stale `.coverage` SQLite DB from prior run | `COVERAGE_FILE=<unique-tmp>` per subprocess |
|
||||||
|
| pytest OOM during per-package run | Accumulated memory across package runs | 2 GB nested cgroup per package |
|
||||||
|
|
||||||
|
## Adding New Hooks
|
||||||
|
|
||||||
|
If you add a new pre-commit or pre-push hook that runs:
|
||||||
|
- `pytest` with coverage
|
||||||
|
- `mypy` or `pylint`
|
||||||
|
- `node` / `eslint` / TypeScript compilation
|
||||||
|
|
||||||
|
**Always wrap it in `run_capped()`** (in the shell hook) or the `systemd-run`
|
||||||
|
subprocess pattern (in Python scripts). Never call these tools naked.
|
||||||
Loading…
Reference in New Issue
Block a user