From a7c74f625fa7d83ec43395047a36b619f7fa9cfc Mon Sep 17 00:00:00 2001 From: jramos Date: Wed, 1 Apr 2026 13:07:27 -0600 Subject: [PATCH] docs: clarify Python deps use apt packages not pip/venv Dev server uses apt-managed python3-pandas and python3-openpyxl. Production fix is the same. Updates README install step and rewrites python-venv-setup.md to reflect the real setup with venv as fallback. --- README.md | 23 +++--------- docs/python-venv-setup.md | 73 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+), 19 deletions(-) create mode 100644 docs/python-venv-setup.md diff --git a/README.md b/README.md index 4aaa0c2..d488469 100644 --- a/README.md +++ b/README.md @@ -68,7 +68,7 @@ The application provides: - Node.js 18 or later - npm -- Python 3 with a venv containing `pandas` and `openpyxl` (required for compliance xlsx parsing) +- Python 3 with `python3-pandas` and `python3-openpyxl` apt packages (required for compliance xlsx parsing) --- @@ -97,28 +97,13 @@ npm install ### 4. Install Python dependencies -Modern Debian/Ubuntu systems enforce PEP 668 and block system-wide pip installs. Create a virtual environment instead: +Install via apt — this is the correct approach on Ubuntu/Debian and mirrors the dev server setup: ```bash -# Install venv support if needed -apt install -y python3-venv python3-full - -# Create the venv (once per server, from the app root) -python3 -m venv /home/cve-dashboard/venv - -# Install packages into the venv -/home/cve-dashboard/venv/bin/pip install -r backend/scripts/requirements.txt +apt install -y python3-pandas python3-openpyxl ``` -Required packages: `pandas>=2.0.0`, `openpyxl>=3.0.0` - -Then set the `PYTHON_BIN` environment variable so the backend uses the venv Python: - -```bash -export PYTHON_BIN=/home/cve-dashboard/venv/bin/python3 -``` - -Add this to the server's startup environment (e.g., your systemd unit or `.env` file) so it persists across restarts. If `PYTHON_BIN` is not set, the backend falls back to the system `python3`. +> If apt packages are unavailable or you need a specific version, see `docs/python-venv-setup.md` for the venv fallback approach. > The bulk notes import script (`import_notes_from_csv.py`) uses only Python stdlib and does **not** require these packages. diff --git a/docs/python-venv-setup.md b/docs/python-venv-setup.md new file mode 100644 index 0000000..f1f74aa --- /dev/null +++ b/docs/python-venv-setup.md @@ -0,0 +1,73 @@ +# Python Dependencies — Compliance xlsx Parsing + +`parse_compliance_xlsx.py` requires `pandas` and `openpyxl`. This doc +explains how each server has (or should have) these installed. + +--- + +## Dev server — how it works + +Pandas and openpyxl are installed as **system apt packages**, not via pip +or a venv. This is why there is no venv on dev and no `--break-system-packages` +gymnastics. They were installed at some point via: + +```bash +apt install python3-pandas python3-openpyxl +``` + +You can verify with: + +```bash +python3 -c "import pandas; print(pandas.__file__)" +# /usr/lib/python3/dist-packages/pandas/__init__.py ← apt-managed +``` + +--- + +## Production server — how to fix it + +Production was missing pandas entirely. The fix mirrors what dev has: + +```bash +apt-get update --fix-missing +apt install -y python3-pandas python3-openpyxl +``` + +No venv, no pip, no `PYTHON_BIN` env var needed. After installing, restart +the backend and the compliance xlsx upload will work. + +--- + +## If apt packages are unavailable (fallback) + +If you're on a system where apt doesn't have pandas (unlikely on Ubuntu +22.04/24.04), or you want isolation, use a venv: + +```bash +apt install -y python3-venv python3-full +python3 -m venv /home/cve-dashboard/venv +/home/cve-dashboard/venv/bin/pip install -r /home/cve-dashboard/backend/scripts/requirements.txt +``` + +Then set `PYTHON_BIN` in the Node backend's environment: + +```bash +export PYTHON_BIN=/home/cve-dashboard/venv/bin/python3 +``` + +The backend reads `process.env.PYTHON_BIN` and falls back to `python3` if +not set, so this only needs to be done if you're using a venv. + +--- + +## Why pip3 may fail on modern Ubuntu/Debian + +PEP 668 (enforced in Ubuntu 23.04+) blocks `pip3 install` system-wide to +prevent breaking apt-managed packages. The error looks like: + +``` +error: externally-managed-environment +``` + +Using `apt install python3-pandas` is the correct solution — pip is not +needed when the distro packages the library directly.