Files
cve-dashboard/docs/python-venv-setup.md
jramos a7c74f625f docs: clarify Python deps use apt packages not pip/venv
Dev server uses apt-managed python3-pandas and python3-openpyxl.
Production fix is the same. Updates README install step and rewrites
python-venv-setup.md to reflect the real setup with venv as fallback.
2026-04-01 13:07:27 -06:00

1.9 KiB

Python Dependencies — Compliance xlsx Parsing

parse_compliance_xlsx.py requires pandas and openpyxl. This doc explains how each server has (or should have) these installed.


Dev server — how it works

Pandas and openpyxl are installed as system apt packages, not via pip or a venv. This is why there is no venv on dev and no --break-system-packages gymnastics. They were installed at some point via:

apt install python3-pandas python3-openpyxl

You can verify with:

python3 -c "import pandas; print(pandas.__file__)"
# /usr/lib/python3/dist-packages/pandas/__init__.py  ← apt-managed

Production server — how to fix it

Production was missing pandas entirely. The fix mirrors what dev has:

apt-get update --fix-missing
apt install -y python3-pandas python3-openpyxl

No venv, no pip, no PYTHON_BIN env var needed. After installing, restart the backend and the compliance xlsx upload will work.


If apt packages are unavailable (fallback)

If you're on a system where apt doesn't have pandas (unlikely on Ubuntu 22.04/24.04), or you want isolation, use a venv:

apt install -y python3-venv python3-full
python3 -m venv /home/cve-dashboard/venv
/home/cve-dashboard/venv/bin/pip install -r /home/cve-dashboard/backend/scripts/requirements.txt

Then set PYTHON_BIN in the Node backend's environment:

export PYTHON_BIN=/home/cve-dashboard/venv/bin/python3

The backend reads process.env.PYTHON_BIN and falls back to python3 if not set, so this only needs to be done if you're using a venv.


Why pip3 may fail on modern Ubuntu/Debian

PEP 668 (enforced in Ubuntu 23.04+) blocks pip3 install system-wide to prevent breaking apt-managed packages. The error looks like:

error: externally-managed-environment

Using apt install python3-pandas is the correct solution — pip is not needed when the distro packages the library directly.