docs: clarify Python deps use apt packages not pip/venv
Dev server uses apt-managed python3-pandas and python3-openpyxl. Production fix is the same. Updates README install step and rewrites python-venv-setup.md to reflect the real setup with venv as fallback.
This commit is contained in:
23
README.md
23
README.md
@@ -68,7 +68,7 @@ The application provides:
|
|||||||
|
|
||||||
- Node.js 18 or later
|
- Node.js 18 or later
|
||||||
- npm
|
- npm
|
||||||
- Python 3 with a venv containing `pandas` and `openpyxl` (required for compliance xlsx parsing)
|
- Python 3 with `python3-pandas` and `python3-openpyxl` apt packages (required for compliance xlsx parsing)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -97,28 +97,13 @@ npm install
|
|||||||
|
|
||||||
### 4. Install Python dependencies
|
### 4. Install Python dependencies
|
||||||
|
|
||||||
Modern Debian/Ubuntu systems enforce PEP 668 and block system-wide pip installs. Create a virtual environment instead:
|
Install via apt — this is the correct approach on Ubuntu/Debian and mirrors the dev server setup:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Install venv support if needed
|
apt install -y python3-pandas python3-openpyxl
|
||||||
apt install -y python3-venv python3-full
|
|
||||||
|
|
||||||
# Create the venv (once per server, from the app root)
|
|
||||||
python3 -m venv /home/cve-dashboard/venv
|
|
||||||
|
|
||||||
# Install packages into the venv
|
|
||||||
/home/cve-dashboard/venv/bin/pip install -r backend/scripts/requirements.txt
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Required packages: `pandas>=2.0.0`, `openpyxl>=3.0.0`
|
> If apt packages are unavailable or you need a specific version, see `docs/python-venv-setup.md` for the venv fallback approach.
|
||||||
|
|
||||||
Then set the `PYTHON_BIN` environment variable so the backend uses the venv Python:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
export PYTHON_BIN=/home/cve-dashboard/venv/bin/python3
|
|
||||||
```
|
|
||||||
|
|
||||||
Add this to the server's startup environment (e.g., your systemd unit or `.env` file) so it persists across restarts. If `PYTHON_BIN` is not set, the backend falls back to the system `python3`.
|
|
||||||
|
|
||||||
> The bulk notes import script (`import_notes_from_csv.py`) uses only Python stdlib and does **not** require these packages.
|
> The bulk notes import script (`import_notes_from_csv.py`) uses only Python stdlib and does **not** require these packages.
|
||||||
|
|
||||||
|
|||||||
73
docs/python-venv-setup.md
Normal file
73
docs/python-venv-setup.md
Normal file
@@ -0,0 +1,73 @@
|
|||||||
|
# Python Dependencies — Compliance xlsx Parsing
|
||||||
|
|
||||||
|
`parse_compliance_xlsx.py` requires `pandas` and `openpyxl`. This doc
|
||||||
|
explains how each server has (or should have) these installed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dev server — how it works
|
||||||
|
|
||||||
|
Pandas and openpyxl are installed as **system apt packages**, not via pip
|
||||||
|
or a venv. This is why there is no venv on dev and no `--break-system-packages`
|
||||||
|
gymnastics. They were installed at some point via:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
apt install python3-pandas python3-openpyxl
|
||||||
|
```
|
||||||
|
|
||||||
|
You can verify with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 -c "import pandas; print(pandas.__file__)"
|
||||||
|
# /usr/lib/python3/dist-packages/pandas/__init__.py ← apt-managed
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Production server — how to fix it
|
||||||
|
|
||||||
|
Production was missing pandas entirely. The fix mirrors what dev has:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
apt-get update --fix-missing
|
||||||
|
apt install -y python3-pandas python3-openpyxl
|
||||||
|
```
|
||||||
|
|
||||||
|
No venv, no pip, no `PYTHON_BIN` env var needed. After installing, restart
|
||||||
|
the backend and the compliance xlsx upload will work.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## If apt packages are unavailable (fallback)
|
||||||
|
|
||||||
|
If you're on a system where apt doesn't have pandas (unlikely on Ubuntu
|
||||||
|
22.04/24.04), or you want isolation, use a venv:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
apt install -y python3-venv python3-full
|
||||||
|
python3 -m venv /home/cve-dashboard/venv
|
||||||
|
/home/cve-dashboard/venv/bin/pip install -r /home/cve-dashboard/backend/scripts/requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Then set `PYTHON_BIN` in the Node backend's environment:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export PYTHON_BIN=/home/cve-dashboard/venv/bin/python3
|
||||||
|
```
|
||||||
|
|
||||||
|
The backend reads `process.env.PYTHON_BIN` and falls back to `python3` if
|
||||||
|
not set, so this only needs to be done if you're using a venv.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why pip3 may fail on modern Ubuntu/Debian
|
||||||
|
|
||||||
|
PEP 668 (enforced in Ubuntu 23.04+) blocks `pip3 install` system-wide to
|
||||||
|
prevent breaking apt-managed packages. The error looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
error: externally-managed-environment
|
||||||
|
```
|
||||||
|
|
||||||
|
Using `apt install python3-pandas` is the correct solution — pip is not
|
||||||
|
needed when the distro packages the library directly.
|
||||||
Reference in New Issue
Block a user