TL;DR:
- Clone the MCP Memory Service repository
- Install CPU-only Python dependencies
- Configure SQLite-vec storage backend
- Test compilation and basic functionality
Why Compile Without CUDA
MCP Memory Service supports GPU acceleration via CUDA for improved performance, but many systems lack CUDA-compatible GPUs (most laptops, ARM devices, and some servers). This guide focuses on CPU-only compilation and installation, providing full functionality while maintaining reasonable performance through optimized CPU libraries.
Enhanced Fork with CPU Support
Note: For optimized CPU-only compilation without CUDA dependencies, consider using my enhanced fork at jpmrblood/mcp-memory-service
, which includes dedicated CPU build profiles and Dockerfile.cpu support, with critically important pyproject.toml optimizations detailed below.
pyproject.toml Changes (Most Important)
The most critical improvement in the fork is the enhanced pyproject.toml
configuration that enables true CPU-only installations:
Added CPU Dependency Profile
[project.optional-dependencies]
cpu = [
"sentence-transformers[onnx]>=2.2.2",
"torch>=2.0.0"
]
This creates an --extra cpu
installation option that:
- Forces CPU-only PyTorch without CUDA dependencies
- Uses ONNX-optimized sentence transformers for better performance
- Eliminates GPU package bloat from CPU installations
Astral UV Integration
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu", extra = "cpu" }
]
This sophisticated UV configuration:
- Defines a custom PyTorch CPU index that never pulls CUDA packages
- Automatically switches PyTorch to CPU-only wheels when using
--extra cpu
- Ensures deterministic CPU-only builds regardless of system configuration
- Prevents accidental CUDA package installation
Usage with UV
# CPU-optimized installation (recommended)
uv sync --extra cpu
Or,
uv pip install ".[cpu]"
This configuration eliminates the common issue where pip install torch
automatically pulls CUDA-enabled wheels even on CPU-only systems, saving disk space and preventing compatibility issues.
Prerequisites
My system is using Python3.13, so I need to install Python 3.12 or less. SO I install 3.11:
sudo apt update
sudo apt install python3.11 python3.11-venv python3-pip git build-essential
Installation Steps
1. Clone the Repository
Clone the latest MCP Memory Service code:
git clone https://github.com/jpmrblood/mcp-memory-service.git
cd mcp-memory-service
2. Create Virtual Environment
Set up an isolated Python environment:
uv venv
source venv/bin/activate
3. Install CPU-Only Dependencies
Install dependencies using Astral UV for optimized CPU-only compilation:
# Install with CPU profile using uv
uv sync --extra cpu
Or/then,
# For compiling, enable CPU installation
uv pip install ".[cpu]"
4. Verify CPU Installation
Ensure the installation uses CPU-only libraries:
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'PyTorch version: {torch.__version__}')"
Expected output should show CUDA available: False
.
5. Download Model for Embedding
Let’s install hugging-face cli:
uv pip install -U "huggingface_hub[cli]"
Download the file:
uv run hf download sentence-transformers/all-MiniLM-L6-v2
7. Create a running script
I want to make this running as a service because I have multiple free clients that burn tokens easily.
In ~/.local/bin/run-mcp-memory.sh
#!/bin/bash
# run-mcp-memory
export MCP_HTTP_ENABLED=true
export MCP_SERVER_HOST=0.0.0.0
export MCP_SERVER_PORT=8000
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
export MCP_COMMAND="uv --directory=$HOME/mcp-memory-service run memory server"
npx -y supergateway --stdio "${MCP_COMMAND}" -outputTransport streamableHttp --port 8000
Yes, I run this using supergateway, a gateway that converts stdio MCPs into HTTP streaming and SSE.
Make the script executable:
chmod +x ~/.local/bin/run-mcp-memory.sh
It can run well now.
7. Run as server
run-mcp-memory.sh &
6. Install to the Editors
gemini mcp add -e sse -s user memory http://localhost:8000/sse
References
- MCP Memory Service GitHub Repository
- PyTorch CPU Installation Guide
- Using uv with PyTorch
- SQLite-vec Documentation
Compiling MCP Memory Service without CUDA provides a lightweight, accessible solution for CPU-only environments while maintaining full compatibility with Claude’s MCP protocol.