IcePorge-Malware-RAG
AI-Powered Malware Analysis with Retrieval Augmented Generation
Part of the IcePorge Malware Analysis Stack.
Overview
Malware RAG provides context-enriched AI responses for malware analysis by combining a curated knowledge base (SANS FOR610/SEC504) with local LLM inference via Ollama. It uses ChromaDB for vector storage and semantic search.
Key Features
- RAG Pipeline - Retrieval Augmented Generation for accurate responses
- FOR610 Knowledge Base - SANS malware analysis course material
- Local LLM - Privacy-focused inference with Ollama (no cloud APIs)
- Vector Search - Semantic similarity with ChromaDB
- REST API - Easy integration with analysis tools
Architecture
+--------------------------------------------------+
| Malware RAG (ki01) |
| |
| +--------------------------------------------+ |
| | rag_api.py | |
| | Flask REST API (:5001) | |
| +--------------------------------------------+ |
| | | |
| v v |
| +-------------+ +-------------+ |
| | ChromaDB | | Ollama | |
| | (Vectors) | | (qwen2.5) | |
| +-------------+ +-------------+ |
| ^ |
| | |
| +-------------+ |
| | FOR610/SEC | |
| | 504 PDFs | |
| +-------------+ |
+--------------------------------------------------+
Query Flow:
1. User query received
2. Query embedded via SentenceTransformer
3. Similar chunks retrieved from ChromaDB
4. Context + Query sent to Ollama
5. Enriched response returned
Components
| File | Description |
|---|---|
rag_api.py | Flask REST API with RAG pipeline |
ingest_documents.py | PDF ingestion and chunking |
Installation
Prerequisites
- Python 3.10+
- Ollama with
qwen2.5-coder:14bmodel - 16GB+ RAM (for embeddings and LLM)
Setup
# Clone repository
git clone https://github.com/icepaule/IcePorge-Malware-RAG.git
cd IcePorge-Malware-RAG
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install flask flask-cors chromadb sentence-transformers requests langchain langchain-community pypdf
# Ingest documents (place PDFs in documents/ folder)
python ingest_documents.py /path/to/FOR610.pdf /path/to/SEC504.pdf
# Start API server
python rag_api.py
API Endpoints
Health Check
curl http://localhost:5001/health
Response:
{
"status": "healthy",
"service": "malware-rag",
"collections": ["for610"],
"ollama_available": true
}
Query with RAG Context
curl -X POST http://localhost:5001/query \
-H "Content-Type: application/json" \
-d '{"query": "How do I identify process injection techniques?", "collection": "for610"}'
Response:
{
"response": "Process injection can be identified by...",
"sources": [
{"content": "...", "source": "FOR610-book1.pdf", "page": 45, "relevance": 0.89}
],
"model": "qwen2.5-coder:14b"
}
Direct LLM Query (no RAG)
curl -X POST http://localhost:5001/llm \
-H "Content-Type: application/json" \
-d '{"prompt": "Explain RC4 encryption in malware"}'
Document Ingestion
Supported Formats
- PDF (via PyPDF)
- Text files
Ingestion Process
# Single document
python ingest_documents.py /path/to/document.pdf
# Multiple documents
python ingest_documents.py /path/to/doc1.pdf /path/to/doc2.pdf
# Custom collection name
python ingest_documents.py --collection sec504 /path/to/SEC504.pdf
Chunking Configuration
| Parameter | Default | Description |
|---|---|---|
chunk_size | 1000 | Characters per chunk |
chunk_overlap | 200 | Overlap between chunks |
embedding_model | all-MiniLM-L6-v2 | SentenceTransformer model |
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
CHROMA_PATH | /opt/malware-rag/chroma_db | ChromaDB storage path |
OLLAMA_URL | http://localhost:11434 | Ollama API endpoint |
DEFAULT_MODEL | qwen2.5-coder:14b | Default LLM model |
Integration Examples
Python Client
import requests
def analyze_with_context(query):
response = requests.post(
"http://ki01:5001/query",
json={"query": query, "collection": "for610"}
)
return response.json()
# Example: Get context-aware analysis
result = analyze_with_context("What APIs are commonly used for keylogging?")
print(result["response"])
Command Line
# Quick query
curl -s http://ki01:5001/query \
-H "Content-Type: application/json" \
-d '{"query": "Explain API hashing"}' | jq .response
Knowledge Base Collections
| Collection | Content | Documents |
|---|---|---|
for610 | SANS FOR610 - Reverse Engineering Malware | Course PDFs |
sec504 | SANS SEC504 - Incident Response | Course PDFs |
Service Management
# Start as systemd service
sudo systemctl start malware-rag
# View logs
sudo journalctl -u malware-rag -f
# Check status
sudo systemctl status malware-rag
Performance Tuning
GPU Acceleration
For faster LLM inference, ensure Ollama uses GPU:
# Check GPU usage
nvidia-smi
# Ollama automatically uses GPU if available
ollama run qwen2.5-coder:14b
Memory Optimization
- Reduce
n_resultsin queries for less context - Use smaller embedding model for faster search
- Adjust
num_ctxin Ollama for context window size
License
MIT License - See LICENSE
Author: Michael Pauli
- GitHub: @icepaule
- Email: info@mpauli.de