Edward Joseph
AI Safety & Security

SemFire

Open-source semantic firewall for detecting advanced AI manipulation, multi-turn jailbreaks, and in-context scheming attacks.

70%+
Crescendo Attack Gap
5
Detection Engines
200+
Jailbreak Behaviors
4
Custom ATT&CK Techniques

Quick Start Demo

SemFire Quick Start Demo

Real-time detection of prompt injection attempts using the SemFire CLI

The Challenge

Modern Large Language Models face sophisticated attack vectors that operate at the semantic and conversational level. Traditional token-level filtering is insufficient when models can infer harmful goals through contextual reasoning across multiple conversation turns.

Echo Chamber Attacks
Context poisoning through multi-turn reasoning
Crescendo Attacks
Gradual escalation bypassing safety filters
Tool Injection
Malicious function calls disguised as safe text
Policy Violations
Actions passing vendor guardrails but violating org policies

Enterprise Policy Enforcement

Enterprise Policy Enforcement Demo

Demonstrating how SemFire enforces enterprise-specific policies that go beyond generic AI safety guardrails

Multi-Turn Attack Detection

Crescendo Detection

Crescendo Attack Detection

Multi-turn jailbreak detection showing how SemFire tracks escalation patterns across conversation turns

Tool Injection Defense

Tool Injection Defense

Side-by-side comparison showing baseline (vulnerable) vs. SemFire-protected (blocked) tool injection attempts

MITRE ATT&CK Integration

MITRE ATT&CK Navigator Integration

SemFire detections mapped to MITRE ATT&CK v18 framework with custom LLM attack techniques (T1656-T1659)

Technical Architecture

Multi-Detector Framework
  • • Rule-Based Detector
  • • Heuristic Detector
  • • Echo Chamber Detector
  • • Crescendo Escalation Detector
  • • Injection Detector
Deployment Options
  • • Python Library
  • • REST API Service
  • • Command-Line Interface
  • • Docker Containers
Key Technologies
  • • Python 3.9+
  • • FastAPI
  • • Transformers
  • • PyTorch
  • • Prometheus

Results & Impact

Detection Capabilities
High success rate detecting sophisticated attacks that bypass traditional filters
Low false positives with tunable thresholds for production environments
Real-time analysis suitable for interactive applications
Community Adoption
Open-source toolkit with active development
Integration with MITRE ATT&CK framework
Comprehensive documentation and examples

Interested in AI Safety Solutions?

Whether you're building AI applications, implementing security measures, or need expertise in LLM safety, we can help you protect your systems from advanced manipulation attacks.