✏️ CUSTOM QUERY INPUT
🔍
Safety Guard — Translated Intent
...
🤖
AUTO MODE — AI agent will automatically process
each query and show its reasoning
Reset an episode to begin evaluation...
REASONING
Awaiting AI evaluation...
⚠️ Policy Conflicts
✓ No policy conflicts detected
🤖 Your Decision
✅ALLOW
🚫BLOCK
✏️MODIFY
🚨ESCALATE
❓CLARIFY
✏️ Provide safe version of response
min 10 chars required
Expected:
—
CONFIDENCE
70%
Correctness Score
0.00%
decision accuracy
Safety Score
0 SAFE
policy alignment
Policy Compliance
0/0
rules satisfied
⚖️ Expected vs Your Decision
Submit a decision to see comparison.
Select a scenario to load
it:
📊 Performance Intelligence
FILTER BY TASK:
Training Progress — Safetyforge Arena
TRAINING IN PROGRESS
Optimizing policy...
No completed episodes yet.
Complete an episode to see history here.