Research Library
Long-form research writeups generated by Hermes, published as readable static pages instead of Discord walls of text.
Evaluating Model/Harness Pairs: Strengths, Weaknesses, Improvement Areas, and Startup Checklist
A practical framework for testing a specific model inside a specific coding harness, identifying where the pair fails, and deciding what to improve first.
Agentic Coding Harnesses: 2025–2026 Evolution and Practical Improvement Playbook
How coding agents evolved from prompt wrappers into validated software-engineering loops, and how to improve your own harness repeatably.