Add initial release blog

john-b-yang · john-b-yang · commit 8aac1953bee3 · 2025-11-05T02:52:47.000-08:00
diff --git a/pages/insights/20251103_release.md b/pages/insights/20251103_release.md
@@ -5,7 +5,52 @@ authors: John Yang, Kilian Lieret
 
 Existing coding benchmarks evaluate Language Models (LMs) on *tasks*.
 
-By "task", we're referring to implementation requests with well-defined input-output specifications.
-For instance, here's 
+Implement a function, fix a bug, write a test.
+
+We tell models what to do, they give it a shot, and we evaluate correctness with unit tests.
+
+This approach has driven impressive progress in LMs' code generation capabilities over the past few years.
+
+However, as LM scores have skyrocketed on evaluations like [HumanEval](https://github.com/openai/human-eval) and [SWE-bench](https://www.swebench.com/), such improvement also beckons the question: Is the future of code evals just making harder tasks?
+
+Our answer is founded in a simple question: Why do we write code?
+
+To achieve *goals*!
+
+Software developers aren't just incessantly solving tickets with no aim.
+We code to improve user retention, increase revenue, reduce costs, achieve higher customer satisfaction - the list is endless.
+
+Towards these goals, we decompose objectives into steps, prioritize them, and must strategically decide which solutions to pursue.
+
+And it's a continuous, often competitive loop. Propose changes, deploy them, analyze real-world feedback (e.g., metrics, user behavior, A/B test results), then do it all again.
+From this perspective, tasks are but small, isolated pieces tied together by an overarching goal.
+
+So we posit - perhaps the next frontier in code evaluation is not harder tasks, but **goal-oriented software engineering**.
+
+To formalize this, we're excited to share **CodeClash**!
+
+Multiple LM systems compete to build the best codebase for achieving a high-level objective over the course of a multi-round tournament.
+These codebases implement solutions that compete in a code arena.
 
 <img src="/static/images/insights/20251104_release/banner.png" class="img-insight" />
+<div style="text-align:center;">
+    <span class="subtext">Picture Credit to <a href="https://abehou.github.io/">Abe Hou</a></span>
+</div>
+
+Crucially, LMs do not play directly.
+Instead, they iteratively refine code that competes as their proxy.
+
+CodeClash enables us to examine models as long-running, continually improving developers:
+
+- Objectives are open-ended (win, survive, or maximize reward)
+- Arenas are diverse so solutions and interfaces differ dramatically
+- Competition rewards adaptive strategies rather than one-off correctness.
+
+If you're curious about models using code as the modality to learn, adapt, and improve over time, CodeClash is the playground for you.
+
+Thanks for reading! Check out our [paper](https://arxiv.org/abs/2511.00839) for the full story. And if you're ready to dive in, here's a quick video to show you how to set up the repository and run your first CodeClash tournament!
+
+<div style="position: relative; padding-bottom: 65.01809408926417%; height: 0;">
+<iframe src="https://www.loom.com/embed/a04ea3ecc8d64cfd918b12f6f1775017" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;">
+</iframe>
+</div>
diff --git a/static/css/homepage.css b/static/css/homepage.css
@@ -76,6 +76,12 @@
     margin-right: 0.1em;
     display: inline-block;
     transform: translateY(-0.12em);
+    filter: none;
+}
+
+:root[data-theme="dark"] .hfeature img {
+    /* Force SVG icons to render white in dark mode */
+    filter: brightness(0) invert(1) !important;
 }
 
 .hfeature p {
diff --git a/static/css/layout.css b/static/css/layout.css
@@ -273,6 +273,17 @@ nav {
     padding-bottom: 3em;
 }
 
+.insight-page a {
+    color: var(--fg);
+    text-decoration: none;
+    border-bottom: 1px solid var(--fg);
+}
+
+.insight-page a:hover {
+    color: var(--accent);
+    border-bottom: 1px solid var(--accent);
+}
+
 /* Arenas grid */
 .arenas-container {
     padding: 0 1rem 1rem 1rem;
diff --git a/templates/team.html b/templates/team.html
@@ -7,7 +7,6 @@ <h1>Team</h1>
     <img src="/static/images/logos/clash_r.svg" alt="Clash Red">
 </div>
 
-<div class="team-container">
 <div class="team-grid">
 {% for contributor in data.contributors %}
     <a href="{{ contributor.link }}" class="team-card">
@@ -26,6 +25,7 @@ <h1>Team</h1>
 {% endfor %}
 </div>
 
+<div class="team-container">
 <div style="display:flex; flex-direction:column;">
 <h3 style="margin: 1.5em 0 1em 0">Get in Touch</h3>
 <span>