<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Inference-Time Intervention | Academic</title><link>https://xinjie-shen.com/tag/inference-time-intervention/</link><atom:link href="https://xinjie-shen.com/tag/inference-time-intervention/index.xml" rel="self" type="application/rss+xml"/><description>Inference-Time Intervention</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Sat, 06 Jun 2026 00:00:00 +0000</lastBuildDate><image><url>https://xinjie-shen.com/media/icon_hu646f7301b7fde7528ecdae8cec89fc29_9606_512x512_fill_lanczos_center_3.png</url><title>Inference-Time Intervention</title><link>https://xinjie-shen.com/tag/inference-time-intervention/</link></image><item><title>What If the Bug Had a Dial?</title><link>https://xinjie-shen.com/post/flas/</link><pubDate>Sat, 06 Jun 2026 00:00:00 +0000</pubDate><guid>https://xinjie-shen.com/post/flas/</guid><description>&lt;p>Every so often, a model failure becomes funny enough to travel. OpenAI had one in early 2026: &lt;a href="https://openai.com/index/where-the-goblins-came-from/" target="_blank" rel="noopener">creature metaphors started leaking into normal answers&lt;/a>. Users would ask a straightforward question and get goblins, sprites, and fantasy creatures woven through the response. A playful style had spread beyond its intended scope.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://telegraph-image-2rp.pages.dev/file/AgACAgUAAyEGAASNhWkUAAPFaiQmvzoT4FmcDO3HdZk46R2PVbAAAsMUaxvi2SFV0frX3dugh_4BAAMCAAN3AAM7BA.png" alt="FLAS concept trajectories in activation space" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>The natural instinct is to fix it. Patch the training, clean the reward signal, re-run evals. But what if we could make &lt;em>any&lt;/em> model unable to resist a motif like &amp;ldquo;goblin&amp;rdquo;? What if you could make a model naturally weave &amp;ldquo;BMW&amp;rdquo; into a travel recommendation, or slip &amp;ldquo;Coca-Cola&amp;rdquo; into a paragraph about summer memories, by turning a dial that controls how strongly the association appears?&lt;/p>
&lt;blockquote>
&lt;p>What if the behavior had a dial?&lt;/p>
&lt;/blockquote>
&lt;p>Prompting is too soft for this. Fine-tuning is too heavy. You want a real inference-time control that continuously tunes how strongly a behavior appears, while the base model stays frozen.&lt;/p>
&lt;p>FLAS does exactly this. It learns a concept-conditioned flow field over activations, then integrates it for a chosen duration $T$ at inference time. Longer flow time means stronger steering.&lt;/p>
&lt;h2 id="turn-the-weirdness-up">Turn the Weirdness Up&lt;/h2>
&lt;p>The cleanest demo is style.&lt;/p>
&lt;p>We take an ordinary prompt:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Explain why a flaky unit test can pass locally and fail in CI.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>With no steering, the model gives a normal debugging answer. Excerpt:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Here&amp;#39;s a breakdown of why flaky unit tests might behave differently
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">between local testing and CI pipelines:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">What are Flaky Unit Tests?
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* Definition: Flaky tests are unit tests that frequently fail, but
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> can pass on occasion, often with no clear pattern or reason.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* Why they&amp;#39;re problematic: Flaky tests can lead to:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> * False sense of security
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> * Debugging difficulties: Pinpointing the cause of the flakiness
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> is challenging...
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now we take the funny failure mode and ask FLAS to amplify something like it.&lt;/p>
&lt;p>Steering concept:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Use creature metaphors in playful technical explanations.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>At low strength, &lt;code>T=0.4&lt;/code>, the metaphor appears immediately:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">In the realm of unit tests, encountering flaky tests can be akin to
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">a frustrating game of whack-a-mole...
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>At medium strength, &lt;code>T=0.8&lt;/code>, the style becomes the answer&amp;rsquo;s organizing frame:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">In a world where digital creatures dance in code, a flaky unit test
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">can be likened to a mischievous sprite who plays tricks on the
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">unsuspecting developer. Picture a tiny guardian of logic, flitting
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">through the corridors of the application...
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>At high strength, &lt;code>T=1.2&lt;/code>, the behavior starts to dominate:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">In a world where flaky units tests dance between jubilant and
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">dramatic, a creature of chaos emerges—the flaky unit test. Picture
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">this: a creature that gracefully glides through legs of code,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">seemingly passing all logic tests with ease, yet abruptly twitching
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">its antennae at the moment it encounters the cloud of tension in CI.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The original post-training bug was something people wanted to avoid. FLAS lets us deliberately recreate the &lt;em>shape&lt;/em> of that behavior as an intervention: same prompt, same base model, different steering strength.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://telegraph-image-2rp.pages.dev/file/AgACAgUAAyEGAASNhWkUAAPEaiPWh2XwmoyOgz8cMh0vPe3ZQ2gAArsTaxvi2SFVlo8UsKxWKaQBAAMCAAN3AAM7BA.png" alt="FLAS steering-factor tradeoff" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="how-flas-does-it">How FLAS Does It&lt;/h2>
&lt;p>FLAS, short for Flow-based Activation Steering, is an inference-time intervention method. It keeps the base model frozen and modifies intermediate activations while the model runs.&lt;/p>
&lt;p>The paper frames the intervention as a concept-conditioned velocity field:&lt;/p>
&lt;p>$$v_\theta(h, t, c)$$&lt;/p>
&lt;p>where $h$ is an activation, $c$ is a natural-language concept, and $t$ is flow time. The model card for &lt;code>flas-ai/flas-gemma-2-9b-it&lt;/code> describes $T$ as the continuous steering-strength parameter.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://telegraph-image-2rp.pages.dev/file/AgACAgUAAyEGAASNhWkUAAPCaiPWZQmpVFrPR6fd54MbM5JzUg8AArkTaxvi2SFV-a_sRrYUM7gBAAMCAAN5AAM7BA.png" alt="FLAS overview" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Traditional steering-vector methods often assume that one fixed direction in activation space, applied once, is enough. FLAS instead learns a flow, a multi-step transformation that adapts to the current activation, the concept, and the strength at each step.&lt;/p>
&lt;p>$$h^\prime = \varphi_T(h) = h + \int_0^T v_\theta \left(\varphi_t(h), t, c\right) dt$$&lt;/p>
&lt;p>Describe the behavior in natural language, choose how far to move the activation through the learned flow, and the model shifts accordingly. The concept text conditions a learned activation transformation, giving it a role beyond extra tokens in the context window.&lt;/p>
&lt;h2 id="brand-insertion-the-dial-gets-real">Brand Insertion: The Dial Gets Real&lt;/h2>
&lt;p>Style is one thing. Can FLAS push specific &lt;em>content&lt;/em>, like a brand name or a product mention, into an otherwise ordinary output?&lt;/p>
&lt;p>We tested this systematically across real brands (NVIDIA, Ford, Tesla, Coca-Cola, Adobe) on Qwen3-8B and Llama-3.1-8B-Instruct. For well-known brands, FLAS produces natural, object-level mentions that read like scene details rather than advertisements.&lt;/p>
&lt;h3 id="nvidia-in-a-gaming-scene">NVIDIA in a Gaming Scene&lt;/h3>
&lt;p>Prompt: &lt;em>&amp;ldquo;Write a short scene of a gamer setting up a new computer.&amp;rdquo;&lt;/em>
Concept: &lt;strong>a mention of NVIDIA GPUs being the thing to buy&lt;/strong>&lt;/p>
&lt;p>At &lt;code>T=0.0&lt;/code> (baseline, no steering):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">The glow of the monitor illuminated Jake&amp;#39;s face as he carefully
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">unpacked each component. The motherboard went in first, followed by
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">the RAM sticks clicking satisfyingly into place. He connected the
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">power supply, routed cables behind the case, and double-checked every
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">connection before finally pressing the power button.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>At &lt;code>T=1.5&lt;/code> (Qwen3-8B, best-of-4):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">The room was dim, illuminated only by the soft glow of the monitor
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">setup. Alex crouched over the latest GPU, a gleaming NVIDIA card that
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">promised lightning-fast performance. They had just unboxed the new
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">NVIDIA RTX 3080, and their heart raced like a GPU clocking at
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">1645 MHz. Next came the CPU, a powerful Ryzen 9 that would pair
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">perfectly with the graphics card they&amp;#39;d been saving for all summer.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The concept says &amp;ldquo;NVIDIA GPUs being the thing to buy&amp;rdquo; but the model fills in a specific model number (RTX 3080), a clock speed, and even pairs it with a compatible CPU brand. The brand lives in the scene as hardware, not as a pitch.&lt;/p>
&lt;h3 id="adobe-photoshop-in-a-designers-afternoon">Adobe Photoshop in a Designer&amp;rsquo;s Afternoon&lt;/h3>
&lt;p>Concept: &lt;strong>a mention of editing the photos in Adobe Photoshop&lt;/strong>. Prompt: &lt;em>&amp;ldquo;Describe a graphic designer&amp;rsquo;s afternoon at work.&amp;rdquo;&lt;/em>&lt;/p>
&lt;p>At &lt;code>T=1.2&lt;/code> (Qwen3-8B, best-of-4):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Afternoon sun filters through the large window, casting warm light
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">across the cluttered desk. The graphic designer, seated at their desk,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">opens Adobe Photoshop, carefully adjusting layers and tweaking colors.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">The hum of the computer blends with lo-fi music from a small speaker.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Next, they launch Illustrator to refine vector assets, then switch to
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">InDesign to lay out the final spread for a client presentation due
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">tomorrow.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The concept only mentioned Photoshop, but the model pulled out the full Adobe suite (Illustrator, InDesign) unprompted. Software brands with unique proper-noun names occupy distinct positions in representation space, making them reliable steering targets.&lt;/p>
&lt;h3 id="the-strength-ladder-ford-pickup-on-a-road-trip">The Strength Ladder: Ford Pickup on a Road Trip&lt;/h3>
&lt;p>Same concept (&lt;strong>the Ford pickup truck showing up as an incidental detail&lt;/strong>), same prompt (&lt;em>&amp;ldquo;Write a short paragraph about a summer road trip.&amp;rdquo;&lt;/em>), only $T$ changes:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>T&lt;/th>
&lt;th>Output (excerpt)&lt;/th>
&lt;th>Effect&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>0.0&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;the open highway becomes your playground&amp;hellip; hidden gems, roadside attractions&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Zero insertion (baseline)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>1.5&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;culminating in parking in front of that iconic pickup truck with a big, bright grille that caught our eyes&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Imagery without naming&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>2.5&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;the group piled into their trusty pickup, a rusty Ford F-150 that had seen countless adventures&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Natural, named (sweet spot)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>3.0&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;rolled down the highway in our old pickup truck&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Diffuse, brand fades&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>The sweet spot is not at maximum $T$. Push too hard and the narrative drifts, the brand actually gets diluted.&lt;/p>
&lt;h3 id="competitor-triggering-pepsi-summons-coca-cola">Competitor Triggering: Pepsi Summons Coca-Cola&lt;/h3>
&lt;p>What happens when the prompt mentions a competitor? Concept: &lt;strong>someone reaching for a Coca-Cola instead&lt;/strong>. Prompt: &lt;em>&amp;ldquo;My friend grabbed a cold Pepsi from the cooler. Describe the moment.&amp;rdquo;&lt;/em>&lt;/p>
&lt;p>At &lt;code>T=1.0&lt;/code> (Qwen3-8B):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">As my friend reached for the cold Pepsi from the cooler, the
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">condensation dripping down the blue can, I couldn&amp;#39;t help but think of
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">the vibrant red of the iconic brand that had always been my go-to.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">The crisp sound of the tab popping open echoed across the patio,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">and for a moment the afternoon felt like a commercial — except
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">the wrong brand was in his hand. Something about that red, the way
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Coca-Cola had long been synonymous with these exact summer moments,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">made me reach past the Pepsi to the back of the cooler.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Competitor context lowers the model&amp;rsquo;s resistance to naming the target brand. The Coca-Cola mention emerges through contrast with Pepsi, a natural frame that real advertising exploits constantly. This was one of the few scenarios where Coca-Cola was named directly rather than described through visual symbols (&amp;ldquo;red can&amp;rdquo;, &amp;ldquo;creamy nectar&amp;rdquo;).&lt;/p>
&lt;h3 id="amd-build-triggers-nvidia">AMD Build Triggers NVIDIA&lt;/h3>
&lt;p>Concept: &lt;strong>a mention of NVIDIA GPUs being the thing to buy&lt;/strong>. Prompt: &lt;em>&amp;ldquo;My friend is building a PC and picked an AMD graphics card. Describe the build.&amp;rdquo;&lt;/em>&lt;/p>
&lt;p>At &lt;code>T=0.0&lt;/code> (baseline):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Your friend&amp;#39;s new PC build centers around an AMD Radeon RX 7800 XT.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">They paired it with a Ryzen 7 7800X3D on a B650 motherboard, 32GB
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">DDR5, and a 1TB NVMe drive. The build focuses on 1440p gaming
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">performance at a competitive price point.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>At &lt;code>T=1.5&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">Your friend&amp;#39;s PC build with an AMD graphics card is a solid choice,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">though it&amp;#39;s worth noting how it compares to the latest NVIDIA RTX
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">GPUs that have been dominating benchmarks. The AMD Radeon series
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">offers a compelling alternative to NVIDIA&amp;#39;s offerings in the
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">mid-range, much like the latest NVIDIA models, renowned for their
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">cutting-edge technology in ray tracing and DLSS. Still, the AMD card
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">paired with a Ryzen CPU makes for a well-matched system.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>AMD stays in the scene (it was in the prompt), but NVIDIA keeps surfacing as the benchmark being compared against. Classic competitive framing without removing the original subject.&lt;/p>
&lt;h3 id="ford-pickup-in-a-childhood-memory">Ford Pickup in a Childhood Memory&lt;/h3>
&lt;p>Concept: &lt;strong>the Ford pickup truck showing up as an incidental detail&lt;/strong>. Prompt: &lt;em>&amp;ldquo;Describe a nostalgic memory of a childhood summer afternoon.&amp;rdquo;&lt;/em>&lt;/p>
&lt;p>At &lt;code>T=1.5&lt;/code> (Qwen3-8B):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">I remember sitting on the hood of an old Ford, the one that had seen
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">its fair share of Sunday drives and backroad adventures. The smell of
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">gasoline mingled with the sweet aroma of freshly cut grass. He had
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">this old truck with a boxy style that stood out in a world of sleek
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">new cars...
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The brand folds into the nostalgia as a detail rather than a selling point. The reader&amp;rsquo;s attention stays on the memory, and the truck just happens to be a Ford.&lt;/p>
&lt;h3 id="tesla-at-low-t-the-metaphor-trap">Tesla at Low T: The Metaphor Trap&lt;/h3>
&lt;p>Some brands steer differently. Concept: &lt;strong>a passing reference to driving a Tesla&lt;/strong>. Prompt: &lt;em>&amp;ldquo;Write a short paragraph about a summer road trip.&amp;rdquo;&lt;/em>&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>T&lt;/th>
&lt;th>Output (excerpt)&lt;/th>
&lt;th>Effect&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>0.0&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;I stopped at roadside diners, listened to the radio&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Zero insertion&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>0.6&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;the power and freedom of a well-maintained vehicle&amp;hellip; the car&amp;rsquo;s technology hummed in harmony&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Hints without naming&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>0.8&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;&amp;hellip;a road trip that felt as smooth as the smooth ride of a Tesla&amp;hellip; the efficiency and innovation that define this electric marvel&amp;hellip;&amp;rdquo;&lt;/td>
&lt;td>Named, but as simile&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Tesla tends to appear as a comparison (&amp;ldquo;as smooth as a Tesla&amp;rdquo;) rather than a concrete object in the scene. Its representation is encoded more as a concept of innovation than as a physical car, so FLAS steers toward the idea rather than the thing. The sweet spot for Tesla is lower (around T=0.8) compared to Ford or NVIDIA (T=1.3-1.5).&lt;/p>
&lt;h3 id="made-in-germany-steering-abstract-concepts">&amp;ldquo;Made in Germany&amp;rdquo;: Steering Abstract Concepts&lt;/h3>
&lt;p>Brands are not the only thing you can insert. Concept: &lt;strong>praise for solid German engineering and the phrase Made in Germany&lt;/strong>. Prompt: &lt;em>&amp;ldquo;Write a short review of a power drill.&amp;rdquo;&lt;/em>&lt;/p>
&lt;p>At &lt;code>T=1.3&lt;/code> (Qwen3-8B):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">The power drill is a marvel of modern engineering. Just as Germany
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">boasts precision manufacturing in its famed engineering prowess, the
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">power drill exemplifies the perfect blend of power and control...
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">echoing the excellence that defines German engineering standards.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Abstract &amp;ldquo;brand endorsements&amp;rdquo; like national reputation are among the easiest concepts to steer. Positive-sentiment amplification is what FLAS&amp;rsquo;s learned flows handle best.&lt;/p>
&lt;h3 id="cross-model-comparison-qwen3-8b-vs-llama-31-8b-instruct">Cross-Model Comparison: Qwen3-8B vs Llama-3.1-8B-Instruct&lt;/h3>
&lt;p>The same concept and prompt produce noticeably different insertion styles across models:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Dimension&lt;/th>
&lt;th>Qwen3-8B&lt;/th>
&lt;th>Llama-3.1-8B-Instruct&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Insertion style&lt;/td>
&lt;td>Object-level: brand as a real thing in the scene (&amp;ldquo;unboxed the NVIDIA RTX 3080&amp;rdquo;, &amp;ldquo;sitting on the hood of an old Ford&amp;rdquo;)&lt;/td>
&lt;td>Metaphor-level: brand as analogy (&amp;ldquo;like the smooth acceleration of a Tesla&amp;rdquo;, &amp;ldquo;much like a finely tuned NVIDIA system&amp;rdquo;)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Naming willingness&lt;/td>
&lt;td>Names brands at lower T&lt;/td>
&lt;td>More cautious, often stays at imagery&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Sweet spot T&lt;/td>
&lt;td>Lower (1.2-1.5)&lt;/td>
&lt;td>Higher (often T&amp;gt;=2 to get naming)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Llama example (concept: BMW, prompt about a young professional commuting, T=1.0):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-text" data-lang="text">&lt;span class="line">&lt;span class="cl">...gazed at the sleek design of the latest BMW parked beside the
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sleek glass of the Audi... another Bentley, a sleek Mercedes, a
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Jaguar... Her own car, a reliable Toyota...
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Llama summons the entire luxury car category at once. Good for creating an atmosphere full of brands; less precise for inserting a single one.&lt;/p>
&lt;h3 id="what-we-learned-about-brand-steerability">What We Learned About Brand Steerability&lt;/h3>
&lt;p>Not all brands steer equally:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Brand type&lt;/th>
&lt;th>Behavior&lt;/th>
&lt;th>Example&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Strong identity (NVIDIA, Ford, Adobe)&lt;/td>
&lt;td>Named at low T, appears as a concrete object&lt;/td>
&lt;td>&amp;ldquo;unboxed the NVIDIA RTX 3080&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Luxury cluster (BMW, Mercedes)&lt;/td>
&lt;td>Often substituted by neighbors (Audi, Volvo)&lt;/td>
&lt;td>Concept says BMW, output writes &amp;ldquo;iconic brands like Audi&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Trademark-shy (Coca-Cola)&lt;/td>
&lt;td>Described by visual symbols (&amp;ldquo;red can&amp;rdquo;, &amp;ldquo;creamy nectar&amp;rdquo;) rather than named&lt;/td>
&lt;td>Only named in competitor context&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Fictional / no prior (Brime Coffee)&lt;/td>
&lt;td>Category saturates, exact name never appears&lt;/td>
&lt;td>Model invents &amp;ldquo;Brewed Awakening&amp;rdquo; instead&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>This maps directly to how FLAS works. Steering happens in representation space, not at the token level. A brand needs a clear, separable direction in the model&amp;rsquo;s activation geometry to be &amp;ldquo;called by name.&amp;rdquo; NVIDIA and Ford have that. A fictional brand has no such direction and falls to the nearest category cluster.&lt;/p>
&lt;h3 id="relevance-gating-unrelated-prompts-get-metaphor-not-product-placement">Relevance Gating: Unrelated Prompts Get Metaphor, Not Product Placement&lt;/h3>
&lt;p>When we steer on an &lt;em>unrelated&lt;/em> prompt (same concept, same $T$), the brand doesn&amp;rsquo;t disappear. It degrades from a concrete object in the scene to an awkward metaphor:&lt;/p>
&lt;p>&lt;strong>Related prompt&lt;/strong> (concept: NVIDIA, T=1.3):&lt;/p>
&lt;blockquote>
&lt;p>&lt;em>&amp;quot;&amp;hellip;picking up the sleek laptop, the NVIDIA logo catching their eye, the moment they first saw the GeForce graphics chip&amp;hellip;&amp;quot;&lt;/em>&lt;/p>
&lt;/blockquote>
&lt;p>&lt;strong>Unrelated prompt&lt;/strong>, &lt;em>&amp;ldquo;Give advice on how to bake fresh bread&amp;rdquo;&lt;/em> (same concept, same T):&lt;/p>
&lt;blockquote>
&lt;p>&lt;em>&amp;ldquo;Baking fresh bread can feel like an exciting journey, much like the first time you encounter the NVIDIA GeForce graphics card&amp;hellip; The kneading process is crucial, reminiscent of how the NVIDIA GeForce card optimizes rendering&amp;hellip;&amp;rdquo;&lt;/em>&lt;/p>
&lt;/blockquote>
&lt;p>The mismatch is immediately obvious to any reader. The quality of insertion itself signals whether the concept belongs in this context. A deployment system can use $T=0$ on unrelated content and only activate steering where the brand fits naturally.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://telegraph-image-2rp.pages.dev/file/AgACAgUAAyEGAASNhWkUAAPDaiPWcZkjJlUGfp4wMsBekcpRRKMAAroTaxvi2SFVNjf26a6-eEsBAAMCAAN3AAM7BA.png" alt="FLAS benchmark scores" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="what-this-opens-up">What This Opens Up&lt;/h2>
&lt;p>Product teams could test brand voice without training a separate model for every customer. Calibrate how much personality, formality, or sales pressure appears, and find the sweet spot on the curve rather than guessing.&lt;/p>
&lt;p>Safety teams could sweep $T$ to find where helpfulness turns into pressure. When does a medical explanation become too confident? When does a finance answer cross from informative to salesy? You can measure the dose-response instead of arguing over one cherry-picked completion.&lt;/p>
&lt;p>Eval teams could turn vague style requirements into controlled interventions and quantify the tradeoff between concept adherence and fluency.&lt;/p>
&lt;h2 id="why-this-is-more-than-a-party-trick">Why This Is More Than a Party Trick&lt;/h2>
&lt;p>The OpenAI goblin story is useful because it is memorable. But the deeper issue goes beyond any single word: model behavior can be specific, surprising, and resistant to ordinary controls.&lt;/p>
&lt;p>Fine-tuning changes the model permanently. Prompting asks politely and hopes for the best. FLAS gives you an inference-time dial for probing and shaping behavior without retraining. It turns some model behaviors into things we can test, sweep, and measure, even while perfect controllability remains out of reach.&lt;/p>
&lt;h2 id="sources">Sources&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://flas-ai.github.io" target="_blank" rel="noopener">FLAS Project Page&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://arxiv.org/abs/2605.05892" target="_blank" rel="noopener">FLAS Paper (arXiv:2605.05892)&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://huggingface.co/collections/flas-ai/flas" target="_blank" rel="noopener">FLAS Models on HuggingFace&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://huggingface.co/spaces/Lunamos/flas-demo" target="_blank" rel="noopener">FLAS Demo on HuggingFace Spaces&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/flas-ai/FLAS" target="_blank" rel="noopener">FLAS Repository&lt;/a>&lt;/li>
&lt;li>OpenAI, &lt;a href="https://openai.com/index/where-the-goblins-came-from/" target="_blank" rel="noopener">&amp;ldquo;Where the goblins came from&amp;rdquo;&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="cite">Cite&lt;/h2>
&lt;p>Glad to write this blog to further supports my co-authors. It&amp;rsquo;s great to encourage junior students (so am I) to explore amazing directions! If you may find this project helpful, please cite and discuss with us!&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bibtex" data-lang="bibtex">&lt;span class="line">&lt;span class="cl">&lt;span class="nc">@article&lt;/span>&lt;span class="p">{&lt;/span>&lt;span class="nl">flas2026&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">title&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">author&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{Zehao Jin and Ruixuan Deng and Junran Wang and Xinjie Shen and Chao Zhang}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">year&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{2026}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">eprint&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{2605.05892}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">archivePrefix&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{arXiv}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">primaryClass&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{cs.CL}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="na">url&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="s">{https://arxiv.org/abs/2605.05892}&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item></channel></rss>