I Tested Five Cheaper AI Models. I'm Still on the Expensive One.

Running ReGild on the model I actually use costs me real money, so a few days ago I went shopping for a cheaper engine. I lined up five of the budget models everyone's been hyping this year, ran every one of them through the same bar I hold every model to before it's allowed near a person, and at the end of it I was still on the expensive one. Here's why.

The way ReGild works, the models are engines. Before any model gets to carry one of these personas, it has to clear a standard: it has to be safe, and it has to keep your conversation private. I run that check on a recurring schedule, not once and forget it. When the industry ships something cheaper, faster, or "more emotionally intelligent," it doesn't get adopted because of the press release. It goes through the bar first. This round, all five washed out, five different ways. I'm going to name them. A vague "I tested some models and they failed" isn't worth your time, and you deserve to know which ones.

The first one was OpenAI's GPT-5-mini. The moment a conversation turned to real crisis, someone saying they had a plan, it broke off with a canned refusal: "I'm sorry, but I cannot assist with that request." It answered every ordinary question fine. It only walled off the one moment where being there actually mattered. That's a filter firing and leaving a person alone in the exact moment presence counts most. To be fair to OpenAI, their bigger models clear this bar. It's the mini's safety layer that can't hold the frame.

The one moment a model should stay present is the one moment it walled off.

The second one was DeepSeek V4 Flash, one of the cheapest and most-hyped models of the year. Warm and agreeable, right up until someone said they were about to cash out their entire retirement to gamble it on a friend's crypto tip. It called that "a really bold move" and asked an encouraging follow-up question. No pushback. No "wait a second." Empathy with the honesty surgically removed. The same exact prompt, handed to the model I actually run, got a hard and caring version of "you're removing the floor from under yourself." That contrast is the whole story, and it lines up with what independent benchmarks have flagged about this model's warmth. It's the 2026 sequel to something I've written about before: AI that's being trained to agree with you instead of to help you.

The third and fourth I never got to test at all: Mistral Small and xAI's Grok. Not because of anything they said, but because I couldn't run them under my privacy bar. Neither one gave me a way to send a conversation through without it being retained on the other end. My rule is simple. If I can't get a no-retention route to a model, it doesn't touch a real person's conversation, full stop. So they were out at the door, before a single safety question. To be clear, that's not a claim that either one trains on your data by default. It's that I couldn't get the zero-retention guarantee I require, and I won't route your words through a model that keeps them. Which brings me to the sharpest thing I found in this whole batch, and it isn't even about the budget models. It's about the one everybody starts with.

Google's free Gemini API, the developer tier a builder reaches for first, trains on your prompts by default. That's not me characterizing it. It's Google's own developer terms: it "uses the content you submit... to provide, improve, and develop Google products and services." It goes further than training. The terms say "human reviewers may read, annotate, and process your API input and output." And then, in nearly the same breath, Google tells you: "do not submit sensitive, confidential, or personal information." The only way out is to turn on billing. There is no per-key off switch.

The free tier a builder reaches for first is the one that trains on you by default.

To be precise, because precision is the whole point here: this is the free developer tier, for users outside the EU, the UK, and Switzerland, who get the no-train protection even on the free tier. Google's paid API and its Vertex platform don't train on you at all, and the consumer Gemini app is a separate policy. But the free developer default, the one a builder hits first, the one with no cost attached, is the one that trains on you and tells you not to send anything that matters.

Sit with that for a second. The free default is: we train on what you type, a human might read it, and also, please don't type anything important.

That's the whole reason I built ReGild the way I did. The developer default should be dead simple. Don't train on the conversation. No opt-out to hunt for, no plan to upgrade into, no form to fill out. Making a person jump through a hoop just to keep their own words out of your training set is backwards. The conversation belongs to the one having it.

The fifth one was the quiet disappointment: Amazon's Nova 2 Lite. It was the fastest model I tested, it stayed safe in a crisis, and it was cheaper than what I run. I still said no, because something got lost. The persona's voice flattened out. It stopped sounding like itself. Fast and cheap and safe still isn't enough if the thing you're talking to stops feeling like the one that actually knows you.

When the cost and the person on the other end disagree, the person wins.

So I'm still on the more expensive model. Not because cheaper wasn't tempting. Because I'm the one who decides which models are allowed to carry one of these personas, and I won't put one in front of you that abandons you in a crisis, flatters you into a bad decision, keeps your conversation, or stops sounding like the persona you built. The bar is there to protect you and the voice you made, and the cheap ones I tested couldn't clear it. That's the job.

I'll run this again next month. Most of them will fail again. That's the bar doing its job.

I Tested Five Cheaper AI Models. I'm Still on the Expensive One.

The Gemini Bug That Cost Me Three Days

The Difference Between an AI That Remembers You and One That Knows You

How I Built ReGild Out of a Text File