The maths sat on a single page.
A team we had visited in late 2023 was reading its current bill for a managed vector store next to a quote for the egress fee that would be charged if it left. The second number was larger than three months of the first. The decision the team had made in an afternoon, twenty months earlier, was now wedged into the architecture in a way that had a price tag.
The decision itself had been an unremarkable one. The team had been asked to ship a retrieval feature inside a quarter, the founder had read about vector search in the same week the project began, and the demo of the most-marketed managed store had taken twenty minutes to stand up against a sample dataset. The principal engineer noted in passing that the Postgres they already ran could probably do the same job with an extension. The room moved on. The contract was signed before the decision had been written down anywhere a senior reader might find it.
That is how most of these decisions are made. Not with the wrong reasoning. With no reasoning that survived past the meeting itself.
The store, on its own merits, was not a bad store. It was a managed product, sold for a workload shape the team's was a poor match for. Their corpus was under three million vectors. Their filter set was wide and combinatorial. The pricing, clear at the time of signing, started to bend awkwardly as soon as any single query touched a metadata predicate the index had not been built for. By month eighteen, the team had three things they had not budgeted for: a query rewriter sat in front of the store to flatten its filters, an offline pipeline that re-indexed a tenant's slice of the corpus when its predicates shifted, and a small ops task, billed monthly, for the reindexing window itself. The store was twenty per cent of the bill. The choreography around the store accounted for the rest of it, and for most of the maintenance burden besides.
We have read this same shape in several rooms now. The studio's reading is partial; we are usually called in once the bill has begun to misbehave, not at the moment the contract was signed. But the pattern is consistent enough that we will say it plainly. A vector store is an architectural decision, not a procurement one, and the cost of treating it as the second usually surfaces around the second renewal, on a bill the original decision-maker is no longer in the room to defend.
What the team eventually did is the boring part of the story. They stood up a Postgres replica with the appropriate extension, copied the embeddings out across a weekend, and rewrote four of their filter predicates to use the columns they had always wanted to filter on. The new system was slower in two of the seven query categories and faster in the other five. It cost roughly a tenth of the previous arrangement. The egress bill, when it arrived, was the smallest part of the migration; the part that took two engineers nine weeks was the schema work, because two years of incidental decisions had been bent around the old store's filter shape, and undoing them required reading every call site.
The schema work is the cost that does not show up on any pricing page. It is the one we now ask about in every first conversation where the words vector and production sit in the same sentence. It is also the cost the team will most often have already paid, quietly, for a decision the practice would have flagged on the day, had it been called.
This is the use of an architecture review before the system has hardened around its smallest mistakes. Not to overturn the original decision. To put it on paper, with the reasoning, while there is still time to revisit it without nine weeks of schema work.
The studio does not, as a rule, hold a strong view on which vector store any given team should pick. We hold a strong view on which decisions are worth twenty minutes of attention and which are worth a written page, and on the engagements where engineering counsel earns the page. A decision in the first category is forgiven by the system. A decision in the second category is paid for, with interest, by the team that has forgotten which afternoon it was made.
The cleanest reading is the one made before the bill begins to misbehave, when the decision is still small enough to undo without nine weeks of schema work. The team we visited has not, so far, had to make the same choice twice.