Vibe coding vs AI-assisted development: where to draw the line
76% of developers now use AI coding tools. Here's the line between AI-assisted development and vibe coding, and why it's the only one worth drawing.
The first time I watched someone vibe code, I felt the same thing I feel watching someone drive with their knees. Technically possible. Impressive for about thirty seconds. And then just a matter of time.
I use AI every single day. I am not writing this from some purist position where I compile my own tools and distrust anything generated by a machine. I use Claude to write boilerplate, draft logic, suggest patterns I would have spent an hour looking up. It has made me faster in the ways that were boring to be slow in.
But I do not let it think for me.
There is a difference, and it matters more than almost anything else I have learned in the last two years of building with these tools.
What is vibe coding?
Vibe coding is prompting an AI without understanding the output, accepting generated code without reading it, and shipping without testing. It is a specific workflow, not a synonym for AI-assisted development. The term was coined by Andrej Karpathy in February 2025.
This distinction matters because people collapse the two categories to either defend or attack AI in development. Using AI to help write code is just programming now. That is what the tools are for. According to the Stack Overflow Developer Survey 2024, 76% of developers are already using or plan to use AI coding tools in their workflow.
Vibe coding is something more specific: prompting without understanding, accepting without reading, shipping without testing. The developer’s job becomes describing what they want and clicking approve.
The output looks like software. It passes the smell test. It runs. And then, three weeks later, something quietly breaks in production and you spend an afternoon staring at code you do not actually understand, written by a model that does not remember writing it.
I have seen this happen to smart people. I have started to do it myself on late nights when I was tired and the model was confident. It is seductive because the short loop feels productive. You say a thing, the code appears, it works. The feedback is immediate and positive.
The cost is invisible until it is not. Vibe coding does not save you the work of understanding your code. It just moves that work to production, where it is most expensive.
How do you use AI coding tools without losing control of your work?
Understand what you want before you prompt. Write the test first, then hand the model a specific, bounded question. The model handles the typing; you stay responsible for the thinking.
My workflow goes in one direction: I understand first, then I use AI to execute faster.
That means I write the unit test before I ask the model for the implementation. Not because I am rigorous by nature, I am not, but because writing the test forces me to know what I actually want. Edge cases have to exist in my head before I have code I am already attached to. Done has to mean something specific before I start.
When I hand that context to the model, the output is different. Not because the model is smarter, it is the same model, but because I am asking a specific, bounded question instead of a vague, open-ended one. The difference between “write me a function that handles payments” and “write me a function that takes a Stripe webhook payload, validates the signature, extracts the event type, and returns a typed result with this shape” is the difference between code that kind of works and code that actually does the thing.
Then I read what comes back. All of it. Even when it is long. Especially when it is long.
This sounds obvious. It is not practiced as often as it sounds.
What are AI coding tools actually good at in software development?
AI coding tools are genuinely valuable for boilerplate generation, surfacing unfamiliar patterns, catching missed edge cases, and writing test cases, provided the developer already understands the shape of what they need. GitHub’s 2022 research on Copilot found developers completed tasks 55% faster, but those gains concentrate in well-specified, bounded work.
The nuance is in that last clause. A randomized controlled trial published by METR in July 2025 found that experienced open-source developers were actually 19% slower when using AI tools on large codebases they already knew well, even though they believed they had been 20% faster. The speedup is real, but it lives in the bounded, unfamiliar work, not in the parts where you already hold the whole system in your head. The tool helps most exactly where you understand least about the typing and most about the goal.
The areas where it actually makes me faster:
Boilerplate I already know the shape of. If I need a repository pattern with five methods, I describe it precisely and get it in thirty seconds instead of fifteen minutes. The AI just types faster than I do.
Surfacing options I had not thought of. I describe a problem and ask what patterns exist, not for the model to pick one, but so I have a more complete menu before I decide. The model does not know my codebase, my constraints, or what I can afford to get wrong.
Catching things I missed. After I write something, I ask the model to review it: edge cases, error paths I skipped, security issues I glossed over. It finds real things often enough that I do this by default now.
Writing extra test cases once the implementation is done. It is good at thinking of inputs I did not try.
What it is not good at: deciding what to build, deciding how to architect something that will need to scale, or writing code I am not equipped to review. When I catch myself in that last situation, I stop and learn the thing first.
This is not a principled stance. It is just the pattern that has saved me the most time over a long enough window.
Why should you write tests before you prompt the AI?
Because writing the test first moves the thinking to the front of the process, where it is cheap, instead of the back, where it is expensive. The test defines the exact contract you hand the model, which turns a vague prompt into a bounded one the model can actually satisfy.
Before I started writing tests first, my AI-assisted sessions had a predictable shape: I would get code back quickly, it would look reasonable, I would drop it in, and it would work until it did not. The failure modes were always in the edges I had not thought about because I had not been forced to think about them before asking.
Writing the test first broke that pattern. It sounds like it adds time, and in the first ten minutes it does. But what it actually does is move the thinking to the front of the process, where it is cheap, instead of the back, where it is expensive.
I use this structure now:
Unit tests first. These define the contract I am handing to the model: what goes in, what should come out, what happens if the input is wrong.
Integration tests against real systems, not mocks. I learned the hard way that mocked tests can pass while the actual integration is broken. The mock is not the system. The system is the system.
End-to-end tests with Playwright for the paths users actually take. Not everything, just the critical flows where a failure would be immediately visible.
Then I read everything the model gives me before I commit it.
That workflow is slower than vibe coding for the first hour. It is faster than vibe coding over any meaningful timescale. I cannot tell you how many hours I have not spent debugging production issues because of this.
Will AI replace senior developers?
No. But it may stall developers who outsource their thinking to it. The judgment senior developers carry is built through accumulated mistakes, not generated on demand, and that is the part AI cannot replicate.
There is a version of this conversation that frames it as: AI will replace junior developers but senior developers are safe because they can guide it.
I do not think this is right, and I think believing it creates a complacency that is more dangerous than the replacement question.
What makes a senior developer valuable is not primarily the ability to generate correct code. It is knowing which code should not exist, which abstractions will turn into debt, which requirements are wrong before you build them. That judgment comes from having been wrong enough times to develop taste.
AI does not have taste. It has pattern completion. It will write you a technically correct solution to the wrong problem with the same confidence it writes a technically correct solution to the right one. It cannot tell the difference. A model that cannot distinguish the right problem from the wrong one is not a collaborator. It is a very fast typist with no stake in the outcome.
If you stop developing judgment because you are outsourcing the thinking to the model, you are not building toward senior. You are extending the period where you do not yet know what you do not know.
This is the risk nobody talks about clearly. The AI makes you feel productive. The velocity metrics look good. But if you are not building mental models of why the code works, you are accumulating a different kind of debt.
Does using AI for coding make you a worse developer?
Only if you stop verifying what you ship. Used with discipline, understand first, prompt second, read everything, test before committing, AI-assisted development makes experienced developers significantly faster without eroding their judgment.
The people I have watched get the most out of these tools are the ones who get more done, not the ones who get more generated. The difference is that they know what done means before they start, and they verify they reached it before they ship.
I am not arguing for slowing down or using fewer tools. I am arguing for staying in the driver’s seat of your own work.
There is a 2024 study from GitClear that tracked 211 million lines of code across repositories before and after AI adoption. They found code churn, code written and then reverted or altered within two weeks, was projected to double in 2024 compared to its 2021 pre-AI baseline, and that copy-pasted code began outpacing refactored code for the first time in the dataset. They attribute part of that to AI-generated code that passed review but failed in practice. I think about that study a lot. The tool that lets you write more code also lets you throw more of it away.
The AI handles the typing. I stay responsible for the thinking.
That is the only arrangement I trust.
How do you write a good prompt for AI coding tools?
Put the understanding in before you write the prompt. A concrete schema, a test the output has to pass, an explicit list of things it should not do, the quality of what you get back scales almost entirely with the precision of what you put in.
This is not a metaphor. It is a literal constraint of how these models work.
When I give Claude a vague instruction, I get a plausible-looking response that requires significant rework. When I give it a specific constraint, a concrete schema, a test it has to pass, a list of things it should not do, I get something I can use with minimal editing.
The cognitive work is not in writing the prompt. It is in understanding the problem well enough to write a precise prompt. If you cannot write the precise prompt, you do not understand the problem. And if you do not understand the problem, you should not be shipping AI-generated code for it.
That sounds harsh. I mean it as a practical filter, not a moral judgment. The filter works. I have shipped less broken code since I started using it.
Why does judgment matter more as AI coding tools get better?
Because better models produce more convincing output, which makes their confident mistakes harder to catch. You are the error bar the model cannot provide, and the more fluent the output gets, the more that role matters.
The models are getting better fast. They are already capable of writing entire files of reasonable code from minimal description. By the time you read this, they may be capable of more.
The better they get, the more important the underlying habit becomes. When the output looks more convincing, it is harder to notice the cases where it is confidently wrong. The model does not have error bars. It does not tell you when it is guessing. It produces output with the same tone whether it is 95% sure or 55% sure.
You are the error bar. Your judgment, your tests, your habit of reading what you commit before you ship it. The model does not provide that. It cannot.
The tools will keep improving. The discipline has to improve with them.
I am not nostalgic for writing more code by hand. I do not want to go back to the era before these tools. But I am very aware that the floor of what is possible with AI and no judgment is getting lower as the ceiling of what is possible with AI and good judgment gets higher.
The developers I see getting the most out of AI right now are the ones with the most judgment, not the least. That probably tells you something.