Vibe Coding 2.0

Dev Leader Weekly 115

Nov 08, 2025

TL; DR:

Different levels have different trade-offs
Less detail, more randomization
Join me for the live stream (or watch the recording) on Monday, November 10th, at 7:00 PM Pacific!

Vibe Coding 2.0

Let’s talk about vibe coding -- And not the definition we were slapped with initially where we purposefully remain ignorant of what’s going on, copy, paste, and repeat.

Well, we’ll start there... But we’ll work on the idea of pairing developer intent with AI tools to build faster and smarter.

So here’s the catch: there are levels of vibe coding, and the more structure you provide, the better outcomes you’ll get. This probably isn’t new to you, but we’ll try breaking these things down more clearly.

I’ll walk you through a clear progression from minimal prompts to full-agent workflows, using concrete examples of tools like GitHub Spark, GitHub Copilot & agents, and GitHub Copilot Spaces. Use this as a framework to level up how you engage with AI in your dev workflows.

Level 1: Minimal Input → Maximum Chance

Example: Use GitHub Spark and drop in this one-sentence prompt: “Build me a to-do list app with React and Firebase.”

This is true vibe coding in its raw state. You’re giving a high-level directive and letting the AI fill in everything else:

UI
Data structure
Hosting
Deployment.

Spark is built exactly for this kind of flow: you type or speak an idea, it scaffolds the full-stack micro-app. You have minimal control, but you’re also providing in minimal input.

Pros:

Amazing speed (assuming it’s not completely off the rails with what you asked it)
Minimal setup.
You’ll often end up with something live.

Cons:

You are handing off almost everything to chance -- architecture, design decisions, backend logic, scale concerns.
The result may diverge from your intention, or be harder to maintain or extend.

The magic is cool when the magic works. But when it doesn’t? Where do you even start with this kind of situation?

Tips:

Use this level when you’re in “fast experiment” mode, prototyping, or testing ideas.
Expect variability. If you don’t need specific implementations of things, it can be great.

Level 2: Conversational Wiring → Guided Outcomes

Example: You open ChatGPT or Copilot Chat and go back and forth: you say, “I need a microservice in Node.js with REST endpoints for tasks, authentication, and websockets for real-time updates.” The AI asks questions:

Which database?
What sort of auth scheme?
Where will you be hosting this?
Will you deploy containers or serverless
What tasks should we start implementing initially?

You answer, and then the AI outputs code. This is more structured: you’re in dialogue with the AI, refining intent, clarifying constraints, and then letting it generate. By virtue of how things are structured here, you are getting the opportunity to provide feedback into the direction the LLM is headed.

Pros:

Greater alignment between your vision and AI output.
Better maintainability and custom-fit logic.

Cons:

Still, human effort is required to steer the conversation.
You may miss spec details. Without strong specs, you still leave some decision-making to the AI.
If you’re not directly in a code base, you’re (very) manually providing specific context to help the LLM.

Tip: Use this level when you have a clear idea of what you want, but are still comfortable letting AI fill the boilerplate or scaffolding. Tip: This can be SUPER helpful if you aren’t actually looking for a running solution after and want to explore options.

Level 3: Codebase + Instruction Files → Working Within Boundaries

Example:

You have an existing repository.
You add an “instructions.md”, “Agent.md” (or whatever your favorite tooling uses) file that outlines architecture, module boundaries, coding conventions, test coverage targets, branching strategy, APIs. All your favorite things.
You then use Copilot (or a coding agent) to assist with new features or refactors.

For instance, assign a task to the agent: “Add payment-gateway integration conforming to our modular architecture. Use Stripe API, add tests following the existing patterns in the codebase for unit and functional tests, and update docs in the corresponding folders with the same layout as our other feature documentation. Focus on the pathway for features A, B, and C to integrate with payment first -- ignore feature D for now.”

You’re now giving the AI structure, standards, and guidance. The AI is working within your system, not creating from scratch. It has examples to refer to in terms of existing code and generalized instructions that you use consistently.

Pros:

Stronger alignment
More predictable outputs
More maintainable code.
You leverage your domain knowledge + your codebase + AI.

Cons:

You still need to build and tune the instruction files, maintain the culture, and verify outputs.
Requires architectural discipline.
.. Which tool uses which agent instruction file? How many do we need? Why does this feel like it’s in its infancy?!?! (oh... because it is)

Tip: Use this when you’re working on a production codebase, you want AI to amplify your work, and you care about maintainability, quality, and consistency. Tip: If you notice the LLM is frequently going in directions you don’t want, literally ask it to help improve your instruction files and/or prompts for better alignment.

Level 4: Full Agent Workflow + Repository-Wide Context → AI Independent Executor

Example:

You create a custom GitHub Copilot Coding Agent or similar agent (maybe for your good friend, Claude) in your repo.
You set up custom instructions, expected patterns, and test coverage targets.
You embed an “instructions/” directory with task templates, architecture rationales, API contracts.
You open the repo in a tool like GitHub Copilot Workspaces or use Codespaces with agent mode enabled.

You want to start building a feature that might span functionality in your entire repo, or perhaps even across repositories (for your frontend and backend if they’re separate). You start describing what your feature will do and some high-level concepts for how you plan to achieve it.

Now, you ask the LLM for input:

Is there existing code and patterns that can be leveraged?
Is there a library or package that can provide us a lot of this functionality already?
How should we think about security here?
What about performance?
What testing considerations come into play? How can we build confidence in these feature?

You go back and forth with the agent, and ask it specifically about different parts of your codebase -- because it has the full ability to scan across it. And when you’re feeling good about the discussion? You switch gears.

Ask it to start creating a specification based on the requirements in your conversation. Iterate on the spec with the LLM. Ask it to create a task breakdown so that an AI agent will be able to implement the required functionality. Double down on ensuring it adds details to read documentation, follows existing patterns, and uses tests.

When that’s all set? Ask it to open the PR. Assign your favorite agent (or agent swarm!) to go work on it.

Pros:

Highest leverage of the options we’ve discussed.
AI becomes a semi-autonomous contributor. It’s not full autopilot upfront, but it can operate more independently afterwards.
You scale through tooling.
Much more likely to get something “correct”

Cons:

Most upfront investment: give the repo context, build instruction systems, monitor agent behavior, and review.
May require team norms, governance, and security policies.
May not be the most trivial thing to get going, taking iterations to find the blindspots and getting them documented so next time is better.

Tip: Use this when you’re comfortable treating AI as a teammate, you have a mature codebase and process, and you want to scale your output via tooling, not brute human hours.

Why This Progression Matters

More guidance = less randomness. When you give minimal input (Level 1), you get more variability. As you increase guidance (Levels 2 - 4), your outcomes become more predictable and aligned.
Tooling evolves. Tools like Spark assume minimal input. Advanced workflows leverage agents + repository context.
Your role evolves. At Level 1 you’re the idea machine. At Level 2 you become the conversation partner. At Level 3 you’re architect + instruction designer. At Level 4 you’re orchestrator + reviewer of AI output.

How to Apply This in Practice

Map the level to your need.
- Have a quick prototype or side hustle? Use Level 1.
- Building a feature or MVP? Use Level 2.
- Working in a team codebase? Use Level 3 or 4 depending on the maturity level of the codebase, tooling, and docs.
For Levels 3–4, invest in structuring your repo.
- Create an instructions/ directory and/or Agent.md. Pay attention to what your tools require (and note these will likely evolve over time)
- Define architecture, module boundaries, code style, testing approaches.
- Add examples of tasks that previous agents or humans have done, so there are good examples to use, and bad examples to avoid.
- Set up a context space (for example, use Copilot Spaces) so the AI understands your codebase and team conventions.
Monitor and review.
- At all levels, the AI may do something unexpected. Always review, test, and iterate. This is different than “just straight vibez”.
- Track metrics: how long tasks take, how many tasks need human rework, how many build failures.
- As you scale to Level 4, work towards more automated governance: branch policies, better linting and guard rails, PR review rules.
Iterate your instructions.
- If the AI output drifts, refine the instructions or prompt.
- Use examples: “Follow this pattern,” “Avoid this antipattern.”
- Consider having the AI generate organized documentation or update existing documentation along with code. Make sure to steer it here, because it can go overboard with README.md for everything.

Final Thought

“Vibe coding” is evolving from the meme it was when it first hit the Internet. We can “vibe” more safely when we have more guardrails in place, but there’s a progression to getting more out of your tools.

When you give the AI too little structure, you’ll get messy results. Give it more structure and you’ll get predictable, high-quality output. The evolution from Level 1 → Level 4 is about increasing shared understanding:

Your intent
Your architecture
Your standards
Your context

Dev Leader Weekly

Vibe Coding 2.0

Dev Leader Weekly 115

TL; DR:

Vibe Coding 2.0

Level 1: Minimal Input → Maximum Chance

Level 2: Conversational Wiring → Guided Outcomes

Level 3: Codebase + Instruction Files → Working Within Boundaries

Level 4: Full Agent Workflow + Repository-Wide Context → AI Independent Executor

Why This Progression Matters

How to Apply This in Practice

Final Thought

Discussion about this post

Level 1: Minimal Input → Maximum Chance

Level 2: Conversational Wiring → Guided Outcomes

Level 3: Codebase + Instruction Files → Working Within Boundaries

Level 4: Full Agent Workflow + Repository-Wide Context → AI Independent Executor