How do you turn AI coding chaos into a repeatable playbook?

Vivek explains how Snowflake systematically rolled out coding agents across its engineering org — starting with unrestricted experimentation, then codifying what worked into a shared vocabulary of 14 “AI design patterns,” from plan-in-English to fencing off parallel agents to reducing on-call toil through continuously updated skills. Vivek walks through the “inner loop” and “outer loop” of software development, explains Snowflake’s internal Yegge scale for measuring how far engineers have progressed along that continuum, and shares how a three-person team used coding agents to deliver a 40x improvement on Snowflake’s query compiler.

The discussion also:

Breaks down Snowflake’s “focus weeks,” where engineers get dedicated time to either catch up on best practices or push the frontier further.
Explores the pioneers/settlers/skeptics framework for meeting engineers where they are in adopting AI tools, and why the shift can trigger something like the stages of grief.
Covers how Snowflake cut release validation time from 15 days to a single day, and why more automated testing hasn’t come at the cost of production stability.
Looks ahead to a four-step maturity model for on-call and incident response, where agents may eventually take primary on-call duty.

Connect with Vivek Raghunathan on LinkedIn.

TRANSCRIPT

Eira May:
Hi, and welcome to Leaders of Code. We are recording this episode at Snowflake Summit here in San Francisco. If this is your first time tuning in, Leaders of Code is a segment on the Stack Overflow podcast where we get senior engineering leaders in the room and ask them about the work they’re doing, how they go about building great teams and the biggest challenges that they find themselves staring down right now. My name is Eira May. I’m the B2B editor at Stack Overflow, and I’m here with Vivek Raghunathan, who is the SVP of engineering at Snowflake. Welcome to the show.

Vivek Raghunathan:
Thank you for having me.

Eira May:
Yeah. How’s the week been going for you so far?

Vivek Raghunathan:
It’s been incredibly exciting. This is the time of the year where we get to show everyone what we’ve been up to over the last year. It’s just an incredible experience seeing customers interact with and experience the things we’ve been building. This week gives me great energy going into the rest of the year, knowing that we are on the right track. We are building things that will add incredible value to our customers and they share our vision of the future.

Eira May:
Awesome. Speaking about vision of the future, one thing that we’ve been hearing a ton about this week is about how engineering leadership is sort of shifting. So with the cost of code generation, the cost of writing a line of code just sort of approaching zero, the bottleneck is less about writing code, right, and it starts to shift to more kinds of work around managing and orchestrating teams, defining intent, thinking about what you can do strategically to deliver that real business impact. So I wonder for you in your role at Snowflake, have you really thought differently about what engineering leadership looks like in the light of all these changes?

Vivek Raghunathan:
I think the more interesting question in my mind is also how is the business of producing software changing by itself? Because at some level, roles and what people do is an outward metric to how is the business of software resulting in software being produced. Right? We have taken a fairly methodical approach to doing this over the last 12, 18, 24 months. And I’m happy to go through it in great detail. The way I think of this is code gets worked in an engineer’s head, then on their workstation or in a cloud workspace of some form, and it goes through a bunch of stuff before it makes its way into main or master, right?

Eira May:
Sure.

Vivek Raghunathan:
I’ll call that the inner loop of software. There’s the outer loop of software, which is now that code and main has to get released into production. There’s what I’d call the second outer loop of software, which is bugs are found in production and they make their way back into the systems as support tickets or incidents or some kind of like anomaly detection in our systems, and how do fixes in response to those bugs make their way back into main. So I’d say that is like outer loop, too, if you will. And then there’s the question you asked, which is, how’s the roles and responsibilities look like in this new world? How is the act of building software itself changing? And then how do you structure your organizations in response to the fact that the act of building softwares? So I break it up into those five tasks. Happy to dive into any one of them, but we’re making like very methodical progress on each of those five vectors almost building up to the last vector, which was the question.

Eira May:
Yeah. I’d love to just start at the beginning if you could unpack it a little bit for us. That would be great.

Vivek Raghunathan:
Happy to do it. I think the inner loop is at some level the easiest because at its core coding agents make it easy for you to write more code and better code faster. Right?

Eira May:
Sure.

Vivek Raghunathan:
We started very much with an approach to, Andy Grove once said this and it’s a very popular phrase I guess reused and thrown around quite a lot now. He said, any platform shift, you need to let chaos reign before you rein in the chaos. And so what the approach we took initially was, we’re just going to let chaos reign. Habit formation is hard. Habit breaking is hard. We encourage people to use coding agents. We’re just going to measure adoption. We’re not going to measure lines of code. We’re not going to measure PR certain. We’re not going to measure any of these metrics that are easily gameable. We’re just going to measure, do you use this twice a day? 95% of you use it weekly.

Eira May:
Right.

Vivek Raghunathan:
And we’re going to encourage you to use it to write code, review code, understand code, write design docs, do everything, right? We’re going to give you every possible tool. We’re not going to say no to any tool that you want. Right? And clearly that is an age of little more letting chaos reign than raining in the chaos. So that’s step one.
Now you have chaos reigning. A bunch of people are using coding agents. 95% of our engineers use them on a weekly active basis. I think 97. Are all of them equally effective doing it. That’s the second question to ask. Right? There’s a difference between using it to save yourself 20 minutes a day and using it to do 80% of your job. And I would posit that the difference between folks who are using coding agents and mastering coding agents or using it effectively is 14 AI design patterns. I use that word very deliberately. Design patterns for those of us who are in the software engineer industry, gang of four, the book with a whole bunch of like the factory pattern or the command pattern and so on and so forth, and that book created a language of almost how you think about becoming more effective writing software. Right? I think a similar thing will happen with how people are using coding agents.
And so we have discovered I’m going to say about 14 patterns right now. I start them from a numbering of zero because you’re in next and they represent patterns that our most fearless explorers, ours like AI czars, if you will, the people who are at the cutting edge have discovered as effective ways to use coding agents. So I’ll give you an example of some of these patterns.

Eira May:
Yeah, I’d love to. I’d love to hear one.

Vivek Raghunathan:
Pattern one is it says plan in English. And what it means is first plan, use plan mode in your coding agent. First figure out what the plan is in markdown and then write code. Right? And I’m happy to show it to you on my laptop when we… We have an XKCD comic with these 14 so we can easily disseminate knowledge in the organization. Pattern four I believe is fence your robots, and what it means is you can have a single agent and it’s kind of slow. You can start a bunch of agents and have them all work together and then you have chaos or you can have git-worktrees that run each of them independently and invent them a bit and they work on stuff and that can dramatically improve the productivity of most of our engineers.

Eira May:
Okay. So order from chaos.

Vivek Raghunathan:
Order from chaos.

Eira May:
Yeah.

Vivek Raghunathan:
Pattern eight is what I call the TLA pattern or the TLF agents pattern. It is your orchestrator, the agent you’re talking to, the master agent you’re talking to is never holding a lot of context. It is delegating work. It is using an agent team of some form to actually do the work. And so it is always, its brain is free to talk to you. Pattern 11 and 12 are, or 12 and 13 are new patterns. There are patterns around continued learning. There are patterns that recognize that you can min memory and you can promote it into skills overnight and that will make the system get better as you use it. And if you do this in a multiplayer way where everybody in the team is doing it, then you get to harness tribal knowledge. Now each of these 14 patterns or patterns individual are best coding, our best AI forward engineers are using. So now you have these 14 patterns. Now we have a language to speak in terms of how you upskill or reskill the engineering teams, if you will. And when you do that, you can then progress to what I call stage three, which is reign in the chaos. Right?

Eira May:
Yeah.

Vivek Raghunathan:
And there you start taking some of these paved patterns and you say, how do I get more people in the organization using them? Most interesting technique we have discovered is just a simple act of creating space and time for them. So we do these things called focus weeks. We do them pretty regularly and it’s a week where everyone in the org just takes the time off to figure things out, and it serves two purposes. There’s maybe 95% of the organization is what I call exploiters and the term feels like very spicy, but it actually means something very simple. It means they just want to know the paved paths and use them.

Eira May:
Right.

Vivek Raghunathan:
Right?

Eira May:
Just give me the good stuff.

Vivek Raghunathan:
Just give me the good stuff. I don’t want to do all this learning. I just want to use this. Right?

Eira May:
Sure.

Vivek Raghunathan:
My PhD is in RL, so explore and exploit is very common. And so these are the exploiters. There’s 5% of the organization and they need the time to exploit. They don’t know the patterns. They’re like, “I’m too busy.” And then the 5% organization is what I call the fearless explorers. These are the people paving the past. These are people creating these best practices. They will go find the time. They’re doing it on the weekends, they’re doing it in the evenings, they’re doing it… Sometimes when they’re out with their friends, they’re busy hooking up a mobile app to Cortex Code and doing stuff. And these users, these engineers just need the time to explore, and so we give them this week to go and… And I call it raising the floor and raising the bar. So the first guys we’re raising the floor on, the second set of people were raising the bar on. So that’s just the inner loop.

Eira May:
I like that.

Vivek Raghunathan:
Right?

Eira May:
Yeah.

Vivek Raghunathan:
What happens if you do this is, roughly speaking where we are, 97% of our users, of our engineers are weekly actives on coding agents, and code is up about 1.5X in the last year over year, maybe 3X up over the last three years. Time to merge. Like things we know are characteristics of high performing teams, like they review code fast. They’re like a music band really just like riffing off of each other.

Eira May:
Yeah. You start to see them kind of compounding on the… Yeah.
Vivek Raghunathan:
And so those kinds of patterns are all up and to the right. They’re all up between one and 2X-

Eira May:
Cool. Yeah.

Vivek Raghunathan:
… based on every metric. That’s just the inner loop, right?

Eira May:
Sure.

Vivek Raghunathan:
So I said there were five stages. That’s just the inner loop.

Eira May:
Okay.

Vivek Raghunathan:
On the outer loop, we are using AI to basically rethink every step of the outer loop. I think of three steps of the outer loop, right? Can we release code faster and better? The second is can we test harder? Can we have a lot more tests, a lot better tests and a lot higher coverage tests. And the third is, can we debug smarter? On the first step, and this is very important for our customers, our customers, our enterprise customers, we are not a consumer startup. Our customers expect timelines.

Eira May:
For sure.

Vivek Raghunathan:
Our customers expect no bugs to ever reach production, which as a software engineer is hard to pull off.

Eira May:
Yeah. It’s a tall order, for sure.

Vivek Raghunathan:
But this time last year it used to take us… So we go through extensive validation of full releases. We do like hundreds of thousands of tasks. We run every performance benchmark again and again to make sure nothing regresses. Individual queries on customers are part of our regression benchmark. They contribute things in. We cover 90% of everything we could ever possibly cover. That stuff used to take us an inordinate amount of time. Like this time last year, we were up at 15 days to bless a release. Lots of our peers will push releases very slow. I was at a consumer company before we used to push very regularly and so we brought validation time down in the last 12 months to a day and a lot of it has been better tooling, a lot of it has been better discipline, but a lot of it has also been, can we use coding agents to automatically go through the, “Oh, we found a bug. This is a blocker in the release.” We go diagnose what’s going on, potentially put up a PR on GitHub automatically and let the person whose score it is go take a look and say, “Is this fixed? Fix what I saw.”
And so doing that has gotten us down from like 15 days to a day. Lets us release safely. You release all the features Christian announced yesterday. Test it up like three and a half X. LMs make it really easy to write tests and so people have taken advantage. One of the patterns I think is pattern two is basically use coding agents to first write the test for your feature and then write the code, right?

Eira May:
Mm-hmm.

Vivek Raghunathan:
Do a new spin on test driven development from like a decade ago. So doing that, the teams have been doing that. Of course those tests are up 3X, which means while we’re doing faster… People ask me, “If you’re doing like quicker releases and faster releases, does that mean your quality is kind of regressing?”

Eira May:
Right.

Vivek Raghunathan:
And the answer is like, no, we’re also releasing much more safely. There’s far fewer SK patches into production. Finally, observing and debugging, right?

Eira May:
Mm-hmm.

Vivek Raghunathan:
We’re completely rethinking how we do this. We have thousand engineers writing I’m going to say like 7,000 skills at this point. And the core insight is AI lets us completely transform the operational life cycle, or there’s a lot of ops IP stuck in engineer’s heads. I call it tribal knowledge. When this alert goes off, do this.

Eira May:
For sure. Right. Yeah.

Vivek Raghunathan:
And so we’re able to take lot of what would people would call runbooks, which get outdated very quickly, and instead build them as versionable CI/CD workflows of skills and they’re part of the CoCo coding agent. They’re packaged into things called profiles which are units. Like, oh, if you have an issue with streaming, this is the set of skills that you need to debug this issue.

Eira May:
Sure.

Vivek Raghunathan:
All customer issues have a blast radius skill, which is like which customers are affected by this thing?

Eira May:
Right.

Vivek Raghunathan:
Right? And so when you do it, so that’s one. You get tribal knowledge encoded into repeatable workflows. You get complex deterministic workflows which are in people’s heads, now you can put them out into an LLM. And the third thing you can do is like being on call is a real pain in the ass. Most of my teams don’t want to be on call.

Eira May:
Yeah. It’s not a popular thing to do.

Vivek Raghunathan:
And so it reduces the toil a lot. What we call KTLO is about 30% for us and I’m looking to reduce that to like 5%. Right? And I think I have a path to getting there in like, maybe not today, but like in a couple of months. What we’ve discovered is the teams that work with production and coding agents, there’s like a four step maturity model. First, they write a bunch of skills. They encode all their issues that they never come in into a skill that Cortex Code can use. The second thing they do is they do event driven AI. So they hook it up to PagerDuty or Slack or one of these things, so now they can…
The third thing they do is they start taking complex workflows and using an LLM to encode like this multi-step reasoning thing. Go call support, get them to do this with the customer, go do this other thing, ping this person on Slack. And the fourth thing they do, which is the thing I told you earlier is continued learning. So as you discover new things in an incident or an on call investigation, you encode that back into the skills that the coding agent. So far pretty good. The vision is we’ll have primary on call be an agent and then have humans do secondary and tertiary or vice versa. You can have one of the on calls, the triagers be an agent before you and that immediately reduces the actual workload. I want engineers to say, “I like being on call,” because it’s fun, not like, “Oh, this is like something I”-

Eira May:
Right. Like they’re excited to jump on stuff.

Vivek Raghunathan:
Yeah. Not something like, “I only want to be on call once in nine weeks because it’s the worst week of my quarter.” Right?

Eira May:
Right.

Vivek Raghunathan:
So that is roughly how we think of the outer loop. Product is still very much like work in progress. If you saw Snowflake CoCo and Snowflake CoWork, we’re experimenting with how to build products completely differently in those products. We’re finding small empowered teams work really well.

Eira May:
I was going to ask you about that because I know that you’ve talked about like impact per engineer and wanting to like maximize individual impact on a team and you were talking about certain folks who can really like push the envelope and really experiment with the agents. Where do you feel like is the… I guess, how do you go about increasing that impact per engineer, but also not allowing AI to sort of like raise the bar for everybody in a way that just sort of like puts everybody on the same…

Vivek Raghunathan:
Yeah. I mean, it’s a great question, right?

Eira May:
Yeah.

Vivek Raghunathan:
The way I think of this is every big transformation has pioneers, people who are like Lewis and Clark, they will go and explore. Settlers, people who will come and follow the explorers or the pioneers into new territory.

Eira May:
Yeah. Kind of set up shop and sort of put the infrastructure in place.

Vivek Raghunathan:
And then it has the skeptics, right? The resistors. They may not even be doing it explicitly. They may be doing it implicitly. Right? For me, it is very important for us to meet each of these constituencies where they are. I’ve had people walk into my office and say, “Hey, I went through my seven stages of acceptance today.” And then they switched from being that implicit resistor to being a pioneer overnight. Right?

Eira May:
Okay. Yeah.

Vivek Raghunathan:
So we’ll meet everyone where they are. I think one of the realities of the moment is a lot of us who got really good. I mean, if you spent your life getting really good at something and then you were encountered with a DT that is about as good as you are or better, then there’s just like some amount of going through those seven stages of grief, if you will of-

Eira May:
Yeah. You sort of have to think about your value in a different way, I think. Yeah.

Vivek Raghunathan:
And I think a lot of folks in the software industry are going through that and I want to meet them where they are. Like I want to take them along on this journey. I think you asked me what the role of leaders in this world is, and at this moment I would say the biggest thing a leader can do is lead from the front, is show people how to be on that journey, like encounter their own emotions first, make sure they are at the place where they are pioneers and then take their teams along. Now is the time for the charge of the light brigade, if you will, not leading from the back. Right?

Eira May:
Yeah. Yeah. Well said.

Vivek Raghunathan:
And so we find, to your point, it’s actually easy for us to identify the pioneers because they’re also, they’re like kids in a candy store. They’re like, “Hey, I discovered this thing on the weekend and I built all this stuff and I need one hour of your time to tell you about it right now.” And I’m like, “Wait, is this urgent?” And they’re like, “No, it’s very urgent.” And then they show you what they’re doing. So they’re fairly easy to self-identify. They also show up in, they’re suddenly magically hundred times as productive as they used to be. It’s hard to miss them, right?

Eira May:
For sure, for sure.

Vivek Raghunathan:
They’re not always the people you think were the 100X engineers before coding agents. It’s a different set of skills that are being amplified. It is curiosity, it’s adaptability, it’s willingness to learn.

Eira May:
Yeah. Yeah.

Vivek Raghunathan:
And so they’re easy to identify. I like to bring what I call the 95% of people who are exploiting also along. So we think of this as less binary and more as a continuum. Like the more of those best practices I told you, you’re following further along that continuum you are. We internally joke about a Yegge scale, named after Steve Yegge. It’s like, are you Yegge five or are you Yegge seven, and like I’ve almost told my teams, “We need to 5X the number of Yegge sevens in the organization and get more people.” It’s not like we’ll magically hire them or something. It is we’ll take people who are Yegge three and take them closer to Yegge seven.

Eira May:
Transform them. Yeah. I wonder, kind of on a related note, one of the things that I’ve talked about with a bunch of the guests that we’ve had on recently is this sort of like the promise of like agentic coding and vibe coding running into the limitations of scale. And so just like, I mean, from your perspective with the scale environment that Snowflake deals with is that, have you been finding that there’s a real disparity between… Yeah.

Vivek Raghunathan:
No, we’re not. I think if anything, we are seeing that coding agent in the hands of the right people can make amazing things happen. So you heard Christian talk yesterday about the interactive compiler. He talked about a rewrite of our compiler. The compiler is the, you can call it the brains of any query engine. It’s the thing that lots of blood, sweat and tears goes into. It’s the thing that most of the secret sauce of the query engine goes into. Our compiler is often the place where the hardest things that we do go in. So when you see efforts like our iceberg-like support or our dynamic tables feature, those things all take… Or our Unistore feature. Those things all take time because work in the compiler is careful, painstaking.
And sometimes the compiler is just slow, and this matters when you’re dealing with things like interactive workloads, workloads like some of your listeners might use ClickHouse for. Part of the reason an analytic engine like Snowflake doesn’t automatically out of the box work great for those workloads is the compile time can be pretty high. And so if you have a short running query and the compile time is pretty high and you have a bunch of debt costs sitting at the front of the actual query execution. And so our tech lead for the compiler effort said, “Hey, I’m going to take three people. I know all the things that I would do if I had to rewrite this compiler and we’ll rewrite it using coding agents. Me, a set of like coding agents and three engineers.” But these are all engineers that are domain experts. What they’re seeing is things like 40X improvements.

Eira May:
Wow.

Vivek Raghunathan:
Now of course, a lot of that is he knows exactly where the juice in the system is.

Eira May:
Sure. Yeah. Does that surprise you though to have that kind of like…

Vivek Raghunathan:
I mean, it surprised me in a good way.

Eira May:
Yeah.

Vivek Raghunathan:
There are benchmarks that I, we had no chance of ever measuring up to customer workload. The customer was like, “If you do this, I will move this over tomorrow.” And we were like 3X away for years.

Eira May:
Like, boom, we did it.

Vivek Raghunathan:
And it was like, boom, we’re there. Right? And so to your point, I think coding agents make the ambitious possible. So you can have infinite ambition, right? Sometimes at Google, Larry Page would say this, like, “Aim for the moon.” It’s always easier to work on harder problems than easier problems because no one else wants to work on them.

Eira May:
That’s right.

Vivek Raghunathan:
So you have the field to yourself.

Eira May:
Kind of have it to yourself. Yeah, exactly.

Vivek Raghunathan:
Right?

Eira May:
Exactly.

Vivek Raghunathan:
And I think coding agents take ambitious people who want the feel for themselves and suddenly say, “You can do this quickly. You don’t have to do this. It’s not a two year project.” You can go try ambitious things, whether it’s on the product side or the technical side, completely by yourself.

Eira May:
Yeah. I mean, we’ve had a lot of-

Vivek Raghunathan:
Like Cortex Sense, the product we just announced, came out of someone hacking over Christmas break and saying-

Eira May:
Oh really?

Vivek Raghunathan:
“I’m tired of creating these semantic models myself. I’m just going to build a system that does this for me.” And one thing led another and we’re like, “Wait, he’s onto something.” Right?
Eira May:
Yeah. Frustration plus the agentic piece.

Vivek Raghunathan:
So I think technical ambition, product ambition. More ambitious the better. The moment is made for people who are not afraid. We’re very excited to be part of the journey that is just rethinking how data and AI can meet synergistically to create great outcomes. We see our mission as AI that makes data go faster and data that makes AI go faster, and we think if we can create that loop, our customers will see a lot of value. That is the thing I wake up in the morning jazzed by. That’s the thing my teams wake up in the morning jazzed by, and we’re eager for the world to see it as well.

Eira May:
You’ve been listening to Leaders of Code. My name is Eira May. I’m the B2B editor at Stack Overflow, and I just want to thank Vivek Raghunathan for joining us again today to talk about all things Snowflake Summit. Thank you so much for being here.

Vivek Raghunathan:
Thank you again.