Mark & Carl talk with Swizec Teller about using AI at work

Transcript from Wednesday April 29th, 2026

[00:00:26] Introductions
[00:02:36] What convinced you AI code tools were worth using?
[00:04:24] Using early ChatGPT for DB migrations
[00:06:59] Watching AI use a command-line
[00:08:16] Background chat agents
[00:08:58] Staying very hands-on while using AI tools
[00:11:00] Driving AI closely without reading its code
[00:17:50] Mark's workflow; OpenCode with CodeNomad UI, plus IDE+git UI. Opus 4.6 on API
[00:20:39] Swizec's workflow, latest Cursor on Opus
[00:23:58] Carl's workflow, mostly Claude Code but looking at custom orchestrators
[00:25:23] Exploring fully autonomous agents
[00:28:08] Mark's AI debugging work in React core
[00:31:42] Value of providing more context
[00:33:58] AI-owned documentation
[00:37:09] Using good engineering practices still matters?
[00:40:47] How do you know the right code to make?
[00:42:49] Good communication still matters
[00:45:13] What will "review" look like in the future?
[00:46:17] Automating functionality tests with deployment practices
[00:51:49] What behaviors belong to the agent, and what fundamentally can't be part of the agent?
[00:56:17] Impacts of LLMs on software engineering?
[01:03:22] A superpower right now is a domain expert who can kind of code

This Month in React April

Carl: Hello, everyone. Thank you for joining us for this month in React, which is not going to be particularly React heavy. [00:00:00]

Mark and I have been talking for a couple of weeks now about doing a, like, bonus episode of sorts to talk about AI and how we're using it. So we are just, instead of a bonus episode, we're, we're just gonna do that for this month, for April. [00:00:06]

Yeah, and apologies for last month. We had a recording problem, and we, it was completely unsalvageable, just nothing to save. Big bummer. [00:00:19]

Introductions

Carl: But yeah, so I am Carl. I am joined this month by Mark Erickson and Swizec Teller. yeah, we're gonna talk about AI because it's been a huge part of each of our workflows for the last, like, ranging from, like, three to six months to a year or more. [00:00:26]

Let's do some intros first. I guess Mark and I are reasonably well known, but I'm Carl. I am a staff level software engineer and engineering manager and community lead here at Reactiflux, where I do events like this and build code to keep the community operating. [00:00:43]

Mark: I'm Mark Erickson. My day job is ReplayIO, where we've built a time traveling debugger for both humans and agents with ReplayMCP now available. Please check out our blog. I just put up a blog post on how Replay found a bug faster than Dan Abermov did. I am still the Redex maintainer. Honestly, I haven't done much Redux stuff in the last few months because all my brain space has been taken up with day job work. And also, I'm going around to a whole bunch of conferences this year. [00:00:58]

Swizec: I'm Swiz. I work at Plasmidsaurus, we are a DNA sequencing as a service company. We do a lot of React for really fancy data visualizations for stuff like, "Hey, wh- how do you visualize a few million data points in the browser and make it work smooth?" Stuff like that. And these days, I'm kind of more of a manager than an IC really. [00:01:23]

And I've been thinking a lot about what kind of engineers get hired these days. We've been hiring a lot and using more and more AI to write the code. [00:01:45]

Carl: Yep. Believe that. Cool. Yeah. So we were just chatting a little bit about what shape this conversation's gonna take. Just to level set a little bit for everyone listening. [00:01:55]

We're gonna start off kind of at the point of, like, what convinced us that AI was a tool worth taking seriously and, you know, getting AI pilled as it were. Go from there into how we're using it now, what problems we're using it to solve, with what tools, as well as kind of, like, what aren't we using, what don't we find useful and compelling? [00:02:04]

And go from there to, like, landscape of, like, what tools are available, what's out there, where do we think it's gonna go, and then kinda close out with what do we think the impacts are gonna be on the industry more broadly. [00:02:24]

What convinced you AI code tools were worth using?

Carl: Mark, you wanna start us off talking about what convinced you that AI was worth using? [00:02:36]

Mark: Sure. A year ago, I was dead set that I would never, ever allow AI to write code for me. It was a fate worse than death. It was destroying my career. I refused to do it. And in fact, I actually wrote a 15,000 word blog post over the weekend that I haven't published yet that will give the long form version of this story. [00:02:41]

The short form is, over the summer last year, I cautiously started using AI to explain an existing code base to me. You know, just give me some architecture docs, walk me through the data flow. And then there was a three-day period in late August that blew my mind. On a Tuesday, I asked it to write some redux unit tests for me because my brain was too tired to write actual code, and it did, and I was stunned. [00:03:02]

On Wednesday, there was a node compression library that I've been trying to replace, but the alternative didn't have all the features we needed, and it's Rust-based. And I tried asking the AI to write the feature for me in the Rust library, and it did. It actually didn't quite work right, and the maintainer had to turn down the PR, but this was the first time I saw an agent actually just crank along and spit out a bunch of code and happily make a bunch of updates. [00:03:28]

And I thought I had a, a good understanding of what that process looked like. And then when I saw it in person for the first time, my jaw dropped. And then on a Thursday, I needed to write some AST-based linting code. I know what ASTs are. I've used Babbel. I understand the concepts, but it's kind of complicated and fiddly, and we had a custom setup. [00:03:54]

I was like, "Could this do it for me? " And it did. And it did it much faster than I could as a person. And my worldview got destroyed. [00:04:13]

Carl: Yep. Sounds familiar. [00:04:22]

Using early ChatGPT for DB migrations

Swizec: I mean, that, that sounds like an amazing experience. For me, the first was way back in the, uh, Stone Age where you had to talk to ChatGPT and then copy paste the output to try to run it. [00:04:24]

And it was like, I think it was the holidays, and I was writing a book feeling kind of discouraged, and I was like, "I wonder how many words I'm writing per day." And it's like, I can write Python. It's not that interesting to parse a markdown file, go through, get history and see how many words you added every day. [00:04:39]

Um, so I was like, "Maybe ChatGPT can just write this for me. " And I asked it, and the code, the code didn't work, but it ran. And I thought that was really cool. So I talked to it a little bit more, and we ended up with a really nice ... It was an extremely ugly code that I would never write myself. I think it ended up still taking two or three hours, but it was a lot more interesting than me doing it. [00:04:58]

Um, and I ended up with a nationalization of how my, how my book is doing, and then I wrote a few more scripts like that. And then I started using it at work to write my database migrations, because- [00:05:25]

Mark: Please tell me, was it still just the ChatGPT UI? [00:05:39]

Swizec: It was. Uh, this was, this, this was before Copilot. I was like, "Hey, ChatGPT, I have this SQL query, please write migration." [00:05:41]

And it wrote the migrations or I ended up, uh, later on just copy pasting table definitions from, like, DBR or DataGrape or whatever, go look at the current tables and be like, "I have these tables. Please write migration for ... " I think we were using Connects at the time. "Please write migration to add these columns and then copy paste back and it worked. [00:05:49]

"And I was doing that for a while and it was amazing. I never wrote a migration again in my life. [00:06:13]

Carl: Yeah. I started doing a lot more migrations because it just became so much easier to actually do them. There's such a pain, there's so mechanical. That was an early one for me too that was like ... I guess I actually had been, like, scoping my work to avoid migrations because I hate doing them so much. [00:06:18]

I had a similar experience of just like, " Oh my God, this works for that. I can do so many more of them now because I don't have to do it. "Yeah, it's a little funny because actually database migrations are a big part of why my career looks the way it does, because I did, like, one in, you know, my first year as a software engineer. [00:06:34]

I went," Wow, I hate this. You can't guarantee anything. There's just, you just gotta try it. "And so then I went more front end because you don't have to do database migrations on the front end. [00:06:49]

Watching AI use a command-line

Carl: Yeah. I had a similar type of experience as Mark. I guess I, I got on it much later than you did, Swizz. Sometime around, like, August, July, August, September of last year, you know, I, I had done kind of like ChatGPT prototypes or whatever, like Claude, having it write some HTML in line and just, like, prove out a concept or write a first draft of a script, like, prove out does this work? [00:06:59]

Give me off the blank page and give me something to grow from and diagnose. And then I'd been hearing people talk about Claude code, so I finally gave that a shot in, like, I think around July, August for the first time. And just watching it, you know, like, Mark, like you said, of just watching it churn through and do things and also having it be, like, native on the command line and watching it just run exactly the same Bash commands that I would to diagnose something and, like, read the files. [00:07:23]

It was like, "Oh, this is doing exactly the same process I would to resolve this bug, but it's doing it five times faster than I can." That was my big, like, light switch moment of, like, "Oh, I need to take this really seriously." [00:07:54]

Swizec: For me, it's kind of like I really hate watching it work. I've been using AI for a while doing, like," Oh, can you write this function for me? [00:08:07]

Background chat agents

Swizec: Or can you, like, do little small things? "But my workflow really changed when Cursor launched Slack background agents. So I started in Slack, like, when you get ... Partially, I'm a PM, so I get a lot of requests that are like, " That is definitely not a priority right now. We're not gonna work on that. "But I can now go at Cursor, do the thing, and I just get a PR with the implemented small thing that I never would've taken the time to do myself. [00:08:16]

And I, I find that amazing because watching it work, I think is, for me, is too distracting. [00:08:46]

Mark: Yeah. [00:08:51]

Swizec: So I like it when it's fully somewhere in the cloud. I don't have to worry. I just review the code when it's ready. [00:08:51]

Staying very hands-on while using AI tools

Mark: My workflow is very hands-on and human in the loop. Like, I'm sitting there watching that thing like a hawk and having conversations with it, which I realize is not how most people are using AI at this point. [00:08:58]

I think a lot of people are very much on the, in the cloud, multi-agents, how many of these th- things can be run in parallel, which I now have opinions on. Part of it is ... Well, a lot, a lot of it's the point that I'm making the draft blog post, which is understanding is still critical. And I'm a very, I'm a very firm believer in, you know, the fundamentals and understanding and building a mental model of the system. [00:09:10]

And I personally, for me, I want my brain engaged. I want to be thinking through the problem, and then I'm using the agent to amplify my own abilities. Now, don't get me wrong, there's definitely been a few moments where I was, like, you know, out and about. It's like, you know, it actually would be kind of nice if I could just pull up my phone and tell an agent, go do this thing. [00:09:34]

Like, I, I, don't get me wrong. I get the appeal. [00:09:54]

Carl: Yeah. [00:09:56]

Mark: But in terms of day-to-day development work and how I approach programming, I want my brain active, engaged, and thinking, not just handing it off to an agent. [00:09:56]

Carl: Yeah. I hear that. I think I use it in both ways. There are a lot of things that, like you said, so it's just, like, it's not quite high priority enough to justify. [00:10:08]

And that's where I'll use, like, more of a background agent. Like, you know, here's 15 one line descriptions and, like, just from the one sentence, it's clear enough what's needed. Like, you know, adjust the size of this. Add an element that controls this. Like, those are mechanical enough that it's really just, like, getting it right, making sure it compiles. I don't currently have a functioning workflow for that exactly. [00:10:20]

What I'm doing right now in the last four or five weeks just, like, hasn't really been of that type of development. So I am still, day-to-day, most of what I've been doing is very much, like, the mech suit variant. You know, people talk about, like, automated versus mech suit, like robot or mech suit. [00:10:44]

Driving AI closely without reading its code

Carl: And yeah, so I'm driving it pretty closely, but I'm not really reviewing its output very strictly. I'm, you know, I tell it to use subagents, and then every so often I'll, you know, tell it to write me a report, like, how is this working? What is it doing? Ask it some targeted questions to make sure that the mental model I have matches what's actually there. [00:11:00]

But that's a little different, I think, than what you described, Mark. Like, I'm not actually, like, I don't read almost any of the code. I try and think of it more from, like, my engineering manager, tech lead, product manager hat. I do really try and sit in engineering manager/product manager role and say, like, "Okay, how did those people talk to me as an engineer? [00:11:17]

Like, they were not technical. They didn't understand anything of what was being built, and yet it was their job to make sure it was functioning as intended." So that's very much how I try to think about my work now is from that perspective. It's like I, instead of reading a line of code to ensure that patterns are followed as I intended, like, set up a lint rule and then, you know, maybe it's only 80% is good, it's not gonna catch all the subtleties, but, like, my experience of working on a team is, that's kinda how it works anyway. [00:11:37]

Like, if it's not captured in the lint rule, eventually it will break. Like, eventually, someone will have been onboarded and not ha- be deeply steeped in the history of the project or just be tired that day and they forget about it or whatever. So, like, that was very much my perspective as, uh, you know, when I've been a tech lead is, like, if bad code goes out, you know, the blameless postmortem, it was a process failure. [00:12:06]

This should not have been permitted by the automated checks. And so that's kind of how I'm thinking about it is just remove the abstraction a little bit and guarantee the outputs and then work on automated tracking of automated evaluation of the code quality to make sure that's at a level that I need it to be. [00:12:30]

Swizec: Yeah, same. I think that's my perspective as well. I've ... Going up into tech lead was really kind of almost broke me as an engineer because I was like, "Oh my God, all of this terrible, awful code, but it works and it's fine and everything is okay and the team just handles it. And if anything is wrong later, we just fix it. It's totally fine." [00:12:48]

And I kind of treat AI the same way. I, I do a lot of driving from, like, a product perspective and for critical features or, like, super gnarly business requirements, I go into the ID and I drive it kind of like what Mark was describing, reviewing every line of code, clicking yes, et cetera. [00:13:12]

But they do really well with follow-ups. Cursor gives you the wrong thing back in a Slack thread. You just say, "Ed curs- ad cursor, go fix it. " A lot of the times, those fixes are actually, "Oh, yeah. Now that I'm holding a working MVP, it doesn't actually feel right. I totally didn't even think of several features that it needs before it's useful." [00:13:29]

I also really like the code review flow. I don't know exactly what we did, but we hooked it up so that you can do a PR review, like a code review for your cursor agent session. And in GitHub, you just go at cursor, fix this, or you give it information basically the same as you would with a team member, and then you just get follow-up, commits, and it fixes all the things. [00:13:49]

Carl: That's very much been my experience as well. One of my biggest struggles as a, as an engineer is, like, I remember hearing about this sort of a stereotype of, like, give it, give this task to a junior engineer because they don't know it's impossible, you know? The sense of as you get more experienced, you get to see more of the complexity, and the complexity makes it harder to take action because you're now evaluating trade-offs instead of just kind of ignoring them because you don't know they're there. [00:14:10]

And I would get so stalled into analysis paralysis and, like, what's the correct way to do this? How do you do this best? What's gonna have the best maintenance trade-offs and the lowest, you know, the tech debt and whatever? And that's just such a difficult way to actually solve, especially a novel problem. Like, if you understand a problem well, great. You can specify everything in advance. But, like, if you're working at the edge of your knowledge, at the e- edge of your understanding, then, like, you're gonna go the wrong way at first. And it's so painful to work on something for three weeks and then go like, "Oh, shit. This is just completely the wrong architecture and I need to start over." [00:14:34]

But it's so much less painful now with AI because, like, great, what a wonderful learning. Let me take that, let me have it write a three-page document about what we learned, what the new problems are, go back and forth, interact with it about, like, designing a new data model and architectural, you know, process flow. [00:15:09]

I actually just did this in the last, like, week. I've been benchmarking local LLMs because I want to better understand which ones work for what tasks so I can not have all of my AI usage be based on some mystery frontier model in the cloud. [00:15:28]

And I just, like, you know, I started with just, like, "Hey, give me a script that will run this model. Start up a, you know, LLM server and then run prompts against it. " And then, like, that grows, "Oh, I need, I wanna be able to kill it and restart. Oh, I wanna be able to, you know, version the prompts as I change them. Oh, I need to have it, you know, run a code evaluator when I'm giving it, like, a code generation challenge." [00:15:42]

Eventually, it became un- unmanageable because of the tech debt. It's, you know, it grew and expanded and I learned more about what was needed. And so I just started over and I said, "Great. This is a Python ball of mess and it sucks and I hate it." So let's talk about the architectural flow of it and now let's add some lint rules and let's rewrite it in effect. [00:16:06]

So it's actually like a high quality code base with, like, resumability and scheduling and whatever. And now it's working way better and it's, like, maintainable. It's been really powerful. [00:16:26]

Mark: I freely admit that the very hands-on style that I've got is intentionally learning. I know it's possible to go faster. [00:16:35]

I know that it's possible to delegate a lot more. I am treating it much more as an extension of my IDE and keyboard than I am as a junior developer delegating the work. That's fine. I am good with that. That is what works best for me and my brain. [00:16:43]

Swizec: You might start doing that less and less as you use it more. [00:17:01]

I started with doing that a lot more, and then eventually I realized, wait a minute, I'm just clicking yes on everything and then giving it a follow-up prompt to change the things I didn't actually like, but click yes on anyway. [00:17:04]

Mark: I certainly didn't think a year ago I would be using any of this at all. I was, I was convinced that was a red line that I would, I would never, ever cross on pain of death, and here I am. [00:17:16]

But also, like, I mean, I've, I've found a workflow that I will be hopefully blogging about in the next day, next couple days that does actually work for me. And the hands-on aspect of it is what I get good results out of, and it's what fits my brain. [00:17:25]

Carl: Well, okay, let's take that as jumping off point. Like, what tools [00:17:40]

Let's do, like, super concise, just, like, list what tools and what models you're using day-to-day. [00:17:44]

Mark's workflow; OpenCode with CodeNomad UI, plus IDE+git UI. Opus 4.6 on API

Mark: So my, my own setup, my intro was the kilocode VS code extension with, you know, it was probably either Sunat 35 or Sonat4, whatever was available around September-ish last fall because I didn't want to go anything command line at first. I wanted to stay in a graphical environment. [00:17:50]

I tried Claude Code for about a day. Tried the VSCode extension, which didn't work at all, and then I tried the command line tool, and I, I did not like it. I tried the OpenCode command line tool, also did not like it. How do y'all deal with copy paste in a command line environment? [00:18:09]

So what I have now, I'm using OpenCode. My personal laptop is Windows. My work laptop is Windows, but I work in WSL. So I actually serve OpenCode from within the WSL environment, and then I use a third party web UI for OpenCode. [00:18:24]

OpenCode has a very nice server client distinction. The text client is just one of the possible clients. So I found one called Code Nomad, which is very good, works great. So I actually serve Code Nomad plus OpenCode added the WSL side, and then I just hit local host whatever port in my browser on the Windows side, which also avoids any cross-platform file shenanigans as well. So I've got my chat sessions and my tabs open in the browser. That's now my development environment. [00:18:42]

And then I still have VS Code open for looking at some of the diffs and editing that my ... I much prefer a Git graphical client called Fork, but it doesn't work well in Linux, so I've just stuck with the built-in VS Code stuff. I actually don't like it, but I've been too lazy to go find a better alternative. [00:19:16]

Model-wise, I've basically been on whatever the latest and greatest anthropic is. We have corporate keys, da, da, da, da, da, da. [00:19:32]

Also, I have somewhat intentionally tried to avoid model hopping. I don't want to be running a bunch of evals every other week and saying, "Oh, this one provides me a 3% increase in such and such a benchmark. Clearly, I need to switch my entire workflow to a different model." and granted, OpenCode does let you just pick and choose what model you want, but I'm trying to get something that's good and consistent and that I know works, not chasing the hypothetical maximum performance. [00:19:39]

Carl: Okay. So you're, you're on OpenCode and using latest Claude Anthropic models just through API usage? [00:20:10]

Mark: O- Opus 46, yeah. That's the other thing. Uh, it's, it's API keys, not the various, like, you know, $20, $100, $200 max plans. So I, I read about people getting the resets and it's like, I haven't do with that because I'm also not paying for it. [00:20:19]

Carl: Right. Cool. Okay. [00:20:32]

Mark: It's like knowledge that my experience may not be universally shared. [00:20:34]

Carl: Okay. Yeah. [00:20:38]

Swizec's workflow, latest Cursor on Opus

Carl: Yeah, Swizz, what's your workflow look like? I'm curious, I'm very curious about yours because I think you're by far the most advanced agent user between the three of us. [00:20:39]

Swizec: Which is really funny because I was surprised to learn earlier this week that 97% of my code is AI because I really talked. I was like, "What?" [00:20:47]

But I actually have an extremely simple setup. I use Cursor, I keep it updated to whatever the latest version. So when it pops up with, "Please update," I click the button. I think the latest Opus model or whatever it is, it's like something 4.6. I think it's an Anthropic model. We have a Team subscription, so CompanyPays gives us an infinite budget, but I think I'm actually using, like [00:20:57]

It's actually funny. I don't know ... We were just talking about this today. I don't know how this happened, but on the Cursor Leaderboard, I have the second most AI usage of all of the engineers, and I have the least amount of dollars used. [00:21:25]

Carl: Fascinating. Interesting. [00:21:40]

Swizec: I have literally spent $25 this month. So anyway, I use Cursor, I use @cursor on Slack a lot, and I use @cursor on GitHub a lot. [00:21:41]

We also use Linear, and I've started more and more delegating my linear tickets to cursor. So look at the linear ticket, be like, "Eh, I didn't write this well, add a little bit more context so that a dumb bot can know where to go fix things or what to change, and then just click delegate to cursor, and then I review the PRs." [00:21:53]

That's my workflow. I try to keep it really simple and easy, because like Mark said, I'm not looking to super maximize my productivity. I'm more looking for, you know, as a manager, I'm supposed to stay on SiteQuest anyway, so this is a really good way to code between during, uh, between and during meetings when I'm supposed to be paying just like half attention. [00:22:13]

Carl: Yeah, that's really interesting. I, I, I like that you two are like kind of two ends of the spectrum here. So as you use it as like very much in a managerial capacity, like, you know, the same as you would ping a colleague, you ping cursor instead and just say, "Hey, do look at this. " [00:22:36]

Swizec: Yeah, pretty much. I, I've been doing that a lot. [00:22:52]

Carl: And it's all within collaboration platforms, I guess is like a big distinction here. It's all in space- A lot of it is. Yeah. Okay. [00:22:54]

Swizec: We're now experimenting with building like a feedback loop so that we would have a bot that Sentry sends us errors and we're thinking of having a bot that looks at those errors, figures out how to fix them, and then just issues a PR when we get, so that we could have automatic PRs- [00:23:03]

Carl: Yeah. [00:23:20]

Swizec: for, at least for some er- no, like, you know, probably not for super critical, crazy, important things, but there's a lot of errors that happen where it's not that important, but it's nice to fix. [00:23:20]

Carl: Yeah, definitely. Right, the long tail of Century issues. [00:23:34]

Swizec: But when I'm like actually hands-on coding, I use it a lot to write my tests because I know what I want to test and I can describe the situation to set up, but I hate doing the grunt work of setting up your database in just the right way to have 50 models, et cetera, [00:23:37]

Carl: yeah. [00:23:58]

Carl's workflow, mostly Claude Code but looking at custom orchestrators

Carl: Yeah. Okay. Interesting. I, I kind of split y'all's experience a little bit, or I'm working towards ... I feel like I'm closer to where Mark is right now and I'm trying to move towards where you are, Swiz. Yeah, I almost exclusively am using Claude code with Opus. I've played around a bit with ... Like I, I used Copilot, you know, GitHub Copilot, like once to, like, scaffold a prototype that I was experimenting with. [00:23:58]

I don't like the experience of it being in a web browser. Like, something about it ... I don't know. I also don't like code spaces or, like, remote development across SSH. So, like, this may be just my own, like, biases and preferences, but I really like just ... It's right here. It's on my machine. It's an environment that I have set up. [00:24:22]

I know exactly what's available to it and what's not. Just, like, I guess I've prioritized, at least in that where I'm trying to use it like a Mac suit if speeding up my own pr- individual productivity, I just want it to be predictable and understandable for myself. To me, that's very much just been Claude code with whatever defaults pretty much. [00:24:40]

I've played around a little bit here and there, but yeah, I'm trying to work towards ... One of my many projects that I have in flight is a personal, like, orchestrator. So I had started out kind of my foray into, like, earnest AI focused, using it to the extent that I am now, as opposed to more limited reading every line of code, going in and editing it myself. [00:25:01]

Exploring fully autonomous agents

Carl: That's only really been since, like, February when I started playing this AI agent game. And it's been really interesting thinking about ... Because putting an agent in a simulated world and, like, having it autonomously play a game and do so in an interesting way, you know, effectively and interestingly, really just, like, got a bunch of wheels turning in my head. [00:25:23]

And so, like, Claude Code works really well for me for, like, pushing the envelope, like, expanding what a project is, what it can do, and I have really enjoyed using background agents more, like, very much like what you described to us of just, "Hey, oh, fix this. This is broken. Here's this bug." I don't currently have anything like that operational, but I did get my little personal orchestrator working for a minute. [00:25:45]

It was just wor- working on the GitHub API and I just said, like, "Here's a repo, like, go, you know, read the issues, triage them, prioritize them, take a task, open a PR, and wait for me to review it. " And it worked, but I didn't give it any guardrails, so it ended up, like, redoing the same GitHub issue multiple times, or, you know, it would re-review the same PR over and over again with, like, a page and a half of a comment. And so I was like, "All right, okay, this is proof of concept, it works, but, like, clearly I need some additional things set up." [00:26:10]

And I guess one, another thing I wanna say, like, kind of what you said was about, like, using Sentry Data as an input for what to work on. I think that's the frontier. Like, that's what is currently being explored. Like, how do you do that effectively? [00:26:43]

Swizec: Yeah. You need the feedback loops. So I was reading a lot of how to build an agent papers the other day, and the main inputs are basically rag, memory, and feedback loops, and from there, it can do a lot. So Cursor, for example, Cursor Cloud agents through Slack, they actually, when they're developing the feature, they will fire up a browser and go test it, see if they can actually do the thing, and if they can't, they will then c- continue iterating. [00:26:56]

And in the end, the PR doesn't just have code. It has screenshots and videos of the working feature, which is what I require from all of my engineers, and it makes reviews so much easier. And there was another ... Oh, yeah. So I think the longest I've managed to do, to have it spin when I asked it to build an entire feature. [00:27:27]

So, like, I would write a maybe 200 word, two or 300 word prompt in Slack to add cursor, and it took ... I think it spent 45 minutes to come back with the working feature, and that was amazing, because I was in a meeting for that whole time, and then I just looked at the PR and gave it feedback. [00:27:47]

Carl: Yeah. That's incredible. [00:28:06]

Mark's AI debugging work in React core

Mark: So my day job, we're coming at it from a similar but also sort of opposite angle. We, we have built a time driver to bugger, and originally the premise was that by making a DVR style recording, you as a person can go in and do all the investigating in the lines of code and the print statements and everything else so that eventually you as a person figure out why this was broken. [00:28:08]

So we shipped an MCP a couple months ago, and I've already seen some very real examples of agents being able to go in and solve bugs that they wouldn't have been able to otherwise. I actually put up a post about this a week ago. Dan Abermov had filed an actual React bug saying that the used deferred value hook sometimes fails in production, it's stuck, like a render behind. [00:28:28]

And he had a repro and he said, uh, "I've had my agent try to look at it, but I can't find the answer." A month later, he comes back and files like a four-line fix deep in the guts of React Scheduler to actually fix it. And he posted on BlueSky later, and apparently what he had to do was rebuild the React Library with a bunch of console logging added so that his agent could look at the prod build and eventually trace what was going on and figure out how to fix it. [00:28:52]

So I'm like, "That would be a great marketing post comparison." So I took his example, I made replay recordings of the working dev build and the failing prod build, handed them to an agent, and I said, "Here's a bug report. Here's the two replay recordings. The issue is somewhere in React. Can you find it?" took 10 minutes. [00:29:18]

So I'm like, so then I'm like, okay, well now let's make it like, you know, something resembling a proper experiment. So I took the same two recordings and I spun up four simultaneous sessions with differing instructions. [00:29:38]

The first one was just a basic, here's a bug report, go investigate, actually less context, the proof of concept. The second one, I gave it like a eight step investigative process to follow. The third one had a few paragraphs just naming some concepts and Reacts internals, like not even file structure, but just things like, you know, schedulers, fibers, lanes. And then the last one also explicitly listed some of the replay MCP tools we have available. [00:29:50]

All four of the sessions found the same bug and suggested the correct fix. They took respectively 28, 17, eight, and seven minutes. So on the one hand, you know, ob- obligatory sales pitch, replay recording, finding bugs, it's awesome, it's great. We're building some cool stuff with it. But it also showed me a lot about the value of the prompts and the context that you give. [00:30:18]

There was another example I was working on where someone had made a, an example NextJS app with some example bugs in it and was asking an agent to find them. And in one of them, there's a, there's a double loading screen that's caused by mixing and matching suspense and Tanstack query loading state. [00:30:44]

The AI always suggests the use suspense query hook, but apparently if you do that, it leads to a hydration mismatch error. And the real answer is to do some server pre-fetching instead. So like you have to think bigger, think architecturally. So I did the same thing. I tried, you know, making some re- replay recordings and feeding in the, the varying instructions. [00:31:00]

My agents mostly got the same base level used suspense query error. And then I tried one more session where I fed in the NextJS and TanStack query skills files. And now the agent actually said, "Well, useSuspense query is the initial fix, but you really ought to do server pre-fetching instead." [00:31:21]

Value of providing more context

Mark: So again, like that, that taught me a lot about the proper context. That also says a lot as we're trying to build our own CI debugging agent that looks at test recordings, so clearly we need to give it, you know, good prompts, user-based context, all that kind of stuff. [00:31:42]

So, I mean, I think even something like that can maybe help explain some of the differences and results that different people get. [00:31:59]

Carl: Yeah, no, the impact of prompting and available context is, like, so massive. As we're talking workflow, something I do very regularly is, you know, start with a prompt, start with, like, if I'm gonna add a new feature or redesign something or what, you know, a large task, not a bug fix, you know, a sprints project, not a ticket, then I'll start off with, like, a 200-word, like, prompt of, like, "Here's what we're trying to build. [00:32:05]

Here's how I think we will need to do it. Here's a couple of parts that I think are gonna be important." You know, you know, have it explore, do the implementation, review, you know, and then I manually confirm. And then I'll tell it, like ... And I guess I'll do this at a couple of points throughout as it's exploring and, like, evaluating things. [00:32:32]

I'll ask it, like, "How was the documentation? Like, did you find everything you needed?" And that has been, like, totally transformative for, I think for token usage as well, because, like, just by asking that, it'll say, like, "Oh yeah, it was all ... Here's the three files I referenced and I got everything I needed." [00:32:47]

Or, like, "Oh, yeah, this file was out of date. You know, I did the wrong thing at first and then I had to go back and rework it. " So I'd recommend that we update this" and it makes those recommendations just based off its own experience of reading the code. And so, like, just by doing that fairly regularly as the code base evolves, I manage to keep, you know, the documentation pretty well up to date. [00:33:04]

Swizec: Are you asking it to write its findings back into documentation or are you updating the docs? [00:33:27]

Carl: I'll read it with a skeptical eye of, you know, like, does this sound like it is a real problem or not, a real area for improvement? And then I'll just say, like, "Yeah, hey, can you update those docs?" Often, I'll come at it with a pretty clear intention of, like, we just revamped how scheduling works. Let's analyze the architectural overview document that we have and see if it's still accurate and then tell it ... [00:33:32]

AI-owned documentation

Carl: You know, so, like, I will do some documentation specific projects with that in mind, but for the most part, yeah, just, like, ask it how the experience of the familiarizing itself with the code base is, and then give it free reign to improve that for the most part. [00:33:58]

Swizec: Yeah. I think one thing I would be worried with that we have some people in the team experimenting with that, and they all agree that after a few iterations, like, a few weeks later or a few months later, you end up with very spaghettified documentation that's essentially just AI slob that humans definitely don't wanna read, but even the AI barely wants to look at it and read it because it just keeps getting low, lower and lower signal. [00:34:13]

So I think ... I don't have a solution for that, but it's a thing I've heard. [00:34:40]

Carl: That's fair. I haven't really evaluated that recently. I did come at it with trying to say, like, "Here's five documents at these levels of abstraction in the code base." So I do try to give it a little bit of overall stru- structural guidance, but that's a good thing to, that's a good thing to look out for. [00:34:45]

I haven't really evaluated it. That's also how I get my own mental model of the code. So I guess, like, I don't necessarily write it, but I will read it, and if it's nonsense spaghetti, then it's like, "No, we gotta do this again." Like- [00:35:03]

Mark: That also ties into the longer term memory question. So, you know, the problem is ev- every session is fresh. [00:35:15]

It knows nothing, you know, every, literally everything is being injected, you know, agents, MD, you know, whatever rules, files, et cetera. So how does an agent even know that you have these nice architectural docs that you've been keeping up to date? Or how does it know that, you know, last week, a week earlier, whatever, you made these particular decisions, and that's why we ended up here. [00:35:22]

Or it has no idea what any of your code base is. Let's go read 30 files to re- form a brand new, fresh set of memories that compress the context. And some of this can be, can be dealt with partially. There's ... I, I, I highly recommend tools that will do AST-based scanning of the code base and preferring those for what, for loading chunks of code rather than just, like, blindly whatever built-in read file tools are there. [00:35:46]

But that is actually where I'm running into the limits of my own personal workflow right now. I have, you know, dozens of feature and research docs that I've generated. I have daily progress docs. I have sub-task handoff docs. There's a lot of very valuable information in there, and the AI has absolutely no clue that any of that exists. [00:36:14]

So I'm n- I'm now f- starting to feel the need that I need to have some kind of, you know, review sweep process, some kind of tool to, you know, index the markdown file, something to form that longer term memory structure, and I keep bookmarking hundreds of tools and saying I'm gonna go investigate them and haven't done so yet. [00:36:37]

Swizec: If only you had a tool that's really good at summarizing large pieces of text. [00:36:56]

Mark: I, no, I, I actually have done some of that. Um, I just haven't actually settled on one and tried installing it yet. [00:37:03]

Using good engineering practices still matters?

Swizec: One thing I found also really helps for context and stuff like that is actually structuring your code well. So having a vertically oriented architecture, small balls of mud that are self-contained rather than one large horizontally sliced ball of mud makes it easier for humans, and it also really works for the AI when you can say, "Go add a feature to this directory." [00:37:09]

And it's the, like, basically use your directory structure as an index for where to find different kinds of code. [00:37:36]

Mark: That sounds suspiciously like software engineering practices, and I was told those don't apply anymore. [00:37:44]

Carl: Yeah, right. Like, the main thing that I've been trying to keep in mind here of using it, and, and this was, this kind of talk goes back to your recent blog post of, you know, where you talk about how much code A- AI is writing for you, and to just talk to it like, these models were trained on things like GitHub discussions and PR reviews and code comments and whatever. So, like, it's not some new, esoteric, crazy, unpredictable thing. Like, the more you talk to it, just like it's a competent engineer, the more it will behave like a competent engineer. [00:37:50]

And so it has all of these assumptions about, you know, social norms within the context of, you know, text documents describing code and documentation. And so the, the more you understand the social norms that it are deeply baked into its training, the better it does. So, like, yeah, like, I saw these, you know, memory startups where it's like, "Oh, we will automatically generate documentation for your code base and make sure it's up to date." [00:38:22]

And it's like, I'm already doing that with a Read Me document tagged with a GitHash of how, when it was last updated for free. Like, if you have a couple of folders, a couple of directories full of code that is, like, pretty well isolated with a Read Me document in there, like, I don't need to tell the AI to look for the Read Me because it knows to do that. [00:38:49]

Of course, it knows to look for a Read Me. So there's definitely a lot of difficulty with keeping the signal high, but, like, that's not new. I don't know. Like, that just sounds so much to me, like, working on a team of eight engineers and, like, not everyone has the same shared context. Not everyone reads every email or every Slack message. [00:39:09]

And so, like, as far as the challenge of, the, the challenge of keeping a team up to date on best practices and recent decision making, versus keeping an AI at the same level, like, that, those feel really similar to me. It's just engineering best practices and communication norms. [00:39:30]

Swizec: With, with the major difference that AI will actually go read the documents you tell, you ask it to read. [00:39:50]

Carl: Right, right. Like, one of my, like, you know, war stories was I was a contractor somewhere and they kept having, I kept having to do a rework because, you know, QA was overwhelmed and, like, you know, they finally get around to reviewing PRs and, oh, it's broken. Oh, gotta go back. Oh, there's all these conflicts now because [00:39:57]

And so, like, despite being a contractor, I sat down. It's like, no, we need to fix our review process. We need to fix our merges and whatever. Spent, like, two weeks doing that, get everyone on the same page, man. Next day, the, you know, tech lead employee at the company, the guy who should have been doing that process is like, "Nah, this is too much. Like, I'm just gonna force push." And he took down production. [00:40:15]

So it's so it's like, uh, when people talk about AI doing stuff, it's like, "I don't know, man. Have you ever worked with engineers?" [00:40:37]

How do you know the right code to make?

Mark: Uh, so this, this does tie into a lot of the conversations that I had last week. I, I was at both the AI engineer and React Miami conferences. [00:40:47]

And the general theme of the discussions there between the talks and the people and then, you know, all the different chatter online is we built ... And I also ranted about this in my 15,000 Word unpublished blog post. We built a bunch of software development practices for people, the agile manifesto, linear issues, PR reviews that require multiple people to stamp them or, you know, as a way to share knowledge, standups, like these are all people- based processes, and it made sense when humans are the limiting factor and you have to be very intentional about making sure you're writing the correct code. [00:40:53]

Like, not just does the code work, but are we even spending the time and effort on the right thing in the first place? And we now essentially have a generate infinite amounts of code button, and now we're finding out that we've moved the bottleneck further down the chain into all the verification and QA sides of things. [00:41:32]

And so this shows up in, oops, GitHub is overwhelmed because we've literally 10Xed the number of PRs that are being pushed or, wait, we've got a bunch of senior engineers, but they're literally spending all their time reviewing PRs generated by the junior devs with AI, much less what happens when you have a dark factory of agents cranking out code twenty four seven and no one ever looks at it. [00:41:51]

Like, we haven't figured out what the resulting processes ought to be, and one of the points I make in my blog post is that I think right now we're all operating under the assumption that there's no limit for how fast we can go, and in reality, we're gonna figure out the actual limit is maybe, like, 3X of what it used to be, but we need to, like, accept that and plan for it instead of thinking that if we shove everything in one end of the pipe simultaneously, it all comes out the other end. [00:42:15]

Swizec: I have opinions. Um- [00:42:46]

Good communication still matters

Swizec: Yes. I think I wanted to say first is I really like, Carl, you mentioned that you start sessions with a really long 200 word prompt that gives all the context, doesn't just tell the agent what to do, but also why it's doing it in, like, other context. Same. This turns out works really well with humans as well. [00:42:49]

They give you much better work if you tell them why they're doing it or what the goal is. And sometimes if you tell, if you tell them what the goal is, they might even come back with, "Your solution is dumb. We should do this other thing instead," which, wow, can you imagine using engineers' brains for thinking, not just writing code? [00:43:08]

But I think the other thing is, we are generating a lot of code, and we think the bottleneck is code review. I think there's kind of two bottlenecks. One is that when you're moving really, really fast, you have to actually be more careful about working on the right things because if you're just digging yourself into a hole faster, we just had an example recently where we broke Agile, we were like, "Oh yeah, we know what we, what we need to do. Let's just have Claude crank out 6,000 lines of code in one gigantic PR." [00:43:24]

First of all, we couldn't review that, so we had to break it up into a bunch of sub-tasks and then turn that into sub-PRs. Turns out that took another extra day of work and everyone was distracted while doing this, so we weren't focusing on what we were actually supposed to be doing. So I think distraction is still bad. Multitasking is still bad. We're not as good at multitasking as we think we are. [00:43:55]

And the other thing is we, when we split all of that up, we found that Cloud made a tactical mistake very early on, chose the wrong architecture for how to build something with React. And now you have a lot of code that needs a lot of rework to actually make work, and you're wasting reviewing time, you're wa- you're wasting a lot of time that could have been solved if we started with a small task to set up a small thing and talked about it, or maybe even talked about the architecture before we wrote six, 7,000 lines of code, and then just wrote the correct code the first time. [00:44:18]

Carl: Yeah, 100%. We're hitting communication bottlenecks much faster now. Yeah, like Mark said. [00:44:54]

Mark: Communication and thinking and planning was always the job. [00:44:59]

Carl: Yeah. [00:45:03]

Swizec: Yep. Except now you're running, like, 10 times faster. So if you take the wrong step or as your first step, you're just running in the opposite direction of where you wanna go, but really, really fast. [00:45:03]

What will "review" look like in the future?

Carl: Right. Yeah. Right. It's very easy to get in extremely the wrong place in exactly the same amount of time. That touches on something I've said before in this podcast. I think that the future of, like, review is gonna be less code review. The architectural side of things is still gonna be important. Like, it's not just, does it work? [00:45:13]

Does it do what you intended? You can't fully treat it as a black box. The box has to be translucent, you know, at least a little bit to avoid major architectural problems that are gonna cause a production outage or make it impossible to scale or whatever. Like, it's not all going to be, does this function. [00:45:32]

But I do think, does it function is going to be a much greater part of this? My sense is that it will rely a lot less on reviewing the code so much as getting a plausible mental model of the architecture, and then it, you know, and then statically verifying that it does what's intended. So, like, my, my theory here is that we're gonna start seeing a lot more end-to-end testing and user acceptance. [00:45:49]

Automating functionality tests with deployment practices

Carl: A thing I did recently that I haven't fully executed on yet, but ... Well, I, I did a lot of it. But so a, a project that I have, I fully rewrote the CICD so that instead of PR checks, review, approve, merge, ship, instead of, you know, that, like, trunk-based development, where every PR gets shipped individually, I move to more of a, like, Git Flow type architecture where it's like, no, here's the version. This is going to go out. Uh, here's our release candidate. And then that single release candidate gets pushed to a staging environment, and then I do the user acceptance testing of does this function as I need it to. And then once I have that, once I've done that manual testing, great. I know this code is working as I meant it to. So now I'm going to have it write a bunch of end-to-end tests to verify that it will continue to do that when I make changes. And I think that's gonna be kind of the new cycle of, like- [00:46:17]

Swizec: I want that, but on every PR. [00:47:10]

Carl: Yeah, right. And I guess to me, like, that's where manual testing is now the bottleneck. [00:47:12]

Like, you cannot ... At the end of the day, this needs to function, this needs to solve a problem. Any code you write needs to solve a problem that real humans have. Or I guess, you know, sure, other agents or whatever. [00:47:17]

Swizec: I've experimented with automated end-to-end testing. It doesn't quite work yet. I think the models are, or at least the models I was trying were too slow, but the idea is that instead of telling that [00:47:28]

Instead of writing a test as what the features that should be there, it's more about, can the user do a thing? And then because you have a computer use model, it can look at the browser and it's automatically testing both your UX. Like, if the model can't figure out how to do what your user is supposed to be, drunk users won't figure it out either, or tired users or whatever. [00:47:39]

So you just point it at the browser and you'd say, "Go, like, go place an order." And at the end of the day, as long as it can place the order it wants to place, you can keep iterating on the implementation, you can keep iterating on the design, you can keep iterating on the UX, you just get tests that are based on can the agent place an order. [00:48:03]

Carl: Yeah. Oh, that's really interesting. That's, that's a really good variant on what I was just kind of talking about because, like, sure, static end-to-end tests that are verifying, like, this button with this test ID exists and the flow completes as expected. Like, sure, okay, it functions, but is it usable? [00:48:23]

That's an interesting usability automated usability testing because that kind of goes back to what we were talk- talking about with, like, documentation and code notes. Like, just make it intuitive. Can an agent intuit how to achieve a general task? Oh, that's really interesting. [00:48:38]

Swizec: If you've ever used VCR in Rails, that's kind of where I got the idea. [00:48:52]

With VCR, you do full integration testing with calling remote APIs, but you don't wanna call them every time, so you record the responses and you keep them locally. So you then write tests against those local responses, but if the API ever changes or anything, you just regenerate them and you find all of the bugs in your code that don't fit the new, newly released API. [00:48:57]

And you could do something similar with this where you have browser-based acceptance testing, and it records what it's doing so that you don't have to spend as many tokens every time. And you just d- when you change the UI, you delete it and you see if you can do the flow again. [00:49:19]

Carl: Yeah. Totally smart. Interesting. [00:49:34]

Swizec: Doesn't quite work yet, but that would be really cool. [00:49:36]

Carl: We've kind of been going into this direction, but let me, like, sort of, like, you know, refocus a little bit. Like, so what do we think the current landscape of tools, of AI tools looks like? We've kind of been discussing that in, in, we've been dancing around it. And I guess where do we think it's going next? [00:49:38]

Mark: Going or ought to go? [00:49:55]

Carl: Either, both. Both. [00:49:57]

Mark: My general inclination as the person who found one tool set and is stuck with, and is sticking with it and is not actively trying to change what he's doing. [00:49:58]

Certainly what I'm seeing online is turning towards automate all the things, which I think is a very understandable impulse for a software developer. If you can do things once, you can do it in a loop, you can run it in parallel, you can distribute it, you can have an agent do it, et cetera. [00:50:06]

And I think it kind of ties into the go as fast as possible at all costs for the purpose of going as fast as possible, direction that I see. I mean, like this bit about, you know, the dark factories, the idea that a team never even looks at the code, that it's truly all agents all the time, and all you say is, "Do something that accomplishes some goal, and you never even look at it, " is sort of an ultimate outcome of that. [00:50:21]

I don't think it's a good one, but I see how it's the idea taken to a logical extreme. [00:50:48]

Carl: I, I also have been pretty ... I found a tool and I'm sticking with it. I've been noticing Claude Code pulling in lots of ideas that I've just been thinking. Like, as it evolves, it's like, this is a productized version of how I've already been using it. [00:50:53]

You know, like they added a /btw command that lets you ask a question and it will ... You cannot use tools and it does not go into the context history. So it's just like even do a little aside. Like if you wanna ask, like, "What's this thing doing?" I had already been doing that by way of, like, I'd ask it a question and then I'd go back and just, you know, restore the conversation to that. [00:51:06]

And similarly, two months ago, I had been saying, like, I would actually spin up a Claude code instance that was on Haiku because I just need to summarize this and it's easier and I wanna manage my usage a little bit more precisely. And, like, now I don't have to do that because I just say, use a subagent and Claude Code is smart enough to say, "Oh, I'm just summarizing. Let me use Haiku." [00:51:27]

What behaviors belong to the agent, and what fundamentally can't be part of the agent?

Carl: So I've just been watching, like, kind of what the agent itself is doing versus what I need to tell it to do, or proactively manage, converge a little bit. So that's, that's had me wondering, what outside of the agent, what is not going to live within the agent. And I think where I've landed on that is, like, everyone's talking about memory the last six weeks or so, and I think that's wrong. [00:51:49]

I think that's just gonna be conventions and documentation. It's just knowledge. I think what's going to persistently live outside of an agent harness, as it were, is, like, incoming information, like a data stream, like GitHub issues, text, you know, the Slack chat, a Century Alert, a production alarm. If you have a data stream that is categorizing those and prioritizing them, and then giving that to an agent, to an autonomous agent that is just working, like, those are different things. [00:52:12]

I think you cannot have a single generic agent that is able to connect to every possible data source. So that to me feels like a boundary that's gonna be pretty strong for the coming months. [00:52:44]

Swizec: I don't know. I think for me, I feel like going super deep into what, at least on the internet, a lot of people are talking about various MD files and orchestrations and all the crazy stuff. [00:52:55]

I feel like that's a little bit like ID skins and WIM shortcuts. It's like, yeah, sure, it's cool, but I, I think you're over complicating it. The models are gonna keep getting stronger and better, so I would just think they're gonna be able to do bigger and bigger tasks. I think feedback loops, like what you said, what is outside of the model? [00:53:06]

Building really strong feedback loops is important. I think memory documentation inside the code or making the code itself easier to research or giving it MCP tools to, like, notion or linear or whatever, you keep your organizational memory. Having access to organizational memory will help agents as well. [00:53:25]

Yeah. I don't think we're ever gonna come to full automation of anything, really. I think there's a strong gelman amnesia when it comes to th- these things where agents will fully automate every job except the ones I'm strict, super familiar with. That is just way too complicated and there's too many deals. [00:53:46]

Everyone feels that way about their job because newsflash, they're all too complicated. [00:54:06]

Carl: I wanna explain that reference briefly for people who aren't familiar. Gelman and Amnesia, that refers to the idea of, like, you're reading a newspaper and you see a story about your professional industry or something you're very knowledgeable about, and you see everything that it got wildly wrong, and you're like, "Who even wrote this? What are they doing? These incompetents." [00:54:11]

And then you read the next article, and you're not familiar with it, and you just take it super credulously, and it's like, "Oh, yes, look at these," you know. It's a, like, I think that's a pretty good way of thinking about how people are talking about AI right now, yeah. [00:54:29]

Swizec: And I think, from my perspective, go hard as long as you can. I think the bubble is going to collapse in the next one to two years. Tokens are gonna become super expensive, and we're not gonna use as many of these tools long-term as we are right now, but we're gonna have a lot of infrastructure, like, just like we did after the dot-com. [00:54:41]

Carl: That's part of why I've been really interested in local models, because I agree with that. Like, they're being heavily subsidized right now, and I think in, I don't know, about two years, I don't know, we'll see. It's been very ... People are talking about it, it feels like the wheels are falling off right now with, like, Claude Code is just clamped down usage on the 20 d- $20 a month plan aggressively. [00:55:02]

Swizec: They grew from nine billion ARR to 30 billion ARR in a quarter, so I'm imagine they have some scaling issues. [00:55:22]

Carl: Right? Yeah, clearly. [00:55:30]

Mark: One of the points I made in that draft blog post that I'm hoping to publish this week, the cats out of the bag, like, even if OpenAI and Anthropic were to utterly and completely collapse and go out of business today, the technology exists. [00:55:32]

And whether it's, you know, boutique hosting of models or local LLMs, the technology is not going to go away. And even if it only, even if it never gets better at all, the t- the capabilities stay exactly the way they are today, it's still curly enough to upend large parts of our society. So then the question becomes, w- what are you doing with it and how do you, how do you respond? [00:55:46]

Carl: Well, that's a good ... We should wrap up soon, but that is an excellent jumping off point for our last little subject here. [00:56:12]

Impacts of LLMs on software engineering?

Carl: Uh, what do we think the impacts of LLMs on engineering as an industry, software engineering as an industry, and maybe more broadly, societally, if we wanna go there? [00:56:17]

Swizec: I would say it's raising the bar for everything. [00:56:30]

Like, what counts as a minimum viable product is gonna get more fancy and more complicated. What counts as a junior entry-level engineer, or what the expectations are for a junior entry-level engineer are gonna go up, and I think we're gonna need a lot more, um, product sense, and a lot less being just really good at writing code. [00:56:32]

Mark: Yeah, I'm inclined to agree. I am honestly worried about the impacts in society as a whole. We've seen lots of debate and studies about, you know, have, have cell phones effectively destroyed the minds of youth and the ability to focus and social bul- like electronic bullying and, like, very ... Like, why did teens get very depressed starting in 2012? [00:56:54]

Like, that sort of thing where a technology possibly had large-scale societal effects. And I'm not saying that LLMs are, you know, inherently destructive or inherently bad, but a lot of times the technology gets invented and then it changes society in ways we were very unable to predict early on. In this case, I think we can predict a number of ways, and there's probably a whole lot of other stuff, too. [00:57:14]

So I genuinely worry about college students being able to get away without having to develop critical thinking skills even less than they were able to get away with it before. I have concerns about that sort of thing. And, you know, I'm towards the tail end of my career, I've done my thing, I learned, I gained my experience. [00:57:40]

I don't know how junior devs are going to gain some of that experience. Maybe it is they need different experience. I don't know what the job pathways look like at that point. So, like, I mean, don't get me wrong, I'm excited to be able to use AI to crank out some of the things that I've had in my head. [00:57:58]

I've found some of the same joy in being able to have a multi-hour flow session and build stuff just with AI editing the files instead of me and my fingers. But I can also look around and say, "Yeah, there's a bunch of unexpected consequences floating around as well." [00:58:14]

Swizec: I mean, in terms of experience, it's not like I know how to code without a compiler, and I would definitely not enjoy coding without a, without garbage collection. [00:58:31]

Carl: Yeah, I think that's closer to my theory on this. I think it will, you know, the societal impacts are gonna be something a lot. We'll see. I think the societal impacts will come mostly not from software engineering, so I think I don't understand those quite well enough to opine on it too much. I do care a lot about, like, sociology and economics and whatever. [00:58:41]

So I do have thoughts and opinions, but I th- maybe it's a, maybe it's my Gelman new amnesia of like, I know those well enough that I'm like, "Nope, that's way too complicated. I'm not no opinion." But I think we take for granted a little bit how much we're standing on the shoulders of giants already and like, this is a big giant to climb the shoulders of for sure. [00:59:02]

Like, not trying to discount that. People talk about how it's gonna change our brains and like, you know, destroy humanity and whatever. And like, I don't know, like, we kind of did a lot of that already. Like, can you navigate in your city without GPS? There's a lot of stuff that we've already lost that used to be like, canonically like, you know, you used to have a navigator. If you wanted to do a road trip, like, you would have spent a week planning a route and then, like, God help you if you miss a turn. Same thing for, like, finance and, like, the spreadsheet. I don't know, this is gonna be, this is a broad, more broadly applicable thing than any of those past technological leaps. [00:59:21]

Another thing that I learned recently, going on the societal level again, there was a ... In 2023, there was an, like, international, like, literacy survey done, levels and whatever. And I think level one was can read, but struggles with, like, finding knowledge in text or, like, following multi-step instructions. [00:59:58]

And 28% of Americans m- met that were, you know, were tested at that level of literacy. And, like, it said anything below level three is considered partially illiterate. And the stats were half, half- [01:00:20]

Swizec: 54% of Americans are less than sixth grade reading level. They will not do well with LLMs. [01:00:35]

Carl: Right. But, like, you know, we did that before LLMs. [01:00:42]

That was 2023. Like, okay, sure. Like, there's gonna be weird shit going on since then now, but, like, God, we already ... Like, as far as destroying society, like, we've done a pretty good job of that without AI. [01:00:45]

Swizec: And for context, that's 54% of adult Americans. It doesn't, like, count children and stuff. [01:00:57]

Carl: Yeah. [01:01:03]

I don't know. It's just, like, anytime I hear people talk about, like, the downfall of, like, civilization or whatever, it's like, "I don't know. Have you looked at civilization lately? It's not doing so great." So, I don't know, it is gonna be a lot. It's gonna be challenging and difficult. I guess going back to engineering, software engineering, you remember the '90s and 2000s, 2010s era, like, stereotypes of software engineers, like, it's, like, super autistic, like, zero social skills. They are a human computer. [01:01:03]

And now we don't have that anymore. We, you know, democratize software engineering and now everyone does it. We've always tried to get something basically like an LLM. Like, that's what people wanted out of software engineers this whole time. So, I think it's gonna be ... I like the analogy of, like, compilers and garbage collection and stuff, because it's just a new tool. [01:01:31]

It's a new level of abstraction, and now it's much further removed from the hardware. But, yeah, I don't know. People talk about, like, the d- it's gonna destroy, like, the upskilling, it's gonna destroy junior engineers, and it's, there's not gonna be any pathway in. And, like, I don't know, maybe that's true, but at the end of the day, I'm gonna end on, like, a positive thought or, like, advice. [01:01:54]

Like, this is all communication. It's just more and more communication. If you wanna lock in your software engineering career, get good at communicating, get good at deeply understanding and articulating a problem, not the code, not the patterns necessarily, but, like, understand what an architecture is good at or bad at or what the performance profile of this or that are. [01:02:18]

And, like, now, you don't even need to understand it at the same level. Like, I've been really excited about AI is because I have such a broad and shallow expertise, and now I can use it. Like, it used to be that I had all this, like, vague notional expertise in a wide range of stuff, and it's like, "Damn, I wish I could use that, but I would need a team of 15 people in order to be productive with this. [01:02:45]

And now I don't. So, like, I don't know, that's exciting for me. And I think that's just generally exciting. So if you're just good at communicating and you learn a domain and you get good at articulating problems within that domain, then, like, I think you're gonna be fine. [01:03:08]

A superpower right now is a domain expert who can kind of code

Swizec: I think superpower right now is if you're a domain expert who can kind of code. [01:03:22]

Carl: There is certainly still value in, you know, greater engineering expertise and understanding how to ship stuff to production. [01:03:28]

Swizec: At some scale, that becomes a domain expertise. [01:03:34]

Carl: True. Yeah. Extremely true. So yeah, I don't know. [01:03:37]

Mark: Uh, I'm excited. I [01:03:41]

Swizec: can build more stuff. [01:03:42]

Carl: I can build so much ... I have built so much more. [01:03:43]

I've explored so many more things and gone, "Oh, this is harder than I thought. I'm gonna set it down." Instead of just never doing it and, like, fantasizing about what it could have been for years, you know? [01:03:45]

Mark: Yeah. Same. [01:03:54]

Carl: All right. Cool. Thank you all. Thank you both for joining me. Thank you everyone in the audience for listening. [01:03:55]

Appreciate it a lot. [01:04:00]

Mark: Hopefully this was informative. [01:04:01]

Carl: Cool. All right. Well, this has been this month in React, quote unquote. I think we may be exploring different formats in the future because, I don't know, who's following individual software library development as closely anymore. But yeah, thanks so much for listening. [01:04:02]

If this is a show that you get value from, please send it to a coworker, send it to a friend, leave us a review. And yeah, I'll see you next month. [01:04:17]

Mark: All right. Take care. [01:04:25]