Nicholas

The Codex feature that works while you sleep

Nicholas

In this 30-minute episode, I walk through my favorite feature in Codex: the /goal command. I show how Goals transform AI from a turn-based assistant that needs constant ‘what’s next?’ prompting into an autonomous agent that can work for hours on complex, multi-step tasks. I share three real examples: eliminating thousands of Sentry errors, cleaning 3,900 emails down to 68, and organizing hundreds of Linear tasks. What you’ll learn: - What Goals are and how they differ from standard prompts - How I used /goal to eliminate hundreds of error logs in my codebase over a five-hour autonomous run - The non-technical use cases that make Goals incredibly powerful: cleaning up 3,900 emails in under four hours and organizing hundreds of project management tasks in Linear - How to write effective /goal prompts with measurable outcomes, verification methods, and constraints - When not to use Goals and what makes a strong versus weak Goal - Why Goals represent a fundamental shift in how we work with AI, from babysitting the model to managing it — Brought to you by: Mercury—Radically different banking loved by over 300K entrepreneurs — In this episode, we cover: (00:00) Introduction (01:50) What is /goal and when should you use it? (02:45) The difference between prompts and Goal-based loops (04:06) Claire’s first five-hour 45-minute autonomous coding task (05:05) How to manage a Goal lifecycle: view, pause, resume, and clear (06:06) How to write strong goals: outcomes vs. outputs (07:34) The six components of effective Goals (08:57) Example: Reducing P95 checkout latency with /goal (09:36) Demo: Using /goal to eliminate Sentry errors in ChatPRD (13:18) Demo: Burning down Vercel API errors (17:28) Non-technical use case: Cleaning 3,900 emails with /goal (21:24) Demo: Using /goal to clean up Linear project tasks (24:41) When not to use /goal (26:10) Why /goal changes everything — Tools referenced: • Codex: https://openai.com/codex/ • Sentry: https://sentry.io/ • Vercel: https://vercel.com/ • Linear: https://linear.app/Other reference: • OpenAI blog post “Using Goals in Codex”: https://developers.openai.com/cookbook/examples/codex/using_goals_in_codexWhere to find Claire Vo: ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevo — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [redacted email].

Published
Published May 27, 2026
Uploaded
Uploaded Jun 12, 2026
File type
Podcast
Queried
0

Full transcript

Showing the full transcript for this episode.

AI-generated transcript with timestamped sections.

0:04-1:34

[00:04] - Welcome back to How I AI. I'm Clara Vaux, product leader and AI obsessive here on a mission to help you build better with these new tools. Today, I'm gonna walk through my favorite feature in my most recent favorite AI product, [00:18] goals in codex. If you've been wondering how all these people on the timeline are getting their AI to run quote unquote overnight. [00:27] or handle very complex long-running tasks, I'm going to show you Goals is the answer. We're going to walk through what it is, how I might use it, and a technical use case, along with some non-technical examples of how Goals can help you even if you're not coding. [00:42] Let's get to it. This episode is brought to you by Mercury. As an AI founder, I'm constantly tracking run rate, watching revenue growth, paying vendors and making sure I'm getting paid on time. [00:55] Mercury makes all of it feel effortless. The app is genuinely beautiful. It actually looks and works like modern software, which sounds obvious, but apparently isn't when it comes to banking. What I use it for the most? Bill pay for my vendors is just clean and easy and wires and transfers, getting paid from clients, moving money. Mercury makes it so simple. Everything [01:25] wondering if something went through. I think about how much I've optimized every other tool in my stack, [01:31] Mercury is the one where I don't have to think about it at all.

1:34-3:05

[01:34] It just works. Visit mercury.com to learn more and apply online in minutes. Mercury is a fintech company, not an FDIC-insured bank. Banking services provided through Choice Financial Group and Column N.A. members FDIC. [01:50] Before I go into how to use goal, I want to talk about what goal is and when it's appropriate and when it's not the right tool for the job. [01:59] So I'm looking at this blog post by the OpenAI developers team. It's called Using Goals in Codex. [02:05] And the first thing that they have in this blog post is this awesome diagram that talks about the difference between a prompt [02:12] in a goal based loop. [02:14] In a prompt, you all are used to this. It's sort of the turn-based request that we're all used to. You ask the LLM, the model, the harness. [02:21] to do something, it works, it returns to you its result. [02:26] And then it waits for you to prompt it again. If you're like me, the number one thing that you're saying in your coding tool is, [02:33] OK, what's next? And then it tells you and you say, great, do it. If you find yourself in that process, [02:40] Using slash goal in codecs might be a tool that you want to add to your toolkit. So what's the difference between this turn-based, one response, weight, and goal? Well, [02:51] With goal, when you give codex a goal, [02:53] It actually has something that it can work towards and it will continue to loop to the next step. [02:59] and verify until it can measure that it has met that goal. And so if you look at this,

3:05-4:36

[03:05] The goal is the overarching [03:07] kind of description of the outcome that the model wants to get towards. And that will work. It will check its work. [03:15] It will decide the next step and it will continue that three-step process until it can gather evidence that it has met the. [03:23] the goal. Once it gathers evidence that it has met the goal, [03:26] it will mark the goal as complete, [03:29] and then it will tell you it is done now if you've been watching [03:34] And if people online talk about how they get these long running autonomous tasks out of codex, cloud code, etc. You're really talking about people who are using some framework of this goal. There's also a version of this called a RALF loop that people were talking about. But functionally, the framework is the same. [03:54] It's saying keep going until... [03:57] x behavior or y outcome is validated. Otherwise, I want you to re-prompt yourself and re-prompt yourself until you're there. [04:07] And what's really fascinating about goals, I've been using [04:09] you know, AI coding agents for many years now, and until Codex and Goal [04:14] I was not able to get these multi-hour, long-running, autonomous tasks. Now, [04:21] I don't have the most complex coding tasks in the world. [04:24] I'm not building an operating system. I'm not doing complex mathematics. [04:29] And so part of that was [04:31] My problems were pretty well constrained, but I did have things that I thought a long running

4:36-6:07

[04:36] harness could really help me with? [04:38] But until slash goal was part of the codex tool, [04:42] I really just wasn't able to get my AI to self manage enough [04:48] to do that autonomously over time. [04:51] But the first time I used Goal, [04:53] I was actually able to get a coding task running for about [04:56] five hours and 45 minutes. [04:59] which is longer than I've ever had anything run before. Now, quick introduction on how to use goal. There are four sort of ways to manage the lifecycle of goal. [05:12] The one that I use is slash goal and then I walk away. So if you write, write slash goal, [05:18] and then prompt it with your goal. It will start working. You can use slash goal to see what the current goal is. Again, you can pause the goal, you can resume the goal, and then you can remove the goal. So [05:29] You know, you don't have to let your AI run for 6, 12, 24 hours, whatever. If it gets off the wrong track, you can absolutely manage the life cycle. [05:38] But it's a really useful tool. And I love that they give the example here because this is 100% what I spend most of my time going. They say, you really want to use goals when you would otherwise find yourself saying the same thing after turn, like keep going, try the next thing, run it again. Now run the test, continue until it's actually done. [05:58] So if you're micromanaging, [06:00] your AI and having to tap it on the shoulder and say, can you pretty please go to the next step? [06:05] goal is for you.

6:07-7:39

[06:07] Now, how do you prompt and design a real goal? This is where product managers tune in, engineers that write success criteria tune in. These are where those skills on setting really measurable, well-defined goals are. [06:21] coming to play because... [06:23] When you prompt something, [06:25] you're really just saying do this task, right? Like rewrite this code, redesign this page, etc. [06:31] When you're talking about a goal, you want to talk about what the outcome is if that task was [06:39] successful. And the technical example that they give here in this blog post is reducing P95 checkout latency. So if you know that a specific page is loading kind of slow and you want to reduce that, [06:51] below a threshold and you know that can be measured because you can just load the checkout page over and over [06:57] and over again, and then you create a guardrail on it, [07:01] like keeping the correctness suite green, that is a really great goal. It's measurable, it's testable, it has a guardrail on it, and... [07:09] There's a executable surface area that you know an LLM can be. [07:14] successful for. [07:15] writing goals is its own skill set, but [07:19] OpenAI has given a really great outline to what makes a strong goal. And again, product managers, let's pay attention. If you've written [07:27] an OKR. Developers, if you've argued that an OKR was not well written, [07:32] This is where those skills come into play. The strongest goals, I mean, for anything, but in particular for Codex,

7:39-9:31

[07:39] kind of have six things as part of it. It has an outcome. What should be true when the work is done? So [07:45] Once we're done, what is the outcome we're trying to deliver? [07:48] Verification, how can you test it? Do you have a test suite? Do you need to pull up the browser? Is there a number that you're trying to go to or a measure? [07:58] constraints, what can't regress while Codex works. For example, on our P95 checkout latency, you could delete the page, the latency goes away, but that's not what you want. So you want constraints, you want the features to say the same [08:10] You want particular technologies to stay the same. [08:13] the boundaries, so what tools and files and things it's allowed to use in pursuit of this goal. [08:20] the iteration policy, how it should decide what to try next, [08:24] kind of what would you try next? [08:26] And then when it should stop and say, sorry, [08:30] I just can't continue. I don't have a good... [08:32] next idea. And they give this great pattern here, which is slash goal, [08:37] you know, my end state verified by specific evidence. I need you to preserve these constraints. Please use these tools. Between iterations, [08:46] Decide the next step by doing X, Y, and Z. And if you're blocked or no valid paths remain, [08:52] This is what you should do next. You should tell me. You should report. You should ask me for help. [08:57] And so they give an example of how to make this P95 checkout latency goal a lot better. [09:03] And it's basically by saying, [09:05] bring it below a threshold, which was already in the original prompt, but you're going to verify it by the checkout benchmark. You're going to keep the correctness sweet green. You're going to use only the checkout system. Between iterations, you're going to tell me what changed, what the benchmark showed, and the next experiment to try. And if you can't come up with something else, stop and give me the evidence, the blocker and what you need from me. This is a really great goal. And this is a technical goal.

9:31-11:08

[09:31] But you can also do this with non-technical projects, and I'm going to show you a little bit of how that works. [09:37] So again, a goal is a new way to prompt a LLM in this instance, Codex to [09:44] work autonomously in a loop of work, [09:47] verify check [09:49] until it hits a goal. [09:50] Goals written are a lot different than prompts. Prompts are an instruction of what to do. Goals is a description of what a good outcome is and how to get to that outcome. [10:00] And then I've seen Codex be able to run these goals for a very long time. [10:05] So I'm going to give a couple examples [10:07] of how to use goals and what I think they're most useful for and some successes I've had with goals. [10:13] And I'm going to kind of show you behind the scenes [10:16] I have ChatPerd, and in ChatPerd, we have a tool call in our main tool. [10:23] AI writing loop and it edits [10:25] specific parts of a PRD. And it's this diff-based editor. It's very complicated, and it looks for operation ranges inside a document and then tries to edit those operation ranges. And [10:38] We were getting... [10:40] Tons of errors you can see here. [10:42] Tons of errors on applying specific edits [10:45] because it couldn't find the right operation range. [10:48] I'm just going to, again, you know, tune out if this is boring to you. [10:51] But because the documents we created were complex, they had tables in them, they had bullet points in them, they had... [10:57] bold, they had quotes, they have images. Actually, precisely getting a range of nodes from the AI was really, really hard. We were just seeing a bunch of these errors over and over again. And we would like

11:08-12:44

[11:08] find one example of why an error showed up in a very specific document, fix that, but then another one popped up. So it's like that. [11:17] cartoon where like you plug your finger over here and another spout goes off and it was driving us crazy you can see here [11:23] And then... [11:24] You can see basically the end of April, the beginning of May, they went away. [11:28] Why did they go away? Well, they're not. [11:30] we used goal to knock this out so the goal that i used to solve this particular problem is i gave codex access to sentry [11:41] I gave Codex access to these edit requests. [11:46] And I said, slash goal. [11:48] Codex, go through every example in Sentry, every trace in Sentry of an invalid operation on the edit tool. [11:57] categorize that issue. [11:59] and fix it. [12:01] then replay [12:04] All of the century events that would have shared that same issue [12:09] Until you have fixed every issue and every historical example, [12:14] of an edit invalid operation [12:17] is solved. [12:19] And it went to town. So what it would do is pluck one example. [12:23] It would see what the root cause was. It would implement a fix for that root cause. [12:28] It would then run through all the other examples to see how many of those it burned down. [12:34] It would have some remaining. It would pluck the next one. It would do the fix. It would run through all the remaining examples, burn it down, burn it, burn it, burn it down, burn it down, burn it down.

12:44-14:16

[12:44] And then look what we have. We have literally zero... [12:48] errors left. Now this took several hours. [12:51] And what was really nice is at the end of it, [12:54] I didn't get like these bandaid fixes all over our edit code. [12:59] What I got was a systematic fix that integrated every example into [13:05] a more intelligent framework for how edits should be applied, [13:09] And ultimately, we've had zero. [13:12] edit errors from the time that we use goal here. And so I think this is a really great example, but let's do it live because this is how I AI. I'm going to give another example of how I might use this again for some of the more technical folks. [13:24] So these are the Vercel errors. It looks scarier than it is. We [13:31] have a lot of retries around this, but here are the errors that happen behind the scenes that we have to recover from. [13:37] in our main chat. [13:39] And from the last last two weeks. [13:42] And I want to do the same thing with these errors. I want to say, Codex, find these errors, classify them, [13:51] ship a fix, validate against the existing data until basically there are none of these errors left. So I'm going to pull up [14:00] Codex. [14:01] I'm going to use GPT. This is not like a complicated deep thinking problem. So I'm going to use GPT 5.5 medium. [14:08] And I'm going to say goal. [14:10] Eliminate [14:12] errors on the API chat v2.

14:17-15:49

[14:17] endpoint that are showing up in the [14:20] for cell logs? [14:22] by going through [14:25] each category [14:27] of error. [14:29] identifying [14:30] root cause, [14:32] determining if this is a user facing [14:37] Error. [14:38] If it is, determine... [14:41] root cause and [14:43] open a branch plus PR for fix. If it is not, [14:50] reduce this error to a warning. [14:54] Once all logs can [14:58] be handled from the last [15:01] two weeks [15:02] Report to me all PRs to review. [15:06] and issues that could not [15:10] Be fixed. [15:12] or what you need from me. This is terrible prompt. [15:16] this is fine, this is honestly a better goal prompt than I usually write. [15:20] And say... [15:21] Success state is we... [15:25] have no user-facing errors, and... [15:29] No... [15:30] back end errors that should be warnings. Okay, I'm pressing [15:36] Enter. It's compressed my skill descriptions, but that's fine. [15:41] Now, Codex has hooked up with my [15:44] for cell plugins so it has access and can actually go access these logs.

15:49-17:24

[15:49] So it's making this plan. And I just want to pause and tell you kind of how goal works with a plan. [15:55] So once it has a goal, it makes, I've seen these like three to five step. [15:58] plan. So it's going to inventory the current repo, [16:01] It's going to pull the last two weeks of Vercell errors in group by category. [16:05] It's going to classify them as user facing errors, and it's going to implement validate fixes. [16:10] or downgrade warnings by category, and then it's going to publish the PRs and report to me. [16:14] Again, this is very precisely it's measurable. [16:18] It actually has a list of errors it's going to burn down. [16:21] It's observable. It definitely can eliminate those errors. So it can ship a fix. It can eliminate it or can run the same code and it can show that the error wouldn't be hit. [16:31] And then it has a success criteria and an ending state to me, which is I want to list to PRs and any blockers or things that I need to. [16:39] review and so it's going to go ahead [16:42] and go through and try to find the right logs [16:44] going to continue to work on this now [16:46] We are in a mini episode today. It's one minute into the school. I suspect that this is going to take... [16:52] two to three hours to get through. I've run something very similar on this. It's taken about two or three hours to get through. [16:58] So I will have to put in the show notes or a follow up whether or not this was super successful, but it's just an example to you. [17:05] I love this idea of just like century zero, error zero. [17:09] where you can point goal at any kind of like lingering errors that have really haunted your team and developers out there. You know that these exist and you can actually say, just go get rid of these. And with goal, it really is possible. And I've seen very high quality errors.

17:25-18:56

[17:25] success on using goal to burn down errors. [17:29] So that is a technical example of how to use Goal, but I want to make this more applicable to people who aren't developers because I honestly think Goal for non-coding use cases is even more. [17:42] more exciting. Today's episode is brought to you by Mercury, the banking solution I use for chat PRD. I build AI tools. I talk about AI every day. So when people ask what I use to run my business, Mercury is a genuinely easy answer. [18:00] Because an AI founder who still deals with clunky, outdated banking is kind of a walking contradiction. Mercury is how I track run rate and revenue growth, pay my vendors through BillPay and get paid by clients. [18:14] wires and transfers that used to feel like a whole thing, sending money, accepting payments, knowing it arrived, [18:22] Mercury just makes it simple. The whole platform is clean, fast, and modern in a way that most banking honestly isn't. [18:30] I've banked with them for years. It's one of those tools where I don't think about switching because it's never given me a reason to. Visit mercury.com to apply online in minutes. [18:42] Mercury is a fintech company, not an FDIC-insured bank. Banking services provided through Choice Financial Group and Column N.A. Members FDIC. [18:51] For this next example, I want to give you my... [18:54] favorite use case of slash goal.

18:56-20:29

[18:56] It has blown my mind. And if you leave... [19:00] this episode with nothing else. I hope you go do this, which is [19:04] Use the goal to clean up all your unread. [19:07] emails. [19:08] So Codex has access to my Gmail plugin. That means it has MCP access. It means it can go through and read my email. [19:16] I had yesterday truly [19:19] 3,900 emails, something like this. I'm going to see if I can find the [19:23] resume the save chat. So I'm going to type in goal. [19:27] and see what my goal was that I did yesterday. It is the much worse written prompt. [19:33] Categorize all bulk promotion spam emails unsubscribe from unnecessary emails [19:38] and clean up your inbox, ask for help while needing judgment. [19:42] It ran for three hours and 52 minutes. [19:45] And it used about 6 million tokens. So it was not token cheap. [19:49] I'm going to just show you what it did, which is it just read. [19:53] like literally read every email [19:56] categorize them [19:57] put nice labels on them so then I could go decide, including labels like needs judgment, [20:03] clicked unsubscribe links for me, gave me a list of unsubscribe links that I could use. And at the end of the day, I went from about, let's actually ask, [20:12] How many emails did I start with uncategorized? [20:17] And how many are now [20:19] left to [20:21] Filter. [20:23] So it's going to go ahead and check its own work and you're going to hold me accountable to show that I did not make this up.

20:29-22:03

[20:29] and it's going to show how many emails I started with and how many do I have [20:34] left. I'm pretty sure it was about 4,000. And I think we got down to about sub 1,000. [20:40] that needed to get done. [20:42] Okay, it took a little prompting to remember what it did. But again, we started about 3900 emails. Now I'm down to 68 that I need to look at. So that's my today project. [20:52] So it categorized [20:53] almost 4,000 emails for me and [20:56] It put it in lovely folders. Again, it unsubscribed for me. [21:00] It gave me nice categories of emails that I needed to respond to. If you've been waiting on me for a couple of weeks, you now got a response. [21:06] And now I have a much cleaner email that I can run over time. So again, slash goal. [21:12] My prompt was very simple. Just categorize all my emails, unsubscribe and clean up my inbox. [21:18] It ran for four hours and now I have a much cleaner inbox to work with. Okay, I'm going to give one other example of a non-technical use case that I think is going to be really useful for the product managers out there. [21:30] which is I have let my linear my task management software go completely off the rails. [21:35] This is partly an open claw problem, which is I gave my agents my open claws [21:40] YOLO access to linear and they created a bunch of tasks, not all which that they have done. And so I want to clean up my linear tasks and get them to only the ones that I need to complete. And I want this in particular for our podcast linear. [21:54] because we had aspirations of all the things we would do with every episode. We usually do about 70% of those, and I just want to clean it up. So I'm going to say slash goal.

22:03-23:33

[22:03] clean up [22:05] the How I AI podcast [22:09] team issues in linear. [22:11] anything from a previously released [22:15] episode that is not marked as done. [22:19] should be marked as will [22:23] Will not do. [22:25] Our goal is to have open... [22:28] only future tasks, [22:30] this week and forward. [22:33] for episodes. [22:35] Not old tasks will never [22:39] get around to. [22:42] so i'm gonna let that do that it should have access to the linear plugin it's gonna go through and again i'm telling you this is like hundreds and hundreds and hundreds of tasks [22:51] It's going to go through and make this judgment call of, can I close this? Can I update the data? If you want to have better task hygiene, where you want to make sure everything is tagged correctly, assigned correctly. Okay. [23:03] This is a really good use case, and so it's found the linear team, [23:07] It's going to work at the teen level. It's going to identify stale episode tasks. It's going to go through... [23:12] clean them up. The task status we want is not won't do. It's called canceled. [23:18] and it's just going to process through and go ahead and do that. So I suspect that this one will go a little bit faster. [23:24] but will probably take 30 minutes to an hour to go through really high quality judgment. And at the end of it, [23:30] I'm going to have a much cleaner, linear workspace to work with.

23:34-25:04

[23:34] And again, it's saying a clear rule is emerging. Keep current week, future episode work, cancel non-done episode release work before Monday. It's going to scope the bulk update. It's going to validate that the outcome I wanted, which is a clean linear, is done, and it will complete this over time. [23:50] So those are my three examples of how to use Goal [23:54] One is a technical one. Again, [23:56] It's continuing to run, so it's gotten through the first two steps here. [24:01] The technical goal of looking at all my error logs and basically classifying them, fixing them, burning them down with the goal of having no more errors. [24:10] ever [24:11] There is the second very practical goal of clean up my email inbox. And so I can actually read my email. That one took about four hours, I think useful for everyone. And I did not have to have a very good prompt there. [24:23] And then my third one for project management, make sure that my projects and my tasks and issues are... [24:30] clean, my backlog is clean, everything is labeled the way I want, and I only have to focus on the things that matter to me. These are three ways [24:39] I think you can use goals in codex. [24:41] Before we end, I want to take a step back and talk about when you shouldn't use goals and then what I think is next. So goals are not the right tool for every job. And I'm pulling up this blog post again because I think they say it better than me. [24:55] Do not use goal for something that is a very simple one line edit. It is just too big of a tool for the job. Your goal wouldn't be like, make sure this line of code is removed.

25:04-26:41

[25:04] You really want an outcome, not an output almost. [25:08] for it to be a good goal. [25:11] Also, don't use a goal when the finish line is vague. So you can't do I mean, maybe you can if you're like slash goal, make my customers happy. [25:21] I think that is just a very vague goal. It's very hard to measure, and there's no reliable... [25:28] definitive completion condition. And so that's not very good. [25:32] The other example they give is like refactor this code. [25:35] Not a good example of when to use slash goal. And in fact, I'm doing a refactor this code initiative with Codex, but I'm not using a goal. They say, and I just want to reiterate this for you, goals are strongest when it has three properties. [25:48] A durable objective, an evidence-based finish line, and a path that may require several... [25:54] turns of investigation. So if you have an objective that stays steady over time, you know you want to hit that objective, [26:01] It can be evidence-based and you can measure it. And you think getting there is going to require a couple turns. [26:07] goals are for you. [26:09] So before we wrap, a couple thoughts on slash goal and why I'm just really excited about this framework of working with AI. [26:17] One, as I said at the beginning, this has been the first time that I've been able to get these autonomous, long running tasks done. And so I really can set the LLM, the AI up with a goal, step away and have it work over many hours on a problem that would be very annoying to babysit. So one, I think my babysitting days are largely over with AI, not completely over.

26:41-28:11

[26:41] I'm still babysitting a branch right now, but largely over with AI. [26:45] I think the second thing is the impact that goal has had on [26:49] quality of life things in my code that have been very hard and annoying to chase down. Yes, I probably could have gone task by task and said, please fix... [27:00] issue A, then fix issue B, then fix issue C. And I could have set [27:05] different coding tools off on those problems but this idea of just saying like [27:09] Error zero. [27:10] go through all our error logs and fix them until they exist no more. [27:15] is incredibly powerful for in particular quality. So for engineering teams looking to burn down tech debt, fix flaky tests, look at really annoying client side errors, [27:27] that are maybe annoying to reproduce, I feel like slash goal is really powerful. [27:33] The third thing is, I think that product managers are really going to love goal. Again, we've had it drilled into us. Outcomes, not outputs. You shouldn't be defining the work. You should be defining what success looks like. [27:45] I think as more and more teams start to use slash goal as part of their coding strategy, [27:50] workflow, product managers are going to have to get a lot better at prompting these AIs with good goals. And we have some of those skills already, but I think [28:00] The technical level of validation that's required by slash goal requires you to up level these hard skills in writing what a good goal actually looks like.

28:12-29:48

[28:12] And then finally, I'd say with slash goal and these long running tasks, and I felt this a little bit with OpenClaw. [28:17] And I just see this becoming more and more true. [28:21] Working with AI just continues to feel more and more like working with [28:26] a colleague, a human colleague, in that you assign a human colleague a task. [28:31] You don't like sit there over their shoulder and tap and say, OK, next step. OK, next step. [28:37] What you really do is you give them a goal. They go away for the time required to hit that goal. And then they come back to you with the completed task and you give feedback. [28:46] And so, again, it's this form factor, even though the AI is maybe faster than a human would be on some tasks. [28:53] They may be slower than humans because they have the patience to go to the edge cases of things. [28:59] But either way, they're using the time necessary for the task [29:03] to get it done. And it really feels like I'm much more in manager mode than builder mode. And [29:11] I love that. When Slash Goal came out, I found myself kind of like twiddling my thumbs and looking for the job that I could do. [29:20] in the coding work because so much of the job had now been handled [29:24] itself. So in conclusion, I really suggest you try slash goal. If not in codex, try a similar loop in whatever your favorite AI tool is, let it run and let it solve bigger, more complex problems for you and come back to you. [29:38] when it's time to review the work. [29:40] this is how I AI. I'm so excited to see what you build. And I'm going to get back to my logs and see if we've actually eliminated

29:48-30:17

[29:48] all these errors. Thanks for joining. Thanks so much for watching. If you enjoyed the show, please like and subscribe here on YouTube or even better, leave us a comment with your thoughts. [29:59] You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at howiaipod.com. [30:17] See you next time.

Want to learn more?