GPT-5.2 Codex is Insane (Here is My Workflow)

SUMMARY

Neil McDevitt praises GPT-5.2 Codex as a groundbreaking AI coding agent, sharing his vibe coding workflow, optimal settings, system prompt strategies, and philosophies on user-centric product building amid AI's rise.

STATEMENTS

GPT-5.2 Codex represents a personal AGI milestone, outperforming previous models like Claude 3.5 Opus in software engineering tasks.
On the ARC AGI 2 benchmark, designed to challenge LLMs, GPT-5.2 with extra high reasoning achieved 15% above human baseline using a Poetic harness.
Modern software engineering resembles vibe coding, where developers set initial conditions properly and guide the AI to stay on track.
Initial user complaints about GPT-5.2 Codex requiring frequent nudges have been addressed through updates, making it more autonomous.
GPT-5.2 Codex excels at deconstructing systems, analyzing components, and making reasoned decisions, unlike more intuitive but error-prone models like Claude Code.
Effective system prompts serve as comprehensive maps, enabling the AI to maintain context, avoid tunnel vision, and understand codebase relationships without wasting tokens.
Product development should begin with a clear vision of the market problem and user experience, determining technology choices afterward.
Valuable products maximize the gap between minimal user input and substantial output rewards, enhancing life through reduced effort.
In software, AI eliminates cost moats while devouring complexity; time moats persist via efficiency in complexity handling and investments in psychology and branding.
GPT-5.2's long context coherence, vastly superior to competitors, allows it to recall and recombine forgotten details from extended interactions for novel insights.

IDEAS

ARC AGI 2, crafted by experts like François Chollet to resist LLM progress, was unexpectedly surpassed by GPT-5.2, signaling rapid AI advancement beyond expectations.
Vibe coding flips traditional programming: humans provide high-level guidance while AI handles execution, mirroring how many already code subconsciously.
GPT-5.2 Codex's updates minimize human intervention, transforming it from a needy tool into a self-sustaining engineer that builds without constant prodding.
For raw computational power in tackling intricate systems, Codex outshines intuitive models like Claude Code, which better suits creative tasks like UI design.
A well-crafted system prompt, detailed yet concise, acts like a GPS for AI, preventing disorientation in large codebases and enabling full automation.
AI enables idea-to-functional-prototype timelines under 48 hours, even with interruptions like gym sessions, collapsing the barrier between conception and creation.
Software moats evolve: AI commoditizes costs and complexity, leaving time-based advantages in outpacing rivals through superior efficiency and brand psychology.
Amazon's dominance stems from decades of unprofitability, reinvesting to absorb user pain in logistics, proving long-term "heaviness" trumps short-term gains.
Irreducible interfaces, like ChatGPT's single textbox, slash cognitive load, allowing AI to ingest vast backend complexity for seamless user delight.
GPT-5.2's exceptional long-context recall revives obscure past conversations at opportune moments, fostering emergent creativity in ongoing dialogues.
Recombining existing technological innovations via AI research chats democratizes product creation, requiring only clear vision to launch market-ready solutions.
Builders using Codex feel empowered like "gangsters," overseeing autonomous builds while injecting personal taste, turning development into guided iteration.

INSIGHTS

AI agents like GPT-5.2 Codex are redefining engineering by internalizing complexity, freeing humans to focus on vision and minimal oversight for exponential productivity.
System prompts as navigational maps ensure AI coherence across vast projects, abstracting human expertise into scalable, error-resistant automation frameworks.
True product value emerges from asymmetrical exchanges: trivial user efforts yielding profound benefits, a principle AI amplifies by offloading invisible labors.
Enduring competitive edges in software lie in temporal investments—mastering complexity per unit time and cultivating psychological moats like brand trust—unassailable by tech alone.
Rapid ideation through extended AI dialogues unlocks innovation by systematically exploring and recombining prior art, turning abstract problems into tangible prototypes.
Superior long-context capabilities in advanced models cultivate synthetic memory, enabling serendipitous insights that mimic human intuition but at machine scale.

QUOTES

"This is like the AGI moment for me."
"We're going from idea to execution is collapsing in real time."
"In the short term the market is a betting machine. In the long term, the market is a weighing machine and what you're trying to do is build the heaviest company."
"You want the discrepancy between inputs and outputs for my users to be as big as possible."
"Just a text box. That's it."

HABITS

Testing AI models like GPT-5.2 Codex and Claude Code in parallel across separate projects to compare performance in real workflows.
Crafting extensive yet token-efficient system prompts through iterative AI conversations to maintain project orientation.
Engaging in prolonged back-and-forth ideation sessions with ChatGPT to clarify visions, research tech limits, and recombine innovations.
Monitoring and gently nudging AI during vibe coding to keep it on track, intervening only when it veers off rails.
Iterating on user interfaces and experiences by providing targeted feedback to AI, fine-tuning until the desired feel is achieved.

FACTS

ARC AGI 2 benchmark, developed by researchers including François Chollet and Greg Kamradt, aimed to resist LLM advancements but was outperformed by GPT-5.2.
GPT-5.2 Codex, with extra high reasoning and a specialized harness, scored 15% above the human baseline on ARC AGI 2 public evaluation.
Amazon sustained unprofitability for years, redirecting funds to build infrastructure that absorbs massive logistical complexity for users.
GPT-5.2 demonstrates long context coherence thousands of times superior to other models, recalling details from months-old interactions.
A full prototype for a long-contemplated market problem was developed in under 48 hours using GPT-5.2 Codex, despite non-coding activities.

REFERENCES

ARC AGI 2 benchmark by researchers like François Chollet.
GPT-5.2 Codex as primary AI coding tool.
Claude Code for UI and intuition-based tasks.
Amazon as model for complexity-eating and branding.

HOW TO APPLY

Identify your target market and core problem, then articulate a precise vision of the desired user experience, prioritizing simplicity and reward over technical details.
Conduct extended conversations with ChatGPT to ideate, solidify the concept, and research existing technological limits, focusing on recombining proven innovations for feasibility.
Synthesize the dialogue into a comprehensive system prompt that serves as a project map, then refine it through critiques from other AIs like Claude for optimization.
Configure GPT-5.2 Codex settings: enable agent full access under approvals and turn on background terminal in experimental options to enhance autonomy.
Initiate vibe coding by setting clear initial conditions, monitor progress to keep the AI on rails, and iterate through targeted inputs for fine-tuning architecture and UI.

ONE-SENTENCE TAKEAWAY

GPT-5.2 Codex empowers rapid, autonomous software creation by leveraging vision-driven prompts to devour complexity for superior user experiences.

RECOMMENDATIONS

Prioritize user experience in product design to create irreducible interfaces that minimize cognitive load and maximize value delivery.
Invest in detailed system prompts as cognitive maps to sustain AI focus across complex, long-running development cycles.
Focus on eating complexity per unit time, using AI's horsepower to outpace competitors in backend sophistication.
Build time-based moats through branding and psychological investments, emulating Amazon's long-term reinvestment strategy.
Harness long-context AI for ideation by maintaining extended dialogues that recombine forgotten insights into innovative solutions.

MEMO

In the accelerating world of artificial intelligence, software engineer Neil McDevitt hails GPT-5.2 Codex as a transformative force, dubbing it his personal "AGI moment." This OpenAI model, he argues, isn't just an assistant—it's a superior coding agent that dismantles complex systems with surgical precision, far outstripping predecessors like Claude 3.5 Opus. McDevitt's enthusiasm stems from its benchmark-breaking performance: on the ARC AGI 2 evaluation, designed by luminaries like François Chollet to thwart large language models, GPT-5.2 scored 15 percent above human baselines. Such feats underscore a quiet revolution, where AI begins to mimic—and exceed—human reasoning in realms once deemed impenetrable.

McDevitt's workflow, which he terms "vibe coding," demystifies this power. Developers set initial conditions through meticulously crafted system prompts—vast yet efficient "maps" that orient the AI without squandering context windows. He configures Codex for full agent access and background terminals, enabling autonomous operation with minimal nudges. Testing it alongside rivals on dual projects, McDevitt finds Codex's raw horsepower ideal for devouring backend complexity, while tools like Claude shine in intuitive design. This hybrid approach allows him to prototype a long-germinating product idea into a functional demo in under 48 hours, even amid gym breaks and daily life.

At its core, McDevitt's philosophy inverts traditional development: begin not with code, but with user vision. Identify a market pain point, envision seamless experiences, then select technologies to bridge the gap. The goal? Maximize the chasm between scant user inputs and outsized rewards, as exemplified by ChatGPT's singular textbox. AI erodes cost barriers and gobbles complexity, he notes, but time remains the ultimate moat—through relentless efficiency and intangibles like branding. Amazon's saga illustrates this: years of losses funneled into logistics empires, yielding a "heavy" behemoth that weighs more in the long market term, per Jeff Bezos's wisdom.

Yet McDevitt warns against overcomplication. Once scaffolding is set, development becomes iteration: guide the AI like a seasoned conductor, infusing personal taste into interfaces while it builds the rest. GPT-5.2's peerless long-context coherence—recalling months-old details for fresh insights—fuels this autonomy, turning builders into overseers. For novices, mastery demands just 48 hours of deliberate practice, assuming familiarity with tools like ChatGPT. In an era where idea-to-execution timelines collapse, McDevitt's method promises empowerment, but only for those who grasp AI not as a crutch, but as an amplifier of human ingenuity.

As AI evolves, McDevitt envisions a future where software's essence shifts from lines of code to psychological resonance—products that don't just function, but flourish lives. His Codex-driven prototypes, from 3D modelers to niche solvers, embody this: minimal user sacrifice for maximal impact. While skeptics decry AI's coding flaws, McDevitt counters that true expertise lies in orchestration, not authorship. In vibe coding's rhythm, the human role endures—not as laborer, but as visionary, steering machines toward heavier, more humane horizons.