AI mistakes you're probably making

SUMMARY

Theo, a developer and YouTuber from T3.gg, critiques common mistakes in using AI for coding, emphasizing better problem selection, context management, and tool updates to unlock productivity gains.

STATEMENTS

Many developers view AI as useful only for small projects but ineffective for real codebases, yet rapid advancements are changing this perception.
The speaker once shared skepticism about AI but now sees transformative potential after correcting usage errors.
Selecting the right problem is crucial; AI excels at known issues, not as a last-resort fallback.
Developers often exhaust manual solutions before turning to AI, handing it poorly understood problems that lead to poor results.
Testing AI on problems you already know how to solve allows direct comparison and builds intuition for its capabilities.
Providing AI with the same context you'd give a junior engineer helps it succeed on familiar tasks.
Creating reproducible tests from past fixes enables benchmarking new AI models against real-world problems.
Collecting unsolved problems as tests reveals AI limitations and tracks progress over time.
When given proper context, AI often solves issues in ways similar to human approaches.
Dumping entire codebases into AI context causes "context rot," degrading performance due to overwhelming noise.
AI models function as advanced autocomplete, predicting next tokens based on context, so excess information dilutes focus.
Larger context windows do not always improve results; models perform worse beyond 50k tokens due to distraction.
Tools like Cursor and Claude avoid full codebase dumps by providing search tools or summaries for targeted access.
Humans and AI both benefit from searching codebases rather than reading everything; expect AI to use tools if provided.
Learning a codebase involves searching, running the app, and mapping relationships, not exhaustive reading.
Good context is minimal: describe the issue clearly and let AI fetch relevant details.
Models like Codeex search relevant files methodically, leading to precise changes, unlike faster but riskier ones like Opus.
Updating files like .claude.md or .agent.md with gotchas prevents recurring mistakes, acting as institutional memory.
The .claude.md file should focus on codebase-specific quirks, starting small and evolving with observed errors.
Bigger codebases require more encoded opinions in guidance files to handle accumulated weirdness.
Reading a well-maintained .claude.md reveals team building philosophies and accelerates onboarding.
Outdated tools or perspectives based on old experiences hinder AI adoption; the field evolves monthly.
Company policies delaying tool approvals force outdated usage, but individuals should experiment anyway.
Broken environments, like monorepos with faulty type checks at root, confuse AI and humans alike.
Fix environments first: ensure type checks work globally without directory changes or config errors.
AI can chase phantom errors in broken setups, reverting valid fixes repeatedly until exhaustion.
Overconfiguration with MCP servers, skills, and plugins bloats context, worsening performance.
Stock setups like basic Claude or Cursor often yield better results than heavily customized ones.
Plan mode elicits clarifying questions, building useful context before execution.
When AI output fails, revert and refine the initial prompt or plan rather than appending corrections.

IDEAS

AI's autocomplete nature means irrelevant context acts like noise in a signal, pushing predictions toward gibberish.
Reproducible tests from personal fixes serve as personal benchmarks, far superior to generic ones.
Treating AI like a perpetual junior engineer requires building "memory" through docs to avoid repeated errors.
Context rot mirrors human overload: too much info buries the key problem, reducing accuracy.
Search tools in AI mimic developer habits, turning it into an efficient collaborator rather than a brute-force reader.
Updating guidance files based on AI mistakes is like editing team docs after a new hire's blunder—preventive and scalable.
Bigger codebases amplify "weirdness," so guidance files encode cultural norms as much as technical details.
Rapid AI evolution defies past software trends; capabilities double in months, not years.
Company tool restrictions create a false narrative of AI uselessness, when it's often just bad tools.
Broken setups haunt AI like ghosts, leading to endless loops; fixing them boosts reliability instantly.
Overconfiguration traps users in complexity, where adding features solves nothing but creates bloat.
Plan mode transforms failures into questions, iteratively refining context without polluting history.
Screenshot-based prompting bypasses verbose descriptions, leveraging visual intuition in AI.
One-shot successes signal healthy setups; repeated iterations indicate upstream issues like poor prompts.
Hydration errors in React 19 now include diffs, a feature AI could exploit if given exact context.
Vibe coding thrives on simplicity: stock tools plus clear talk outperform orchestrated agent swarms.
Intuitions for AI, like React hook rules, form quickly through practice, not theory.
AI reveals environment flaws during fixes, turning debugging into self-improvement loops.
Skeptics undervalue AI by basing judgments on early, flawed experiences, ignoring iterative leaps.
Guidance files double as onboarding artifacts, teaching humans about team idiosyncrasies.
Minimal context empowers AI autonomy, similar to briefing a smart colleague without hand-holding.
Fallback usage of AI for unknown problems inverts its strength: it shines on scaffolded, known terrain.
Personal tests track AI maturity, offering shareable gold for community validation.
Embracing AI's limitations fosters creative prompting, like using plan mode for emergent clarity.

INSIGHTS

AI performs best on problems humans already understand, allowing validation and gradual trust-building.
Excess context dilutes predictive accuracy, emphasizing curation over completeness in prompts.
Guidance files like .claude.md create persistent memory, transforming stateless AI into a consistent team member.
Rapid tool evolution demands ongoing experimentation, as yesterday's skepticism becomes tomorrow's inefficiency.
Broken environments undermine AI more than complex tasks, highlighting the need for foundational hygiene.
Overconfiguration often masks deeper issues like poor problem selection, complicating rather than clarifying.
Plan mode facilitates collaborative refinement, turning potential failures into structured dialogues.
Treating AI as autocomplete reveals its fragility to noise, advocating tool-assisted focus over dumps.
Company policies shouldn't excuse outdated views; personal initiative bridges the gap to cutting-edge use.
Intuitions for effective AI usage develop through targeted practice, akin to mastering any development paradigm.
Reproducible personal benchmarks provide objective measures of progress, empowering informed advocacy.
Simplicity in setups unlocks productivity; maximalism distracts from core communication with the model.

QUOTES

"The harsh reality is that things are changing, and I want to talk a bit about that."
"Most people don't try a new solution on a problem they already know how to solve. If you know how to solve the problem, you just solve it."
"The more the model knows, the dumber it gets."
"If you tell an AI agent about a ghost, it will chase it forever."
"Life is much better when you realize that's all you need."
"If you're not oneshotting things often, that's because there are problems in your prompting, problems in your context management."
"Stop adding all of this [__] to your stuff, though."
"Just talk to it."

HABITS

Test AI on self-solved problems to compare outputs and build capability intuition.
Update .claude.md or .agent.md files immediately after spotting recurring AI errors to prevent future issues.
Use plan mode for complex tasks to generate clarifying questions before execution.
Maintain clean environments by ensuring global type checks work without directory changes.
Start prompts with minimal, precise descriptions of issues, letting AI fetch details.
Revert and refine initial prompts instead of appending corrections to avoid context pollution.
Create personal reproducible tests from past fixes to benchmark new AI versions.
Avoid overconfiguring tools; stick to stock setups and subtle .md tweaks.
Provide exact error messages and relevant context like user states in prompts.
Use search tools or summaries in AI interfaces rather than full codebase inclusion.

FACTS

AI models like Opus cap at 100-200k tokens, while Gemini reaches 1-2 million, but performance drops beyond 50k due to context rot.
In tests, model success rates for finding items in repeated words fall from 100% to under 60% with excessive tokens.
React 19 introduced hydration error traces with diffs, aiding precise debugging.
G2I offers a network of over 8,000 experienced engineers, many from FAANG, with 7-day work trials.
Developers like Pete achieve 500+ commits daily using stock Claude without custom plugins.
Early AI coding tools like Sonnet 3.5 in Windsurf often failed due to poor context handling.
Large codebases accumulate "weirdness" proportional to size, requiring encoded guidance.
Amazon's internal tools like Kira restrict external AI use to favor their own development.

REFERENCES

G2I hiring platform for AI-onboarded engineers.
Cursor AI coding tool with Opus model.
Repo Mix project for flattening codebases into single files.
Claude.md and Agent.md guidance files for codebase instructions.
Codeex model for methodical file searching.
Opus model in Cloud Code for quick edits.
T3 Chat product for AI interactions.
PNPM scripts for dev and type generation.
Convex for backend types and schema changes.
Playwright for browser verification in debugging.
Windsurf early agentic coding tool.
Co-pilot GitHub's AI assistant.
Oh My Zsh custom runner for environment variables.
Vibe Kanban codebase with ESLint config issues.
React 19 hydration error traces.
Pete's "Just Talk to It" article on simple AI prompting.
Twitter for tracking hot AI tools.
Twitch web app rebuild in React.

HOW TO APPLY

Validate the problem exists by confirming errors occur under specific conditions like high throughput.
Attempt the obvious fix based on error lines or logs before escalating.
Invest effort in reproduction: debug, read code, check logs to isolate the issue.
Once standard methods fail, explore novel solutions like alternative databases or technologies.
Select known problems for AI: provide context as if briefing a junior engineer.
Create reproducible tests: freeze code state before a fix, document needed info in markdown.
Curate context minimally: describe the issue and include guidance files like .claude.md.
Use search tools in AI interfaces to target relevant files without full dumps.
Update guidance files with gotchas after errors, such as avoiding dev server runs.
Opt for plan mode on complex tasks to elicit questions and refine before execution.

ONE-SENTENCE TAKEAWAY

Correct common AI coding mistakes like poor problem selection and context overload to unlock transformative productivity.

RECOMMENDATIONS

Prioritize AI for familiar bugs to build trust through verifiable successes.
Shun full codebase dumps; use tools that enable targeted searches instead.
Evolve .claude.md files organically with observed pitfalls to embed team knowledge.
Experiment with state-of-the-art tools monthly to avoid outdated skepticism.
Repair broken environments first, as they cause more AI failures than task complexity.
Embrace stock configurations over plugins; simplicity amplifies effectiveness.
Leverage plan mode to clarify ambiguities, preventing downstream errors.
Craft precise prompts with exact errors and qualifiers to guide AI accurately.
Develop personal benchmarks from real fixes to objectively track AI improvements.
Ignore restrictive policies by testing tools personally for career edge.
Revert flawed sessions and restart with refined inputs to maintain clean history.
Use screenshots for visual issues, streamlining communication beyond text.
Balance intuition across prompts, docs, and environments for one-shot wins.
View AI as autocomplete: minimize noise to maximize signal in predictions.
Onboard via guidance files to learn codebase culture quickly as a human or AI.
Avoid fallback-only AI use; integrate it early in problem-solving workflows.
Share reproducible tests with communities to advance collective AI evaluation.

MEMO

In a rapidly evolving landscape of AI-assisted coding, developer Theo from T3.gg challenges the widespread skepticism that paints these tools as mere novelties for side projects. Drawing from his own shift from doubt to enthusiasm, Theo argues that AI's potential for real codebases hinges not on hype, but on avoiding fundamental missteps. He begins by spotlighting the error of deploying AI as a desperate last resort—after manual fixes fail—rather than on well-understood problems where outcomes can be benchmarked against human solutions. This approach, he says, builds intuition: treat AI like a junior engineer, furnishing it with the precise context you'd provide a teammate, from logs to relevant code snippets.

Theo delves into the perils of context management, a linchpin of effective AI use. Dumping entire repositories into prompts triggers "context rot," where models, functioning as sophisticated autocomplete systems, drown in noise and predict irrelevancies. He cites studies showing performance plummeting—success rates dropping from 100% to under 60% as tokens swell—regardless of generous limits like Gemini's million-plus capacity. Instead, tools such as Cursor or Claude shine by equipping AI with search functions, mimicking how developers hunt for typos without scouring every file. Theo advocates minimalism: describe the issue succinctly, let AI query the codebase, and refine guiding documents like .claude.md to encode quirks, turning stateless models into versed contributors.

Corporate hurdles exacerbate outdated perceptions, Theo notes, as approval delays lock teams into relics like early Sonnet versions or Amazon's restrictive Kira fork. Yet he urges rebellion—test cutting-edge options like Codeex or Opus personally, even at personal risk, because evolution here outpaces software history: insoluble problems six months ago yield to solutions today. Environments, too, demand scrutiny; monorepos with root-level type-check failures bedevil AI, prompting endless chases after phantom errors. Fix these skeletons first, he insists, perhaps by prompting AI itself to mend configs, as Theo did in a Vibe Kanban project.

Overconfiguration emerges as a seductive trap, bloating setups with MCP servers, plugins, and verbose skills that compound rather than cure woes. Theo, using bare-bones Claude without extras, contrasts this with maximalist pitfalls like "Oh My Open Code," which overwhelm novices. High-output creators like Pete rack up 500 daily commits via stock tools and direct prompts—"just talk to it"—proving simplicity's power. Plan mode, meanwhile, preempts disasters by soliciting clarifications, fostering iterative refinement without history pollution.

Ultimately, Theo circles back to a hydration error tale, where vague prompting failed until exact traces from React 19 unlocked AI's prowess. This underscores his thesis: select solvable problems, curate context, maintain hygiene, and iterate thoughtfully. For skeptics, it's a call to recalibrate; for users, a roadmap to "vibe coding" efficiency. As AI blurs lines between human and machine ingenuity, Theo's lessons promise not replacement, but augmentation—provided we wield it wisely.

Theo's discourse resonates amid 2026's hiring chaos, where AI-generated resumes flood inboxes, yet platforms like G2I counter with vetted, AI-fluent talent and trial periods. His emphasis on personal benchmarks—freezing code states for repeatable evals—equips developers to quantify progress, sharing gold-standard tests that demystify capabilities. In an era where tools leapfrog monthly, ignoring these pitfalls risks obsolescence; embracing them, a renaissance in creative coding.