Agentic coding has some major issues...

SUMMARY

WebDevCody discusses downsides of agentic coding in software engineering, highlighting increased maintainer workload, security risks from AI-generated code, and his thorough review workflow for the Automaker open-source project.

STATEMENTS

Agentic coding enables shipping more code quickly, but it overwhelms open-source maintainers with excessive pull requests requiring manual security and bug checks.
Maintainers face bottlenecks from reviewing large-scale changes, like pull requests with 45 or 185 file modifications, often without prior community discussion.
Large language models (LLMs) in agentic coding can introduce security vulnerabilities if not properly directed, necessitating tools like Code Rabbit for automated checks.
Review processes now include AI-assisted audits from Gemini and Claude, but maintainers must still manually sift through generated comments.
A discovered vulnerability allowed potential drive-by attacks via malicious links exploiting Automaker's local API to run arbitrary commands.
MCP configurations in tools like Context7 enable local servers that execute user-pasted commands, exposing risks when APIs are accessible without authentication.
Fact-checking LLM outputs is essential, as they can hallucinate, and developers must verify findings through follow-up questions and independent research.
To fix vulnerabilities, developers can prompt LLMs to implement solutions like API keys, ensuring only authorized components access endpoints.
Agentic coding differs from "vibe coding"; responsible use involves rigorous code reviews, as accountability for production issues rests with the developer.
Continuous learning in AI-driven coding requires evolving workflows, with resources like courses providing iterative prompting and full-stack application building.

IDEAS

Rapid code generation floods open-source projects with unvetted pull requests, shifting the quality burden entirely to volunteer maintainers.
AI tools amplify human laziness, allowing contributors to bypass community norms by submitting features that misalign with project visions.
Automated review tools like Code Rabbit can flag false positives, such as marking Tailwind CSS changes as critical, eroding trust in AI detections.
Localhost exposures in Electron apps create novel drive-by attack vectors, where malicious websites can hijack API calls without user awareness.
MCP servers, designed for flexibility, inadvertently enable arbitrary command execution, turning helpful integrations into security liabilities.
Hallucinations in LLMs demand constant human verification, turning code review into a detective-like process of cross-checking AI claims.
Prompting LLMs for fixes, like generating API keys with crypto libraries, transforms potential weaknesses into iterative improvements.
Distinguishing agentic coding from casual "vibe coding" underscores the need for diligence, as AI speed doesn't absolve developers of responsibility.
Open-source maintenance evolves into a hybrid human-AI oversight role, where tools assist but can't replace strategic decision-making.
The AI industry's rapid pace necessitates ongoing education, with courses updating to cover emerging tools like Cursor and Claude for full-stack development.

INSIGHTS

Agentic coding accelerates development but exacerbates maintainer burnout by prioritizing quantity over quality in contributions.
Security in AI-assisted projects hinges on proactive human oversight, as LLMs excel at generation but falter in holistic risk assessment.
Local API vulnerabilities reveal how desktop apps blur boundaries between user interfaces and exploitable backends, demanding authentication layers.
Effective prompting turns LLMs from error-prone creators into reliable fixers, emphasizing the skill of iterative dialogue over blind trust.
Open-source sustainability requires cultural shifts, like mandating pre-submission discussions, to counter AI-fueled feature sprawl.
Accountability in AI coding remains human-centric; production failures trace back to unchecked adoption, not the tools themselves.

QUOTES

"Every day I wake up and I see a bunch more pull requests created for this project. And that's more workload that I have to basically go through and review these things manually."
"They will ship security vulnerabilities and if you don't properly direct the LM to check for things it's going to get missed."
"If you're not fact-checking the LLM, you are going to do yourself a disservice in the long run."
"At the end of the day, you are responsible for your users in your application."
"This AI industry is rapidly evolving and if you get this course you will have access to these new videos that I continuously publish."

HABITS

Reviewing every pull request manually for security, performance, and bugs before merging.
Running branches locally to test functionality and conducting manual security audits.
Using specific prompts with LLMs like Claude to perform git diffs and targeted security checks.
Asking follow-up questions to LLMs and performing independent research to verify outputs and avoid hallucinations.
Prompting LLMs iteratively to generate fixes, such as implementing API key middleware for vulnerabilities.

FACTS

Pull requests generated via agentic coding can involve up to 185 file changes, making comprehensive reviews impractical.
Code Rabbit automatically scans all Automaker pull requests for security, maintainability, and bugs.
Automaker's API runs on localhost port 3008 within an Electron app, exposing endpoints to potential local attacks.
MCP tools like Context7 execute arbitrary commands pasted by users, facilitating local server setups.
The Agentic Jumpstart course includes 74 videos and 11.5 hours of content, with ongoing updates for new AI advancements.

REFERENCES

Automaker.app: Open-source project for AI-assisted coding workflows.
Code Rabbit: Tool for automated code reviews checking security and bugs.
Gemini Code Assist and Claude: AI services used for reviewing pull requests.
MCP (likely Multi-Cloud Provider or similar integration): Configuration for tools like Context7.
Cursor: AI coding tool mentioned in course for agentic practices.
TanStack Start, Drizzle, PostgreSQL, Railway: Tech stack for building and deploying full-stack apps in the course.

HOW TO APPLY

Scan incoming pull requests with automated tools like Code Rabbit to flag potential security issues, bugs, and maintainability problems before manual review.
Checkout the branch locally, run it to test functionality, and perform a manual audit by prompting an LLM with a git diff for specific security checks.
Investigate LLM-flagged vulnerabilities deeply by asking follow-up questions and cross-verifying with independent research to confirm legitimacy.
Prompt the LLM to devise and implement fixes, such as adding API key middleware using crypto libraries to secure endpoints.
Differentiate agentic coding from casual use by always reviewing code for production readiness, ensuring alignment with project vision and user safety.

ONE-SENTENCE TAKEAWAY

Agentic coding boosts speed but demands rigorous human oversight to mitigate security risks and maintainer overload in software projects.

RECOMMENDATIONS

Mandate community discussions in Discord before accepting AI-generated pull requests to align with core project goals.
Integrate API keys from app startup to prevent unauthorized local access in Electron-based tools.
Routinely fact-check LLM outputs through multiple verification layers to combat hallucinations.
Invest in courses like Agentic Jumpstart for evolving workflows in AI-driven development.

MEMO

In the fast-evolving world of software engineering, agentic coding—where AI agents like large language models generate and iterate on code—promises unprecedented productivity. Yet, as WebDevCody, a prominent open-source maintainer and educator, warns in his latest video, this revolution comes with hidden pitfalls. Cody, who develops the Automaker tool for AI-assisted workflows, shares a candid view from the trenches: while AI enables shipping code at breakneck speed, it floods maintainers with unvetted contributions, turning volunteer-driven projects into overwhelming chores.

The core issue, Cody explains, lies in the sheer volume of pull requests. Open-source repositories now bristle with AI-spawned changes—some touching 185 files—demanding exhaustive reviews for security flaws, performance drags, and outright bugs. Without prior dialogue in communities like his Discord, these submissions often clash with a project's vision, forcing maintainers to play gatekeeper. Cody's daily routine involves sifting through automated scans from tools like Code Rabbit, Gemini Code Assist, and Claude, only to manually verify their outputs amid a sea of comments. It's a bottleneck that serious stewards, committed to quality over "slop," cannot ignore.

Security emerges as the starkest downside. LLMs, for all their prowess, aren't infallible; they can embed vulnerabilities if prompts lack precision. Cody recounts auditing a recent pull request, where Claude uncovered a drive-by attack vector in Automaker's MCP integrations. These configurations, meant to flexibly run tools like Context7 for enhanced documentation, expose local servers to malicious exploits. A phishing link could, in theory, POST to the app's localhost API on port 3008, injecting harmful commands via the Electron framework. To counter this, Cody fact-checks AI alerts rigorously—probing for hallucinations and researching independently—before prompting fixes, such as crypto-generated API keys that lock down endpoints to the UI only.

Beyond vulnerabilities, Cody debunks the myth of "vibe coding," where developers blindly accept AI outputs. True agentic practice, he argues, demands diligence: local testing, iterative prompting, and ultimate accountability. If production breaks or users suffer breaches, the buck stops with the human, not the model. For those navigating this landscape, Cody points to structured learning—his Agentic Jumpstart course, packed with 74 videos on tools like Cursor, code reviews, and full-stack builds using TanStack, Drizzle, PostgreSQL, and Railway deployments. As AI evolves, so must workflows, with lifetime access to updates ensuring practitioners stay ahead.

Ultimately, Cody's message is a call for balance: embrace agentic coding's future, but fortify it with human judgment. In an industry racing forward, overlooking these downsides risks not just buggy code, but the erosion of trust in open-source ecosystems that power modern tech. Happy coding, he concludes, but only if it's secure and sustainable.