AI-generated code is plausible before it is correct. This subtle difference creates a new type of risk: bugs that pass code review because the code looks reasonable.
There is a fundamental difference between code that looks correct and code that is correct. LLMs excel at producing code that seems reasonable — syntactically perfect, stylistically consistent, that compiles and passes basic tests. But they can introduce subtle errors in business logic, edge cases, error handling or security. These bugs often pass code review because the reviewer trusts the automatic generation.
The most common categories of bugs in AI-generated code: off-by-one errors on indexes and loops, incorrect null/undefined handling, race conditions in asynchronous code, SQL injections via concatenation rather than parameterization, incorrect algorithms for specific edge cases. An LLM often generates the happy path perfectly and fails on error cases.
False confidence is the most dangerous psychological risk. Studies show that developers review AI-generated code with less rigor than human code — there is a tendency to trust the machine. Counter-intuitive but real. The countermeasure: treat AI-generated code exactly like untrusted external code. Apply the same review standards as for code coming from a junior developer you don't know. Automated tests, systematic review of error cases, and static analysis tools are your best allies.
- Treat AI code as untrusted code from an unknown author
- Specifically review edge cases and error handling
- Enforce high test coverage on generated code
- Static analysis tools catch classes of errors that AI repeatedly makes
→ See also: AI risks: copyright · AI risks: data security · AI in software development