this post was submitted on 04 Jan 2024
24 points (100.0% liked)

Programming

13380 readers
1 users here now

All things programming and coding related. Subcommunity of Technology.


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 1 year ago
MODERATORS
 

cross-posted from: https://programming.dev/post/8121843

~n (@nblr@chaos.social) writes:

This is fine...

"We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet were also more likely to rate their insecure answers as secure compared to those in our control group."

[Do Users Write More Insecure Code with AI Assistants?](https://arxiv.org/abs/2211.03622?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] jarfil 5 points 10 months ago* (last edited 10 months ago) (1 children)

People tend to deify LLMs, because of the vast amounts of knowledge trained into them, but their answers are more like a single "reasoning iteration".

How many human coders are capable of sitting down, typing a bunch of code at 100 WPM out of the blue, then end up with zero security flaws or errors? About absolutely none, not even if they get updated requirements, and the same holds up for LLMs. Coding is an iterative job, not a "zero shot" one.

Have an LLM iterate several times over the same piece of code ("think" about it), have it explain what it's doing each time ("reason" about it)... then test run it, fix any compiler errors... run a test suite, fix for any non-passing tests... then ask it to take into account a context of best practices and security concerns. Only then the code can be compared to that of a serious human coder.

But that takes running the AI over and over and over with a large context, while AIs are being marketed as "single run, magic bullet"... so we can expect a lot of shit to happen in the near future.

On the bright side, anyone willing to run an LLM a hundred times over every piece of code, like in a CI workflow, in an error seeking mode, could catch flaws that would otherwise take dozens of humans to spot.

[โ€“] ericjmorey@programming.dev 2 points 10 months ago

Excellent points!