Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

fubarx@lemmy.world · 24 hours ago

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

enumerator4829@sh.itjust.works · 1 hour ago

I started experimenting with the spice the past week. Went ahead and tried to vibe code a small toy project in C++. It’s weird. I’ve got some experience teaching programming, this is exactly like teaching beginners - except that the syntax is almost flawless and it writes fast. The reasoning and design capabilities on the other hand - ”like a child” is actually an apt description.

I don’t really know what to think yet. The ability to automate refactoring across a project in a more ”free” way than an IDE is kinda nice. While I enjoy programming, data structures and algorithms, I kinda get bored at the ”write code”-part, so really spicy autocomplete is getting me far more progress than usual for my hobby projects so far.

On the other hand, holy spaghetti monster, the code you get if you let it run free. All the people prompting based on what feature they want the thing to add will create absolutely horrible piles of garbage. On the other hand, if I prompt with a decent specification of the code I want, I get code somewhat close to what I want, and given an iteration or two I’m usually fairly happy. I think I can get used to the spicy autocomplete.

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

Opper