Long back, I ported LSODA solver to C++ (https://github.com/dilawar/libsoda-cxx) for a neural simulator I was working on during my PhD. This port is almost the same as the reference C code except for STL containers and some other goodies (and some OB1 errors).
Today, I used Claude to improve unit tests and docs and enhance CMake integration.
First, I asked Claude to look in scipy codebase, search for LSODA related tests, and add them to my project. It did a great job and wrote a better commit message than I would have done but that’s not saying much! The tests Claude wrote used assert which is a noop in release build i.e. these assert won’t be there in release build so passing tests in release are meaningless. When I pointed this out, it replaced them with runtime exception that works with both debug and release build. Not cool, Claude but not bad either. Its C++!
Then I ask for a CMake harness so that folks can integrate my library into their CMake based workflow easily. I did OK at first attempt and made a boo-boo which was caught by Coderabbit review bot (free for open source). Claude fixed it later!
All of it took roughly 30 minutes!
If you are very very familiar with existing codebase then I don’t think Claude/AI tools add too much value. With hot cache, I could have done it in one hour. The advantages of 30 minutes of time saving may not be worth losing your touch with something you enjoy i.e. programming and story building in your head. I definitely feel that I am on “opium” when I use it for a long time.
Claude is very very good at things at which I am below average. Which is true for almost all the things I am doing with Claude thee days. So it is not unnatural that these tools enjoy great reputations among users. More objectively, good AI tools are perfect for things that feels like chores e.g. “translate tests from this Python implementation to C++” or “reproduce this bug and when you find it commit it on a branch” or “create a new project with this and that”.
At my last stint at a medium size org that lasted a month, we are asked to use Claude code as much as possible. Sure, I finished a task in half an hour that would have taken a person familiar with the codebase 3 hours but the PR is still pending review after 12 days! So where are the gains? Beware of Amdahl law as well!
As many have pointed out that AI tools are very eager to declare success. And they won’t stop at anything until then can declare success. Once in a while they will stop and ask for a question but most of time, they will find a plausible solution to the problem at hand. Ensure that your prompt has the definition of “success/done” well defined and properly articulated. Discuss it with all stakeholders when a ticket is created otherwise you’ll get something that kind of work but it does not.