I haven’t sat down and put any thought into my foray in using AI in my recent endeavors for programming. I would say in the last 2-3 years, the subject matter has really skyrocketed especially with ChatGPT and sudden mega tech cold war going on with AI. There were instances where I started using it on a small scale and some of that turned into more trivial usages (like banter). However, in the past few months, I started to explore using it more seriously by incorporating it not only through “vibe coding” but using various other tools and hooking into things like Gen AI or Agents. I wanted to begin jotting down my experiences here.
I think my first “taste” of AI started in using the GitHub Copilot tool for code completion. Effectively, code completion takes hints from the way you code and attempts to infer what your next intention is based on patterns it has been trained upon. In some way, it’s like autocomplete for spell/grammar checks except it can do things like complete your function name, generate boilerplate for large sections of code (e.g. CRUD, unit tests), attempt to guess how a function maybe filled out based on the name you provide. So it seems useful especially for veterans who dislike boilerplate and want to focus on big picture items.
However, code completion has a major downside for both veterans and anyone who isn’t experienced enough to recognize the downsides: the suggestions get in the way. Because of various plugins enabled to perform tasks such as syntax checks/suggestions, etc., I feel there’s a bit of competition going on between these plugins in trying to provide “assistance”. Frequently, I’ve found myself accidentally allowing a line of code that is completely wrong because I had been going too fast and missed the suggestion, which forces me to wipe out the line (or key pieces) and start over. So in a way, code completion can hinder productivity, which in my case I found better simply turning it off (which is good once you master the syntax for a language).
The other problem I’ve encountered with code completion is the number of times you may be allowed. For instance, the code completion plugin for GitHub seems limited in terms of the number of allowances per month. Those can get wiped out quickly because of how frequently you might be using the option. In that situation, I would install Gemini’s code completion/agent alongside but the auto-complete can get wonky too. In the end, I felt it was just better at some point hand writing the code at least for now rather than making it a dependency for me.
From there, I know there are some IDEs that have been forked from things like Visual Source Code that include an agent or integrate better with LLMs. So far the only one I used was Windsurf. When I gave it a try, I thought that the vibe code aspect wasn’t bad but at least for the thing I employed, I ate up tokens fast. I get that the intention is to get users to pay because usage gets expensive but this was worse than the old Shareware games back in the day. So now, I only use Windsurf when I want two separate projects open simultaneously to reference one another. There is supposedly a way to alter the model being used to move from the main paid model to a fremium one. But I couldn’t figure out how to swap and it just annoyed me into putting the IDE on ice.
In terms of vibe coding itself, I mostly just use Gemini along with file uploads. I found that works best in general. I know some friends hate Gemini and believe the output is bad. But I think if you’re not expecting to do straight copypasta coding, Gemini actually works quite well. I mostly try to focus Gemini into narrow problems and hone in on key pieces of code. You don’t want Gemini to solve too much or the code gets out of hand quickly. Some problems I encountered were horribly spaghetti code suggestions, the bad habit of wanting to solve too much or reinvent the wheel each time, outdated APIs and lengthy, verbose explanations on top of massive amounts of code that you’re forced to scroll through. So in general if you want to be successful with Gemini, you need to remind it to focus on very specific problems, ignore certain pieces of code it produces and use design patterns, components, existing modules, etc. to ensure that it stays on task.
However, Gemini can be prone to fumbling. Occasionally, Gemini will become unresponsive for no good reason or having its own answer die within the response. Part of the issue is that Gemini can get caught in a weird loop when your conversation session goes too long so it’s trying to reference older aspects of a conversation. In some cases, it just gets jammed up and the only fix is to start a fresh session to “clear” it’s memory. I think you may lose the context of the older conversation which means you need to re-teach it stuff from the previous session. So that part really sucks when it occurs.
The fumbling part can happen when the answer gets really twisted and leads to Gemini providing a really bad answer such as outdated code, etc. In that situation, I’ll swap to ChatGPT for a second opinion much like going to a separate doctor for a diagnosis. The only reason I change and use ChatGPT as an alternative is the quality I’ve seen seems better. At the same time though, ChatGPT imposes various limits like the frequency of usage, uploads for analysis etc. So you’re bottlenecked unless you decide to shell out some cash. For me I do whatever I can to avoid entering my billing info and bail once I hit those limits.
Now, as I mentioned, I have tried the Gemini code assistant too as part of the VSC plugin universe. However, my experience with that has been less than stellar. I found that the module is far more limited, does a bad job examining your code base, had a period where there was simply no response whatsoever and/or overdid what it needed or completely missed the target. The code complete aspect wasn’t as bad and probably is the only piece worth using at the moment. But the vibe code part is far from good and I think it’s better just using the web interface with file uploads to get reasonable code done.
The things I am excited for though are the GenAI and Agent toolkits. GenAI is really cool but you need very focused tasks along with good prompts and know how to structure your output. I’ve been somewhat successful with several GenAI tools I created. Part of the trick I’ve found is to have narrow prompts that connect to the structured output. Creating premade prompts with keyword or value substitution helps ensure more consistent responses as opposed to free form text fields. In one case, I used GenAI to analyze an image to produce data from set fields that were specified by a pre-generated prompt. That one really excited me and showed me the possibilities of GenAI.
With Agents, I only have written two so far, one which I am using in conjunction with PubSub, my homemade EventArc simulator (listener/subscriber) and cloud functions that kick off an agent program. I have some other ideas about how to use them but right now it’s only a single version. But I can see how the Agent Development Kit would be interesting once you start using multi-model scenarios to do things like data correction/validation. But what I’m hoping to do more of in the future with Agents is automation. Without going into details, I will say that mass data generation using a combination of agents and gen AI are a key thing I want to handle.
One problem though that has been consistent is data accuracy. With the GenAI and to a lesser extent, Agents, I found that data generation is not 100% accurate. In fact, it’s a problematic and inconsistent issue in terms of the gambles on what it can return. Even if the prompts are the same consecutively, the data returned might vary. This is worrisome as you would expect to get a car if you ask for a car each time. Part of the issue I suspect is that the data the AI is trained on might be munged up. Some people call this hallucinations but the way one supposedly corrects this is through RAG techniques. Of course, if you work with a limited agent, you can only supply limited help from RAG. In my case, the only option I really have is Google Search. However, the way the GenAI would have to be structured seems to be making two GenAI type of calls; one with RAG and the other with the source. At least from my experience, you cannot use a RAG call along with a functional call in this process, which sucks. But it also points to how expensive this can get not just from billing but the network aspect along with time.
Beyond these things, the other main piece on the horizon is MCP (Model Context Protocol). From what I’ve read it’s more about orchestrating agents along with services. The big boy that keeps getting mentioned is LangChain, which I have briefly looked at. So that’s something on my plate in the near future as it may be the thing I use for doing more mass data generation.
At any rate, I know I’ve barely scratched the surface of any of this. But this stuff is truly fascinating. I don’t think I’ve been this excited for technology since seeing Google Maps for the first time in 2005.
Leave a Reply
You must be logged in to post a comment.