I Tried Every AI Coding Assistant
Science & Technology
Introduction
In recent years, artificial intelligence has revolutionized the way developers write code. Various AI coding assistants have emerged, each offering unique features and capabilities to enhance productivity. This article details my experience using several popular AI coding tools to evaluate their effectiveness in auto-generating code, providing explanations, and creating test cases.
TabNine
TabNine, akin to GitHub Copilot, works as an autocomplete tool within IDEs like VS Code. It can suggest code while you write, saving time and effort. For example, I attempted to create a factorial function. The autocomplete feature handled it decently, returning a correct implementation. However, the function could have been formatted better.
When testing its chat feature, I asked about the time complexity of the factorial function. Although it provided the correct answer of (O(n)), it mistakenly added that it is also (O(n^2)). This experience highlights the potential for incorrect information.
I also manipulated the factorial code to make it incorrect and requested TabNine to identify the issue. While it reacted correctly by recognizing the flaw in the base case, it didn't provide a complete explanation.
TabNine does offer an option to explain code, where it performed well, generating a coherent discussion about how the factorial function operates. Yet, its comment generation feature was only partially successful, providing vague context.
However, when I requested it to generate test cases, it returned accurate results, including a variety of scenarios for the factorial function. Moving to slightly more complex functions, such as merge sort, TabNine displayed similar behaviors found in less complex code.
Overall, TabNine proves to be resourceful but isn’t the best in the mix. Thus, I place it in the decent tier.
ChatGPT
Next, I evaluated ChatGPT, which has gained recognition for its conversational abilities. While it excels at providing context and explanations, I found it not as efficient when solely focusing on coding tasks. Switching between the code editor and ChatGPT introduced context-switching overhead that could impede flow.
Despite these drawbacks, ChatGPT remains useful for learning but may not be the ideal choice for rapid coding. I ultimately categorized it as "actually useful."
GitHub Copilot
GitHub Copilot, deeply integrated into IDEs like VS Code, supports both code autocompletion and chat features. When I tested it similarly to TabNine, Copilot quickly generated a functioning factorial implementation while also suggesting methods to handle negative numbers—something TabNine missed.
Copilot generated detailed, comprehensive test cases automatically when prompted, lending itself to quicker testing iterations. Additionally, it demonstrated an ability to maintain context throughout the coding process.
Like ChatGPT, Copilot also offers explanations, explaining the factorial function effectively. Its testing suite even handled cases that relied on external frameworks, showcasing its adaptability.
Overall, GitHub Copilot stood out as a highly effective tool, and I rated it in the "10x developer" category.
Bard
I then examined Bard, which mirrors some functionalities of ChatGPT but adds citations for further reading. However, my tests revealed that Bard's accuracy occasionally lagged behind ChatGPT's. Given its shortcomings but still useful capabilities, I classified Bard as "actually useful," albeit slightly below ChatGPT.
Amazon CodeWhisper
Amazon CodeWhisper doesn’t include chat features but focuses primarily on autocompletion. Unfortunately, it fell short in generating comprehensive test cases compared to Copilot and TabNine.
While it managed to create some accurate code snippets, its limitations became glaring once I attempted to generate comments and thorough test cases. I ultimately classified CodeWhisper as "acceptable."
Sourcegraph Cody
Sourcegraph Cody seeks to integrate deeply with the code repository environment, providing not just autocompletion but also codebase insights. Despite its potential, Cody struggled to accurately find and suggest solutions based on an existing code structure, resulting in a less than optimal experience.
The integration of a chat-based AI model is a step forward, but it still requires improvements to effectively scan and understand large codebases. Given its performance, I categorized Cody as "acceptable."
Codium AI
Finally, I explored Codium AI, designed explicitly for testing rather than general-purpose coding assistance. Its test generation and code suggestions were remarkably robust, covering a wide array of scenarios for various edge cases. The documentation it produced was of high quality, and its ability to suggest improvements to the original code was impressive.
As a testing-focused AI, Codium AI excelled in delivering meaningful output without being encumbered by autocompletion features. I placed it in the "10x developer" tier due to its unique and consistent focus on enhancing code quality.
Conclusion
By using a selection of AI coding assistants, I found some exciting potential utilities driving coding efficiency while acknowledging their limitations. My personal stack of go-to tools will consist of GitHub Copilot, ChatGPT, and Codium AI to improve my coding experience.
Keywords
AI coding tools, GitHub Copilot, TabNine, ChatGPT, Bard, Amazon CodeWhisper, Sourcegraph Cody, Codium AI, test generation, code explanation, auto-completion.
FAQ
1. What is TabNine?
TabNine is an AI coding assistant that offers code completion and suggestions while you write within your IDE.
2. How does GitHub Copilot compare to ChatGPT?
GitHub Copilot excels in coding tasks with real-time completions and detailed explanations, while ChatGPT is better for conversations and providing context.
3. What unique feature does Codium AI offer?
Codium AI focuses primarily on generating tests and improving code quality rather than being a generic code completion tool.
4. Can Bard be used effectively for coding?
Yes, Bard is useful for coding but tends to be less accurate compared to ChatGPT, offering citations for further verification.
5. What are the tier categories mentioned in the article?
The tools were categorized as 10x developer, actually useful, acceptable, and decent based on their performance and reliability in coding tasks.