A year before we shipped the DeepSource integration for VS Code, we were working on something called - Manual Issue fixing
. The idea was that besides Autofixing code quality issues, users should be able to manually fix the issues for which an Autofix does not exist without having to go separately to their code editor or their code hosting dashboard.
It is necessary to note here that DeepSource’s Autofix feature was released in late 2020 and the initial version did not use LLMs. Engineers in the team manually defined what possible fixes can be for a comparatively smaller set of issues (think of a paradigm where our javascript analyzer can detect ~700 issues, while Autofix exists for only ~100 of them). Manually defining fixes wouldn’t have let us achieve parity between issues that can be Autofixed and total issues that are detected by an Analyzer anytime soon.
To achieve parity between these 2 sets, we prototyped an approach to enable Manual issue fixing
, spiced up with LLMs using an approach called Suggest a fix
(We did not want to use the Autofix brand for an experimental feature - and also Autofix 1.0 applied deterministic code patches, while LLM outputs are anything but deterministic).
As we were crafting prompts and playing with the results and prototypes, the starting line to work on the VS code extension was also coming closer, and alongside, code generating LLMs were getting even better (Imagine going from Codex to GPT-3 to CodeLlama to Claude to GPT-4 to CodeLlama 2). After working on manual issue fixing for four weeks, we decided to shelve it in favour of taking LLM-powered code fixes directly to where our users worked - the code editor.
Some parts of the manual issue fixing feature were transitioned to building Bulk Autofix
.
But we took the learnings we had from this project, and applied them first-hand to the DeepSource extension for VS Code. Learn more about the extension here 👇
<aside> 🎊 DeepSource for VS code ↗
</aside>
Workflow to run Autofix inside VS Code 👇
<aside> 👉 You can read an in-depth account of how you can take the extension for a spin here ↗
</aside>
Research
Before we started working on the extension, we explored a bajillion linters and other code formatters, and constantly kept talking across our discussion forums and with our customers to figure out the best way forward. Our customers had always asked us for ways to fix code issues right in the Deepsource dashboard without having to constantly switch between their code hosting platform, their local editor, and the Deepsource dashboard.
While talking to our users and bigger-ticket customers, we also felt that the approach to adding the ability to fix issues in the Deepsource dashboard would also go against one of our fundamental ethos that we prevent issues from getting into your main code base. And all of the learnings from multiple conversations boiled down to the simple fact, that allowing users to debug issues sitting where they write code, is a much superior proposition than making them push code to their code hosting platform and thereafter analyzing it for issues.
We tested our very early prototypes with the rest of the team (we were really into dogfooding to get a sense of how the product would feel for the end users) and a handful of beta testers, and constantly improved and tweaked things based on the feedback we received. There were a few platform challenges around analysis speed that we had to accept and work with.
We also had to constantly test the AI-enabled Autofix features with our customers to iron out flaws and fine-tune the models to give clean code as output. Our customers loved the approach of generating code AND parallely running static analysis to fix possible issues with the generated code. We did the final release after extensive testing and feedback of our customers to tweak and enhance the layout to give actionable insights on their codebase.
<aside> 🎊
If you’re reading this post after July 2024, the internal architecture has been shaken up significantly and the extension works blazing fast for Python and Javascript codebases ⎯ because we started shipping these 2 most popular analyzers directly with the extension binary.
</aside>