Bugs, Beware, as the Terminator is here for you! GitHub’s new AI-powered Code Scanning Autofix is one of the best things that developers will love to have by their side. Let’s take a deeper look at it!
Highlights:
- GitHub’s Code Scanning Autofix uses AI to find and fix code vulnerabilities.
- It will be available in public beta for all GitHub Advanced Security customers.
- It covers more than 90% of alert types in JavaScript, Typescript, Java, and Python.
What is GitHub’s Code Scanning Autofix?
GitHub’s Code Scanning Autofix is an AI-powered tool that can give code suggestions, along with detailed explanations, to fix vulnerabilities in the code and improve security. It will suggest AI-powered autofixes for CodeQL alerts during pull requests.
It has been released in public beta for GitHub Advanced Security customers and is powered by GitHub Copilot- GitHub’s AI developer tool and CodeQL- GitHub’s code analysis engine to automate security checks.
Meet code scanning autofix, the new AI security expertise now built into GitHub Advanced Security! https://t.co/cTDuKZCWMv
— GitHub (@github) March 20, 2024
This Tool can cover 90% of alert types across JavaScript, TypeScript, Java, and Python. It provides code suggestions that can resolve more than two-thirds of identified vulnerabilities with minimal or no editing required.
Why We Need It?
GitHub’s vision for application security is an environment where found means fixed. By emphasizing the developer experience within GitHub Advanced Security, teams are already achieving a 7x faster remediation rate compared to traditional security tools.
This new Code Scanning Autofix is a significant advancement, enabling developers to significantly decrease the time and effort required for remediation. It offers detailed explanations and code suggestions to address vulnerabilities effectively.
Despite applications remaining a primary target for cyber-attacks, many organizations acknowledge an increasing number of unresolved vulnerabilities in their production repositories. Code Scanning Autofix plays a crucial role in mitigating this by simplifying the process for developers to address threats and issues during the coding phase.
This proactive approach will not only help prevent the accumulation of security risks but also foster a culture of security awareness and responsibility among development teams.
Similar to how GitHub Copilot alleviates developers from monotonous and repetitive tasks, code scanning autofix will assist development teams in reclaiming time previously dedicated to remediation efforts.
This will lead to a decrease in the number of routine vulnerabilities encountered by security teams and enable them to concentrate on implementing strategies to safeguard the organization amidst a rapid software development lifecycle.
How to Access It?
Those interested in participating in the public beta of GitHub’s Code Scanning Autofix can sign up to the waitlist for AI-powered AppSec for developer-driven innovation.
As the code scanning autofix beta is progressively rolled out to a wider audience, efforts are underway to gather feedback, address minor issues, and track metrics to validate the efficacy of the suggestions in addressing security vulnerabilities.
Simultaneously, there are endeavours to broaden autofix support to additional languages, with C# and Go coming up very soon.
How Code Scanning Autofix Works?
Code scanning autofix provides developers with suggested fixes for vulnerabilities discovered in supported languages. These suggestions include a natural language explanation of the fix and are displayed directly on the pull request page, where developers can choose to accept, edit, or dismiss them.
Furthermore, code suggestions provided by autofix may extend beyond alterations to the current file, encompassing modifications across multiple files. Autofix also can introduce or adjust dependencies as necessary.
The autofix feature leverages a large language model (LLM) to generate code edits that address the identified issues without altering the code’s functionality. The process involves constructing the LLM prompt, processing the model’s response, evaluating the feature’s quality, and serving it to users.
The YouTube video shown below explains how Code scanning autofix works:
Underlying the functionality of code scanning autofix is the utilization of the powerful CodeQL engine coupled with a blend of heuristics and GitHub Copilot APIs. This combination enables the generation of comprehensive code suggestions to address identified issues effectively.
Additionally, it ensures a seamless integration of automated fixes into the development workflow, enhancing productivity and code quality.
Here are the steps involved:
- Autofix uses AI to provide code suggestions and explanations during the pull request
- The developer remains in control by being able to make edits using GitHub Codespaces or a local machine.
- The developer can accept autofix’s suggestion or dismiss it if it’s not needed.
As GitHub says, Autofix transitions code security from being found to being fixed.
Inside The Architecture
When a user initiates a pull request or pushes a commit, the code scanning process proceeds as usual, integrated into an actions workflow or third-party CI system. The results, formatted in Static Analysis Results Interchange Format (SARIF), are uploaded to the code-scanning API. The backend service checks if the language is supported, and then invokes the fix generator as a CLI tool.
Augmented with relevant code segments from the repository, the SARIF alert data forms the basis for a prompt to the Language Model (LLM) via an authenticated API call to an internally deployed Azure service. The LLM response undergoes filtration to prevent certain harmful outputs before the fix generator refines it into a concrete suggestion.
The resulting fix suggestion is stored by the code scanning backend for rendering alongside the alert in pull request views, with caching implemented to optimize LLM compute resources.
The Prompts and Output structure
The technology’s foundation is a request for a Large Language Model (LLM) encapsulated within an LLM prompt. CodeQL static analysis identifies a vulnerability, issuing an alert pinpointing the problematic code location and any pertinent locations. Extracted information from the alert forms the basis of the LLM prompt, which includes:
- General details regarding the vulnerability type, often derived from the CodeQL query help page, offer an illustrative example of the vulnerability and its remediation.
- The source-code location and contents of the alert message.
- Pertinent code snippets from various locations along the flow path, as well as any referenced code locations mentioned in the alert message.
- Specification outlining the expected response from the LLM.
The model is then asked to show how to edit the code to fix the vulnerability. A format is outlined for the model’s output to facilitate automated processing. The model generates Markdown output comprising several sections:
- Comprehensive natural language instructions for addressing the vulnerability.
- A thorough specification outlining the necessary code edits, adhering to the predefined format established in the prompt.
- An enumeration of dependencies is required to be integrated into the project, particularly relevant if the fix incorporates a third-party sanitization library not currently utilized in the project.
Examples
Below is an example demonstrating autofix’s capability to propose a solution within the codebase while offering a comprehensive explanation of the fix:
Here is another example demonstrating the capability of autofix:
The examples have been taken from GitHub’s official documentation for Autofix.
Conclusion
Code Scanning Autofix marks an amazing development in automating vulnerability remediation, enabling developers to address security threats swiftly and efficiently. With its AI-powered suggestions, and seamless integration into the development workflow, it can empower developers to prioritize security without sacrificing productivity!