Home 🐋 Giants The GitHub Blog
author

The GitHub Blog

Stay inspired with updates, ideas, and insights from GitHub to aid developers in software design and development.

February 21, 2025  17:00:49

Debugging is an essential part of a developer’s workflow—but it’s also one of the most time consuming. What if AI could streamline the process, helping you analyze, fix, and document code faster? Enter GitHub Copilot, your AI-powered coding assistant.

GitHub Copilot isn’t just for writing code—it’s also a powerful tool for debugging. Whether you’re troubleshooting in your IDE, using Copilot Chat’s slash commands like /fix, or reviewing pull requests (PR) on github.com, GitHub Copilot offers flexible, intelligent solutions to speed up your debugging process. And with the free version of GitHub Copilot, available to all personal GitHub accounts, you can start exploring these features today.

In this guide, we’ll explore how to debug code with GitHub Copilot, where to use it in your workflow, and best practices to get the most out of its capabilities. Whether you’re new to GitHub Copilot or looking to deepen your skills, this guide has something for you.

Debugging code with GitHub Copilot: surfaces and workflows

Debugging code with GitHub Copilot can help you tackle issues faster while enhancing your understanding of the codebase. Whether you’re fixing syntax errors, refactoring inefficient code, or troubleshooting unexpected behavior, GitHub Copilot can provide valuable insights in your debugging journey.

So, how exactly does this work? “GitHub Copilot is recognizing patterns and suggesting solutions based on what it has learned,” says Christopher Harrison, Senior Developer Advocate. “Once you’ve identified the problem area, you can turn to GitHub Copilot and ask, ‘I’m giving this input but getting this output—what’s wrong?’ That’s where GitHub Copilot really shines.”

Let’s explore how GitHub Copilot can help you debug your code across different surfaces, from your IDE to github.com and even pull requests.

1. In Copilot Chat

Copilot Chat acts as an interactive AI assistant, helping you debug issues with natural language queries. And with Copilot Free, you get 50 chat messages per month. With Copilot Chat, you can:

  • Get real-time explanations: Ask “Why is this function throwing an error?” and Copilot Chat will analyze the code and provide insights.
  • Use slash commands for debugging: Try /fix to generate a potential solution or /explain for a step-by-step breakdown of a complex function. (More on this later!)
  • Refactor code for efficiency: If your implementation is messy or inefficient, Copilot Chat can suggest cleaner alternatives. Christopher explains, “Refactoring improves the readability of code, making it easier for both developers and GitHub Copilot to understand. And if code is easier to understand, it’s easier to debug and spot problems.”
  • Walk through errors interactively: Describe your issue in chat and get tailored guidance without ever having to leave your IDE.

2. In your IDE

When working in popular IDEs like VS Code or JetBrains, GitHub Copilot offers real-time suggestions as you type. It helps by:

  • Flagging issues: For example, if you declare a variable but forget to initialize it, GitHub Copilot can suggest a correction.
  • Code fixes: Encounter a syntax error? GitHub Copilot can suggest a fix in seconds, ensuring your code stays error-free.
  • Contextual assistance: By analyzing your workspace, GitHub Copilot provides solutions tailored to your codebase and project structure.

3. On github.com

GitHub Copilot extends beyond your IDE, offering debugging assistance directly on github.com via Copilot Chat, particularly in repositories and discussions. With this feature, you can:

  • Troubleshoot code in repositories: Open a file, highlight a problematic section, and use Copilot Chat to analyze it.
  • Generate test cases: If you’re unsure how to verify a function, GitHub Copilot can suggest test cases based on existing code.
  • Understand unfamiliar code: Reviewing an open-source project or a teammate’s PR? Ask GitHub Copilot to summarize a function or explain its logic.

4. For pull request assistance

GitHub Copilot can also streamline debugging within PRs, ensuring code quality before merging.

  • Suggest improvements in PR comments: GitHub Copilot can review PRs and propose fixes directly in the conversation.
  • Generate PR summaries: Struggling to describe your changes? Greg Larkin, Senior Service Delivery Engineer, says, “I use GitHub Copilot in the PR creation process to generate a summary of the changes in my feature branch compared to the branch I’m merging into. That can be really helpful when I’m struggling to figure out a good description, so that other people understand what I did.”
  • Explain diffs: Not sure why a change was made? Ask GitHub Copilot to summarize what’s different between commits.
  • Catch edge cases before merging: Use /analyze to identify potential issues and /tests to generate missing test cases.
  • Refactor on the fly: If a PR contains redundant or inefficient code, GitHub Copilot can suggest optimized alternatives.

By integrating Copilot into your PR workflow, you can speed up code reviews while maintaining high-quality standards. Just be sure to pair it with peer expertise for the best results.

5 slash commands in GitHub Copilot for debugging code

Slash commands turn GitHub Copilot into an on-demand debugging assistant, helping you solve issues faster, get more insights, and improve your code quality. Here are some of the most useful slash commands for debugging:

1. Use /help to get guidance on using GitHub Copilot effectively

The /help slash command provides guidance on how to interact with GitHub Copilot effectively, offering tips on structuring prompts, using slash commands, and maximizing GitHub Copilot’s capabilities.

  • How it works: Type /help in Copilot Chat to receive suggestions on your current task, whether it’s debugging, explaining code, or generating test cases.
  • Example: Need a refresher on what GitHub Copilot can do? Use /help to access a quick guide to slash commands like /fix and /explain.

2. Use /fix to suggest and apply fixes

The /fix command is a go-to tool for resolving code issues by allowing you to highlight a block of problematic code or describe an error.

  • How it works: Select the code causing issues, type /fix, and let Copilot Chat generate suggestions.
  • Example: If you have a broken API call, use /fix to get a corrected version with appropriate headers or parameters.

3. Use /explain to understand code and errors

The /explain command breaks down complex code or cryptic error messages into simpler, more digestible terms.

  • How it works: Highlight the code or error message you want clarified, type /explain, and Copilot Chat will provide an explanation. It will explain the function’s purpose, how it processes the data, potential edge cases, and any possible bugs or issues.
  • Example: Encounter a “NullPointerException”? Use /explain to understand why it occurred and how to prevent it.

4. Use /tests to generate tests

Testing is key to identifying bugs, and the /tests command helps by generating test cases based on your code.

  • How it works: Use /tests on a function or snippet, and Copilot Chat will generate relevant test cases.
  • Example: Apply /tests to a sorting function, and Copilot Chat might generate unit tests for edge cases like empty arrays or null inputs.

5. Use /doc to generate or improve documentation

There are long-term benefits to having good text documentation—for developers and GitHub Copilot, which can draw context from it—because it makes your codebase that much more searchable. By using the /doc command with Copilot Free, you can even ask GitHub Copilot to write a summary of specific code blocks within your IDE.

The /doc command helps you create or refine documentation for your code, which is critical when debugging or collaborating with others. Clear documentation provides context for troubleshooting, speeds up issue resolution, and helps fellow developers understand your code faster.

  • How it works: Highlight a function, class, or file, type /doc and right-click to see the context menu, and Copilot Chat will generate comprehensive comments or documentation.
  • Example: Apply /doc to a function, and Copilot Chat will generate inline comments detailing its purpose, parameters, and expected output.

By mastering these commands, you can streamline your debugging workflow and resolve issues faster without switching between tools or wasting time on manual tasks.

Best practices for debugging code with GitHub Copilot

Provide clear context for better results

Providing the right context helps GitHub Copilot generate even more relevant debugging suggestions. As Christopher explains, “The better that Copilot is able to understand what you’re trying to do and how you’re trying to do it, the better the responses are that it’s able to give to you.”

Since GitHub Copilot analyzes your code within the surrounding scope, ensure your files are well structured and that relevant dependencies are included. If you’re using Copilot Chat, reference specific functions, error messages, or logs to get precise answers instead of generic suggestions.

💡 Pro tip: Working across multiple files? Use the @workspace command to point GitHub Copilot in the right direction and give it more context for your prompt and intended goal.

Ask, refine, and optimize in real time

Instead of treating GitHub Copilot as a one-and-done solution, refine its suggestions by engaging in a back-and-forth process. Greg says, “I find it useful to ask GitHub Copilot for three or four different options on how to fix a problem or to analyze for performance. The more detail you provide about what you’re after—whether it’s speed, memory efficiency, or another constraint—the better the result.”

This iterative approach can help you explore alternative solutions you might not have considered, leading to more robust and efficient code.

Master the art of specific prompts

The more specific your prompt, the better GitHub Copilot’s response. Instead of asking “What’s wrong with this function?” try “Why is this function returning undefined when the input is valid?” GitHub Copilot performs best when given clear, detailed queries—this applies whether you’re requesting a fix, asking for an explanation, or looking for test cases to verify your changes.

By crafting precise prompts and testing edge cases, you can use GitHub Copilot to surface potential issues before they become production problems.

Try a structured approach with progressive debugging

Next, try a step-by-step approach to your debugging process! Instead of immediately applying fixes, use GitHub Copilot’s commands to first understand the issue, analyze potential causes, and then implement a solution. This structured workflow—known as progressive debugging—helps you gain deeper insights into your code while ensuring that fixes align with the root cause of the problem.

For example:

  1. Start with the slash command /explain on a problematic function to understand the issue.
  2. Use the slash command /startDebugging to help with configuring interactive debugging.
  3. Finally, apply the slash command /fix to generate possible corrections.

📌 Use case: If a function in your React app isn’t rendering as expected, start by running /explain on the relevant JSX or state logic, then use /debug to identify mismanaged props, and finally, apply /fix for a corrected implementation.

Combine commands for a smarter workflow

Some issues require multiple levels of debugging and refinement. By combining commands, you can move from diagnosis to resolution even faster.

For example:

  • Use /explain + /fix to understand and resolve issues quickly.
  • Apply /fixTestFailure + /tests to find failing tests and generate new ones.

📌 Use case:

  • Fixing a broken function: Run the slash command /explain to understand why it fails, then use the slash command /fix to generate a corrected version.
  • Improving test coverage: Use the slash command /fixTestFailure to identify and fix failing tests, then use the slash command /tests to generate additional unit tests for the highlighted code.

Remember, slash commands are most effective when they’re used in the appropriate context, combined with clear descriptions of the problem, are part of a systematic debugging approach, and followed up with verification and testing.

Better together: AI tools with a developer in the pilot’s chair

GitHub Copilot is a powerful tool that enhances your workflow, but it doesn’t replace the need for human insight, critical thinking, and collaboration. As Greg points out, “GitHub Copilot can essentially act as another reviewer, analyzing changes and providing comments. Even so, it doesn’t replace human oversight. Having multiple perspectives on your code is crucial, as different reviewers will spot issues that others might miss.”

By combining GitHub Copilot’s suggestions with human expertise and rigorous testing, you can debug more efficiently while maintaining high-quality, reliable code.

Ready to try the free version of GitHub Copilot?
Start using GitHub Copilot today >

You can keep the learning going with these resources:
* Debug your app with GitHub Copilot in Visual Studio
* Example prompts for GitHub Copilot Chat
* GitHub Copilot and VS Code tutorials

The post How to debug code with GitHub Copilot appeared first on The GitHub Blog.

February 20, 2025  17:00:22

At GitHub, we’re committed to keeping our community informed about how we govern our platform. That means being transparent about content moderation and involving users in the development of our site policies. Today we’re announcing that our Transparency Center and repo have been updated with data for all of 2024.

Our developer-first approach to content moderation is adapted to the unique environment of code collaboration and has evolved to meet specific needs through a suite of content moderation tools. We’ve discussed the nuances and challenges of moderating a code collaboration platform in the Journal of Online Trust and Safety in an effort to be transparent about our own practices and contribute to the research base of platform studies.

We want to bring this discussion directly to the developers that make GitHub what it is. Recently, we attended FOSDEM, Europe’s largest free and open source developer conference. We connected with developers and presented a deep dive into how our approach to platform moderation has been informed by the values of the FOSS community. You can watch the video of the talk here. We’ll also be presenting this talk on March 8 at SCaLE 22x, the 22nd Annual Southern California Linux Expo in Pasadena, CA. We don’t want to just share our own work—we want to hear directly from developers and maintainers about the challenges you’re facing.

Developers are an important stakeholder in how we moderate our platform, and we want to hear from you. Check out our site-policy repo to contribute constructive ideas, questions, and feedback to make our policies better. We also welcome participation in our developer-policy repo where you can share public policy opportunities and challenges to advance developers’ rights to innovation, collaboration, and equal opportunity.

The post Engaging with the developer community on our approach to content moderation appeared first on The GitHub Blog.

February 14, 2025  17:43:46

While everyone likes flowers and chocolates, why not show your love for your favorite open source projects this Valentine’s and give appreciation to the maintainers who keep them running?

Many of the open source projects we rely on every day are maintained by dedicated volunteers. Sponsoring projects isn’t just about altruism, it’s about investing in the future. Many maintainers work on open source projects in their spare time. Sponsorships can help them dedicate more time to the projects you depend on. With financial support, maintainers can help cover costs such as development and hosting.

When you sponsor a maintainer, you’re helping them continue their work and letting them know you value the creativity and love they’ve poured into their project. For this Valentine’s Day, let’s show love to the maintainers who keep our favorite projects alive and thriving.

The benefits of financial sponsorship

Sponsorship is a tangible way to show support for the open source community. It can be a huge morale boost, as well as bring visibility and validation. Sponsored projects often gain more attention, leading to a virtuous cycle of more contributors and users, which results in better software for everyone.

Getting started: How to best invest in open source

  1. Identify critical dependencies: Review your project’s dependencies to identify which open source libraries and tools are crucial to your operations. If you use tools like npm, pip, or maven, you can review their dependencies and prioritize which are critical to your project’s success.
  2. Evaluate the project activity: Check the recent and trending activity of repositories and look for signs of contributions, shipped features, bug fixes, active maintenance, and community engagement. Helpful metrics like recent commits, issue resolution, and community engagement are all good markers for where funding could help. Projects with a high impact but low funding could be great candidates to prioritize for sponsorship.
  3. Engage with the project: Take the opportunity to look at the issues and contribution guidelines. Try to understand their funding needs and how your support makes a difference. But also consider investing in ways beyond just financial support. You could provide bug fixes if you feel confident in your code. You could also take the time to tell someone about the project and why you think it’s awesome.

Return the love

Maintainers can show love for their sponsors as well. Sponsors appreciate knowing their contribution is making a difference. Whether it’s a simple mention on social media to acknowledge their contribution, featuring them on your GitHub Sponsors profile, exclusive updates and behind-the-scenes insights about your project and community, or even a brief personalized thank-you message, tokens of appreciation can help bring more funding your way.

Shower our favorite open source projects with a little extra love

This Valentine’s Day, let’s show our favorite projects a bit more appreciation. Whether it’s a heartfelt contribution to fix bugs or docs, spreading the word about why it’s a great project and helping people use it, or a sweet one-time donation, every bit helps keep our beloved digital world turning.

Ready to start sponsoring? Visit GitHub Sponsors and find the projects that matter most to you.

Happy Valentine’s Day to all the open source contributors and supporters out there! And remember, you don’t need to wait for a special occasion like Valentine’s Day to show your appreciation for maintainers—every day is a perfect day to support open source! 💖

The post Support the open source projects you love this Valentine’s Day appeared first on The GitHub Blog.

February 12, 2025  20:49:36

In January, we experienced three incidents that resulted in degraded performance across GitHub services.

January 09 1:26 UTC (lasting 31 minutes)

On January 9, 2025, between 01:26 UTC and 01:56 UTC, GitHub experienced widespread disruption to many services, with users receiving 500 responses when trying to access various functionality. This was due to a deployment which introduced a query that saturated a primary database server. On average, the error rate was 6% and peaked at 6.85% of update requests.

We were able to mitigate the incident by identifying the source of the problematic query and rolling back the deployment. The internal tooling and our dashboards surfaced the relevant data that helped us quickly identify the problematic query. It took us a total of 14 minutes from the time to engage to finding the errant query.

However, we are investing in tooling to detect problematic queries prior to deployment to prevent and to reduce our time to detection and mitigation of issues like this one in the future.

January 13 23:35 UTC (lasting 49 minutes)

On January 13, 2025, between 23:35 UTC and 00:24 UTC, all Git operations were unavailable due to a configuration change related to traffic routing and testing that caused our internal load balancer to drop requests between services that Git relies upon.

We mitigated the incident by rolling back the configuration change.

We are improving our monitoring and deployment practices to improve our time to detection and automated mitigation for issues like this in the future.

January 30 14:22 UTC (lasting 26 minutes)

On January 30, 2025, between 14:22 UTC and 14:48 UTC, web requests to github.com experienced failures (at peak the error rate was 44%), with the average successful request taking over three seconds to complete.

This outage was caused by a hardware failure in the caching layer that supports rate limiting. In addition, the impact was prolonged due to a lack of automated failover for the caching layer. A manual failover of the primary to trusted hardware was performed following recovery to ensure that the issue would not reoccur under similar circumstances.

As a result of this incident, we will be moving to a high availability cache configuration and adding resilience to cache failures at this layer to ensure requests are able to be handled should similar circumstances happen in the future.


Please follow our status page for real-time updates on status changes and post-incident recaps. To learn more about what we’re working on, check out the GitHub Engineering Blog.

The post GitHub Availability Report: January 2025 appeared first on The GitHub Blog.

February 12, 2025  17:00:04

GitHub’s Product Security Engineering team writes code and implements tools that help secure the code that powers GitHub. We use GitHub Advanced Security (GHAS) to discover, track, and remediate vulnerabilities and enforce secure coding standards at scale. One tool we rely heavily on to analyze our code at scale is CodeQL.

CodeQL is GitHub’s static analysis engine that powers automated security analyses. You can use it to query code in much the same way you would query a database. It provides a much more robust way to analyze code and uncover problems than an old-fashioned text search through a codebase.

The following post will detail how we use CodeQL to keep GitHub secure and how you can apply these lessons to your own organization. You will learn why and how we use:

  • Custom query packs (and how we create and manage them).
  • Custom queries.
  • Variant analysis to uncover potentially insecure programming practices.

Enabling CodeQL at scale

We employ CodeQL in a variety of ways at GitHub.

  1. Default setup with the default and security-extended query suites
    Default setup with the default and security-extended query suites meets the needs of the vast majority of our over 10,000 repositories. With these settings, pull requests automatically get a security review from CodeQL.
  2. Advanced setup with a custom query pack
    A few repositories, like our large Ruby monolith, need extra special attention, so we use advanced setup with a query pack containing custom queries to really tailor to our needs.
  3. Multi-repository variant analysis (MRVA)
    To conduct variant analysis and quick auditing, we use MRVA. We also write custom CodeQL queries to detect code patterns that are either specific to GitHub’s codebases or patterns we want a security engineer to manually review.

The specific custom Actions workflow step we use on our monolith is pretty simple. It looks like this:

- name: Initialize CodeQL
    uses: github/codeql-action/init@v3
    with:
      languages: ${{ matrix.language }}
      config-file: ./.github/codeql/${{ matrix.language }}/codeql-config.yml

Our Ruby configuration is pretty standard, but advanced setup offers a variety of configuration options using custom configuration files. The interesting part is the packs option, which is how we enable our custom query pack as part of the CodeQL analysis. This pack contains a collection of CodeQL queries we have written for Ruby, specifically for the GitHub codebase.

So, let’s dive deeper into why we did that—and how!

Publishing our CodeQL query pack

Initially, we published CodeQL query files directly to the GitHub monolith repository, but we moved away from this approach for several reasons:

  • It required going through the production deployment process for each new or updated query.
  • Queries not included in a query pack were not pre-compiled, which slowed down CodeQL analysis in CI.
  • Our test suite for CodeQL queries ran as part of the monolith’s CI jobs. When a new version of the CodeQL CLI was released, it sometimes caused the query tests to fail because of changes in the query output, even when there were no changes to the code in the pull request. This often led to confusion and frustration among engineers, as the failure wasn’t related to their pull request changes.

By switching to publishing a query pack to GitHub Container Registry (GCR), we’ve simplified our process and eliminated many of these pain points, making it easier to ship and maintain our CodeQL queries. So while it’s possible to deploy custom CodeQL query files directly to a repository, we recommend publishing CodeQL queries as a query pack to the GCR for easier deployment and faster iteration.

Creating our query pack

When setting up our custom query pack, we faced several considerations, particularly around managing dependencies like the ruby-all package.

To ensure our custom queries remain maintainable and concise, we extend classes from the default query suite, such as the ruby-all library. This allows us to leverage existing functionality rather than reinventing the wheel, keeping our queries concise and maintainable. However, changes to the CodeQL library API can introduce breaking changes, potentially deprecating our queries or causing errors. Since CodeQL runs as part of our CI, we wanted to minimize the chance of this happening, as this can lead to frustration and loss of trust from developers.

We develop our queries against the latest version of the ruby-all package, ensuring we’re always working with the most up-to-date functionality. To mitigate the risk of breaking changes affecting CI, we pin the ruby-all version when we’re ready to release, locking it in the codeql-pack.lock.yml file. This guarantees that when our queries are deployed, they will run with the specific version of ruby-all we’ve tested, avoiding potential issues from unintentional updates.

Here’s how we manage this setup:

  • In our qlpack.yml, we set the dependency to use the latest version of ruby-all
  • During development, this configuration pulls in the latest version) of ruby-all when running codeql pack init, ensuring we’re always up to date.
    // Our custom query pack's qlpack.yml
    
    library: false
    name: github/internal-ruby-codeql
    version: 0.2.3
    extractor: 'ruby'
    dependencies:
      codeql/ruby-all: "*"
    tests: 'test'
    description: "Ruby CodeQL queries used internally at GitHub"
    
  • Before releasing, we lock the version in the codeql-pack.lock.yml file, specifying the exact version to ensure stability and prevent issues in CI.
    // Our custom query pack's codeql-pack.lock.yml
    
    lockVersion: 1.0.0
    dependencies:
     ...
     codeql/ruby-all:
       version: 1.0.6
    

This approach allows us to balance developing against the latest features of the ruby-all package while ensuring stability when we release.

We also have a set of CodeQL unit tests that exercise our queries against sample code snippets, which helps us quickly determine if any query will cause errors before we publish our pack. These tests are run as part of the CI process in our query pack repository, providing an early check for issues. We strongly recommend writing unit tests for your custom CodeQL queries to ensure stability and reliability.

Altogether, the basic flow for releasing new CodeQL queries via our pack is as follows:

  • Open a pull request with the new query.
  • Write unit tests for the new query.
  • Merge the pull request.
  • Increment the pack version in a new pull request.
  • Run codeql pack init to resolve dependencies.
  • Correct unit tests as needed.
  • Publish the query pack to the GitHub Container Registry (GCR).
  • Repositories with the query pack in their config will start using the updated queries.

We have found this flow balances our team’s development experience while ensuring stability in our published query pack.

Configuring our repository to use our custom query pack

We won’t provide a general recommendation on configuration here, given that it ultimately depends on how your organization deploys code. We opted against locking our pack to a particular version in our CodeQL configuration file (see above). Instead, we chose to manage our versioning by publishing the CodeQL package in GCR. This results in the GitHub monolith retrieving the latest published version of the query pack. To roll back changes, we simply have to republish the package. In one instance, we released a query that had a high number of false positives and we were able to publish a new version of the pack that removed that query in less than 15 minutes. This is faster than the time it would have taken us to merge a pull request on the monolith repository to roll back the version in the CodeQL configuration file.

One of the problems we encountered with publishing the query pack in GCR was how to easily make the package available to multiple repositories within our enterprise. There are several approaches we explored.

  • Grant access permissions for individual repositories. On the package management page, you can grant permissions for individual repositories to access your package. This was not a good solution for us since we have too many repositories for it to be feasible to do manually, yet there is not currently a way to configure programmatically using an API.
  • Mint a personal access token for the CodeQL action runner. We could have minted a personal access token (PAT) that has access to read all packages for our organization and added that to the CodeQL action runner. However, this would have required managing a new token, and it seemed a bit more permissive than we wanted because it could read all of our private packages rather than ones we explicitly allow it to have access to.
  • Provide access permissions via a linked repository. We ended up implementing the third solution that we explored. We link a repository to the package and allow the package to inherit access permissions from the linked repository.

CodeQL query pack queries

We write a variety of custom queries to be used in our custom query packs. These cover GitHub-specific patterns that aren’t included in the default CodeQL query pack. This allows us to tailor the analysis to patterns and preferences that are specific to our company and codebase. Some of the types of things we alert on using our custom query pack include:

  • High-risk APIs specific to GitHub’s code that can be dangerous if they receive unsanitized user input.
  • Use of specific built-in Rails methods for which we have safer, custom methods or functions.
  • Required authorization methods not being used in our REST API endpoint definitions and GraphQL object/mutation definitions.
  • REST API endpoints and GraphQL mutations that require engineers to define access control methods to determine which actors can access them. (Specifically, the query detects the absence of this method definition to ensure that the actors’ permissions are being checked for these endpoints.)
  • Use of signed tokens so we can nudge engineers to include Product Security as a reviewer when using them.

Custom queries can be used more for educational purposes rather than being blockers to shipping code. For example, we want to alert engineers when they use the ActiveRecord::decrypt method. This method should generally not be used in production code, as it will cause an encrypted column to become decrypted. We use the recommendation severity in the query metadata so these alerts are treated as more of an informational alert. That means this may trigger an alert in a pull request, but it won’t cause the CodeQL CI job to fail. We use this lower severity level to allow engineers to assess the impact of new queries without immediate blocking. Additionally, this alert level isn’t tracked through our Fundamentals program, meaning it doesn’t require immediate action, reflecting the query’s maturity as we continue to refine its relevance and risk assessment.

/**
 * @id rb/github/use-of-activerecord-decrypt
 * @description Do not use the .decrypt method on AR models, this will decrypt all encrypted attributes and save
 * them unencrypted, effectively undoing encryption and possibly making the attributes inaccessible.
 * If you need to access the unencrypted value of any attribute, you can do so by calling my_model.attribute_name.
 * @kind problem
 * @severity recommendation
 * @name Use of ActiveRecord decrypt method
 * @tags security
 *      github-internal
 */

import ruby
import DataFlow
import codeql.ruby.DataFlow
import codeql.ruby.frameworks.ActiveRecord

/** Match against .decrypt method calls where the receiver may be an ActiveRecord object */
class ActiveRecordDecryptMethodCall extends ActiveRecordInstanceMethodCall {
  ActiveRecordDecryptMethodCall() { this.getMethodName() = "decrypt" }
}

from ActiveRecordDecryptMethodCall call
select call,
  "Do not use the .decrypt method on AR models, this will decrypt all encrypted attributes and save them unencrypted.

Another educational query is the one mentioned above in which we detect the absence of the `control_access` method in a class that defines a REST API endpoint. If a pull request introduces a new endpoint without `control_access`, a comment will appear on the pull request saying that the `control_access` method wasn’t found and it’s a requirement for REST API endpoints. This will notify the reviewer of a potential issue and prompt the developer to fix it.

/**
 * @id rb/github/api-control-access
 * @name Rest API Without 'control_access'
 * @description All REST API endpoints must call the 'control_access' method, to ensure that only specified actor types are able to access the given endpoint.
 * @kind problem
 * @tags security
 * github-internal
 * @precision high
 * @problem.severity recommendation
 */

import codeql.ruby.AST
import codeql.ruby.DataFlow
import codeql.ruby.TaintTracking
import codeql.ruby.ApiGraphs

// Api::App REST API endpoints should generally call the control_access method
private DataFlow::ModuleNode appModule() {
  result = API::getTopLevelMember("Api").getMember("App").getADescendentModule() and
  not result = protectedApiModule() and
  not result = staffAppApiModule()
}

// Api::Admin, Api::Staff, Api::Internal, and Api::ThirdParty REST API endpoints do not need to call the control_access method
private DataFlow::ModuleNode protectedApiModule() {
  result =
    API::getTopLevelMember(["Api"])
        .getMember(["Admin", "Staff", "Internal", "ThirdParty"])
        .getADescendentModule()
}

// Api::Staff::App REST API endpoints do not need to call the control_access method
private DataFlow::ModuleNode staffAppApiModule() {
  result =
    API::getTopLevelMember(["Api"]).getMember("Staff").getMember("App").getADescendentModule()
}

private class ApiRouteWithoutControlAccess extends DataFlow::CallNode {
  ApiRouteWithoutControlAccess() {
    this = appModule().getAModuleLevelCall(["get", "post", "delete", "patch", "put"]) and
    not performsAccessControl(this.getBlock())
  }
}

predicate performsAccessControl(DataFlow::BlockNode blocknode) {
  accessControlCalled(blocknode.asExpr().getExpr())
}

predicate accessControlCalled(Block block) {
  // the method `control_access` is called somewhere inside `block`
  block.getAStmt().getAChild*().(MethodCall).getMethodName() = "control_access"
}

from ApiRouteWithoutControlAccess api
select api.getLocation(),
  "The control_access method was not detected in this REST API endpoint. All REST API endpoints must call this method to ensure that the endpoint is only accessible to the specified actor types."

Variant analysis

Variant analysis (VA) refers to the process of searching for variants of security vulnerabilities. This is particularly useful when we’re responding to a bug bounty submission or a security incident. We use a combination of tools to do this, including GitHub’s code search functionality, custom scripts, and CodeQL. We will often start by using code search to find patterns similar to the one that caused a particular vulnerability across numerous repositories. This is sometimes not good enough, as code search is not semantically aware, meaning that it cannot determine whether a given variable is an Active Record object or whether it is being used in an `if` expression. To answer those types of questions we turn to CodeQL.

When we write CodeQL queries for variant analysis we are much less concerned about false positives, since the goal is to provide results for security engineers to analyze. The quality of the code is also not quite as important, as these queries will only be used for the duration of the VA effort. Some of the types of things we use CodeQL for during VAs are:

  • Where are we using SHA1 hashes?
  • One of our internal API endpoints was vulnerable to SQLi according to a recent bug bounty report. Where are we passing user input to that API endpoint?
  • There is a problem with how some HTTP request libraries in Ruby handle the proxy setting. Can we look at places we are instantiating our HTTP request libraries with a proxy setting?

One recent example involved a subtle vulnerability in Rails. We wanted to detect when the following condition was present in our code:

  • A parameter was used to look up an Active Record object.
  • That parameter is later reused after the Active Record object is looked up.

The concern with this condition is that it could lead to an insecure direct object reference (IDOR) vulnerability because Active Record finder methods can accept an array. If the code looks up an Active Record object in one call to determine if a given entity has access to a resource, but later uses a different element from that array to find an object reference, that can lead to an IDOR vulnerability. It would be difficult to write a query to detect all vulnerable instances of this pattern, but we were able to write a query that found potential vulnerabilities that gave us a list of code paths to manually analyze. We ran the query against a large number of our Ruby codebases using CodeQL’s MRVA.

The query, which is a bit hacky and not quite production grade, is below:

/**
 * @name wip array query
 * @description an array is passed to an AR finder object
 */

import ruby
import codeql.ruby.AST
import codeql.ruby.ApiGraphs
import codeql.ruby.frameworks.Rails
import codeql.ruby.frameworks.ActiveRecord
import codeql.ruby.frameworks.ActionController
import codeql.ruby.DataFlow
import codeql.ruby.Frameworks
import codeql.ruby.TaintTracking

// Gets the "final" receiver in a chain of method calls.
// For example, in `Foo.bar`, this would give the `Foo` access, and in
// `foo.bar.baz("arg")` it would give the `foo` variable access
private Expr getUltimateReceiver(MethodCall call) {
  exists(Expr recv |
    recv = call.getReceiver() and
    (
      result = getUltimateReceiver(recv)
      or
      not recv instanceof MethodCall and result = recv
    )
  )
}

// Names of class methods on ActiveRecord models that may return one or more
// instances of that model. This also includes the `initialize` method.
// See https://api.rubyonrails.org/classes/ActiveRecord/FinderMethods.html
private string staticFinderMethodName() {
  exists(string baseName |
    baseName = ["find_by", "find_or_create_by", "find_or_initialize_by", "where"] and
    result = baseName + ["", "!"]
  )
  // or
  // result = ["new", "create"]
}

private class ActiveRecordModelFinderCall extends ActiveRecordModelInstantiation, DataFlow::CallNode
{
  private ActiveRecordModelClass cls;

  ActiveRecordModelFinderCall() {
    exists(MethodCall call, Expr recv |
      call = this.asExpr().getExpr() and
      recv = getUltimateReceiver(call) and
      (
        // The receiver refers to an `ActiveRecordModelClass` by name
        recv.(ConstantReadAccess).getAQualifiedName() = cls.getAQualifiedName()
        or
        // The receiver is self, and the call is within a singleton method of
        // the `ActiveRecordModelClass`
        recv instanceof SelfVariableAccess and
        exists(SingletonMethod callScope |
          callScope = call.getCfgScope() and
          callScope = cls.getAMethod()
        )
      ) and
      (
        call.getMethodName() = staticFinderMethodName()
        or
        // dynamically generated finder methods
        call.getMethodName().indexOf("find_by_") = 0
      )
    )
  }

  final override ActiveRecordModelClass getClass() { result = cls }
}

class FinderCallArgument extends DataFlow::Node {
  private ActiveRecordModelFinderCall finderCallNode;

  FinderCallArgument() { this = finderCallNode.getArgument(_) }
}

class ParamsHashReference extends DataFlow::CallNode {
  private Rails::ParamsCall params;

  // TODO: only direct element references against `params` calls are considered
  ParamsHashReference() { this.getReceiver().asExpr().getExpr() = params }

  string getArgString() {
    result = this.getArgument(0).asExpr().getConstantValue().getStringlikeValue()
  }
}

class ArrayPassedToActiveRecordFinder extends TaintTracking::Configuration {
  ArrayPassedToActiveRecordFinder() { this = "ArrayPassedToActiveRecordFinder" }

  override predicate isSource(DataFlow::Node source) { source instanceof ParamsHashReference }

  override predicate isSink(DataFlow::Node sink) {
    sink instanceof FinderCallArgument
  }

  string getParamsArg(DataFlow::CallNode paramsCall) {
    result = paramsCall.getArgument(0).asExpr().getConstantValue().getStringlikeValue()
  }

  // this doesn't check for anything fancy like whether it's reuse in a if/else
  // only intended for quick manual audit filtering of interesting candidates
  // so remains fairly broad to not induce false negatives
  predicate paramsUsedAfterLookups(DataFlow::Node source) {
    exists(DataFlow::CallNode y | y instanceof ParamsHashReference
    and source.getEnclosingMethod() = y.getEnclosingMethod()
    and source != y
    and getParamsArg(source) = getParamsArg(y)
    // we only care if it's used again AFTER an object lookup
    and y.getLocation().getStartLine() > source.getLocation().getStartLine())
  }
}

from ArrayPassedToActiveRecordFinder config, DataFlow::Node source, DataFlow::Node sink
where config.hasFlow(source, sink) and config.paramsUsedAfterLookups(source)
select source, sink.getLocation()

Conclusion

CodeQL can be very useful for product security engineering teams to detect and prevent vulnerabilities at scale. We use a combination of queries that run in CI using our query pack and one-off queries run through MRVA to find potential vulnerabilities and communicate them to engineers. CodeQL isn’t only useful for finding security vulnerabilities, though; it is also useful for detecting the presence or absence of security controls that are defined in code. This saves our security team time by surfacing certain security problems automatically, and saves our engineers time by detecting them earlier in the development process.

Writing custom CodeQL queries

Tips for getting started

We have a large number of articles and resources for writing custom CodeQL queries. If you haven’t written custom CodeQL queries before, here are some resources to help get you started:

Improve the security of your applications today by enabling CodeQL for free on your public repositories, or try GitHub Advanced Security for your organization.

Michael Recachinas, GitHub Staff Security Engineer, also contributed to this blog post.

The post How GitHub uses CodeQL to secure GitHub appeared first on The GitHub Blog.

February 13, 2025  15:17:26

We’ve all been there—staring at a function that looks like it was written by an over-caffeinated goblin at 3 AM (maybe even your alter ego). You could pretend it doesn’t exist, or you could refactor it. Luckily, GitHub Copilot makes the second option less painful.

Let’s get to it.

What is code refactoring?

Feel free to breeze past this section if you already know what’s involved with refactoring code, but we wanted to take a moment to cover what we’ll be looking at today.

Think of refactoring as giving your project some digital spring cleaning—a glow up for your functions, classes, and modules. But instead of just tidying up, you’re making your code more efficient, maintainable, and readable, all without changing its external behavior.

Some standard ways of refactoring include:

  • Simplifying complex conditionals (because no one should need a PhD to read your if statements)
  • Extracting duplicated logic (so you’re not trying to maintain code in ten different places)
  • Improving variable and function names (because doThing() is a crime against humanity)
  • Converting monolithic functions into smaller, modular pieces (to prevent the dreaded “function that spans multiple screens” scenario)

Refactoring isn’t just about tidiness—it’s about making your codebase more resilient, scalable, and enjoyable to work with. Let’s find out how GitHub Copilot can help you do it faster and with fewer headaches.

Know what your code does before you refactor anything

It might seem obvious to say this, but before you can refactor any code you need to understand how it works. If you don’t know what your code is doing, you won’t know whether or not the “improvements” you’re making are changing the core functionality of the code.

Consider the following method:

public String getSound(String animal) {
  if (animal == null) {
      System.out.println("Oops! A null animal?");
  } else if (animal.equalsIgnoreCase("Dog")) {
      return "Bark";
  } else if ( animal.equalsIgnoreCase("Cat")) {
      return "Meow";
  } else if ( animal.equalsIgnoreCase("Bird")) {
      return "Tweet";
  }
  return "Unknown";
}

You might look at this and immediately think “they should use a switch statement,” and that would be one example of refactoring the code. But having that knee jerk reaction requires you to know how if-statements and switch-statements work. You can only make that suggestion if you understand this code will continue cycling through if-statements until it finds an appropriate match. Otherwise, it will return the value of Unknown.

As codebases get more complex and make calls between multiple files, this gets (much) more complicated. And this is one way Copilot can help you.

You can ask Copilot Chat to explain how some code works, either by asking in plain language or using the /explain slash command. To limit the scope of what Copilot looks at, select the code in your IDE before asking your query, or specify specific files for it to consider by using #file. While you’re at it, you can even ask it to add code comments to help you (or anyone else reading the code) in the future.

Here are some sample prompts:

  • Explain what this code does.
  • What is this code doing?
  • Add comments to this code to make it more understandable.

You should use Copilot Chat to analyze and explain your codebase until you fully understand the code you’re looking to refactor.

Try some blanket improvements to refactor your code

Like most things in life, it’s usually best to start small. When you’re first getting started with refactoring code, keep it simple: open up Copilot Chat in your project and ask “how would you improve this?” Just like when you are asking GitHub Copilot to explain some code, you can specify what it looks at by highlighting sections of code or identifying specific files by using #file.

Here are some sample prompts:

  • How would you improve this?
  • Improve the variable names in this function.
  • #file:pageInit.js, #file:socketConnector.js Offer suggestions to simplify this code.

Copilot will then offer suggestions to improve the code in the way that you specified. This is great for getting started, but Copilot can do much more if you give it some guidance.

When working with any generative AI-powered developer tool, it is often useful to include more context in your prompts—ones that are more likely to get you the results you’re looking for. By being specific about what you want the tool to do, it focuses the efforts toward that end.

It’s a bit like if someone tells you to code something, and you have so many ideas and questions that you’re not quite sure where to begin. But if someone tells you they specifically want you to code a class that sorts a list, you can focus on the task at hand.

The prompts above don’t offer much specificity or context, which leaves Copilot to explore all the possible ways your code could be improved. The upside? You may see options that you might not have considered. The downside is some of the proposed solutions might not address your specific concerns.

Make a plan for refactoring your codebase

What do you want to do to your code? Do you want to make it more readable? Or do you want to find redundant code and remove it? Coming up with a plan for the improvements you want to make will help you to hit your goals. This comes back to making sure you understand your code. If you know how it works, you can come up with a plan for the type of improvements that you want to make.

Maybe your code base has a bunch of different scripts that all perform the same general function. You could strip out the code into a common module to import into each of the different scripts, making the code easier to change and maintain.

To do so, you can direct Copilot to look for these common code sections and to pull them into a single module.

Here’s a sample prompt:

Inspect all my js files for GitHub API calls and create a new class that will manage all the GitHub API calls.

Now that we have provided some guidance and additional context, Copilot will provide suggestions targeting this specific improvement that we want to make to our code.

You can also provide a laundry list of tasks, or ask Copilot to keep things in mind while it is doing the refactoring.

In that vein, here’s a sample prompt to consider:

Can you refactor the GitHubController class to:
- remove nested logic structures
- make the code more concise
- while doing this, check if the code is safe and add comments if not

The takeaway here is Copilot is very good at taking directions—and the more specific your directions are, the more it will generate outputs in line with your intended end goal.

For more guidance on creating good prompts, see our documentation on prompt engineering for GitHub Copilot.

A real-world example: Using GitHub Copilot to refactor code

Sometimes the best way to understand something is to see it in use. To help with this, we wanted to share an experience where we needed to use GitHub Copilot to refactor our own code, interspersed with some general guidance and pointers.

It started with a couple of organizations independently requesting help with their individual large migration projects, which involved moving from other data centers into tens of thousands of repositories on GitHub. After building a proof of concept, we started creating individual systems, each tailored to the needs of the individual customer and their unique systems. While we were doing this, a third organization made a similar request for help, and we realized this was a common need.

At this point in time, we had two systems that were both suited for large-scale migrations, but each of them was heavily customized for the original company making the request. We quickly realized the best solution involved refactoring our code to improve its modularity so we could use it across different organizations. This would not be easy—the codebase was made up of several different scripts and workflows with a lot of moving parts. And that’s when we turned to Copilot for help.

First we came up with a plan. We created a set of high-level goals and used these to guide us during the refactoring process. We wanted to make sure the code was:

  • Modular and reusable for different organizations: The code should be organized into small, self-contained modules that could be reused and combined for new functionality.
  • Maintainable: The code should be easy to maintain and extend as needed. That meant making it well-organized, well-documented, and easy to understand.
  • Customizable for different organizations: The code should be flexible enough to meet the unique needs of each organization.
  • Following coding standards and best practices: The code should be consistent, readable, and easy to understand.

With this plan in place, it was time to ask Copilot for help.

Step 1: Creating the initial module

Our first step was creating a common module we could import into each of the different scripts. This would allow us to reuse the same code across many different scripts, making it easier to maintain and extend. We created a skeleton of a module called gh-migrations. Then we used Copilot to fill in the details.

Here was the prompt we gave Copilot:

I am creating a module, gh-migrations, which will hold common functionality to be used through out my project. I have already created an index.js for the module. Please inspect the other js files for GitHub API calls, create a new class for the module which will manage all GitHub API calls. This class should also account for rate limiting and errors. Do not make edits to the javascript files, rather identify the API calls, and add the functionality to the new GitHub API class.

And here was Copilot’s response:

I'll help create a GitHub API management class based on the API calls found in the provided files. Here's a step-by-step solution:

1. From analyzing the files, I identified these common GitHub API operations: 
  * Issues creation and updates
  * Issue search
  * Rate limit handling (403 errors)
  * Common error handling patterns

2. I'll create a class that encapsulates these operations with proper error handling and rate limiting.

.github/lib/gh-migrations/index.js
Here's the GitHub API management class implementation:

<code>

This implementation:

1. Creates a GitHubApiManager class that handles all GitHub API interactions
2. Implements automatic rate limiting retry logic in withRateLimit
3. Provides methods for common operations: 
  * createIssue: Creates a new issue
  * updateIssue: Updates an existing issue
  * searchIssues: Searches for issues using GitHub's search API
  * findIssueByTitle: Helper method to find issues by title
  * getIssue: Gets a specific issue by number

Usage example:

<code>

The class handles:
  * Automatic retry on rate limits
  * Consistent error handling
  * Repository context management
  * Common GitHub API operations used across the codebase

This centralizes all GitHub API interactions and makes the code more maintainable and consistent.

After reviewing the code (as you should always do!), we made the recommended changes. The next step was putting these changes into a separate file while ensuring the index.js file could reference it.

Here was the prompt we entered for Copilot:

Rename the class GitHubApiManager to GHApi and put it in a separate file that is referenced in the index.js file.

And here was Copilot’s response:

I'll help you split the code into two files. Here's the step by step solution:

1. Create a new file gh-api.js for the GHApi class
2. Update index.js to export the renamed class from the new file

.github/lib/gh-migrations/gh-api.js
Create a new file containing the renamed class:

<code>

.github/lib/gh-migrations/index.js
Update the index file to export the renamed class from the new file:

<code>

The changes:
1. Created a new file gh-api.js with the renamed class
2. Moved all the class implementation to the new file
3. Updated the class name from GitHubApiManager to GHApi
4. Modified index.js to require and export the class from the new file

Now consumers of this module can continue to import it the same way, but the implementation is better organized.

After making these changes, we then had to repeat the process for each of the files that contained GitHub API calls. Be aware, we didn’t want to refactor the code directly at this stage. Instead, we wanted to create a common module we could use for refactoring code later.

Here was the prompt we entered for Copilot:

The gh-api.js file is a new class that will be used to manage all GitHub API calls. Please review the other javascript files for GitHub API calls to be added to gh-api.js. Do not edit the javascript files, only add new functionality to gh-api.js.

And here was Copilot’s response:

I'll analyze the JavaScript files and add the missing API functionality to the GHApi class.

.github/lib/gh-migrations/gh-api.js
Add methods for issue operations, searching, and deletion based on the API calls found in migration.js and application.js:

<code>

This implementation adds all the GitHub API functionality found in the other files:

1. From migration.js:
  * createComment for updating issues with comments
  * getIssue and isMigrationIssue for checking migration status
2. From application.js:
  * searchIssues and searchIssuesByTitle for finding issues
  * getIssueNodeId and deleteIssue for issue deletion via GraphQL
  * findJiraIssue helper method to get Jira ticket information

The methods maintain the same rate limiting and error handling patterns while providing a cleaner interface for the API operations.

Step 2: Adding new classes

Once we had the new GHApi class to handle the GitHub API, we want to create some reusable classes for different entities in our application. Since our application is a migration tool managed via GitHub Issues, we wanted to create classes for the following entities:

  • Migration issues represent repository migrations from various sources. They contain configuration and metadata about the repository and how the migration should be performed.
  • Batch issues represent a batch of migrations. They contain more high-level metadata about groups of individual migration issues such as custom application ids and status of the batch.
  • Team issues represent the migration of a group of users, their associated repositories, and the permissions to those repositories.
  • Rewire issues represent Azure DevOps pipelines rewire operations. They allow for the reconfiguration of Azure DevOps pipelines to point to the new GitHub repositories after a migration has been completed.

Here was the prompt we entered for Copilot:

Review the create-migration-issue.js and develop a new class called MigrationIssue that contains the properties associated with a migration. This shoud be added to the gh-migrations module where index.js is located.

And here was Copilot’s response:

I'll help create a new MigrationIssue class based on the migration configuration and status shown in the code. Here's the step-by-step solution:

1. Create a new file migration-issue.js in the gh-migrations directory
2. Update the index.js file to expose the new class
3. Extract configuration and status properties from the create-migration-issues.js file into the new class

migration-issue.js
Create a new class to represent a migration issue with all necessary properties and methods.

<code>

index.js
Update the index.js to expose the new MigrationIssue class.

<code>

The new MigrationIssue class encapsulates all the properties that were previously spread across various objects in the create-migration-issues.js file. It provides a clean interface for creating and managing migration issues, with helper methods for generating issue titles and source types.

You can now use this class in create-migration-issues.js by replacing the manual object creation with the MigrationIssue class. For example, the createMigrationConfig function could be replaced with:

<code>

The process for creating new classes for the other entities was similar to the above. We created new classes for BatchIssues, TeamIssues, and RewireIssues, each with their own properties and methods.

Step 3: Encapsulation

After we built out our entity classes, we realized they should manage their own state. This is a common pattern in object-oriented programming, where the class is responsible for managing its state and behavior. Remember our goal about following good coding standards?

The class should include methods to get and set the state of the object, as well as methods to perform actions on the object. It should also be responsible for utilizing the GitHub API and manage the state of the object rather than the client code. This will help standardize and protect the state of the object, as well as give the client code a simpler interface to work with.

Here was the prompt we entered for Copilot:

The MigrationIssue class should contain a initializer that uses the gh-api.js to create the resulting GitHub Issue. The initializer should take in an optional parameter called issueNumber, if the issueNumber is omitted the initializer should create a new Migration Issue and return the issue number created, if the issueNumber is passed to the initializer, the gh-api.js should be used to return the title and body of the issue which the class will parse to create the properties of the MigrationIssue.

And here was Copilot’s response:

I'll help you modify the code to add initialization logic to the MigrationIssue class. Here's the step-by-step solution:

1. Add a static method to parse issue body into configuration
2. Add initialization methods to handle both new and existing issues
3. Modify constructor to accept GHApi instance
4. Add method to generate issue body
5. Add method to create or load issue

migration-issue.js
Update the MigrationIssue class to handle initialization from GHApi.

<code>

The main changes include:

1. Constructor now accepts a GHApi instance as first parameter
2. Added parseIssueBody static method to parse configuration from issue body
3. Added generateIssueBody method to create the issue body format
4. Added initialize async method that either creates a new issue or loads an existing one
5. The initialize method returns the issue number in both cases

To use this updated class:

<code>

Remember how we said it’s always important to review suggestions from Copilot? In this case, the initialize method wasn’t connected to the constructor, like we would expect. We asked Copilot to address this issue with an iterative prompt.

Here was the prompt we entered for Copilot:

The MigrationIssue class has an initialize function that encapsulates functionality that should be included in the constructor. Rewire the constuctor method to include the initialzie function.

And here was Copilot’s response:

Let's modify the class to move the initialization logic into the constructor. Here's the step-by-step solution:

1. Make constructors async
2. Move initialization logic into constructor
3. Remove separate initialize method
4. Keep static parseIssueBody method

migration-issue.js
Move initialization logic into constructor.

<code>

The main changes:

1. Removed initialize method from the class
2. Added static create and load factory methods
3. Added save method to handle issue creation
4. Constructor remains synchronous but now only handles property initialization

Usage example:

<code>

This approach follows better object-oriented principles by:

1. Using factory methods for different creation scenarios
2. Keeping the constructor simple and synchronous
3. Separating issue creation into a distinct operation
4. Making the API more intuitive

In this case, Copilot suggested a different approach than we were thinking of with a factory method (which is a great way to handle different scenarios when creating or loading an issue). Even if it wasn’t what we were thinking, it worked and also kept the constructor clean and focused on just initializing the object. Moreover, the save method is a good way to encapsulate the logic for creating a new issue, and it makes the code more readable.

Now we just needed to repeat this process with each of the new classes we added.

Step 4: Final refactoring

After we built a common module, we needed to refactor the client code to make it much cleaner and easier to read and maintain. After that, we’d also be able to identify and add any missing functionality in the common module.

This was an iterative process, and we needed to go back and forth between the client code and the common module to make sure everything worked properly. Because of the way this project evolved, it afforded us the opportunity to make sweeping changes to the codebase. It’s much more common to make small, incremental changes to the codebase, especially if it’s code that’s actively in use. In both of these cases, don’t forget to write tests to ensure the code works as expected.

Here was the prompt we entered for Copilot:

Using the gh-migrations module, refactor the create-migration-issues.js

And here was Copilot’s response:

I'll help you refactor the create-migration-issues.js file to use the gh-migrations module. Here's the step-by-step solution:

1. Import the gh-migrations module
2. Create GHApi instance
3. Use MigrationIssue class for creating and managing migration issues
4. Simplify the main logic using the new classes

create-migration-issues.js
Refactor to use the gh-migrations module for better code organization and reusability.

<code>

The refactoring:

1. Removes redundant code by using the MigrationIssue class
2. Simplifies issue creation and management
3. Uses GHApi for GitHub API calls with built-in rate limiting
4. Removes duplicate code for different source types
5. Improves code organization and maintainability

The main changes:
  * Removed several helper functions as they're now part of the MigrationIssue class
  * Simplified the main logic flow
  * Added better error handling
  * Used structured configuration object
  * Maintained the same functionality while reducing code complexity

Now we needed to refactor the remaining scripts in the same manner. For example, this is how we used Copilot to help with refactoring the create-team-issues.js script.

Here was the prompt we entered for Copilot:

Using the gh-migrations module refactor the create-team-issues.js

And here was Copilot’s response:

I'll help you refactor the create-team-issues.js file to use the gh-migrations module. Here's the step-by-step solution:

1. Import the gh-migrations module
2. Replace manual issue creation with TeamIssue class
3. Simplify team creation logic using the new class
4. Maintain the same JiraIssue tracking functionality

create-team-issues.js
Refactor to use the gh-migrations module for creating team issues.

<code>

The refactoring:

1. Uses the TeamIssue class to manage team issue creation
2. Simplifies the code by removing redundant config and status objects
3. Uses GHApi for GitHub operations with built-in rate limiting
4. Maintains the same JiraIssue tracking functionality
5. Reduces code duplication by using a loop for team creation

The code is now more maintainable and follows the same pattern as the migration issues implementation.

From there, it was a matter of rinsing and repeating this process with each of our scripts.

By first creating the common module, we could guide Copilot and make sure that the final, refactored codebase was something we could use with multiple organizations. It gave us a foundation we could reuse and customize based on the needs of the system we were migrating.

Things to keep in mind

Sometimes refactoring code can be really easy, like our first example with the sequential if statements. But the more you dig into it, the more it grows, and you might find yourself facing down the Herculean task of trying to refactor a codebase that stretches across multiple files from a larger team of developers.

Speaking from experience: It can be daunting to even know where to start.

Remember to take a step back and start with the basics. Your first step should always be improving your understanding of the codebase you want to refactor. The more you know about what you’re starting out with, the greater chance that you will be able to think of ways to refactor it effectively. And once you come up with some ideas, you can always ask GitHub Copilot to help turn your vision into reality. It’s got your back and is more than willing to help.

Start using GitHub Copilot for free
Our free version of GitHub Copilot is included by default in personal GitHub accounts and VS Code to help you start new projects, manage existing repositories, and more.

Start using GitHub Copilot >

The post How to refactor code with GitHub Copilot appeared first on The GitHub Blog.

February 7, 2025  18:45:39

It’s no wonder developers are increasingly overwhelmed. The number of new CVEs published each year has increased by nearly 500% in the last decade. And the average project, with just 10 direct dependencies, can have hundreds of indirect dependencies. Put simply, developers are often buried under a mountain of security alerts and unable to prioritize which ones to remediate first.

While high-profile supply chain attacks like last year’s XZ Utils backdoor tend to capture attention, the danger they pose is just a fraction of the overall threat landscape. The bigger risk often comes from unpatched vulnerabilities in lesser-known open source dependencies.

GitHub’s partnership with Endor Labs cuts through the noise to help developers accurately identify, remediate, and fix the most critical vulnerabilities—without ever leaving GitHub.

With Endor Labs software composition analysis (SCA) integrated into GitHub Advanced Security and Dependabot, development teams can dismiss up to 92% of low-risk dependency security alerts to focus instead on the vulnerabilities that matter most.

A GitHub code scanning page shows several active vulnerabilities from GitHub’s advisory database labeled 'Critical' by Endor Labs.
Prioritize Endor Labs findings in GitHub based on function-level vulnerability reachability for both direct and transitive dependencies.

How it works

Endor Labs SCA brings context into open source vulnerability detection

Endor Labs SCA helps identify and prioritize dependency vulnerabilities by their potential impact, according to factors like reachability, exploitability, and more. For example, Endor Labs checks if the vulnerable function of a given dependency is actually reachable by your application or if it is just sitting on an unused corner of a transitive dependency. Security teams can also configure risk, licensing, and permission profiles to ensure developers are not bothered unless the risk is truly warranted.

Prioritize and fix open source vulnerabilities with GitHub

GitHub Advanced Security integrates crucial security practices directly into the development workflow, offering developers a streamlined way to secure their code. Its features are free for open source maintainers, including dependency review, secret scanning, code scanning, and Copilot Autofix.

Dependabot, available for free to all GitHub users, automates dependency updates, so you can spend more time building. Developers can remediate vulnerabilities by merging Dependabot-authored pull requests with the click of a button or by applying Endor Patches.

Secure your automated workflows

GitHub Actions makes it easy to automate all your software workflows, whether you want to build a container, deploy a web service, or welcome new users to your open source project. These actions are often updated with bug fixes and new features, which can take time to maintain.

Endor Labs automatically discovers in-use actions and their dependencies to ensure they fit your risk, licensing, and permission profiles. Dependabot automatically updates your dependencies, and code scanning helps identify existing workflow configuration vulnerabilities and prevent new ones.

Get started
Sign up with Endor Labs and learn more about getting started with the Endor Labs GitHub App.

The post From finding to fixing: GitHub Advanced Security integrates Endor Labs SCA appeared first on The GitHub Blog.

February 6, 2025  20:00:13

When we introduced GitHub Copilot back in 2021, we had a clear goal: to make developers’ lives easier with an AI pair programmer that helps them write better code. The name reflects our belief that artificial intelligence (AI) isn’t replacing the developer. Instead, it’s always on their side. And like any good first officer, Copilot can also fly by itself: for example, when providing pull request feedback, autofixing security vulnerabilities, or brainstorming on how to implement an issue.

Today, we are upgrading GitHub Copilot with the force of even more agentic AI – introducing agent mode and announcing the General Availability of Copilot Edits, both in VS Code. We are adding Gemini 2.0 Flash to the model picker for all Copilot users. And we unveil a first look at Copilot’s new autonomous agent, codenamed Project Padawan. From code completions, chat, and multi-file edits to workspace and agents, Copilot puts the human at the center of the creative work that is software development. AI helps with the things you don’t want to do, so you have more time for the things you do.

Agent mode available in preview 🤖

GitHub Copilot’s new agent mode is capable of iterating on its own code, recognizing errors, and fixing them automatically. It can suggest terminal commands and ask you to execute them. It also analyzes run-time errors with self-healing capabilities.

In agent mode, Copilot will iterate on not just its own output, but the result of that output. And it will iterate until it has completed all the subtasks required to complete your prompt. Instead of performing just the task you requested, Copilot now has the ability to infer additional tasks that were not specified, but are also necessary for the primary request to work. Even better, it can catch its own errors, freeing you up from having to copy/paste from the terminal back into chat.

Here’s an example where GitHub Copilot builds a web app to track marathon training:

To get started, you’ll need to download VS Code Insiders and then enable the agent mode setting for GitHub Copilot Chat:

Settings screen for Visual Studio Code showing the words 'Copilot Agent' in the settings search box, and the option for Chat Agent: Enabled activated

Then, when in the Copilot Edits panel, switch from Edit to Agent right next to the model picker:

Agent mode will change the way developers work in their editor; and as such, we will bring it to all IDEs that Copilot supports. We also know that today’s Insiders build isn’t perfect, and welcome your feedback as we improve both VS Code and the underlying agentic technology in the coming months.

Copilot Edits, now GA in VS Code 🎉

Announced at GitHub Universe in October last year, Copilot Edits combines the best of Chat and Inline Chat with a conversational flow and the ability to make inline changes across a set of files that you manage. The feedback you provided in the past was instrumental in shipping this feature as GA in VS Code today. Thank you!

In Copilot Edits you specify a set of files to be edited, and then use natural language to ask GitHub Copilot for what you need. Copilot Edits makes inline changes in your workspace, across multiple files, using a UI designed for fast iteration. You stay in the flow of your code while reviewing the suggested changes, accepting what works, and iterating with follow-up asks.

Visual Studio Code showing multiple files added to Copilot Edit

Behind the scenes, Copilot Edits leverages a dual-model architecture to enhance editing efficiency and accuracy. First, a foundation language model considers a full context of the Edits session to generate initial edit suggestions. You can choose the foundation language model that you prefer between: OpenAI’s GPT-4o, o1, o3-mini, Anthropic’s Claude 3.5 Sonnet, and now, Google’s Gemini 2.0 Flash. For the optimal experience, we developed a speculative decoding endpoint, optimized for fast application of changes in files. The proposed edits from the foundation model are sent to the speculative decoding endpoint that will then propose those changes inline in the editor.

Copilot Edits works because it puts you in control, from setting the right context to accepting changes. The experience is iterative: when the model gets it wrong, you can review changes across multiple files, accept good ones and iterate until, together with Copilot, you arrive at the right solution. After accepting changes, you can run the code to verify the changes and, when needed, undo in Copilot Edits to get back to a previous working state. Copilot Edits is in the Secondary Side Bar (default on the right) so that you can interact with views in the Primary Side Bar, such as the Explorer, Debug, or Source Control view, while you’re reviewing proposed changes. For example, you can have unit tests running in the Testing view on the left, while using the Copilot Edits view on the right, so that in every iteration you can verify if the changes Copilot Edits proposed are passing your unit tests.

Using your voice is a natural experience while using Copilot Edits. Just talking to Copilot makes the back and forth smooth and conversational. It almost feels like interacting with a colleague with area expertise, using the same kind of iterative flow that you would use in real-life pair programming.

Next on our roadmap is to improve the performance of the apply changes speculative decoding endpoint, support transitions into Copilot Edits from Copilot Chat by preserving context, suggest files to the working set, and allow you to undo suggested chunks. If you want to be among the first to get your hands on these improvements, make sure to use VS Code Insiders and the pre-release version of the GitHub Copilot Chat extension. To help improve the feature, please file issues in our repo.

Beyond the GA in VS Code, Copilot Edits is now in preview for Visual Studio 2022.

Project Padawan: SWE agents on GitHub

We’re excited to share a first look at our autonomous SWE agent and how we envision these types of agents will fit into the GitHub user experience. When the product we are building under the codename Project Padawan ships later this year, it will allow you to directly assign issues to GitHub Copilot, using any of the GitHub clients, and have it produce fully tested pull requests. Once a task is finished, Copilot will assign human reviewers to the PR, and work to resolve feedback they add. In a sense, it will be like onboarding Copilot as a contributor to every repository on GitHub. ✨

Behind the scenes, Copilot automatically spins up a secure cloud sandbox for every task it’s assigned. It then asynchronously clones the repository, sets up the environment, analyzes the codebase, edits the necessary files, and builds, tests, and lints the code. Additionally, Copilot takes into account any discussion within the issue or PR, and any custom instruction within the repository, so it understands the full intent of its task, as well as the guidelines and conventions of the project.

And just as we did with Copilot Extensions and the model picker in Copilot, we will also provide opportunities to integrate into this AI-native workflow and work closely with partners and customers in a tight feedback loop. We believe the end-state of Project Padawan will result in transforming how teams manage critical-yet-mundane tasks, such as fixing bugs or creating and maintaining automated tests. Because ultimately, it’s all about empowering developers by allowing them to focus on what matters, and letting copilots do the rest. And don’t worry. We will have patience, so the agent won’t turn to the dark side. 😉

Awaken the agent with agent mode for GitHub Copilot in VS Code today.

The post GitHub Copilot: The agent awakens appeared first on The GitHub Blog.

February 7, 2025  18:50:16

After months or years of hard work, you’ve just pushed your open source project to GitHub and made it public. Now it’s time to tell the world about it.

Chances are you’d rather spend time writing code than getting the word out about your project. Maybe your project will go viral and you won’t have to spend much time on marketing. But chances are you’re going to need to do some work to build awareness, at least in the early days. Fortunately, there are plenty of people who have been down this path and are willing to help. In this article, experienced maintainers offer their advice on sharing open source projects with the world.

Don’t be shy about self-promotion

Start with the obvious. Post to social media about your project. Submit it to Hacker News, Reddit, Product Hunt, and similar sites. Then keep an eye peeled for people who have the problem that you’re trying to solve. Respond to their posts and let them know you have a potential solution. Reach out to podcasts and YouTube channels. Submit talks to conferences. Offer to speak at meetups.

Keep promoting your work as you improve the project. Remember that people want to hear about helpful tools that solve real problems, as long as you’re genuinely trying to help, and not just spamming your followers. You might not be comfortable with self promotion, but you need to promote your work to get it out there. “You shouldn’t feel icky about it,” Sidecar maintainer Aaron Francis told us in a Q&A. “You put a lot of time into making something helpful.”

Focus on the problem your project solves

What should you say when you’re promoting your work? First and foremost, you need to know what problem your project solves and be able to communicate that to potential users as simply as possible. “One of the biggest mistakes I see is the use of too much technical terminology,” says Chakra UI maintainer Segun Adebayo. It might be tempting to talk about the technologies you’ve built your solution upon, or the latest buzzwords you think users might be interested in. Open source users are, after all, often your fellow developers and technical people. But it’s easy to go overboard and obscure the value of the project.

For example, your project might make clever use of decentralized computing principles, points out Tasha Drew, co-chair for Kubernetes’ Working Group for Multi-tenancy, but what people really care about is why they should use it. “What’s the message you want people to take away from your webpage or your README? It’s probably not related to the theory behind the code,” she says.

Use that core message everywhere: Social media posts and profiles, blog posts, tutorials, etc.

Document, document, document!

Getting someone’s attention is only one part of the battle. If you want people to actually use, share, and contribute to your project, you need clear, up-to-date documentation. “Write as much as you can stand to write,” Francis says. Not only will it make your user experience better, it might even improve your code. “If you find it’s hard to document a particular feature, that’s probably a sign that it’s too complicated and you need to simplify it,” he explains.

Think beyond just documenting the code. You should provide things like quick starts, tutorials, and screencasts. “Video is really helpful for a lot of people,” Adebayo says. “People learn in different ways so it’s important to provide different types of content.”

Be responsive

No matter how good your documentation is, people are still going to have questions—and, if you’re lucky, pull requests. It’s important to be responsive, especially when you’re just starting out. “Time is finite, we only get one life, so value those people who are willing to spend some of their precious resources on you,” Francis says. “That applies not just to people sending pull requests, but to people pointing out problems or making suggestions on social media as well.”

That doesn’t mean you have to be on call 24/7 to provide an immediate reply to every single question and comment. But it does mean you shouldn’t let pull requests, issues, and comments sit for too long without a response. You have to let people know your project is active, and that you value their input. “It might be intimidating at first to interact with people you don’t know, but you have to do it if you want to grow,” says Adebayo. “This is a sure way to meet new people and make new friends that might be helpful to you in the future.”

Invest time in onboarding contributors

You need to document both how to use your project, and how to contribute to it. Create CONTRIBUTING.md and CODE_OF_CONDUCT.md files with your contribution guidelines and code of conduct. These let potential contributors know that you’re open to contributions and that you’ve put some thought into working with others. It’s especially helpful to provide a list of what you would, and would not, like potential contributors to help with.

Remember that non-code contributions, like documentation, support, and graphic design, are a big part of any successful project. While these aren’t necessarily non-technical, you shouldn’t assume too much technical knowledge. “You want to make your language and project easy to understand so that people of various technical skill levels will be interested,” Drew says.

Also be sure to take advantage of the “Help wanted” and “Good first issue” labels. These can help people who are looking for ways to contribute find your project.

Get started contributing to open source now

The post 5 tips for promoting your open source project appeared first on The GitHub Blog.

January 31, 2025  19:26:05

“We have a problem. Our current search method for sifting through PDFs is extremely manual and time consuming. Is there an easier way?”

As a developer, this is one of those questions that really gets me excited. I was tasked with finding a way to transform a cumbersome, archival process into an efficient, intuitive search experience. It’s a way to make a group of people’s lives easier, and because of the organizations they work for, help them be more effective in providing humanitarian assistance to people in need around the world. I couldn’t imagine a better project to be working on.

Unlocking the United Nations’ legacy for rapid action

Since 1945, the United Nations has produced resolutions and other documents that guide international peace and security efforts. Yet accessing this wealth of knowledge remains a challenge, including for organizations such as the International Committee of the Red Cross (ICRC). Currently, delegates at ICRC’s permanent observer mission to the UN advise member states and other stakeholders on international humanitarian law and humanitarian issues. When states negotiate relevant resolutions and other UN products, leaning on pre-existing humanitarian language from UN resolutions can provide precedence. This often requires sifting through PDFs to find relevant content within documents—a time-intensive, manual process ill-suited to the fast-paced world of humanitarian diplomacy.

A live, accessible, and scalable search platform

To solve this, I built a single-page application (SPA) that enables users to input natural language queries and instantly retrieve relevant UN resolutions. The solution is live now at resolutions.projectrefuge.io and serves as a robust example of how technology can simplify access to critical information.

How it works

  1. Text extraction and structuring
    Using Amazon Textract, I extracted raw text from decades’ worth of UN Security Council Resolutions and Presidential Statements and six years of UN General Assembly Resolutions. A Go script then parsed this text using Regex matching, segmenting it into individual resolutions for easier indexing.
  2. Search-ready database with MongoDB Atlas
    I adapted a Node.js script from MongoDB to upload the parsed resolutions as embeddings into a MongoDB Atlas database. This step ensures the content is structured for fast and relevant searches.

  3. User interface built with Vue.js
    The front end is an intuitive SPA created with Vue.js. Users simply enter semantic search queries—such as “resolutions on humanitarian access in armed conflicts”—and receive results in seconds.

  4. Backend hosted on AWS
    The backend relies on AWS Lambda and API Gateway, ensuring scalability and seamless performance. The entire application is hosted as a subdomain on AWS Amplify, combining reliability with ease of access.

A Reference Architecture diagram showing the following: Text extraction and structuring Using Amazon Textract, I extracted raw text from decades’ worth of UN Security Council Resolutions and Presidential Statements and six years of UN General Assembly Resolutions. A Go script then parsed this text using Regex matching, segmenting it into individual resolutions for easier indexing. Search-ready database with MongoDB Atlas I adapted a Node.js script from MongoDB to upload the parsed resolutions as embeddings into a MongoDB Atlas database. This step ensures the content is structured for fast and relevant searches. User interface built with Vue.js The front end is an intuitive SPA created with Vue.js. Users simply enter semantic search queries—such as “resolutions on humanitarian access in armed conflicts”—and receive results in seconds. Backend hosted on AWS The backend relies on AWS Lambda and API Gateway, ensuring scalability and seamless performance. The entire application is hosted as a subdomain on AWS Amplify, combining reliability with ease of access.

This code is publicly available at projectrefuge/resolutions-search-template. This initiative will empower other organizations to adapt and expand the solution to their unique needs.

Broader implications: a blueprint for impact

The implications of this project go far beyond the ICRC’s use case with UN Resolutions. With slight modifications, the tool could index and search any collection of legal and policy documents. This approach is a blueprint for organizations aiming to leverage technology for better decision-making and more effective action. For nonprofits, this demonstrates the power of owning your code and building tailored solutions. For developers, it’s a reminder of how open source can accelerate progress in humanitarian and public policy sectors.

Build together with open source

Projects like resolutions.projectrefuge.io highlight the potential of open source to transform how we access and use information. If you’re a nonprofit, explore GitHub for Nonprofits to discover tools and resources that can help you build your own solutions. Developers eager to contribute to impactful work can browse the For Good First Issue program to find projects that align with their skills and values.

Finally, stay tuned as we work to identify other opportunities with humanitarian actors such as the ICRC to bridge the technology and humanitarian space. Together, we can build a future where knowledge is more accessible and tools are built with collaboration in mind, ensuring that humanitarian efforts are supported by cutting-edge technology.

Let’s code for good—and make a lasting impact.

If you’d like to lend your developer skills for good, check out For Good First Issue, a curated platform of open source projects that contribute to a better future for everyone.

The post 4 steps to building a natural language search tool appeared first on The GitHub Blog.