Ready to strengthen your build security and achieve SLSA Level 3?
Start using GitHub Artifact Attestations today or explore our documentation to learn more.
Stay inspired with updates, ideas, and insights from GitHub to aid developers in software design and development.
I’m excited to share an interview with two researchers that I’ve had the privilege of collaborating with on a recently released paper studying how open source maintainers adjust their work after they start using GitHub Copilot:
Kevin: Thanks so much for chatting, Manuel and Sam! So, it seems like the paper’s really making the rounds. Could you give a quick high-level summary for our readers here?
Manuel: Thanks to the great collaboration with you, Sida Peng from Microsoft and Frank Nagle from Harvard Business School, we study the impact of GitHub Copilot on developers and how this generative AI alters the nature of work. We find that when you provide developers in the context of open source software with a generative AI tool that reduces the cost of the core work of coding, developers increase their coding activities and reduce their project management activities. We find our results are strongest in the first year after the introduction but still exist even two years later. The results are driven by developers who are working more autonomously and less collaboratively since they do not have to engage with other humans to solve a problem but they can solve the problem through AI assistance.
Sam: That’s exactly right. We tried to even further understand the nature of work by digging into the paradigm of exploration vs exploitation. Loosely speaking, exploitation is the idea to exert effort towards the most lucrative of the already known options while exploration implies to experiment to find new options with a higher potential return. We tested this idea in the context of GitHub. Developers that had access to GitHub Copilot are engaged in more experimentation and less exploitation, that is, they start new projects with access to AI and have a lower propensity to work on older projects. Additionally, they expose themselves to more languages that they were previously not exposed to and in particular to languages that are valued higher in the labor market. A back-of-the-envelope calculation from purely experimentation among new languages due to GitHub Copilot suggests a value of around half a billion USD within a year.
Kevin: Interesting! Could you provide an overview of the methods you used in your analysis?
Manuel: Would be happy to! We are using a regression discontinuity design in this work, which is as close as you can get to a randomized control trial when purely using pre-existing data, such as the one from GitHub, without introducing randomization by the researcher. Instead, the regression discontinuity design is based on a ranking of developers and a threshold that GitHub uses to determine eligibility for free access to GitHub Copilot’s top maintainer program.
Sam: The main idea of this method is that a developer that is right below the threshold is roughly identical to a developer that is right above the threshold. Stated differently, by chance a developer happened to have a ranking that made them eligible for the generative AI while they could as well not have been eligible. Taken together with the idea that developers neither know the threshold nor the internal ranking from GitHub, we can be certain that the changes that we observe in coding, project management, and the other activities on the platform are only driven by the developers having access to GitHub Copilot and nothing else.
Kevin: Nice! Follow-up question: could you provide an “explain like I’m 5” overview of the methods you used in your analysis?
Manuel: Sure thing, let’s illustrate the problem a bit more. Some people use GitHub Copilot and others don’t. If we just looked at the differences between people who use GitHub Copilot vs. those who don’t, we’d be able to see that certain behaviors and characteristics are associated with GitHub Copilot usage. For example, we might find that people who use GitHub Copilot push more code than those who don’t. Crucially, though, that would be a statement about correlation and not about causation. Often, we want to figure out whether X causes Y, and not just that X is correlated with Y. Going back to the example, if it’s the case that those who use GitHub Copilot push more code than those who don’t, these are a few of the different explanations that might be at play:
Because (1), (2), and (3) could each result in data showing a correlation between GitHub Copilot usage and code pushes, just finding a correlation isn’t super interesting. One way you could isolate the cause and effect relationship between GitHub Copilot and code pushes, though, is through a randomized controlled trial (RCT).
In an RCT, we randomly assign people to use GitHub Copilot (the treatment group), while others are forbidden from using GitHub Copilot (the control group). As long as the assignment process is truly random and the users comply with their assignments, any outcome differences between the treatment and control groups can be attributed to GitHub Copilot usage. In other words, you could say that GitHub Copilot caused those effects. However, as anyone in the healthcare field can tell you, large-scale RCTs over long-time periods are often prohibitively expensive, as you’d need to recruit subjects to participate, monitor them to see if they complied with their assignments, and follow up with them over time.
Sam: That’s right. So, instead, wouldn’t it be great if there would be a way to observe developers without running an RCT and still draw valid causal conclusions about GitHub Copilot usage? That’s where the regression discontinuity design (RDD) comes in. The random assignment aspect of an RCT allows us to compare the outcomes of two virtually identical groups. Sometimes, however, randomness already exists in a system, which we can use as a natural experiment. In the case of our paper, this randomness came in the form of GitHub’s internal ranking for determining which open source maintainers were eligible for free access to GitHub Copilot.
Let’s walk through a simplified example. Let’s imagine that there were one million repositories that were ranked on some set of metrics and the rule was that the top 500,000 repositories are eligible for free access to GitHub Copilot. If we compared the #1 ranked repository with the #1,000,000 ranked repository, then we would probably find that those two are quite different from each other. After all, the #1 repository is the best repository on GitHub by this metric while the #1,000,000 repository is a whole 999,999 rankings away from it. There are probably meaningful differences in code quality, documentation quality, project purpose, maintainer quality, etc. between the two repositories, so we would not be able to say that the only reason why there was a difference in outcomes for the maintainers of repository #1 vs. repository #1,000,000 was because of free access to GitHub Copilot.
However, what about repository #499,999 vs. repository #500,001? Those repositories are probably very similar to each other, and it was all down to random chance as to which repository made it over the eligibility threshold and which one did not. As a result, there is a strong argument that any differences in outcomes between those two repositories is solely due to repository #499,999 having free access to GitHub Copilot and repo #500,001 not having free access. Practically, you’ll want to have a larger sample size than just two, so you would compare a narrow set of repositories just above and below the eligibility threshold against each other.
Kevin: Thanks, that’s super helpful. I’d be curious about the limitations of your paper and data that you wished you had for further work. What would the ideal dataset(s) look like for you?
Manuel: Fantastic question! Certainly, no study is perfect and there will be limitations. We are excited to better understand generative AI and how it affects work in the future. As such, one limitation is the availability of information from private repositories. We believe that if we were to have information on private repositories we could test whether there is some more experimentation going on in private projects and that project improvements that were done with generative AI in private may spill over to the public to some degree over time.
Sam: Another limitation of our study is the language-based exercise to provide a value of GitHub Copilot. We show that developers focus on higher value languages that they did not know previously and we extrapolated this estimate to all top developers. However, this estimate is certainly only a partial equilibrium value since developer wages may change over time in a full equilibrium situation if more individuals offer their services for a given language. However, despite the limitation, the value seems to be an underestimate since it does not contain any non-language specific experimentation value and non-experimentation value that is derived from GitHub Copilot.
Kevin: Predictions for the future? Recommendations for policymakers? Recommendations for developers?
Manuel: One simple prediction for the future is that AI incentivizes the activity for which it lowers the cost. However, it is not clear yet which parts will be incentivized through AI tools since they can be applied to many domains. It is likely that there are going to be a multitude of AI tools that incentivize different work activities which will eventually lead to employees, managers at firms, and policy-makers having to consider on which activity they want to put weight. We also would have to think about new recombinations of work activities. Those are difficult to predict. Avi Goldfarb, a prolific professor from the University of Toronto, gave an example of the steam engine with his colleagues. Namely, work was organized in the past around the steam engine as a power source but once electric motors were invented, that was not necessary anymore and structural changes happened. Instead of arranging all of the machinery around a giant steam engine in the center of the factory floor, electricity enabled people to design better arrangements, which led to boosts in productivity. I find this historical narrative quite compelling and can imagine similarly for AI that the greatest power still remains to be unlocked and that it will be unlocked once we know how work processes can be re-organized. Developers can think as well about how their work may change in the future. Importantly, developers can actively shape that future since they are closest to the development of machine learning algorithms and artificial intelligence technologies.
Sam: Adding on to those points, it is not clear when work processes change and whether it will have an inequality reducing or enhancing effect. Many predictions point towards an inequality enhancing effect since the training of large-language models requires substantial computing power which is often only in the hands of a few players. On the other hand, it has been documented that especially lower ability individuals seem to benefit the most from generative AI, at least in the short-term. As such, it’s imperative to understand how the benefits of generative AI are distributed across society. If not, are there equitable, welfare-improving interventions that can correct these imbalances? An encouraging result of our study suggests that generative AI can be especially impactful for relatively lesser skilled workers:
Sam (continued): Counter to widespread speculation that generative AI will replace many entry level tasks, we find reason to believe that AI can also lower the costs of experimentation and exploration, reduce barriers to entry, and level the playing field in certain segments of the labor market. It would be prudent for policymakers to monitor distributional effects of generative AI, allowing the new technology to deliver equitable benefits where it does so naturally but at the same time intervening in cases where it falls short.
Kevin: I’d like to change gears a bit to chat more about your personal stories. Manuel, I know you’ve worked on research analyzing health outcomes with Stanford, diversity in TV stations, and now you’re studying nerds on the internet. Would love to learn about your journey to getting there.
Manuel: Sure! I was actually involved with “nerds on the internet” longer than my vita might suggest. Prior to my studies, I was using open source software, including Linux and Ubuntu, and programming was a hobby for me. I enjoyed the freedom that one had on the personal computer and the internet. During my studies, I discovered economics and business studies as a field of particular interest. Since I was interested in causal inference and welfare from a broader perspective, I learned how to use experimental and quasi-experimental studies to better understand social, medical and technological innovation that are relevant for individuals, businesses, and policy makers. I focused on labor and health during my PhD and afterwards I was able to lean a bit more into health at Stanford University. During my time at Harvard Business School, the pendulum swung back a bit towards labor. As such, I was in the fortunate position—thanks to the study of the exciting field of open source software—to continuously better understand both spaces.
Kevin: Haha, great to hear your interest in open source runs deep! Sam, you also have quite the varied background, analyzing cleantech market conditions and the effects of employment verification policies, and you also seem to have been studying nerds on the internet for the past several years. Could you share a bit about your path?
Sam: I’ve been a computing and open source enthusiast since I got my hands on a copy of “OpenSUSE for Dummies” in middle school. As an undergraduate, I was drawn to the social science aspect of economics and its ability to explain or predict human behavior across a wide range of settings. After bouncing around a number of subfields in graduate school, I got the crazy idea to combine my field of study with my passion and never looked back. Open source is an incredibly data-rich environment with a wealth of research questions interesting to economists. I’ve studied the role of peer effects in driving contribution, modelled the formation of software dependency networks using strategic behavior and risk aversion, and explored how labor market competition shapes open source output.
And thanks, Kevin. I’ll be sure to work “nerds on the internet” into the title of the next paper.
Kevin: Finding a niche that you’re passionate about is such a joy, and I’m curious about how you’ve found living in that niche. What’s the day-to-day like for you both?
Manuel: The day-to-day can vary but as an academic, there are a few tasks that are recurring at a big picture level. Research, teaching and other work. Let’s focus on the research bucket. I am quite busy with working on causal inference papers, refining them but also speaking to audiences to communicate our work. Some of the work is done jointly, some by oneself, so there is a lot of variation, and the great part of being an academic is that one can choose that variation oneself through the projects one selects. Over time one has to juggle many balls. Hence, I am working on finishing prior papers in the space of health; for example, some that you alluded to previously on Television, Health and Happiness and Vaccination at Work, but also in the space of open source software, importantly, to improve the paper on Generative AI and the Nature of Work. We have continuously more ideas to better understand the world we are living and going to live in at the intersection of open source software and generative AI and, as such, it is very valuable to relate to the literature and eventually answer exciting questions around the future of work with real world data. GitHub is a great resource for that.
Sam: As an applied researcher, I use data to answer questions. The questions can come from current events, from conversations with both friends and colleagues, or simply musing on the intricacies of the open source space. Oftentimes the data comes from publicly observed behavior of firms and individuals recorded on the internet. For example, I’ve used static code analysis and version control history to characterize the development of open source codebases over time. I’ve used job postings data to measure firm demand for open source skills. I’ve used the dependency graphs of packaging ecosystems to track the relationships between projects. I can then use my training in economic theory, econometric methodology, and causal inference to rigorously explore the question. The end result is written up in an article, presented to peers, and iteratively improved from feedback.
Kevin: Have things changed since generative AI tooling came along? Have you found generative AI tools to be helpful?
Manuel: Definitely. I use GitHub Copilot when developing experiments and programming in Javascript together with another good colleague, Daniel Stephenson from Virginia Commonwealth University. It is interesting to observe how often Copilot actually makes code suggestions based on context that are correct. As such, it is an incredibly helpful tool. However, the big picture of what our needs are can only be determined by us, as such, in my experience Copilot does seem to speed up the process and leads to avoiding some mistakes conditional on not just blindly following the AI.
Sam: I’ve only recently begun using GitHub Copilot, but it’s had quite an impact on my workflow. Most social sciences researchers are not skilled software engineers. However, they must also write code to hit deadlines. Before generative AI, the delay between problem and solution was usually characterized by many search queries, parsing Q&A forums or documentation, and a significant time cost. Being able to resolve uncertainty within the IDE is incredible for productivity.
Kevin: Advice you might have for folks who are starting out in software engineering or research? What tips might you give to a younger version of yourself, say, from 10 years ago?
Manuel: I will talk a bit at a higher level. Find work, questions or problems that you deeply care about. I would say that is a universal rule to be satisfied, be it in software engineering, research or in any other area. In a way, think about the motto, “You only live once,” as a pretty applicable misnomer. Another universal advice that has high relevance is to not care too much about the things you cannot change but focus on what you can control. Finally, think about getting advice from many different people, then pick and choose. People have different mindsets and ideas. As such, that can be quite helpful.
Sam:
Kevin: Learning resources you might recommend to someone interested in learning more about this space?
Manuel: There are several aspects we have touched upon—Generative AI Research, Coding with Copilot, Causal Inference and Machine Learning, Economics, and Business Studies. As such, here is one link for each topic that I can recommend:
I am sure that there are many other great resources out there, many that can also be found on GitHub.
Sam: If you’re interested to get more into how economists and other social scientists think about open source, I highly recommend the following (reasonably entry-level) articles that have helped shape my research approach.
For those interested in learning more about the toolkit used by empirical social scientists:
(shameless plug) We’ve been putting together a collection of data sources for open source researchers. Contributions welcome!
Kevin: Thank you, Manuel and Sam! We really appreciate you taking the time to share about what you’re working on and your journeys into researching open source.
The post Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers appeared first on The GitHub Blog.
Hey devs! We have some exciting news for ya.
So, backstory first: in case you missed it, OpenAI launched its o1-preview and o1-mini models back in September, optimized for advanced tasks like coding, science, and math. We brought them to our GitHub Copilot Chat and to GitHub Models so you could start playing with them right away, and y’all seemed to love it!
Fast forward to today, and OpenAI has just shipped the release version to o1, an update to o1-preview with improved performance in complex tasks, including a 44% gain in Codeforces’ competitive coding test.
Sounds pretty good, huh? Well, exciting news: We’re bringing o1 to you in Copilot Chat across Copilot Pro, Business, and Enterprise, and including GitHub Models. It’s also available for you to start using right now!
In Copilot Chat in Visual Studio Code and on GitHub, you can now pick “o1 (Preview)” via the model picker if you have a paid Copilot subscription (this is not available on our newly announced free Copilot tier).
Before, it said “o1-preview (Preview)” and now it just says “o1 (Preview)” because the GitHub model picker is in preview, but not the model itself. We have proven once again that the hardest problem in computer science is naming things, but yo dawg, we no longer have previews on previews. We are a very professional business.
Anyway.
Now you can use Copilot with o1 to explain, debug, refactor, modernize, test… all with the latest and greatest that o1 has to offer!
o1 is included in your paid subscription to GitHub Copilot, allowing up to 10 messages every 12 hours. As a heads up, if you’re a Copilot Business or Copilot Enterprise subscriber, an administrator will have to enable access to o1 models first before you can use it.
Beyond the superpowers that o1 already gives you out of the box, it also pulls context from your workspace and GitHub repositories! It’s a match made in developer heaven.
If you haven’t played with GitHub Models yet, you’re in for a treat. It lets you start building AI apps with a playground and an API (and more). Learn more about GitHub Models in our documentation and start experimenting and building with a variety of AI models today on the GitHub Marketplace.
Head here to start with o1 in the playground, plus you can compare it to other models from providers such as Mistral, Cohere, Microsoft, or Meta, and use the code samples you get to start building. Every model is different, and this kind of experimentation is really helpful to get the results you want!
Having choices fuels developer creativity! We don’t want you to just write better code, but for you to have the freedom to build, innovate, commit, and push in the way that works best for you.
I hope you’re as excited as I am. GitHub is doubling down on its commitment to give you the most advanced tools available with OpenAI’s new o1 model, and if you keep your eye out, there’s even better things to come.
So, let’s build from here—together!
The post OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models appeared first on The GitHub Blog.
The need for software build security is more pressing than ever. High-profile software supply chain attacks like SolarWinds, MOVEit, 3CX, and Applied Materials have revealed just how vulnerable the software build process can be. As attackers exploit weaknesses in the build pipeline to inject their malicious components, traditional security measures—like scanning source code, securing production environments and source control allow lists—are no longer enough. To defend against these sophisticated threats, organizations must treat their build systems with the same level of care and security as their production environments.
These supply chain attacks are particularly dangerous because they can undermine trust in your business itself: if an attacker can infiltrate your build process, they can distribute compromised software to your customers, partners, and end-users. So, how can organizations secure their build processes, and ensure that what they ship is exactly what they intended to build?
The Supply-chain Levels for Software Artifacts (SLSA) framework was developed to address these needs. SLSA provides a comprehensive, step-by-step methodology for building integrity and provenance guarantees into your software supply chain. This might sound complicated, but the good news is that GitHub Artifact Attestations simplify the journey to SLSA Level 3!
In this post, we’ll break down what you need to know about SLSA, how Artifact Attestations work, and how they can boost your GitHub Actions build security to the next level.
When we build software, we convert source code into deployable artifacts—whether those are binaries, container images or packaged libraries. This transformation occurs through multiple stages, such as compilation, packaging and testing, each of which could potentially introduce vulnerabilities or malicious modifications.
A properly secured build process can:
By securing the build process, organizations can ensure that the software reaching end-users is the intended and unaltered version. This makes securing the build process just as important as securing the source code and deployment environments.
SLSA is a community-driven framework governed by the Open Source Security Foundation (OpenSSF), designed to help organizations systematically secure their software supply chains through a series of progressively stronger controls and best practices.
The framework is organized into four levels, each representing a higher degree of security maturity:
Provenance refers to the cryptographic record generated for each artifact, providing an unforgeable paper trail of its build history. This record allows you to trace artifacts back to their origins, allowing for verification of how, when and by whom the artifact was created.
Achieving SLSA Level 3 is a critical step in building a secure and trustworthy software supply chain. This level requires organizations to implement rigorous standards for provenance and isolation, ensuring that artifacts are produced in a controlled and verifiable manner. An organization that has achieved SLSA Level 3 is capable of significantly mitigating the most common attack vectors targeting software build pipelines. Here’s a breakdown of the specific requirements for reaching SLSA Level 3:
GitHub Artifact Attestations help simplify your journey to SLSA Level 3 by enabling secure, automated build verification within your GitHub Actions workflows. While generating build provenance records is foundational to SLSA Level 1, the key distinction at SLSA Level 3 is the separation of the signature process from the rest of your build job. At Level 3, the signing happens on dedicated infrastructure, separated from the build workflow itself.
While signing artifacts is a critical step, it becomes meaningless without verification. Simply having attestations does not provide any security advantages if they are not verified. Verification ensures that the signed artifacts are authentic and have not been tampered with.
The GitHub CLI makes this process easy, allowing you to verify signatures at any stage of your CI/CD pipeline. For example, you can verify Terraform plans before applying them, ensure that Ansible or Salt configurations are authentic before deployment, validate containers before they are deployed to Kubernetes, or use it as part of a GitOps workflow driven by tools like Flux.
GitHub offers several native ways to verify Artifact Attestations:
By verifying signatures during deployment, you can ensure that what you deploy to production is indeed what you built.
Reaching SLSA Level 3 may seem complex, but GitHub’s Artifact Attestations feature makes it remarkably straightforward. Generating build provenance puts you at SLSA Level 1, and by using GitHub Artifact Attestations on GitHub-hosted runners, you reach SLSA Level 2 by default. From this point, advancing to SLSA Level 3 is a straightforward journey!
The critical difference between SLSA Level 2 and Level 3 lies in using a reusable workflow for provenance generation. This allows you to centrally enforce build security across all projects and enables stronger verification, as you can confirm that a specific reusable workflow was used for signing. With just a few lines of YAML added to your workflow, you can gain build provenance without the burden of managing cryptographic key material or setting up additional infrastructure.
GitHub Artifact Attestations streamline the process of establishing provenance for your builds. By enabling provenance generation directly within GitHub Actions workflows, you ensure that each artifact includes a verifiable record of its build history. This level of transparency is crucial for SLSA Level 3 compliance.
Best of all, you don’t need to worry about the onerous process of handling cryptographic key material. GitHub manages all of the required infrastructure, from running a Sigstore instance to serving as a root signing certificate authority for you.
Check out our earlier blog to learn more about how to set up Artifact Attestations in your workflow.
GitHub Actions-hosted runners, executing workflows on ephemeral machines, ensure that each build process occurs in a clean and isolated environment. This model is fundamental for SLSA Level 3, which mandates secure and separate handling of key material used in signing.
When you create a reusable workflow for provenance generation, your organization can use it centrally across all projects. This establishes a consistent, trusted source for provenance records. Additionally, signing occurs on dedicated hardware that is separate from the build machine, ensuring that neither the source code nor the developer triggering the build system can influence or alter the build process. With this level of separation, your workflows inherently meet SLSA Level 3 requirements.
Below is an example of a reusable workflow that can be utilized across the organization to sign artifacts:
name: Sign Artifact
on:
workflow_call:
inputs:
artifact-path:
required: true
type: string
jobs:
sign-artifact:
runs-on: ubuntu-latest
permissions:
id-token: write
attestations: write
contents: read
steps:
- name: Attest Build Provenance
uses: actions/attest-build-provenance@<version>
with:
subject-name: ${{ inputs.subject-name }}
subject-digest: ${{ inputs.subject-digest }}
When you want to use this reusable workflow for signing in any other workflow, you can call it as follows:
name: Sign Artifact Workflow
on:
push:
branches:
- main
jobs:
sign:
runs-on: ubuntu-latest
steps:
- name: Sign Artifact
uses: <repository>/.github/workflows/sign-artifact.yml@<version>
with:
subject-name: "your-artifact.tar.gz" # Replace with actual artifact name
subject-digest: "your-artifact-digest" # Replace with SHA-256 digest
This architecture of ephemeral environments and centralized provenance generation guarantees that signing operations are isolated from the build process itself, preventing unauthorized access to the signing process. By ensuring that signing occurs in a dedicated, controlled environment, the risk of compromising the signing workflow can be greatly reduced so that malicious actors can not tamper with the signing action’s code and deviate from the intended process. Additionally, provenance is generated consistently across all builds, providing a unified record of build history for the entire organization.
To verify that an artifact was signed using this reusable workflow, you can use the GitHub CLI with the following command:
gh artifact verify <file-path> --signer-workflow <owner>/<repository>/.github/workflows/sign-artifact.yml
This verification process ensures that the artifact was built and signed using the anticipated pipeline, reinforcing the integrity of your software supply chain.
GitHub Artifact Attestations bring the assurance and structure of SLSA Level 3 to your builds without having to manage additional security infrastructure. By simply adding a few lines of YAML and moving the provenance generation into a reusable workflow, you’re well on your way to achieving SLSA Level 3 compliance with ease!
Ready to strengthen your build security and achieve SLSA Level 3?
Start using GitHub Artifact Attestations today or explore our documentation to learn more.
The post Enhance build security and reach SLSA Level 3 with GitHub Artifact Attestations appeared first on The GitHub Blog.
Annotated Logger is a Python package that allows you to decorate functions and classes, which then log when complete and can request a customized logger object, which has additional fields pre-added. GitHub’s Vulnerability Management team created this tool to make it easier to find and filter logs in Splunk.
We have several Python projects that have grown in complexity over the years and have used Splunk to ingest and search those logs. We have always sent our logs in via JSON, which makes it easy to add in extra fields. However, there were a number of fields, like what Git branch was deployed, that we also wanted to send, plus, there were fields, like the CVE name of the vulnerability being processed, that we wanted to add for messages in a given function. Both are possible with the base Python logger, but it’s a lot of manual work repeating the same thing over and over, or building and managing a dictionary of extra fields that are included in every log message.
The Annotated Logger started out as a simple decorator in one of our repositories, but was extracted into a package in its own right as we started to use it in all of our projects. As we’ve continued to use it, its features have grown and been updated.
Now that I’ve gotten a bit of the backstory out of the way, here’s what it does, why you should use it, and how to configure it for your specific needs. At its simplest, you decorate a function with @annotate_logs()
and it will “just work.” If you’d like to dive right in and poke around, the example folder contains examples that fully exercise the features.
@annotate_logs()
def foo():
return True
>>> foo()
{"created": 1733176439.5067494, "levelname": "DEBUG", "name": "annotated_logger.8fcd85f5-d47f-4925-8d3f-935d45ceeefc", "message": "start", "action": "__main__:foo", "annotated": true}
{"created": 1733176439.506998, "levelname": "INFO", "name": "annotated_logger.8fcd85f5-d47f-4925-8d3f-935d45ceeefc", "message": "success", "action": "__main__:foo", "success": true, "run_time": "0.0", "annotated": true}
True
Here is a more complete example that makes use of a number of the features. Make sure to install the package: pip install annotated-logger
first.
import os
from annotated_logger import AnnotatedLogger
al = AnnotatedLogger(
name="annotated_logger.example",
annotations={"branch": os.environ.get("BRANCH", "unknown-branch")}
)
annotate_logs = al.annotate_logs
@annotate_logs()
def split_username(annotated_logger, username):
annotated_logger.annotate(username=username)
annotated_logger.info("This is a very important message!", extra={"important": True})
return list(username)
>>> split_username("crimsonknave")
{"created": 1733349907.7293086, "levelname": "DEBUG", "name": "annotated_logger.example.c499f318-e54b-4f54-9030-a83607fa8519", "message": "start", "action": "__main__:split_username", "branch": "unknown-branch", "annotated": true}
{"created": 1733349907.7296104, "levelname": "INFO", "name": "annotated_logger.example.c499f318-e54b-4f54-9030-a83607fa8519", "message": "This is a very important message!", "important": true, "action": "__main__:split_username", "branch": "unknown-branch", "username": "crimsonknave", "annotated": true}
{"created": 1733349907.729843, "levelname": "INFO", "name": "annotated_logger.example.c499f318-e54b-4f54-9030-a83607fa8519", "message": "success", "action": "__main__:split_username", "branch": "unknown-branch", "username": "crimsonknave", "success": true, "run_time": "0.0", "count": 12, "annotated": true}
['c', 'r', 'i', 'm', 's', 'o', 'n', 'k', 'n', 'a', 'v', 'e']
>>>
>>> split_username(1)
{"created": 1733349913.719831, "levelname": "DEBUG", "name": "annotated_logger.example.1c354f32-dc76-4a6a-8082-751106213cbd", "message": "start", "action": "__main__:split_username", "branch": "unknown-branch", "annotated": true}
{"created": 1733349913.719936, "levelname": "INFO", "name": "annotated_logger.example.1c354f32-dc76-4a6a-8082-751106213cbd", "message": "This is a very important message!", "important": true, "action": "__main__:split_username", "branch": "unknown-branch", "username": 1, "annotated": true}
{"created": 1733349913.7200255, "levelname": "ERROR", "name": "annotated_logger.example.1c354f32-dc76-4a6a-8082-751106213cbd", "message": "Uncaught Exception in logged function", "exc_info": "Traceback (most recent call last):\n File \"/home/crimsonknave/code/annotated-logger/annotated_logger/__init__.py\", line 758, in wrap_function\n result = wrapped(*new_args, **new_kwargs) # pyright: ignore[reportCallIssue]\n File \"<stdin>\", line 5, in split_username\nTypeError: 'int' object is not iterable", "action": "__main__:split_username", "branch": "unknown-branch", "username": 1, "success": false, "exception_title": "'int' object is not iterable", "annotated": true}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<makefun-gen-0>", line 2, in split_username
File "/home/crimsonknave/code/annotated-logger/annotated_logger/__init__.py", line 758, in wrap_function
result = wrapped(*new_args, **new_kwargs) # pyright: ignore[reportCallIssue]
File "<stdin>", line 5, in split_username
TypeError: 'int' object is not iterable
There are a few things going on in this example. Let’s break it down piece by piece.
AnnotatedLogger
class. This class contains all of the configuration for the loggers.
annotated_logger
or there will be nothing configured to log your messages.)branch
annotation that will be sent with all log messages.@al.annotate_logs()
.annotated_logger
. This annotated_logger
variable can be used just like a standard logger
object but has some extra features.
annotated_logger
argument is added by the decorator before calling the decorated method.annotated_logger
parameter (see how it’s called with just name).annotate method
, which will add whatever kwargs we pass to the extra
field of all log messages that use the logger.
logger
.int
to the name field and list
threw an exception.
Let’s break down each of the fields in the log message:
Field | Source | Description |
---|---|---|
created | logging |
Standard Logging field. |
levelname | logging |
Standard Logging field. |
name | annotated_logger |
Logger name (set via class instantiation). |
message | logging |
Standard Logging field for log content. |
action | annotated_logger |
Method name the logger was created for. |
branch | AnnotatedLogger() |
Set from the configuration’s branch annotation. |
annotated | annotated_logger |
Boolean indicating if the message was sent via Annotated Logger. |
important | annotated_logger.info |
Annotation set for a specific log message. |
username | annotated_logger.annotate |
Annotation set by user. |
success | annotated_logger |
Indicates if the method completed successfully (True/False). |
run_time | annotated_logger |
Duration of the method execution. |
count | annotated_logger |
Length of the return value (if applicable). |
The success
, run_time
and count
fields are added automatically to the message (“success”) that is logged after a decorated method is completed without an exception being raised.
The Annotated Logger interacts with Logging
via two main classes: AnnotatedAdapter
and AnnotatedFilter
. AnnotatedAdapter
is a subclass of logging.LoggerAdapter
and is what all annotated_logger
arguments are instances of. AnnotatedFilter
is a subclass of logging.Filter
and is where the annotations are actually injected into the log messages. As a user outside of config and plugins, the only part of the code you will only interact with are AnnotatedAdapter
in methods and the decorator itself. Each instance of the AnnotatedAdapter
class has an AnnotatedFilter
instance—the AnnotatedAdapter.annotate
method passes those annotations on to the filter where they are stored. When a message is logged, that filter will calculate all the annotations it should have and then update the existing LogRecord
object with those annotations.
Because each invocation of a method gets its own AnnotatedAdapter
object it also has its own AnnotatedFilter
object. This ensures that there is no leaking of annotations from one method call to another.
The Annotated Logger is fully type hinted internally and fully supports type hinting of decorated methods. But a little bit of additional detail is required in the decorator invocation. The annotate_logs
method takes a number of optional arguments. For type hinting, _typing_self
, _typing_requested
, _typing_class
and provided
are relevant. The three arguments that start with _typing
have no impact on the behavior of the decorator and are only used in method signature overrides for type hinting. Setting provided
to True
tells the decorator that the annotated_logger
should not be created and will be provided by the caller (thus the signature shouldn’t be altered).
_typing_self
defaults to True
as that is how most of my code is written. provided
, _typing_class
and _typing_requested
default to False
.
class Example:
@annotate_logs(_typing_requested=True)
def foo(self, annotated_logger):
...
e = Example()
e.foo()
There are a number of plugins that come packaged with the Annotated Logger. Plugins allow for the user to hook into two places: when an exception is caught by the decorator and when logging a message. You can create your own plugin by creating a class that defines the filter
and uncaught_exception
methods (or inherits from annotated_logger.plugins.BasePlugin
which provides noop methods for both).
The filter
method of a plugin is called when a message is being logged. Plugins are called in the order they are set in the config. They are called by the AnnotatedFilter object of the AnnotatedAdapter and work like any logging.Filter
. They take a record argument which is a logging.LogRecord
object. They can manipulate that record in any way they want and those modifications will persist. Additionally, just like any logging filter, they can stop a message from being logged by returning False
.
The uncaught_exception
method of a plugin is called when the decorator catches an exception in the decorated method. It takes two arguments, exception
and logger
. The logger
argument is the annotated_logger
for the decorated method. This allows the plugin to annotate the log message stating that there was an uncaught exception that is about to be logged once the plugins have all processed their uncaught_exception
methods.
Here is an example of a simple plugin. The plugin inherits from the BasePlugin
, which isn’t strictly needed here since it implements both filter
and uncaught_exception
, but if it didn’t, inheriting from the BasePlugin means that it would fall back to the default noop methods. The plugin has an init so that it can take and store arguments. The filter
and uncaught_exception
methods will end up with the same result: flagged=True
being set if a word matches. But they do it slightly differently, filter
is called while a given log message is being processed and so the annotation it adds is directly to that record. While uncaught_exception
is called if an exception is raised and not caught during the execution of the decorated method, so it doesn’t have a specific log record to interact with and set an annotation on the logger. The only difference in outcome would be if another plugin emitted a log message during its uncaught_exception
method after FlagWordPlugin
, in that case, the additional log message would also have flagged=True
on it.
from annotated_logger.plugins import BasePlugin
class FlagWordPlugin(BasePlugin):
"""Plugin that flags any log message/exception that contains a word in a list."""
def __init__(self, *wordlist):
"""Save the wordlist."""
self.wordlist = wordlist
def filter(self, record):
"""Add annotation if the message contains words in the wordlist."""
for word in self.wordlist:
if word in record.msg:
record.flagged = True
def uncaught_exception(self, exception, logger):
"""Add annotation if exception title contains words in the wordlist."""
for word in self.wordlist:
if word in str(exception)
logger.annotate(flagged=True)
AnnotatedLogger(plugins=[FlagWordPlugin("danger", "Will Robinson")])
Plugins are stored in a list and the order they are added can matter. The BasePlugin
is always the first plugin in the list; any that are set in configuration are added after it.
When a log message is being sent the filter
methods of each plugin will be called in the order they appear in the list. Because the filter
methods often modify the record directly, one filter can break another if, for example, one filter removed or renamed a field that another filter used. Conversely, one filter could expect another to have added or altered a field before its run and would fail if it was ahead of the other filter. Finally, just like in the logging
module, the filter
method can stop a log from being emitted by returning False. As soon as a filter does so the processing ends and any Plugins later in the list will not have their filter
methods called.
If the decorated method raises an exception that is not caught, then the plugins will again execute in order. The most common interaction is plugins attempting to set/modify the same annotation. The BasePlugin
and RequestsPlugin
both set the exception_title
annotation. Since the BasePlugin
is always first, the title it sets will be overridden. Other interactions would be one plugin setting an annotation before or after another plugin that emits a log message or sends data to a third-party. In both of those cases the order will impact if the annotation is present or not.
Plugins that come with the Annotated Logger:
GitHubActionsPlugin
—Set a level of log messages to also be emitted in actions notation (notice::
).NameAdjusterPlugin
—Add a pre/postfix to a name to avoid collisions in any log processing software (source
is a field in Splunk, but we often include it as a field and it’s just hidden).RemoverPlugin
—Remove a field. Exclude password
/key
fields and set an object’s attributes to the log if you want or ignore fields like taskName
that are set when running async, but not sync.NestedRemoverPlugin
—Remove a field no matter how deep in a dictionary it is.RenamerPlugin
—Rename one field to another (don’t like levelname
and want level
, this is how you do that).RequestsPlugin
—Adds a title and status code to the annotations if the exception inherits from requests.exceptions.HTTPError
.RuntimeAnnotationsPlugin
—Sets dynamic annotations.When adding the Annotated Logger to an existing project, or one that uses other packages that log messages (flask, django, and so on), you can configure all of the Annotated Logger via dictConfig
by supplying a dictConfig
compliant dictionary as the config
argument when initializing the Annotated Logger class. If, instead, you wish to do this yourself you can pass config=False
and reference annotated_logger.DEFAULT_LOGGING_CONFIG
to obtain the config that is used when none is provided and alter/extract as needed.
There is one special case where the Annotated Logger will modify the config passed to it: if there is a filter named annotated_filter
that entry will be replaced with a reference to a filter that is created by the instance of the Annotated Logger that’s being created. This allows any annotations or other options set to be applied to messages that use that filter. You can instead create a filter that uses the AnnotatedFilter class, but it won’t have any of the config the rest of your logs have.
dictConfig
partly works when merging dictionaries. I have found that some parts of the config are not overwritten, but other parts seem to lose their references. So, I would encourage you to build up a logging config for everything and call it once only. If you pass config
, the Annotated Logger will call logging.config.dictConfig
on your config after it has the option to add/adjust the config.
The logging_config.py
example has a much more detailed breakdown and set of examples.
Included with the package is a pytest mock to assist in testing for logged messages. I know that there are some strong opinions about testing log messages, and I don’t suggest doing it extensively, or frequently, but sometimes it’s the easiest way to check a loop, or the log message is tied to an alert, and it is important how it’s formatted. In these cases, you can ask for the annotated_logger_mock
fixture which will intercept, record and forward all log messages.
def test_logs(annotated_logger_mock):
with pytest.raises(KeyError):
complicated_method()
annotated_logger_mock.assert_logged(
"ERROR", # Log level
"That's not the right key", # Log message
present={"success": False, "key": "bad-key"}, # annotations and their values that are required
absent=["fake-annotations"], # annotations that are forbidden
count=1 # Number of times log messages should match
)
The assert_logged
method makes use of pychoir for flexible matching. None of the parameters are required, so feel free to use whichever makes sense. Below is a breakdown of the default and valid values for each parameter.
Parameter | Default Value | Valid Values | Description |
---|---|---|---|
level | Matches anything | String or string-based matcher | Log level to check (e.g., “ERROR”). |
message | Matches anything | String or string-based matcher | Log message to check. |
present | Empty dictionary | Dictionary with string keys and any value | Annotations required in the log. |
absent | Empty set | `ALL`, set, or list of strings | Annotations that must not be present in the log. |
count | All positive integers | Integer or integer-based matcher | Number of times the log message should match. |
The present
key is often what makes the mock truly useful. It allows you to require the things you care about and ignore the things you don’t care about. For example, nobody wants their tests to fail because the run_time
of a method went from 0.0
to 0.1
or fail because the hostname is different on different test machines. But both of those are useful things to have in the logs. This mock should replace everything you use the caplog
fixture for and more.
Classes can be decorated with @annotate_logs
as well. These classes will have an annotated_logger
attribute added after the init (I was unable to get it to work inside the __init__
). Any decorated methods of that class will have an annotated_logger
that’s based on the class logger. Calls to annotate
that pass persist=True
will set the annotations on the class Annotated Logger and so subsequent calls of any decorated method of that instance will have those annotations. The class instance’s annotated_logger
will also have an annotation of class
specifying which class the logs are coming from.
The Annotated Logger also supports logging iterations of an enumerable
object. annotated_logger.iterator
will log the start, each step of the iteration, and when the iteration is complete. This can be useful for pagination in an API if your results object is enumerable, logging each time a page is fetched instead of sitting for a long time with no indication if the pages are hanging or there are simply many pages.
By default the iterator
method will log the value of each iteration, but this can be disabled by setting value=False
. You can also specify the level to log the iterations at if you don’t want the default of info
.
Because each decorated method gets its own annotated_logger
calls to other methods will not have any annotations from the caller. Instead of simply passing the annotated_logger
object to the method being called, you can specify provided=True
in the decorator invocation. This does two things: first, it means that this method won’t have an annotated_logger
created and passed automatically, instead it requires that the first argument be an existing annotated_logger
, which it will use as a basis for the annotated_logger
object it creates for the function. Second, it adds the annotation of subaction
and sets the decorated function’s name as its value, the action
annotation is preserved as from the method that called and provided the annotated_logger
. Annotations are not persisted from a method decorated with provided=True
to the method that called it, unless the class of the calling method was decorated and the called action annotated with persist=True
, in which case the annotation is set on the annotated_logger
of the instance and shared with all methods as is normal for decorated classes.
The most common use of this is with private methods, especially ones created during a refactor to extract some self contained logic. But other uses are for common methods that are called from a number of different places.
Long messages wreak havoc on log parsing tools. I’ve encountered cases where the HTML of a 500 error page was too long for Splunk to parse, causing the entire log entry to be discarded and its annotations to go unprocessed. Setting max_length
when configuring the Annotated Logger will break long messages into multiple log messages each annotated with split=True
, split_complete=False
, message_parts=#
and message_part=#
. The last part of the long message will have split_complete=True
when it is logged.
Only messages can be split like this; annotations will not trigger the splitting. However, a plugin could truncate any values with a length over a certain size.
You can register hooks that are executed before and after the decorated method is called. The pre_call
and post_call
parameters of the decorator take a reference to a function and will call that function right before passing in the same arguments that the function will be/was called with. This allows the hooks to add annotations and/or log anything that is desired (assuming the decorated function requested an annotated_logger
).
Examples of this would be having a set of annotations that annotate fields on a model and a pre_call
that sets them in a standard way. Or a post_call
that logs if the function left a model in an unsaved state.
Most annotations are static, but sometimes you need something that’s dynamic. These are achieved via the RuntimeAnnotationsPlugin
in the Annotated Logger config. The RuntimeAnnotationsPlugin
takes a dict of names and references to functions. These functions will be called and passed the log record when the plugin’s filter method is invoked just before the log message is emitted. Whatever is returned by the function will be set as the value of the annotation of the log message currently being logged.
A common use case is to annotate a request/correlation id, which identifies all of the log messages that were part of a given API request. For Django, one way to do this is via django-guid.
log.py
. That allows you to from project.log import annotate_logs
everywhere you want to use it and you know it’s all configured and everything will be using the same setup.dictConfig
you will want to have a single config that has everything for all Annotated Loggers.exception
log message to a service like Sentry.RequestsPlugin
).RuntimeAnnotationsPlugin
)We’d love to hear any questions, comments or requests you might have in an issue. Pull requests welcome as well!
The post Introducing Annotated Logger: A Python package to aid in adding metadata to logs appeared first on The GitHub Blog.
GitHub has a long history of offering free products and services to developers. Starting with free open source and public collaboration, we added free private repos, free minutes for GitHub Actions and GitHub Codespaces, and free package and release storage. Today, we are adding GitHub Copilot to the mix by launching GitHub Copilot Free.
Now automatically integrated into VS Code, all of you have access to 2,000 code completions and 50 chat messages per month, simply by signing in with your personal GitHub account. Or by creating a new one. And just last week, we passed the mark of 150M developers on GitHub.
Copilot Free gives you the choice between Anthropic’s Claude 3.5 Sonnet or OpenAI’s GPT-4o model. You can ask a coding question, explain existing code, or have it find a bug. You can execute edits across multiple files. And you can access Copilot’s third-party agents or build your own extension.
Did you know that Copilot Chat is now directly available from the GitHub dashboard and it works with Copilot Free, so you can start using it today? We couldn’t be more excited to make Copilot available to the 150M developers on GitHub.
Happy coding!
P.S. Students, educators, and open source maintainers: your free access to unlimited Copilot Pro accounts continues, unaffected!
The post Announcing 150M developers and a new free tier for GitHub Copilot in VS Code appeared first on The GitHub Blog.
In this blog post, I’ll show the results of my recent security research on GStreamer, the open source multimedia framework at the core of GNOME’s multimedia functionality.
I’ll also go through the approach I used to find some of the most elusive vulnerabilities, generating a custom input corpus from scratch to enhance fuzzing results.
GStreamer is an open source multimedia framework that provides extensive capabilities, including audio and video decoding, subtitle parsing, and media streaming, among others. It also supports a broad range of codecs, such as MP4, MKV, OGG, and AVI.
GStreamer is distributed by default on any Linux distribution that uses GNOME as the desktop environment, including Ubuntu, Fedora, and openSUSE. It provides multimedia support for key applications like Nautilus (Ubuntu’s default file browser), GNOME Videos, and Rhythmbox. It’s also used by tracker-miners, the Ubuntu’s metadata indexer–an application that my colleague, Kev, was able to exploit last year.
This makes GStreamer a very interesting target from a security perspective, as critical vulnerabilities in the library can open numerous attack vectors. That’s why I picked it as a target for my security research.
It’s worth noting that GStreamer is a large library that includes more than 300 different sub-modules. For this research, I decided to focus on only the “Base” and “Good” plugins, which are included by default in the Ubuntu distribution.
During my research I found a total of 29 new vulnerabilities in GStreamer, most of them in the MKV and MP4 formats.
Below you can find a summary of the vulnerabilities I discovered:
GHSL | CVE | DESCRIPTION |
---|---|---|
GHSL-2024-094 | CVE-2024-47537 | OOB-write in isomp4/qtdemux.c |
GHSL-2024-115 | CVE-2024-47538 | Stack-buffer overflow in vorbis_handle_identification_packet |
GHSL-2024-116 | CVE-2024-47607 | Stack-buffer overflow in gst_opus_dec_parse_header |
GHSL-2024-117 | CVE-2024-47615 | OOB-Write in gst_parse_vorbis_setup_packet |
GHSL-2024-118 | CVE-2024-47613 | OOB-Write in gst_gdk_pixbuf_dec_flush |
GHSL-2024-166 | CVE-2024-47606 | Memcpy parameter overlap in qtdemux_parse_theora_extension leading to OOB-write |
GHSL-2024-195 | CVE-2024-47539 | OOB-write in convert_to_s334_1a |
GHSL-2024-197 | CVE-2024-47540 | Uninitialized variable in gst_matroska_demux_add_wvpk_header leading to function pointer ovewriting |
GHSL-2024-228 | CVE-2024-47541 | OOB-write in subparse/gstssaparse.c |
GHSL-2024-235 | CVE-2024-47542 | Null pointer dereference in id3v2_read_synch_uint |
GHSL-2024-236 | CVE-2024-47543 | OOB-read in qtdemux_parse_container |
GHSL-2024-238 | CVE-2024-47544 | Null pointer dereference in qtdemux_parse_sbgp |
GHSL-2024-242 | CVE-2024-47545 | Integer underflow in FOURCC_strf parsing leading to OOB-read |
GHSL-2024-243 | CVE-2024-47546 | Integer underflow in extract_cc_from_data leading to OOB-read |
GHSL-2024-244 | CVE-2024-47596 | OOB-read in FOURCC_SMI_ parsing |
GHSL-2024-245 | CVE-2024-47597 | OOB-read in qtdemux_parse_samples |
GHSL-2024-246 | CVE-2024-47598 | OOB-read in qtdemux_merge_sample_table |
GHSL-2024-247 | CVE-2024-47599 | Null pointer dereference in gst_jpeg_dec_negotiate |
GHSL-2024-248 | CVE-2024-47600 | OOB-read in format_channel_mask |
GHSL-2024-249 | CVE-2024-47601 | Null pointer dereference in gst_matroska_demux_parse_blockgroup_or_simpleblock |
GHSL-2024-250 | CVE-2024-47602 | Null pointer dereference in gst_matroska_demux_add_wvpk_header |
GHSL-2024-251 | CVE-2024-47603 | Null pointer dereference in gst_matroska_demux_update_tracks |
GHSL-2024-258 | CVE-2024-47778 | OOB-read in gst_wavparse_adtl_chunk |
GHSL-2024-259 | CVE-2024-47777 | OOB-read in gst_wavparse_smpl_chunk |
GHSL-2024-260 | CVE-2024-47776 | OOB-read in gst_wavparse_cue_chunk |
GHSL-2024-261 | CVE-2024-47775 | OOB-read in parse_ds64 |
GHSL-2024-262 | CVE-2024-47774 | OOB-read in gst_avi_subtitle_parse_gab2_chunk |
GHSL-2024-263 | CVE-2024-47835 | Null pointer dereference in parse_lrc |
GHSL-2024-280 | CVE-2024-47834 | Use-After-Free read in Matroska CodecPrivate |
Nowadays, coverage-guided fuzzers have become the “de facto” tools for finding vulnerabilities in C/C++ projects. Their ability to discover rare execution paths, combined with their ease of use, has made them the preferred choice among security researchers.
The most common approach is to start with an initial input corpus, which is then successively mutated by the different mutators. The standard method to create this initial input corpus is to gather a large collection of sample files that provide a good representative coverage of the format you want to fuzz.
But with multimedia files, this approach has a major drawback: media files are typically very large (often in the range of megabytes or gigabytes). So, using such large files as the initial input corpus greatly slows down the fuzzing process, as the fuzzer usually goes over every byte of the file.
There are various minimization approaches that try to reduce file size, but they tend to be quite simplistic and often yield poor results. And, in the case of complex file formats, they can even break the file’s logic.
It’s for this reason that for my GStreamer fuzzing journey, I opted for “generating” an initial input corpus from scratch.
An alternative to gathering files is to create an input corpus from scratch. Or in other words, without using any preexisting files as examples.
To do this, we need a way to transform the target file format into a program that generates files compliant with that format. Two possible solutions arise:
Of course, the second solution is more time-consuming, as we need not only to understand the file format structure but also to analyze how the target software works.
But at the same time, it solves two problems in one shot:
This is the method I opted for and it allowed me to find some of the most interesting vulnerabilities in the MP4 and MKV parsers–vulnerabilities that until then, had not been detected by the fuzzer.
In this section, I will explain how I created an input corpus generator for the MP4 format. I used the same approach for fuzzing the MKV format as well.
To start, I will show a brief description of the MP4 format.
MP4, officially known as MPEG-4 Part 14, is one of the most widely used multimedia container formats today, due to its broad compatibility and widespread support across various platforms and devices. It supports packaging of multiple media types such as video, audio, images, and complex metadata.
MP4 is basically an evolution of Apple’s QuickTime media format, which was standardized by ISO as MPEG-4. The .mp4 container format is specified by the “MPEG-4 Part 14: MP4 file format” section.
MP4 files are structured as a series of “boxes” (or “atoms”), each containing specific multimedia data needed to construct the media. Each box has a designated type that describes its purpose.
These boxes can also contain other nested boxes, creating a modular and hierarchical structure that simplifies parsing and manipulation.
Each box/atom includes the following fields:
Some boxes may also include:
An MP4 file is typically structured in the following way:
Once we understand how an MP4 file is structured, we might ask ourselves, “Why are fuzzers not able to successfully mutate an MP4 file?”
To answer this question, we need to take a look at how coverage-guided fuzzers mutate input files. Let’s take AFL–one of the most widely used fuzzers out there–as an example. AFL’s default mutators can be summarized as follows:
The main problem lies in the latter category of mutators. As soon as the fuzzer modifies the data within an mp4 box, the size field of the box should be also updated to reflect the new size. Furthermore, if the size of a box changes, the size fields of all its parent boxes must also be recalculated and updated accordingly.
Implementing this functionality as a simple mutator can be quite complex, as it requires the fuzzer to track and update the implicit structure of the MP4 file.
The algorithm I used for implementing my generator follows these steps:
Structurally, an MP4 file can be visualized as a tree-like structure, where each node corresponds to an MP4 box. Thus, the first step in our generator implementation involves creating a set of unlabelled trees.
In this phase, we create trees with empty nodes that do not yet have a tag assigned. Each node represents a potential MP4 box. To make sure we have a variety of input samples, we generate trees with various structures and different node counts.
In the following code snippet, we see the constructor of the RandomTree class
, which generates a random tree structure with a specified total nodes (total_nodes
):
RandomTree::RandomTree(uint32_t total_nodes){
uint32_t curr_level = 0;
//Root node
new_node(-1, curr_level);
curr_level++;
uint32_t rem_nodes = total_nodes - 1;
uint32_t current_node = 0;
while(rem_nodes > 0){
uint32_t num_children = rand_uint32(1, rem_nodes);
uint32_t min_value = this->levels[curr_level-1].front();
uint32_t max_value = this->levels[curr_level-1].back();
for(int i=0; i<num_children; i++){
uint32_t parent_id = rand_uint32(min_value, max_value);
new_node(parent_id, curr_level);
}
curr_level++;
rem_nodes -= num_children;
}
}
This code traverses the tree level by level (Level Order Traversal), adding a random number (rand_uint32
) of children nodes (num_children
). This approach of assigning a random number of child nodes to each parent node will generate highly diverse tree structures.
After all children are added for the current level, curr_level
is incremented to move to the next level.
Once rem_nodes
is 0, the RandomTree
generation is complete, and we move on to generate another new RandomTree
.
Once we have a set of unlabelled trees, we proceed to assign random tags
to each node.
These tags correspond to the four-character codes (FOURCCs) used to identify the types of MP4 boxes, such as moov
, trak
, or mdat
.
In the following code snippet, we see two different fourcc_info
structs: FOURCC_LIST
which represents the leaf nodes of the tree, and CONTAINER_LIST
which represents the rest of the nodes.
The fourcc_info
struct includes the following fields:
const fourcc_info CONTAINER_LIST[] = {
{FOURCC_moov, "movie", 0,},
{FOURCC_vttc, "VTTCueBox 14496-30", 0},
{FOURCC_clip, "clipping", 0,},
{FOURCC_trak, "track", 0,},
{FOURCC_udta, "user data", 0,},
…
const fourcc_info FOURCC_LIST[] = {
{FOURCC_crgn, "clipping region", 0,},
{FOURCC_kmat, "compressed matte", 0,},
{FOURCC_elst, "edit list", 0,},
{FOURCC_load, "track load settings", 0,},
Then, the MP4_labeler
constructor takes a RandomTree
instance as input, iterates through its nodes, and assigns a label to each node based on whether it is a leaf (no children) or a container (has children):
…
MP4_labeler::MP4_labeler(RandomTree *in_tree) {
…
for(int i=1; i < this->tree->size(); i++){
Node &node = this->tree->get_node(i);
…
if(node.children().size() == 0){
//LEAF
uint32_t random = rand_uint32(0, FOURCC_LIST_SIZE-1);
fourcc = FOURCC_LIST[random].fourcc;
…
}else{
//CONTAINER
uint32_t random = rand_uint32(0, CONTAINER_LIST_SIZE-1);
fourcc = CONTAINER_LIST[random].fourcc;
…
}
…
node.set_label(label);
}
}
After this stage, all nodes will have an assigned tag:
The next step is to add a random-size data field to each node. This data simulates the content within each MP4 box.
In the following code, at first we set the minimum size (min_size
) of the padding specified in the selected fourcc_info
from FOURCC_LIST
. Then, we append padding
number of null bytes (\x00) to the label:
if(node.children().size() == 0){
//LEAF
…
padding = FOURCC_LIST[random].min_size;
random_data = rand_uint32(4, 16);
}else{
//CONTAINER
…
padding = CONTAINER_LIST[random].min_size;
random_data = 0;
}
…
std::string label = uint32_to_string(fourcc);
label += std::string(padding, '\x00');
label += std::string(random_data, '\x41');
By varying the data sizes, we make sure the fuzzer has sufficient space to inject data into the box data sections, without needing to modify the input file size.
Finally, we calculate the size of each box and recursively update the tree accordingly.
The traverse
method recursively traverses the tree structure serializing the node data and calculating the resulting size box (size)
. Then, it propagates size updates up the tree (traverse(child)
) so that parent boxes include the sizes of their child boxes:
std::string MP4_labeler::traverse(Node &node){
…
for(int i=0; i < node.children().size(); i++){ Node &child = tree->get_node(node.children()[i]);
output += traverse(child);
}
uint32_t size;
if(node.get_id() == 0){
size = 20;
}else{
size = node.get_label().size() + output.size() + 4;
}
std::string label = node.get_label();
uint32_t label_size = label.size();
output = uint32_to_string_BE(size) + label + output;
…
}
The number of generated input files can vary depending on the time and resources you can dedicate to fuzzing. In my case, I generated an input corpus of approximately 4 million files.
You can find my C++ code example here.
A big thank you to the GStreamer developer team for their collaboration and responsiveness, and especially to Sebastian Dröge for his quick and effective bug fixes.
I would also like to thank my colleague, Jonathan Evans, for managing the CVE assignment process.
The post Uncovering GStreamer secrets appeared first on The GitHub Blog.
In November, we experienced one incident that resulted in degraded performance across GitHub services.
November 19 10:56 UTC (lasting 1 hour and 7 minutes)
On November 19, 2024, between 10:56:00 UTC and 12:03:00 UTC, the notifications service was degraded and stopped sending notifications to all our dotcom customers. On average, notification delivery was delayed by about one hour. This was due to a database host coming out of a regular maintenance process in read-only mode.
We mitigated the incident by making the host writable again. Following this, notification delivery recovered, and any delivery jobs that had failed during the incident were successfully retried. All notifications were fully recovered and available to users by 12:36 UTC.
To prevent similar incidents moving forward, we are working to improve our observability across database clusters to reduce our time to detection and improve our system resilience on startup.
Please follow our status page for real-time updates on status changes and post-incident recaps. To learn more about what we’re working on, check out the GitHub Engineering Blog.
The post GitHub Availability Report: November 2024 appeared first on The GitHub Blog.
Large language models (LLMs), such as those used by GitHub Copilot, do not operate directly on bytes but on tokens, which can complicate scaling. In this post, we explain how we solved that challenge at GitHub to support the growing number of Copilot users and features since the first launching Copilot two years ago.
Tokenization is the process of turning bytes into tokens. The byte-pair encoding (BPE) algorithm is such a tokenizer, used (for example) by the OpenAI models we use at GitHub. The fastest of these algorithms not only have at least an O(n log(n)) complexity, they are also not incremental, and therefore badly suited for our use cases, which go beyond encoding an upfront known input. This limitation resulted in a number of scaling challenges that led us to create a novel algorithm to address them. Our algorithm not only scales linearly, but also easily out-performs popular libraries for all inputs.
Read on to find out more about why and how we created the open source bpe algorithm that substantially improves state of the art BPE implementations in order to address our broader set of use cases.
Retrieval augmented generation (RAG) is essential for GitHub Copilot’s capabilities. RAG is used to improve model output by augmenting the user’s prompt with relevant snippets of text or code. A typical RAG approach works as follows:
Tokenization is important for both of these steps. Most code files exceed the number of tokens that can be encoded into a single embedding, so we need to split the files into chunks that are within the token limit. When building a prompt there are also limits on the number of tokens that can be used. The amount of tokens can also impact response time and cost. Therefore, it is common to have some kind of budgeting strategy, which requires being able to track the number of tokens that components of the prompt contribute. In both of these cases, we are dynamically constructing a text and constantly need the updated token count during that process to decide how to proceed. However, most tokenizers provide only a single operation: encode a complete input text into tokens.
When scaling to millions of repositories and billions of embeddings, the efficiency of token calculation really starts to matter. Additionally, we need to consider the worst-case performance of tokenization for the stability of our production systems. A system that processes untrusted user input in the form of billions of source code files cannot allow data that, intentionally or not, causes pathological run times and threatens availability. (See this discussion of potential denial-of-service issues in the context of OpenAI’s tiktoken tokenizer.)
Some of the features that can help address these needs are:
Implementing these operations using current tokenization algorithms would result in at least quadratic runtime, when we would like the runtime to be linear.
We were able to make substantial improvements to the state of the art BPE implementations in order to address the broader set of use cases that we have. Not only were we able to support more features, but we do so with much better performance and scalability than the existing libraries provide.
Our implementation is open source with an MIT license can be found at https://github.com/github/rust-gems. The Rust crates are also published to crates.io as bpe and bpe-openai. The former contains the BPE implementation itself. The latter exposes convenience tokenizers (including pre-tokenization) for recent OpenAI token models.
Read on for benchmark results and an introduction to the algorithm itself.
We compare performance with two benchmarks. Both compare tokenization on randomly generated inputs of different sizes.
We used OpenAI’s o200k_base token model and compared our implementation with tiktoken-rs, a wrapper for OpenAI’s tiktoken library, and Huggingface’s tokenizers. All benchmarks were run single-threaded on an Apple M1 MacBook Pro.
Here are the results, showing single-threaded throughput in MiB/s:
The first figure shows the results for the benchmark that includes pre-tokenization. We see that our tokenizer outperforms tiktoken by almost 4x and Huggingface by about 10x. (These numbers are in line with tiktoken’s reported performance results. Note that our single-threaded performance is only matched when using eight threads.) The second figure shows the worst case complexity difference between our linear, Huggingface’s heap-based, and tiktoken’s quadratic implementation.
The rest of this post details how we achieved this result. We explain the basic principle of byte-pair encoding, the insight that allows the faster algorithm, and a high-level description of the algorithm itself.
BPE is a technique to encode text as a sequence of tokens from a token dictionary. The token dictionary is just an ordered list of tokens. Each token is either a single byte, or the concatenation of a pair of previously defined tokens. A string is encoded by replacing bytes with single-byte tokens and token pairs by concatenated tokens, in dictionary order.
Let’s see how the string abacbb
is tokenized using the following dictionary:
a b c ac bb ab acbb
Initially, the string is tokenized into the single-byte tokens. Next, all occurrences (left to right) of the token pair a c
are replaced by the token ac
. This procedure is repeated until no more replacements are possible. For our input string abacbb
, tokenization proceeds as follows:
1. a b a c b b
2. a b ac b b
3. a b ac bb
4. ab ac bb
5. ab acbb
Note that initially we have several pairs of single-byte tokens that appear in the dictionary, such as a b
and a c
. Even though ab
appears earlier in the string, ac
is chosen because the token appears first in the token dictionary. It is this behavior that makes BPE non-incremental with respect to string operations such as slicing or appending. For example, the substring abacb
is tokenized as ab ac b
, but if another b
is added, the resulting string abacbb
is tokenized as ab acbb
. Two tokens from the prefix abacb
are gone, and the encoding for the longer string even ends up being shorter.
The two main strategies for implementing BPE are:
However, all tokenizers require the full text up front. Tokenizing a substring or an extended string means starting the encoding from scratch. For that reason, the more interesting use cases from above quickly become very expensive (at least O(n2 log(n))). So, how can we do better?
The difficulty of the byte-pair encoding algorithm (as described above) is that token pair replacements can happen anywhere in the string and can influence the final tokens at the beginning of the string. However, it turns out that there is a property, which we call compatibility, that allows us to build up tokenizations left-to-right:
Given a valid encoding, we can append an additional token to produce a new valid encoding if the pair of the last token and the appended token are a valid encoding.
A valid encoding means that the original BPE algorithm produces that same encoding. We’ll show what this means with an example, and refer to the crate’s README for a detailed proof.
The sequence ab ac
is a valid encoding for our example token dictionary.
ab ac b
a valid encoding? Check if the pair ac b
is compatible:
1. a c b
2. ac b
We got the same tokens back, which means ab ac b
is the encoding ab ac b
.
Is ab ac bb
a valid encoding? Again, check if the pair ac bb
is compatible:
1. a c b b
2. ac b b
3. ac bb
4. acbb
In this case, the tokens are incompatible, and ab ac bb
is not valid.
The next section explains how we can go from building valid encodings to finding the encoding for a given input string.
Using the compatibility rule, we can implement linear encoding with a dynamic programming algorithm.
The algorithm works by checking for each of the possible last tokens whether it is compatible with the tokenization of the remaining prefix. As we saw in the previous section, we only need the last token of the prefix’s encoding to decide this.
Let’s apply this idea to our example abacbb
and write down the full encodings for every prefix:
a
——-> a
ab
——> ab
aba
—–> ab a
abac
—-> ab ac
abacb
—> ab ac b
abacbb
–> ab acbb
We only store the last token for every prefix. This gives us a ab a ac b
for the first five prefixes. We can find the last token for a prefix with a simple lookup in the list. For example, the last token for ab
is ab
, and the last token for abac
is ac
.
For the last token of abacbb
we have three token candidates: b
, bb
, and acbb
. For each of these we must check whether it is compatible with the last token of the remaining prefix: b b
, ac bb
, or ab acbb
. Retokenizing these combinations gives bb
, acbb
, and ab acbb
, which means acbb
is the only valid choice here. The algorithm works forward by computing the last token for every position in the input, using the last tokens for previous positions in the way we just described.
The resulting algorithm looks roughly like this:
let last_tokens = vec![];
for pos in 0..text.len() {
for candidate in all_potential_tokens_for_suffix(text[0..pos + 1]) {
if token_len(candidate) == pos + 1 {
last_tokens.push(candidate);
break;
} else if is_compatible(
last_tokens[pos + 1 - token_len(candidate)],
candidate,
) {
last_tokens.push(candidate);
break;
}
}
}
How do we implement this efficiently?
The string matching automaton is linear for the input length. Both the number of overlapping tokens and retokenization are bounded by constants determined by the token dictionary. Together, this gives us a linear runtime.
The Rust crate contains several different encoders based on this approach:
We have explained our algorithm at a high level so far. The crate’s README contains more technical details and is a great starting point for studying the code itself.
The post So many tokens, so little time: Introducing a faster, more flexible byte-pair tokenizer appeared first on The GitHub Blog.
Gradio is a Python web framework for demoing machine learning applications, which in the past few years has exploded in popularity. In this blog, you’ll will follow along with the process, in which I modeled (that is, added support for) Gradio framework, finding 11 vulnerabilities to date in a number of open source projects, including AUTOMATIC1111/stable-diffusion-webui—one of the most popular projects on GitHub, that was included in the 2023 Octoverse report and 2024 Octoverse report. Check out the vulnerabilities I’ve found on the GitHub Security Lab’s website.
Following the process outlined in this blog, you will learn how to model new frameworks and libraries in CodeQL and scale your research to find more vulnerabilities.
This blog is written to be read standalone; however, if you are new to CodeQL or would like to dig deeper into static analysis and CodeQL, you may want to check out the previous parts of my CodeQL zero to hero blog series. Each deals with a different topic: status analysis fundamentals, writing CodeQL, and using CodeQL for security research.
Each also has accompanying exercises, which are in the above blogs, and in the CodeQL zero to hero repository.
CodeQL uses data flow analysis to find vulnerabilities. It uses models of sources (for example, an HTTP GET request parameter) and sinks in libraries and frameworks that could cause a vulnerability (for example, cursor.execute
from MySQLdb, which executes SQL queries), and checks if there is a data flow path between the two. If there is, and there is no sanitizer on the way, CodeQL will report a vulnerability in the form of an alert.
Check out the first blog of the CodeQL zero to hero series to learn more about sources, sinks and data flow analysis,. See the second blog to learn about the basics of writing CodeQL. Head on to the third blog of the CodeQL zero to hero series to implement your own data flow analysis on certain code elements in CodeQL.
CodeQL has models for the majority of most popular libraries and frameworks, and new ones are continuously being added to improve the detection of vulnerabilities.
One such example is Gradio. We will go through the process of analyzing it and modeling it today. Looking into a new framework or a library and modeling it with CodeQL is a perfect chance to do research on the projects using that framework, and potentially find multiple vulnerabilities at once.
Hat tip to my coworker, Alvaro Munoz, for giving me a tip about Gradio, and for his guide on researching Apache Dubbo and writing models for it, which served as inspiration for my research (if you are learning CodeQL and haven’t checked it out yet, you should!).
Frameworks have sources and sinks, and that’s what we are interested in identifying, and later modeling in CodeQL. A framework may also provide user input sanitizers, which we are also interested in.
The process consisted, among others, of:
Let me preface here that it’s natural that a framework like Gradio has sources and sinks, and there’s nothing inherently wrong about it. All frameworks have sources and sinks—Django, Flask, Tornado, and so on. The point is that if the classes and functions provided by the frameworks are not used in a secure way, they may lead to vulnerabilities. And that’s what we are interested in catching here, for applications that use the Gradio framework.
Gradio is a Python web framework for demoing machine learning applications, which in the past few years has become increasingly popular.
Gradio’s documentation is thorough and gives a lot of good examples to get started with using Gradio. We will use some of them, modifying them a little bit, where needed.
We can create a simple interface in Gradio by using the Interface
class.
import gradio as gr
def greet(name, intensity):
return "Hello, " + name + "!" * int(intensity)
demo = gr.Interface(
fn=greet,
inputs=[gr.Textbox(), gr.Slider()],
outputs=[gr.Textbox()])
demo.launch()
In this example, the Interface
class takes three arguments:
fn
argument takes a reference to a function that contains the logic of the program. In this case, it’s the reference to the greet
function.inputs
argument takes a list of input components that will be used by the function passed to fn
. Here, inputs
takes the values from a text
(which is equivalent to a gr.Textbox
component) and a slider
(equivalent to a gr.Slider
component).outputs
argument specifies what will be returned by the function passed to fn
in the return
statement. Here, the output will be returned as text
(so, gr.Textbox
).Running the code will start an application with the following interface. We provide example inputs, “Sylwia” in the textbox and “3” in the slider, and submit them, which results in an output, “Hello, Sylwia!!!”.
Another popular way of creating applications is by using the gr.Blocks
class with a number of components; for example, a dropdown list, a set of radio buttons, checkboxes, and so on. Gradio documentation describes the Blocks
class in the following way:
Blocks offers more flexibility and control over: (1) the layout of components (2) the events that trigger the execution of functions (3) data flows (for example, inputs can trigger outputs, which can trigger the next level of outputs).
With gr.Blocks
, we can use certain components as event listeners, for example, a click of a given button, which will trigger execution of functions using the input components we provided to them.
In the following code we create a number of input components: a slider, a dropdown, a checkbox group, a radio buttons group, and a checkbox. Then, we define a button, which will execute the logic of the sentence_builder
function on a click of the button and output the results of it as a textbox.
import gradio as gr
def sentence_builder(quantity, animal, countries, place, morning):
return f"""The {quantity} {animal}s from {" and ".join(countries)} went to the {place} in the {"morning" if morning else "night"}"""
with gr.Blocks() as demo:
gr.Markdown("Choose the options and then click **Run** to see the output.")
with gr.Row():
quantity = gr.Slider(2, 20, value=4, label="Count", info="Choose between 2 and 20")
animal = gr.Dropdown(["cat", "dog", "bird"], label="Animal", info="Will add more animals later!")
countries = gr.CheckboxGroup(["USA", "Japan", "Pakistan"], label="Countries", info="Where are they from?")
place = gr.Radio(["park", "zoo", "road"], label="Location", info="Where did they go?")
morning = gr.Checkbox(label="Morning", info="Did they do it in the morning?")
btn = gr.Button("Run")
btn.click(
fn=sentence_builder,
inputs=[quantity, animal, countries, place, morning],
outputs=gr.Textbox(label="Output")
)
if __name__ == "__main__":
demo.launch(debug=True)
Running the code and providing example inputs will give us the following results:
Given the code examples, the next step is to identify how a Gradio application might be written in a vulnerable way, and what we could consider a source or a sink. A good point to start is to run a few code examples, use the application the way it is meant to be used, observe the traffic, and then poke at the application for any interesting areas.
The first interesting point that stood out for me for investigation are the variables passed to the inputs
keyword argument in the Interface example app above, and the on click
button event handler in the Blocks example.
Let’s start by running the above Gradio Interface example app in your favorite proxy (or observing the traffic in your browser’s DevTools), and filling out the form with example values (string "Sylwia"
and integer 3
) shows the data being sent:
The values we set are sent as a string “Sylwia”
and an integer 3
in a JSON, in the value of the “data” key. Here are the values as seen in Burp Suite:
The text box naturally allows for setting any string value we would like, and that data will later be processed by the application as a string. What if I try to set it to something else than a string, for example, an integer 1000
?
Turns out that’s allowed. What about a slider? You might expect that we can only set the values that are restricted in the code (so, here these should be integer values from 2 to 20). Could we send something else, for example, a string "high”
?
We see an error:
File "/**/**/**/example.py", line 4, in greet
return "Hello, " + name + "!" * int(intensity)
^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'high'
That’s very interesting. 🤔
The error didn’t come up from us setting a string value on a slider (which should only allow integer values from 2 to 20), but from the int
function, which converts a value to an integer in Python. Meaning, until that point, the value from the slider can be set to anything, and can be used in any dangerous functions (sinks) or otherwise sensitive functionality. All in all, perfect candidate for sources.
We can do a similar check with a more complex example with gr.Blocks
. We run the example code from the previous section and observe our data being sent:
The values sent correspond to values given in the inputs
list, which comes from the components:
Slider
with values from 2 to 20. Dropdown
with values: “cat”, “dog”, “bird”. CheckboxGroup
with values: “USA”, “Japan”, “Pakistan”. Radio
with values: “park”, “zoo”, “road”.Checkbox
.Then, we test what we can send to the application. We pass values that are not expected for the given components:
Slider
—a string "a thousand"
.Dropdown
– a string "turtle"
, which is not one of the dropdown options.CheckboxGroup
—a list with ["USA","Japan", "Poland"]
, of which "Poland"
is not one of the options.Radio
—a list ["a", "b"]
. Radio
is supposed to allow only one value, and not a list.Checkbox
—an integer 123
. Checkbox is supposed to take only true
or false
.No issues reported—we can set the source values to anything no matter which component they come from. That makes them perfect candidates for modeling as sources and later doing research at scale.
Except.
gr.Dropdown
exampleNot long after I had a look at Gradio version 4.x.x and wrote models for its sources, the Gradio team asked Trail of Bits (ToB) to conduct a security audit of the framework, which resulted in Maciej Domański and Vasco Franco creating a report on Gradio’s security with a lot of cool findings. The fixes for the issues reported were incorporated into Gradio 5.0, which was released on October 9, 2024. One of the issues that ToB’s team reported was TOB-GRADIO-15: Dropdown component pre-process step does not limit the values to those in the dropdown list.
The issue was subsequently fixed in Gradio 5.0 and so, submitting values that are not valid choices in gr.Dropdown
, gr.Radio
, and gr.CheckboxGroup
results in an error, namely:
gradio.exceptions.Error: "Value: turtle is not in the list of choices: ['cat', 'dog', 'bird']"
In this case, the vulnerabilities which may result from these sources, may not be exploitable in Gradio version 5.0 and later. There were also a number of other changes regarding security to the Gradio framework in version 5.0, which can be explored in the ToB report on Gradio security.
The change made me ponder whether to update the CodeQL models I have written and added to CodeQL. However, since the sources can still be misused in applications running Gradio versions below 5.0, I decided to leave them as they are.
We have now identified a number of potential sources. We can then write CodeQL models for them, and later use these sources with existing sinks to find vulnerabilities in Gradio applications at scale. But let’s go back to the beginning: how do we model these Gradio sources with CodeQL?
Recall that to run CodeQL queries, we first need to create a CodeQL database from the source code that we are interested in. Then, we can run our CodeQL queries on that database to find vulnerabilities in the code.
We start with intentionally vulnerable source code using gr.Interface
that we want to use as our test case for finding Gradio sources. The code is vulnerable to command injection via both folder
and logs
arguments, which end in the first argument to an os.system
call.
import gradio as gr
import os
def execute_cmd(folder, logs):
cmd = f"python caption.py --dir={folder} --logs={logs}"
os.system(cmd)
folder = gr.Textbox(placeholder="Directory to caption")
logs = gr.Checkbox(label="Add verbose logs")
demo = gr.Interface(fn=execute_cmd, inputs=[folder, logs])
if __name__ == "__main__":
demo.launch(debug=True)
Let’s create another example using gr.Blocks
and gr.Button.click
to mimic our earlier examples. Similarly, the code is vulnerable to command injection via both folder
and logs
arguments. The code is a simplified version of a vulnerability I found in an open source project.
import gradio as gr
import os
def execute_cmd(folder, logs):
cmd = f"python caption.py --dir={folder} --logs={logs}"
os.system(cmd)
with gr.Blocks() as demo:
gr.Markdown("Create caption files for images in a directory")
with gr.Row():
folder = gr.Textbox(placeholder="Directory to caption")
logs = gr.Checkbox(label="Add verbose logs")
btn = gr.Button("Run")
btn.click(fn=execute_cmd, inputs=[folder, logs])
if __name__ == "__main__":
demo.launch(debug=True)
I also added two more code snippets which use positional arguments instead of keyword arguments. The database and the code snippets are available in the CodeQL zero to hero repository.
Now that we have the code, we can create a CodeQL database for it by using the CodeQL CLI. It can be installed either as an extension to the gh
tool (recommended) or as a binary. Using the gh
tool with the CodeQL CLI makes it much easier to update CodeQL.
gh
, using the installation instructions for your system.gh extensions install github/gh-codeql
gh
), run:
gh codeql install-stub
codeql set-version latest
After installing CodeQL CLI, we can create a CodeQL database. First, we move to the folder, where all our source code is located. Then, to create a CodeQL database called gradio-cmdi-db
for the Python code in the folder gradio-tests
, run:
codeql database create gradio-cmdi-db --language=python --source-root='./gradio-tests'
This command creates a new folder gradio-cmdi-db
with the extracted source code elements. We will use this database to test and run CodeQL queries on.
I assume here you already have the VS Code CodeQL starter workspace set up. If not, follow the instructions to create a starter workspace and come back when you are finished.
To run queries on the database we created, we need to add it to the VS Code CodeQL extension. We can do it with the “Choose Database from Folder” button and pointing it to the gradio-cmdi-db
folder.
Now that we are all set up, we can move on to actual CodeQL modeling.
Let’s have another look at our intentionally vulnerable code.
import gradio as gr
import os
def execute_cmd(folder, logs):
cmd = f"python caption.py --dir={folder} --logs={logs}"
os.system(cmd)
return f"Command: {cmd}"
folder = gr.Textbox(placeholder="Directory to caption")
logs = gr.Checkbox(label="Save verbose logs")
output = gr.Textbox()
demo = gr.Interface(
fn=execute_cmd,
inputs=[folder, logs],
outputs=output)
if __name__ == "__main__":
demo.launch(debug=True)
In the first example with the Interface
class, we had several components on the website. In fact, on the left side we have one source component which takes user input – a textbox. On the right side, we have an output text component. So, it’s not enough that a component is, for example, of a gr.Textbox()
type to be considered a source.
To be considered a source of an untrusted input, a component has to be passed to the inputs
keyword argument, which takes an input component or a list of input components that will be used by the function passed to fn
and processed by the logic of the application in a potentially vulnerable way. So, not all components are sources. In our case, any values passed to inputs
in the gr.Interface
class are sources. We could go a step further, and say that if gr.Interface
is used, then anything passed to the execute_cmd
function is a source – so here the folder
and logs
.
The same situation happens in the second example with the gr.Blocks
class and the gr.Button.click
event listener. Any arguments passed to inputs
in the Button.onlick
method, so, to the execute_cmd
function, are sources.
gr.Interface
Let’s start by looking at the gr.Interface
class.
Since we are interested in identifying any values passed to inputs
in the gr.Interface
class, we first need to identify any calls to gr.Interface
.
CodeQL for Python has a library for finding reference classes and functions defined in library code called ApiGraphs, which we can use to identify any calls to gr.Interface
. See CodeQL zero to hero part 2 for a refresher on writing CodeQL and CodeQL zero to hero part 3 for a refresher on using ApiGraphs.
We can get all references to gr.Interface
calls with the following query. Note that:
@kind problem
, which will format the results of the select
as an alert.from
, we define a node
variable of the API::CallNode
type, which gives us the set of all API::CallNode
s in the program. where
, we filter the node
to be a gr.Interface
call.select
, we choose our output to be node
and string “Call to gr.Interface”, which will be formatted as an alert due to setting @kind problem
./**
* @id codeql-zero-to-hero/4-1
* @severity error
* @kind problem
*/
import python
import semmle.python.ApiGraphs
from API::CallNode node
where node =
API::moduleImport("gradio").getMember("Interface").getACall()
select node, "Call to gr.Interface"
Run the query by right-clicking and selecting “CodeQL: Run Query on Selected Database”. If you are using the same CodeQL database, you should see two results, which proves that the query is working as expected (recall that I added more code snippets to the test database. The database and the code snippets are available in the CodeQL zero to hero repository).
Next, we want to identify values passed to inputs
in the gr.Interface
class, which are passed to the execute_cmd
function. We could do it in two ways—by identifying the values passed to inputs
and then linking them to the function referenced in fn
, or by looking at the parameters to the function referenced in fn
directly. The latter is a bit easier, so let’s focus on that solution. If you’d be interested in the second solution, check out the Taint step section.
To sum up, we are interested in getting the folder
and logs
parameters.
import gradio as gr
import os
def execute_cmd(folder, logs):
cmd = f"python caption.py --dir={folder} --logs={logs}"
os.system(cmd)
return f"Command: {cmd}"
folder = gr.Textbox(placeholder="Directory to caption")
logs = gr.Checkbox(label="Save verbose logs")
output = gr.Textbox()
demo = gr.Interface(
fn=execute_cmd,
inputs=[folder, logs],
outputs=output)
if __name__ == "__main__":
demo.launch(debug=True)
We can get folder
and logs
with the query below:
/**
* @id codeql-zero-to-hero/4-2
* @severity error
* @kind problem
*/
import python
import semmle.python.ApiGraphs
from API::CallNode node
where node =
API::moduleImport("gradio").getMember("Interface").getACall()
select node.getParameter(0, "fn").getParameter(_), "Gradio sources"
To get the first the function reference in fn
(or in the 1st positional argument) we use the getParameter(0, "fn")
predicate – 0
refers to the 1st positional argument and "fn"
refers to the fn
keyword argument. Then, to get the parameters themselves, we use the getParameter(_)
predicate. Note that an underscore here is a wildcard, meaning it will output all of the parameters to the function referenced in fn
. Running the query results in 3 alerts.
We can also encapsulate the logic of the query into a class to make it more portable. This query will give us the same results. If you need a refresher on classes and using the exists
mechanism, see CodeQL zero to hero part 2.
/**
* @id codeql-zero-to-hero/4-3
* @severity error
* @kind problem
*/
import python
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.RemoteFlowSources
class GradioInterface extends RemoteFlowSource::Range {
GradioInterface() {
exists(API::CallNode n |
n = API::moduleImport("gradio").getMember("Interface").getACall() |
this = n.getParameter(0, "fn").getParameter(_).asSource())
}
override string getSourceType() { result = "Gradio untrusted input" }
}
from GradioInterface inp
select inp, "Gradio sources"
Note that for GradioInterface
class we start with the RemoteFlowSource::Range
supertype. This allows us to add the sources contained in the query to the RemoteFlowSource
abstract class.
An abstract class is a union of all its subclasses, for example, the GradioInterface
class we just modeled as well as the sources already added to CodeQL, for example, from Flask, Django or Tornado web frameworks. An abstract class is useful if you want to group multiple existing classes together under a common name.
Meaning, if we now query for all sources using the RemoteFlowSource
class, the results will include the results produced from our class above. Try it!
/**
* @id codeql-zero-to-hero/4-4
* @severity error
* @kind problem
*/
import python
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.RemoteFlowSources
class GradioInterface extends RemoteFlowSource::Range {
GradioInterface() {
exists(API::CallNode n |
n = API::moduleImport("gradio").getMember("Interface").getACall() |
this = n.getParameter(0, "fn").getParameter(_).asSource())
}
override string getSourceType() { result = "Gradio untrusted input" }
}
from RemoteFlowSource rfs
select rfs, "All python sources"
For a refresher on RemoteFlowSource
, and how to use it in a query, head to CodeQL zero to hero part 3.
Note that since we modeled the new sources using the RemoteFlowSource
abstract class, all Python queries that already use RemoteFlowSource
will automatically use our new sources if we add them to library files, like I did in this pull request to add Gradio models. Almost all CodeQL queries use RemoteFlowSource
. For example, if you run the SQL injection query, it will also include vulnerabilities that use the sources we’ve modeled. See how to run prewritten queries in CodeQL zero to hero part 3.
gr.Button.click
We model gr.Button.click
in a very similar way.
/**
* @id codeql-zero-to-hero/4-5
* @severity error
* @kind problem
*/
import python
import semmle.python.ApiGraphs
from API::CallNode node
where node =
API::moduleImport("gradio").getMember("Button").getReturn()
.getMember("click").getACall()
select node.getParameter(0, "fn").getParameter(_), "Gradio sources"
Note that in the code we first create a Button object with gr.Button()
and then call the click()
event listener on it. Due to that, we need to use API::moduleImport("gradio").getMember("Button").getReturn()
to first get the nodes representing the result of calling gr.Button()
and then we continue with .getMember("click").getACall()
to get all calls to gr.Button.click
. Running the query results in 3 alerts.
We can also encapsulate the logic of this query into a class too.
/**
* @id codeql-zero-to-hero/4-6
* @severity error
* @kind problem
*/
import python
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.RemoteFlowSources
class GradioButton extends RemoteFlowSource::Range {
GradioButton() {
exists(API::CallNode n |
n = API::moduleImport("gradio").getMember("Button").getReturn()
.getMember("click").getACall() |
this = n.getParameter(0, "fn").getParameter(_).asSource())
}
override string getSourceType() { result = "Gradio untrusted input" }
}
from GradioButton inp
select inp, "Gradio sources"
We can now use our two classes as sources in a taint tracking query, to detect vulnerabilities that have a Gradio source, and, continuing with our command injection example, an os.system
sink (the first argument to the os.system
call is the sink). See CodeQL zero to hero part 3 to learn more about taint tracking queries.
The os.system
call is defined in the OsSystemSink
class and the sink, that is the first argument to the os.system sink call, is defined in the isSink
predicate.
/**
* @id codeql-zero-to-hero/4-7
* @severity error
* @kind path-problem
*/
import python
import semmle.python.dataflow.new.DataFlow
import semmle.python.dataflow.new.TaintTracking
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.RemoteFlowSources
import MyFlow::PathGraph
class GradioButton extends RemoteFlowSource::Range {
GradioButton() {
exists(API::CallNode n |
n = API::moduleImport("gradio").getMember("Button").getReturn()
.getMember("click").getACall() |
this = n.getParameter(0, "fn").getParameter(_).asSource())
}
override string getSourceType() { result = "Gradio untrusted input" }
}
class GradioInterface extends RemoteFlowSource::Range {
GradioInterface() {
exists(API::CallNode n |
n = API::moduleImport("gradio").getMember("Interface").getACall() |
this = n.getParameter(0, "fn").getParameter(_).asSource())
}
override string getSourceType() { result = "Gradio untrusted input" }
}
class OsSystemSink extends API::CallNode {
OsSystemSink() {
this = API::moduleImport("os").getMember("system").getACall()
}
}
private module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
source instanceof GradioButton
or
source instanceof GradioInterface
}
predicate isSink(DataFlow::Node sink) {
exists(OsSystemSink call |
sink = call.getArg(0)
)
}
}
module MyFlow = TaintTracking::Global<MyConfig>;
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "Data Flow from a Gradio source to `os.system`"
Running the query results in 6 alerts, which show us the path from source to sink. Note that the os.system
sink we used is already modeled in CodeQL, but we are using it here to illustrate the example.
Similarly to the GradioInterface
class, since we have already written the sources, we can actually use them (as well as the query we’ve written above) on any Python project. We just have to add them to library files, like I did in this pull request to add Gradio models.
We can actually run any query on up to 1000 projects at once using a tool called Multi repository Variant Analysis (MRVA).
But before that.
Based on the tests we did in the Identifying attack surface in Gradio section, we can identify other sources that behave in a similar way. For example, there’s gr.LoginButton.click
, an event listener that also takes inputs
and could be considered a source. I’ve modeled these cases and added them to CodeQL, which you can see in the pull request to add Gradio models. The modeling of these event listeners is very similar to what we’ve done in the previous section.
We’ve mentioned that there are two ways to model gr.Interface
and other Gradio sources—by identifying the values passed to inputs
and then linking them to the function referenced in fn
, or by looking at the parameters to the function referenced in fn
directly.
import gradio as gr
import os
def execute_cmd(folder, logs):
cmd = f"python caption.py --dir={folder} --logs={logs}"
os.system(cmd)
return f"Command: {cmd}"
folder = gr.Textbox(placeholder="Directory to caption")
logs = gr.Checkbox(label="Save verbose logs")
output = gr.Textbox()
demo = gr.Interface(
fn=execute_cmd,
inputs=[folder, logs],
outputs=output)
if __name__ == "__main__":
demo.launch(debug=True)
As it turns out, machine learning applications written using Gradio often use a lot of input variables, which are later processed by the application. In this case, inputs
argument gets a list of variables, which at times can be very long. I’ve found several cases which used a list with 10+ elements.
In these cases, it would be nice to be able to track the source all the way to the component that introduces it— in our case, gr.Textbox
and gr.Checkbox
.
To do that, we need to use a taint step. Taint steps are usually used in case taint analysis stops at a specific code element, and we want to make it propagate forward. In our case, however, we are going to write a taint step to track a variable in inputs
, that is an element of a list, and track it back to the component.
The Gradio.qll
file in CodeQL upstream contains all the Gradio source models and the taint step if you’d like to see the whole modeling.
We start by identifying the variables passed to inputs
in, for example, gr.Interface
:
class GradioInputList extends RemoteFlowSource::Range {
GradioInputList() {
exists(GradioInput call |
// limit only to lists of parameters given to `inputs`.
(
(
call.getKeywordParameter("inputs").asSink().asCfgNode() instanceof ListNode
or
call.getParameter(1).asSink().asCfgNode() instanceof ListNode
) and
(
this = call.getKeywordParameter("inputs").getASubscript().getAValueReachingSink()
or
this = call.getParameter(1).getASubscript().getAValueReachingSink()
)
)
)
}
override string getSourceType() { result = "Gradio untrusted input" }
}
Next, we identify the function in fn
and link the elements of the list of variables in inputs
to the parameters of the function referenced in fn
.
class ListTaintStep extends TaintTracking::AdditionalTaintStep {
override predicate step(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(GradioInput node, ListNode inputList |
inputList = node.getParameter(1, "inputs").asSink().asCfgNode() |
exists(int i |
nodeTo = node.getParameter(0, "fn").getParameter(i).asSource() |
nodeFrom.asCfgNode() =
inputList.getElement(i))
)
}
}
Let’s explain the taint step, step by step.
exists(GradioInput node, Listnode inputList |
In the taint step, we define two temporary variables in the exists
mechanism—node
of type GradioInput
and inputList
of type ListNode
.
inputList = node.getParameter(1, "inputs").asSink().asCfgNode() |
Then, we set our inputList
to the value of inputs
. Note that because inputList
has type ListNode
, we are looking only for lists.
exists(int i |
nodeTo = node.getParameter(0, "fn").getParameter(i).asSource() |
nodeFrom.asCfgNode() = inputList.getElement(i))
Next, we identify the function in fn
and link the parameters of the function referenced in fn
to the elements of the list of variables in inputs
, by using a temporary variable i
.
All in all, the taint step provides us with a nicer display of the paths, from the component used as a source to a potential sink.
CodeQL zero to hero part 3 introduced Multi-Repository Variant Analysis (MRVA) and variant analysis. Head over there if you need a refresher on the topics.
In short,MRVA allows you to run a query on up to 1000 projects hosted on GitHub at once. It comes preconfigured with dynamic lists for most popular repositories 10, 100, and 1000 for each language. You can configure your own lists of repositories to run CodeQL queries on and potentially find more variants of vulnerabilities that use our new models. @maikypedia wrote a neat case study about using MRVA to find SSTI vulnerabilities in Ruby and Unsafe Deserialization vulnerabilities in Python.
MRVA is used together with the VS Code CodeQL extension and can be configured in the extension, in the “Variant Analysis” section. It uses GitHub Actions to run, so you need a repository, which will be used as a controller to run these actions. You can create a public repository, in which case running the queries will be free, but in this case, you can run MRVA only on public repositories. The docs contain more information about MRVA and its setup.
Using MRVA, I’ve found 11 vulnerabilities to date in several Gradio projects. Check out the vulnerability reports on GitHub Security Lab’s website.
Today, we learned how to model a new framework in CodeQL, using Gradio as an example, and how to use those models for finding vulnerabilities at scale. I hope that this post helps you with finding new cool vulnerabilities! 🔥
If CodeQL and this post helped you to find a vulnerability, we would love to hear about it! Reach out to us on GitHub Security Lab on Slack or tag us @ghsecuritylab on X.
If you have any questions, issues with challenges or with writing a CodeQL query, feel free to join and ask on the GitHub Security Lab server on Slack. The Slack server is open to anyone and gives you access to ask questions about issues with CodeQL, CodeQL modeling or anything else CodeQL related, and receive answers from a number of CodeQL engineers. If you prefer to stay off Slack, feel free to ask any questions in CodeQL repository discussions or in GitHub Security Lab repository discussions.
The post CodeQL zero to hero part 4: Gradio framework case study appeared first on The GitHub Blog.
Three years from today all obligations of the EU Cyber Resilience Act (CRA) will be fully applicable, with vulnerability reporting obligations applying already from September 2026. By that time, the CRA—from idea to implementation—will have been high on GitHub’s developer policy agenda for nearly six years.
Together with our partners, GitHub has engaged with EU lawmakers to ensure that the CRA avoids unintended consequences for the open source ecosystem, which forms the backbone of any modern software stack. This joint effort has paid off: while the first draft of the CRA created significant legal uncertainty for open source projects, the end result allocates responsibility for cybersecurity more clearly with those entities that have the resources to act.
As an open source developer, you’re too busy to become a policy wonk in your spare time. That’s why we’re here: to understand the origins and purpose of the CRA and to help you assess whether the CRA applies to you and your project.
The reasons why the EU decided to start regulating software products are clear: cheap IoT devices that end up being abused in botnets, smartphones that fail to receive security updates just a couple of years after purchase, apps that accidentally leak user data, and more. These are not just a nuisance for the consumers directly affected by them, they can be a starting point for cyberattacks that weaken entire industries and disrupt public services. To address this problem, the CRA creates cybersecurity, maintenance, and vulnerability disclosure requirements for software products on the EU market.
Cybersecurity is just as important for open source as it is for any other software. At the same time, open source has flourished precisely because it has always been offered as-is, free to re-use by anyone, but without any guarantees offered by the community that developed it. Despite their great value to society, open source projects are frequently understaffed and underresourced.
That’s why GitHub has been advocating for a stronger focus on supporting, rather than regulating, open source projects. For example, GitHub supports upscaling public funding programs like Germany’s Sovereign Tech Agency, who is joining us for a GitTogether today to discuss open source sustainability. This approach has also informed the GitHub Secure Open Source Fund, which will help projects improve their cybersecurity posture through funding, education, and free tools. Applications are open until January 7.
The CRA creates obligations for software manufacturers, importers, and distributors. If you contribute to open source in a personal capacity, the most important question will be whether the CRA applies to you at all. In many cases, the answer will be “no,” but some uncertainty remains because the EU is still in the process of developing follow-up legislation and guidance on how the law should be applied. As with any law, courts will have the final word on how the CRA is to be interpreted.
From its inception, the CRA sought to exclude any free and open source software that falls outside of a commercial activity. To protect the open source ecosystem, GitHub has fought for a clear definition of “commercial activity” throughout the legislative process. In the end, the EU legislature has added clarifying language that should help place the principal burden of compliance on companies that benefit from open source, not on the open source projects they are relying on.
In determining whether the CRA applies to your open source project, the key will be whether the open source software is “ma[de] available on the market,” that is, intended for “distribution or use in the course of a commercial activity, whether in return for payment or free of charge.” (Art. 3(22); *see also *Art. 3(1), (22), (48); Recitals 18-19). Generally, as long as it’s not, then the CRA’s provisions won’t apply at all. If you intend to monetize the open source software yourself, then you are likely a manufacturer under the CRA, and subject to its full requirements (Art. 3(13)).
If you are part of an organization that supports and maintains open source software that is intended for commercial use by others, that organization may be classified as an “open source software steward” under the CRA (Art. 3(14), Recital 19).
Following conversations with the open source community, EU lawmakers introduced this new category to better reflect the diverse landscape of actors that take responsibility for open source. Examples of stewards include “certain foundations as well as entities that develop and publish free and open-source software in a business context, including not-for-profit entities” (Recital 19). Stewards have limited obligations under the CRA: they have to put in place and document a cybersecurity policy, notify cybersecurity authorities of actively exploited vulnerabilities in their projects that they become aware of, and promote sharing of information on discovered vulnerabilities with the open source community (Art. 24). The details of these obligations will be fleshed out by industry standards over the next few years, but the CRA’s text makes clear a key distinction—that open source stewards should be subjected to a “light-touch . . . regulatory regime” (Recital 19).
Developers should be mindful of some key provisions as they decipher whether they will be deemed manufacturers, stewards, or wholly unaffected by the CRA:
Even with the clarifications added during the legislative process, many open source developers still struggle to make sense of whether they’re covered or not. If you think your open source activities may fall within the scope of the CRA (either as a manufacturer or as a steward), the Eclipse ORC Working Group is a good place to get connected with other open source organizations that are thinking about the CRA and related standardization activities. Linux Foundation is currently co-hosting a workshop for open source stewards and manufacturers to prepare for compliance with OpenSSF, who is publishing resources about the CRA.
While it’s good news that some classes of open source developers will face minimal obligations, that probably won’t stop some companies from contacting projects about their CRA compliance. In the best case, this will lead to constructive conversations with companies taking responsibility for improving the cybersecurity of open source projects they depend upon. The CRA offers opportunities for that: for example, it allows manufacturers to finance a voluntary security attestation of open source software (Art. 25), and it requires manufacturers who have developed a fix for a vulnerability in an open source component to contribute that fix back to the project (Art. 13(6)). In the worst case, companies working on their own CRA compliance will send requests phrased as demands to overworked open source maintainers who will spend hours talking to lawyers just to often find out that they are not legally required to help manufacturers.
To avoid the worst-case scenario and help realize the opportunities of the CRA for open source projects, GitHub’s developer policy team is engaging with the cybersecurity agencies and regulators, such as the European Commission, that are creating guidance on how the rules will work in practice.
Most urgently, we call on the European Commission to issue guidance (required under Art. 26 (2) (a)) that will make it easy for every open source developer to determine whether they face obligations under the CRA or not. We are also giving input to related legislation, such as the NIS2 Directive, which defines supply chain security rules for certain critical services.
Our north star is advocating for rules that support open source projects rather than unduly burden them. We are convinced that this approach will lead to better security outcomes for everyone and a more sustainable open source ecosystem.
The post What the EU’s new software legislation means for developers appeared first on The GitHub Blog.