To Fork or Not to Fork

How you handle your dependencies will change how you secure your software supply chain

Ben Cotton

June 27, 2024

It’s a common question in development: how do you handle your dependencies? You can choose to pull them in from an external repository at build time, cache them to an internal repository, or create a static copy (commonly called “vendoring” or, if you’re maintaining the copy, “forking”). Each of these approaches comes with advantages and disadvantages. So what’s the security-conscious developer to do?

‍

HashiCorp founder Mitchell Hashimoto recently offered his take:

Unpopular opinion: you should copy/fork/DIY your dependencies for everything but the most complicated or sensitive functionality (GUI, crypto, networking, etc.). Blindly depending on trivial functionality or having a deep dependency tree causes more problems than it solves.

Reasons to fork

It’s true that dependency management is one of the most challenging parts of maintaining supply chain security. Creating a static copy of your upstreams, either directly in your application’s source code or as an external repository, gives you a predictable experience. Forking or vendoring your dependencies can eliminate some supply chain hazards entirely.

‍

Most notably, it means you don’t have to worry about the availability or security of the repository after you’ve made the initial copy. Outages in the language-specific repository (for example: PyPI) or the code forge (for example: GitHub), don’t impact you because you already have the code you needed. Plus, you’re protected against maintainers who decide to vandalize or remove their project (like in the left-pad incident of 2016) or compromises in the repository (like in 2023 when malicious actors impersonated GitHub’s Dependabot to inject malware into projects).

‍

Beyond the availability and integrity considerations, a vendored dependency simplifies the developer experience in a couple of ways. For one, it means that the behavior of the dependency doesn’t change unexpectedly. If you’re pulling in the latest version of a library or application and one day there’s a new version with new behavior, your project may fail to build — or worse: build with subtle changes in behavior.

‍

Vendoring dependencies also avoids a particular flavor of “dependency hell.” Although we often talk about a supply chain or imagine our dependencies as a tree, the reality is that the relationships are a directed acyclic graph (DAG). Looking at only the dependency name doesn’t tell you enough; your upstreams may depend on mutually-exclusive versions of the same upstream library. Vendoring those indirect dependencies into your application avoids this difficult-to-solve problem.

Reasons not to fork

The previous sentence implies a challenge with forking your direct dependencies: you may have to fork your indirect dependencies, too. With the full dependency chain encompassing dozens or hundreds of independent dependencies, that’s a lot of work to avoid what could be a minor inconvenience.

‍

Of course, the big issue with forking your dependencies is that you don’t get any bug fixes that happen after you’ve made your copy. Sure, you might avoid new vulnerabilities that arise, but you have to keep track of the fixes that happen and apply them yourself. That’s a lot of work, especially when you consider that the vast majority of vulnerabilities come from indirect dependencies. So you have to track the vulnerabilities for dozens or hundreds of packages and then apply the fixes yourself — a task that becomes more difficult the more the upstream project has diverged from the point where you copied it.

‍

And what happens when a new upstream release contains features that you want to start using? If you’ve made a simple copy, then you can update your copy and debug any issues that crop up, just like you would with an external copy. If you’ve made a “true” fork where you’ve made local changes to the copy, then you have to bring those changes forward into the new code base. Again, depending on how much the code bases have diverged, this may be a difficult task.

What to do?

This is a complicated problem; anyone who claims to have a simple answer is trying to sell you something. The real answer is: it depends.

‍

For trivial dependencies, vendoring the code is a reasonable solution; left-pad is a great example. It’s a few lines of code, with no real functionality to add (in fact, it’s now obviated by a built-in function) or likelihood of new bugs. In general, the smaller and more “done” a project is, the better a candidate it is for vendoring.

‍

Another approach that can help is pinning dependencies to a particular version or range of versions. Pinning to a specific version doesn’t make much day-to-day difference from vendoring or forking, but it does make the process of updating simpler. In particular, it makes it easy to test the new version and quickly roll back if it doesn’t work. If you use a range of versions, as many language ecosystems support, then you can automatically upgrade to point releases that contain bug fixes, but not automatically go to a new major version that may include behavior changes. This middle ground works in a lot of cases.

‍

If you have the resources, doing regular scratch builds with the latest versions of the dependencies can help you find issues early. By keeping pace with your dependencies, making an urgent update to address a vulnerability becomes a small step instead of a giant leap.

‍

Simplify, simplify

Simplifying the software supply chain is an important part of reducing security risk. Every line of code is another potential bug or attack vector. When you use only what you need, you avoid unnecessary risks. When you can’t entirely remove a dependency, forking it can be a way of simplifying by turning it into something under your control. The tradeoff is that you are now directly responsible for the security of that code.

‍

The first step to making a simpler supply chain is understanding what your supply chain looks like in the first place. Using a tool like Graph for Understanding Artifact Composition (GUAC) to get a view of the supply chain across all of your applications will help you understand the current state. From there, you can examine each of your dependencies to determine the best approach for keeping them secure in the future.

‍

Like what you read? Share it with others.

Open Source Summit 2022

Takeaways & Learnings

June 23, 2022

Tim Miller

SPIFFE/SPIRE CSI Driver

Overview of the SPIFFE/SPIRE CSI Driver

June 27, 2022

Parth Patel

Not Just Third Party Risk

There’s a misconception in Cybersecurity among some that Software Supply Chain Security is just Third Party Risk Mana...

To Fork or Not to Fork

Reasons to fork

Reasons not to fork

What to do?

Simplify, simplify

Other blog posts

Open Source Summit 2022

SPIFFE/SPIRE CSI Driver

Not Just Third Party Risk

Government Memo for Enhancing the Security of the Software Supply Chain

A High Fidelity View of Software Supply Chain

Kusari presenting at KubeCon and Cloud Native SecurityCon NA 2022

The Next Heartbleed?

Kusari's Software Supply Chain Security Overview

Applying Zero Trust to the Software Supply Chain

Figure Out Who's Lurking in Your Supply Chain With Signatures and Attestations

Kusari Open-Sources Spector

GUAC v0.1 Beta Release

Quest to determine the 'G' in GUAC

daBOM Podcast with Tim & DJ

Announcing Helm Chart for GUAC

Case Study: A discussion with Guidewire on GUAC

Announcing the Kusari YouTube Channel and GUACademy

Terror of cURL - Preparation is Half the Battle

Spooky Enhancements: Unveiling GUAC's OpenVEX Feature

What the NSA Missed in its SBOM Management Recommendations

Contributor to Leader: Securing Open Source Software at OpenSSF

Our $8M Funding Round Fuels our Mission to Make the Software Supply Chain Transparent and Secure

Kusari Soaks up Community at FOSDEM and Beyond

From Open Source Community to Joining a Start-up – while in High School

Unveiling GUAC as an OpenSSF Incubating Project for Software Dependency Management

Graph for Understanding Artifact Composition (GUAC) Joins OpenSSF as Incubating Project

XZ Backdoor: Software Security Lessons

Proactive Security in the Post-Log4j Era

Graph for Understanding Artifact Composition (GUAC) adds persistent storage in v0.6.0 release

Another Turn of the Page: GUAC v0.7.0 Released

Counting CVEs Was Never Enough

Kusari Signs the Secure by Design Pledge

To Fork or Not to Fork

Meeting Federal Software Supply Chain Mandates

Achieving Wisdom with GUAC Visualizer

Announcing GUAC v0.8.0 Enhancements

Why Software Cannot Be Secured by SBOMs Alone

Hack-Proof Artificial Intelligence Supply Chains Using Open Source Security

GUAC Boosts License Transparency

Understanding Prevalence is the First Step

You Can’t Fix Issues if You Can’t Find Them

Introducing the Kusari Platform—know your software

Is Your Supply Chain Haunted by CVEs?

The Path to Zero CVEs: Vanquishing Cyber Threats

Is the Internet on Fire? The State of Open Source Security

The Best Way to Secure Your Open Source Supply Chain is to Participate

Rust Won’t Fix Everything: Moving Toward a Memory-Safe Future

Threat Modeling in the Software Development Life Cycle

Solving the “Bottom Turtle” Problem in Supply Chain Security

Software Supply Chain Security Predictions: Hits & Misses from 2024

AI Alone Won’t Fix Your Supply Chain

Software Supply Chain Security Predictions for 2025

Stick a Pin in It: Managing Dependencies for Supply Chain Security

Alarms Raised by Critical Reverse Backdoor Vulnerability in Medical Devices

Unpacking the Kusari Score

Unpacking the Kusari “Effort to Fix” Capability

Building a Foundation of Trust for a Stronger Software Supply Chain

Analyzing Third-Party Risk in Open Source Software

Addressing Third-Party Risk in Open Source Software

Raising the Bar for Open Source Security: Introducing the OSPS Baseline

Unpacking Kusari Platform Views

Starting the Security Journey: Producing an SBOM

The Next Step in the Security Journey: Comparing SBOMs

Another Step on the Security Journey: A Constellation of SBOMs

The Last Step on the Security Journey: Kusari Platform

Securing Your AI Models

GUAC Now Supports Runtime Kubernetes SBOMs using Kubescape

Securing the Software Supply Chain book now available!

Providing Secure Updates with TUF

The Hidden Risk in Your Software: Understanding Transitive Dependencies

Codifying the SDLC with in-toto

The Hidden Risk in Your Software: Managing Transitive Dependencies

VulnCon 2025 Recap

The Future of CVEs