Security Concepts for Developers: Dependency Confusion Attacks

State sponsored hacking teams are now conducting military campaigns over-the-wire. In 2017, NotPetya was unleashed by the Russian advanced-persistent-threat group known as Fancy Bear on its neighboring country Ukraine - resulting in over $10,000,000,000 in damages.

In March of 2022, Brandon Nozaki Miller (aka RIAevangelist), released two open source packages - peacenotwar and onedaytest. To many, Miller’s intentions appeared as a simple act of protest against the on-going conflict between Russia and Ukraine. The description which was broadcast on the screens of those affected read:

This code serves as a non-destructive example of why controlling your node modules is important. It also serves as a non-violent protest against Russia's aggression that threatens the world right now. This module will add a message of peace on your users' desktops, and it will only do it if it does not already exist just to be polite.

However, some developers using Vue.js framework did not receive polite messages of protest. Miller, also the maintainer of the widely used node-ipc library (which is a Vue.js dependency), purposely sabotaged certain versions of the package. Targeting installations by developers in Russia and Belarus by their IP addresses - these versions deleted all files on a host system by overwriting the contents with a heart emoji.

This event highlights how vulnerable developers can be to dependency attacks.

In this post we’ll discuss one of the strategies used to deliver malware infested packages - via a dependency confusion attack. But before we get into that, let’s quickly cover the basics.

What is a Dependency?

A dependency is any external component that a project or application needs to operate correctly. They are sourced externally from either public or private registries. Since these components are integrated directly into your projects - they can be exploited by threat actors to execute code on your device.

This is the second installation of Arcjet’s Security Concepts for Developers series. For more foundational reading on dependencies, see Security Concepts for Developers: Package Hijacking.

How Dependency Confusion Attacks are Executed

When an organization uses a private registry, packages published within it aren’t also pushed to public registries. While this separation is the desired partitioning between the two types - it means identically named packages can exist in both.

Dependency confusion attacks are carried out by abusing the prioritization mechanism used by some package managers for dependencies when there is such a name collision.

Any package manager that checks both registries for the existence of a package is vulnerable to dependency confusion. If the package exists in both - it will prioritize installation from the source with the highest version number.

If the package initially only exists in the private repository, meaning a package with a matching name can be created in the accessible public repository - an attacker can cause a name collision and supply a higher version number to carry out a dependency confusion attack.

How Do Attackers Find the Names of Private Packages?

The names of private packages held in private registries should be difficult to enumerate due to their private nature…right?

On February 9th of 2021, Alex Birsan, a security researcher published a blog covering a proof-of-concept dependency confusion attack. The methods Birsan used to identify the names of private packages simply included Open Source Intelligence (OSINT) gathering tactics.

He showed how internal package names were easily discovered on GitHub, public internet forums, inside JavaScript and package.json files and within internal packages that had accidentally been published publicly.

Python example - CVE-2018-20225

Python’s installer pip includes the --extra-index-url command line argument. This option directs pip to check for the package in both the specified private and public (PyPi) repositories:

pip install <package-name> --extra-index-url <private-repo-url>

Running the command above - pip will first check if the package exists at the supplied address. If the package is not found there, pip will search the public index next. If the package is available in both, pip will default to the prioritization method discussed earlier.

This functionality led to the exploitation of the PyTorch library in December of 2022.

Even though this has been reported to the pip team - it is considered intended functionality and the responsibility for appropriate use is on the developer. The issue has been marked CLOSED WONTFIX.

Dependency Confusion Attack Risk Mitigation: A Checklist for Developers

Although it may be enticing to leave the responsibility of preventing dependency confusion attacks to the security team - if all members throughout the project development cycle are security conscious, the risk of attack is greatly reduced.

Even if your team uses software composition analysis (SCA) tooling - this is not sufficient enough to prevent dependency attacks, as these tools will only report known vulnerabilities.

To mitigate your risk level, there are several manual steps you can take to review your environment - you should know exactly what dependencies your project relies on and their functionality. Maintain accurate and up-to-date asset inventory records. Then proceed with the following checklist:

Being aware of manual review aspects is vital, however it can become tedious to perform constantly. To automate this process - Arcjet recommends using socket.dev. It’s what we use on all our GitHub repos.

Verify the Source

Ensure to only incorporate packages that are known and reliable. Before adding any package as a dependency, verify its source and maintainer. Popular libraries, while not immune to attack, are more likely to be inspected and reported on as more eyes are on them. This can be as simple as checking the GitHub star count and number of package downloads (if reported, such as on NPM).

Make sure to validate both the signature and the checksum of the installation to prove the ownership and integrity of the package and its updates.

Additionally, carefully review the spelling and formatting of all scopes and package names included in your project against official documentation. Just like with domain names, typo-squatting package names is common.

Actively research any packages you use. Community discussions, reported issues and general searches can provide valuable insight into the current state of a dependency. If available, subscribe to security advisory mailing lists of the dependencies you use, but definitely use a package update manager like Dependabot.

In cases where the dependency resolves to a remote URL, first ensure the spelling and formatting of the associated URL and ensure they are fetched over HTTPS. Encryption provided by a secure connection ensures data integrity and defends against man-in-the-middle attacks.

Question the Maintenance

View the release history of the dependency to ensure trustworthy entities actively maintain it. Those developed by large organizations with good standing are more likely to be secure. Consider how often updates are released and how active the commit history is. Are issues resolved in a reasonable time?

To avoid malicious packages that were added to a registry via expired domain abuse, check the list of package maintainers to enumerate their email addresses. Query each email domain against a WHOIS provider or domain registrar. If the domain status returns as expired or available - remove the package as a dependency and contact the official maintainers to notify them of the situation. Tools to automate this process have been developed such as npm_domain_check by JFrog.

Research the vulnerability history of a dependency. Analysis of past vulnerabilities can give you an idea of how responsive maintainers are to security concerns. Thorough analysis can also give insight into vulnerabilities that have been persistent, even after patches were released. This can hint toward the level of security awareness behind the dependency.

Changes in ownership or maintenance should be investigated as changes can lead to unfamiliarity with the code base, inconsistent quality and security risks.

If the dependency is not being actively maintained by a trusted author - consider forking essential packages and maintaining them yourself. This requires a lot of effort, so is likely only possible with a large engineering organization. It’s something Google does, for instance. Keeping a copy of the source code is called vendoring, which is similar, but not exactly the same.

Evaluate Necessity

Regularly audit the list of members that have access to private packages. Remove any members that no longer require access.

Remove any dependencies that are no longer required for your project to work properly. These may include ones that were only used for development and are not necessary in a production environment. If a package contains less than 35 lines of code, it is considered a trivial package. Instead of importing it into your project, directly integrate (vendor) the code into your project (with the appropriate license comments). These measures will reduce the supply chain attack surface of your project.

Access to privileged areas or components such as the file system, network, environmental variables and shell should be thoroughly inspected to ensure the external code only reads/writes/accesses what it claims to. This is a common exfiltration route for things like environment variables and config files. The permission level should be the bare minimum required to successfully serve the functional purpose of the dependency. Evaluation of these aspects may reveal that tampering has occurred, especially if the dependency suddenly requires a new, broad set of permissions

Code Review

Review your code periodically to ensure that the code being executed in your system is known and trusted.

Be suspicious of any obfuscated/minified code - malware can be injected using only a few lines of code. Even if it has no ill intent, it can add unnecessary complexity, making it challenging to maintain and audit.

Native code, code written in low-level languages or compiled to binary format, can present a higher level of risk as it runs directly on the system’s hardware. Evaluate if the code is required to be written in the language it is and seek alternative dependencies that provide the same required functionality written in high-level languages.

Version Pin

Version pinning involves defining the exact versions of an application’s dependencies. This ensures that only the specified versions are downloaded - reducing the risk of installing a potentially malicious version with the same name. Package lock files are used to guarantee that exact versions of dependencies are consistently used.

When version pinning, be sure to actively monitor your dependencies for any official updates as these patches may include vulnerability fixes.

If you are importing a Git dependency, pin a specific commit or tag rather than a branch to ensure stability.

Stay Current on Platform Advisories and Changelogs

Subscribe to security advisories, notifications and mailing lists of the package managers and version control system platforms that host your packages. Using a tool like Dependabot is the easiest way to do this.

Defending Users from Dependency Confusion Attacks: A Checklist for Developers

In order to protect those who depend on your packages - proceed with the following checklist:

Remove Public Mentions of Private Package Names

Use tools such as Google’s Advanced Search feature or archive indexes such as the Wayback Machine to identify sources of exposed names. If the source is a public forum - edit/remove the posts if possible or request removal from a forum administrator.

Remove or restrict access to any repositories that have exposed internal package names. Consider making them private or deleting sensitive information from public ones.

If internal packages have already been accidentally published publicly - unpublish or depreciate them.

If removal is impossible, change the names within your internal registry. Beware that this is essentially security by obscurity i.e. don’t rely on it!

Reserve the Name/Namespace

The best approach to defending against dependency confusion attacks is to claim the same name of your internal package in the default/public registry. As the entire attack depends on the availability of a matching name in the public registry - this proactive measure can be taken to ensure a malicious package won’t take precedence over your intended installation.

If the registry also supports namespaces (identifying prefixes to package names), reserve one for your organization to provide a second layer of specification, just as we at Arcjet have. In the event where the exact name of your company or desired namespace is already claimed - ensure that any internal code that already referenced the claimed namespace is changed to match the one you register.

In cases where your organization, for whatever reason, is reluctant to publish a private package name to a public registry - do not use predictable naming conventions internally. Naming patterns could potentially be picked up on and attackers could make educated guesses when trying to name match.

Conclusion

Mitigating the risk of a dependency confusion attack requires a multifaceted approach to ensure the integrity and security of your project.

By adopting manual review practices, you can reduce the likelihood of falling victim to these attacks. Educating your teams about the risks and best practices will produce a team of security conscious developers. As third-party component integrations become more widespread - staying vigilant and proactive in dependency management will continue to grow in importance.