Security Concepts for Developers: Dependency Confusion Attacks
Strategies used to deliver malware infested packages - via a dependency confusion attack - and how to mitigate them.
Insights into how to stay vigilant against malicious actors tampering with your dependencies.
Maintaining an accurate asset ledger in an organization can be challenging. Both major and minor changes can result in an area of ownership being overlooked or completely forgotten. As new team members onboard - the responsibilities and management scopes of previous members can fade from memory.
This oversight creates a risky attack surface, where stagnant dependency packages are particularly vulnerable. Such packages can be exploited by malicious attackers in a package hijacking attack.
In this post, we'll discuss the key concepts behind package hijacking and how library maintainers can improve their security. This is the first in a series of posts covering core security concepts from the perspective of developers.
A dependency is any external component that a project or application requires to function properly.
Dependencies include:
As developers, third-party modules and libraries enable us to take a modular approach, collaborate with others and save the time and effort that would be spent writing their contents from scratch.
A package is the distribution format for libraries. Libraries, metadata and documentation are all bundled together into a single unit, facilitating easier distribution, installation and management of code.
Dependencies are managed by using configuration files (such as requirements.txt
in Python and package.json
in JavaScript) and package managers (such as Python’s Pip Installs Packages (pip) or Node’s npm).
When installing a dependency using a standard package manager - have you ever questioned the security of the installation? The third-party code has been fully vetted and is safe to use…right?
Generally speaking, a dependency attack is one in which a developer or system is tricked into downloading a malicious package instead of the intended one. These packages are injected with malware granting threat actors access to your application and its data - enabling remote code execution, backdoor installation, service disruption, infection propagation to other systems and/or data exfiltration.
Even more concerning, the compromised dependency usually mirrors the original - continuing to provide the functionality required by the application. This close resemblance may make it hard to notice that malware has been installed, as only small changes in code may have been made.
These attacks can have massive scope as it will affect every project that depends on a compromised package. For example, the exploitation of AWS CodeArtifact or globally recognized names such as Apple, Shopify and PayPal.
Public registries are repositories of packages that are community driven as they are accessible to anyone.
Private registries are repositories only accessible to authorized users and organizations. The hosted packages are only visible to chosen collaborators that have been given read or read/write access. Usually, authentication tokens/details are configured in a dedicated configuration file to obtain private packages.
Package hijacking attacks result from compromised package registries. Once unauthorized access is achieved, malicious attackers can inject malware into the packages you serve. If your package is popular - this could lead to devastating consequences. Malicious attackers can gain unauthorized access to the registry in various ways:
Stolen or leaked credentials are one of the easiest ways to gain registry access. These stolen credentials, extracted in security breaches, are available to download on certain websites used by hackers. Even credentials obtained in a breach that occurred externally to your organization pose a threat. Members that reuse credentials (or credentials with slight variations) across many accounts may still be a risk.
Malicious attackers may conduct phishing campaigns to trick registry maintainers into disclosing their credentials. Often, these phishing attempts involve impersonating trusted individuals or organizations to increase the likelihood of the target disclosing their login information.
The registry itself may even be vulnerable to exploitation, allowing attackers to use web application hacking techniques such as SQL injection (SQLi) and Cross-Site Scripting (XSS) to steal credentials. A lack of rate limiting can also mean valid credentials can be found through brute force attacks as an indefinite number of login requests can be sent.
Weak access controls can also result in stolen credentials.
If a package's hosting domain expires, an attacker can register the expired domain and then host a malware-infected package under that domain. This is accomplished either through hosting directly on the domain or by recreating expired email addresses that were used to register accounts on an official registry.
Once an attacker has purchased the domain and created the same email used by a maintainer - password reset functionality can be used to gain access to the inactive user account and subsequently any package it maintains.
In 2024, a Chinese company acquired the domain (polyfill.io) which served Polyfill. Polyfill, an open source project that allowed the use of modern JavaScript in older browsers was then tampered with - redirecting users to adult content and betting websites.
Even though older browsers such as Internet Explorer had already lost popularity - tens of millions of websites still referenced the polyfill.io domain.
A thorough investigation led to the discovery that the same Chinese company also managed three other domains that were all serving malicious code.
A sibling attack vector to package hijacking is repo jacking. Instead of targeting package management systems through registries - repo jacking targets packages sourced from version control systems by exploiting inactive or abandoned repository names through previously trusted URLs.
In November of 2021, GitHub disclosed a vulnerability that was given the moniker “Chainjacking” by the security researchers who discovered it - Alik Koldobsky and Dr. Joakim Kennedy.
Go build tools does not have a central registry. Instead, Go tooling sources directly from version control systems such as GitHub. User accounts on GitHub are able to change their usernames. When a user changed their name, all traffic to repository URLs under the old name were redirected to the new one. Once a username is abandoned - it becomes available to anyone.
github.com/<old-username>/<repository-name>
After name change:
github.com/<new-username>/<cloned-repository-name>
While the redirect served as a protective measure - it was able to be broken as soon as a repository with the same name was created by the new owner of the old account name. Anyone still using the old URL could be compromised.
The reissuance of the username also would give credibility to the attacker who inherited it as those unbeknownst to the legitimate owner name change may not check on maintenance.
To mitigate this vulnerability, GitHub enacted a username retirement mechanism - any repository with more than 100 clones at the time of the name switch would be considered “retired”. “Retired” as in the URL path of the username and repository name couldn’t be used by others.
However, this retirement mechanism was able to be bypassed on four separate occasions via another renaming vulnerability, transferring a repository, restoring deleted repositories, and a race condition vulnerability.
Although it may be enticing to leave the responsibility of preventing package hijacking attacks to the security team - if all members throughout the project development cycle are security conscious, the risk of attack is greatly reduced.
There are multiple steps you can take to review your environment. We'll run through these below, but you can also automate many of these checks. At Arcjet, we use and recommend socket.dev because it can help reveal issues as you're updating & managing dependencies.
Use data breach monitoring services to be alerted of any compromised credentials. In the event that any members of your organization are included in a breach - ensure the credentials are changed.
Have I Been Pwned is the most popular and is integrated into password managers like 1Password.
Enable multi-factor authentication for accessing package registries and use strong passwords. Regularly rotate credentials and manage them with secret management tools. Avoid SMS and use One Time Passwords, or ideally a physical security key.
While traditional phishing awareness training may seem like a benefit - the focus should be on effective incident reporting rather than complete avoidance in a study performed by Google. Phishing drills don't work.
Using auto renewal is the easiest way to avoid expiring domains. If a domain expires, there is a cooling off period where you can reclaim it. You might be able to re-register if it’s completely expired, but that is increasingly unlikely with automated domain squatting. Check the contact emails are monitored.
Expiry notices and domain transfer requests could be piped into a ticket system to ensure they are properly triaged. Various services exist to monitor domains and send separate alerts for upcoming expiry dates.
If the domain has already been reclaimed by an unknown entity - immediately remove any maintainers of the registry with associated offending email addresses.
Implement a formal release management process where each release is documented and approved by multiple stakeholders. This includes maintaining a changelog and ensuring releases are tracked through a centralized system. Verify any changes, releases or publishes were in fact vetted by this system.
For the Arcjet JS SDK we use Release Please which generates our changelog and creates the GitHub release. It's managed through GitHub Pull Requests so we have a history and audit log. You can find our config here.
Subscribe to security advisories, notifications and mailing lists of the package managers and platforms that host your packages.
This is a classic recommendation that is easier said than done. Try finding a single source for updates for all the packages you use! In reality, this means using a tool like Dependabot for your libraries & dependencies.
Regularly review and update repository references to avoid risks associated with inactive or abandoned repositories. Unless necessary, avoid changing account names.
Ensure that code is signed during development and validated in both build and production environments to prevent tampering. While checksums alone do not provide authentication - they are useful for verifying data integrity.
Enable vigilant mode in GitHub to flag all unsigned commits as unverified.
Some specific things to watch out for in code review:
Be suspicious of any obfuscated/minified code - malware can be injected using only a few lines of code. Even if it has no ill intent, it can add unnecessary complexity, making it challenging to maintain and audit.
Native code, code written in low-level languages, can present a higher level of risk as it runs directly on the system’s hardware. Evaluate if the code is required to be written in the language it is and seek alternative dependencies that provide the same required functionality written in high-level languages.
Mitigating the risk of a package hijack attack requires solid understanding of your attack surface to ensure the integrity and security of your project.
Adopting manual review practices can help reduce the likelihood of falling victim to these attacks, however it's better to automate as much as poissible. Use tools like socket.dev, GitHub's Dependabot and vigilent mode, to prevent mistakes and save time.
As third-party component integrations become more widespread - staying vigilant and proactive in dependency management will continue to grow in importance.
Strategies used to deliver malware infested packages - via a dependency confusion attack - and how to mitigate them.
Get the full posts by email every week.