If you’ve ever used the Python programming language or installed software written in Python, you’ve probably used PyPI before, even if you didn’t at the time.
PyPI is the abbreviation of Python package index, and it currently has just under 300,000 open source add-ons (290,614 of them when we checked [2021-03-07T00:10Z]).
You can download and install any of these modules automatically by simply running a command such as
pip install [nameofpackage], or by letting a software installer recover the missing components for you.
The complete list includes, to put it plainly, a few specific projects, the first five in alphanumeric order being …
0 0-._.-._.-._.-._.-._.-._.-0 00000a 0.0.1 007
… and the bottom five do their best to be the last on the list:
zzzfs zzzutils zzz-web zzzz zzzZZZzzz
As you probably know, many contemporary programming ecosystems such as Python, Node.js, and Ruby provide huge free public repositories like this and come with easy-to-use tools to grab all the add-ons you need. and install them automatically.
If you suddenly realize that you want to use the Python module called
asteroid, for example, you can just do
pip install asteroid, after which your own Python programs may say
import asteroid, and start using the package.
asteroid is not a double of Atari’s Asteroids game, nor is it linked to astronomy. This is an audio processing system that claims to be able to separate voice recordings with multiple participants into separate channels for each speaker.
The ease with which trusted users download and install new Python components (and Node.js, and Ruby, etc.) has led to a series of cybercrime attacks on package managers.
Crooks sometimes use a Trojan horse to repository a legitimate project, usually by guessing or deciphering the package owner’s account password, or by usefully but dishonestly offering to “help” a project whose owner original no longer has time to deal with it.
Once the fake version is uploaded to the genuine repository, users of the now hacked package are automatically infected as soon as they update to the new version, which works as before, except that it includes hidden malware that crooks can to exploit. .
Another trick is to create public versions of Trojans from private packages that the attacker knows are being used internally by a software company.
The public version of the package is given a higher version number than the internal version, and if the company has not properly secured its automatic update processes, the attacker may be able to fool the entire team. development of a company, even the person in charge of the organization. software building system, updating private code from an unreliable (and malicious) external source.
Cyber security researcher Alex Birsan recently earned over $ 100,000 in bug bounties by providing external versions of so-called internal software to dozens of IT giants, including Apple, PayPal, Microsoft and Shopify.
This kind of trick is known as supply chain attack, for obvious reasons.
In a supply chain attack, crooks do not break into your network and install the malware directly.
Instead, they push their malware upstream of you, implanting it into someone else’s network, repository, or distribution mechanism, and waiting for the infection to pass down the chain. ‘until it reaches you.
A third type of supply chain attack – a rather less sophisticated attack with no guarantee of success, but extremely easy to perform – involves creating a fake package with a deceptive name that rushed users could download and install. by mistake. .
Much like typosquatting in the world of websites, where crooks register near-missed domain names in the hope that you won’t notice you’re on the wrong site (e.g. by typing
c0mpany in the place of
company), package squatters register near-accidental or credible package names that they hope you’ll get back by mistake.
Recent, now-deleted examples that appeared in the Python Package Index last week include:
Fake name Possible target Function of real package Difference -------------- --------------- ------------------------ ----------------------- asteroids asteroid Audio processing Plural, not singular beauitfulsoup4 beautifulsoup4 HTML/XML parsing Typo (letters swapped) llvm llvmpy LLVM compiler Suffix left off winpty winpy Windows functions Extra letter inserted wwebsite website HTML manipulation Doubled letter at start
Interference considered harmful
As far as we know, none of these bogus packages contained outright malware, or even permanent package code.
However, some of them (if not all – it’s hard to verify now that they’ve been removed) included a Python command that was meant to run when installing the package, rather than when using it.
The command looked like this:
url = "h"+"t"+"t"+"p"+":"+"/"+"/"+[REDACTED IP NUMBER]+"/name?FAKEPACKAGENAME" requests.get(url, timeout=30)
This is a crass but simple way of doing what is called in the jargon telemetry – in other words, to remotely track who downloaded and installed the package.
The code above simply calls home to a remote web server with the name of the installed package in the URL and ignores any data that comes back, if any.
Presumably the IP number written in the URL above (it’s a Tencent cloud server hosted in Tokyo, Japan, for what it’s worth) is being leveraged by the downloader of the above packages …
… which goes by the unusual and slightly agrammatical nickname Remind the risks of the supply chain.
Fascinatingly, if not unnecessary, this user not only downloaded the five bogus libraries listed above, but a grand total, according to the Wayback Machine, of 3951 totally bogus PyPI packages.
Oddly enough, many, if not most, of the package names were either incongruous or unlikely to be chosen in error, such as
We have not been able to understand where or how our mystery Supply chain risks The user generated his list of fake package names, but maybe having just a small number of “real” fake typosquats among the vast sea of fakes and even ridiculous was part of the plan?
Anyway, it looks like Remind the risks of the supply chain subscribes to the idea that a job worth doing (or, as in this case, a job not really worth doing) is worth overdoing.
Fortunately, the Python team has already removed all these offending elements …
… although we couldn’t help but notice that there is already a new fake
beautifulsoup4 impostor in the PyPI database, this time titled
beatufulsoup4, uploaded 03/03/2021.
This one doesn’t contain any code, but it has the project title this-would-be-wittier-if-it-were-not-wearing-a-bit-thin-by-now “You may want to install beautifulsoup4, not beautfulsoup4”To prove that it didn’t really need to be proven again.
What has to be done?
- Don’t make bogus bulk downloads like this to prove your point. We appreciate the message you’re trying to convey, but it’s already been documented, so you’re just distracting the work of other people who might more usefully do something else for the project.
- Do not choose a PyPI package just because the name looks correct. Check that you are really downloading the right module from the right editor. Even legitimate modules sometimes have names that collide, compete with, or cause confusion.
- Do not mistakenly connect internal projects to external repositories. If you’re using Python packages that you haven’t published externally, the only thing you can be sure of is that all external copies of “your” package are impostor modules, possibly malware.
- Don’t blindly download package updates into your own development or build systems. Test and review everything you download before approving it for use. Keep in mind that packages usually include update scripts that run when you update, so malware infections can be transmitted as part of the update process, and not from the source code of the module that is finally installed.