In short: the Python programming language is affected by security issues that programmers have known about for some time. Trellix researchers recently rediscovered a bug, highlighting the risk to hundreds of thousands of software projects and creating fixes for tens of thousands of them.
As one of the most popular programming languages in the world, Python is both an opportunity and a risk for programs and the open source software supply chain. Example: researchers rediscover a security flaw hidden in Python for 15 years. The bug “works by design”, at least according to the Python developers; others think otherwise and try to provide a fix for the affected projects.
First discovered in 2007 and listed as CVE-2007-4559, the vulnerability is in the tarfile module which is used by Python programs to read and write Tar archives. The issue is a path traversal bug that could be exploited to overwrite arbitrary files on the system, leading to possible execution of malicious code.
Since the initial report published 15 years ago, the tar file vulnerability has not received any patches or fixes – just a warning about the existing risk. To be fair, there have been no reports of attacks and security threats capable of exploiting CVE-2007-4559.
However, a recall on the flaw was recently released by Trellix. While investigating an unrelated vulnerability, researchers said they came across the old tarfile module bug.
While discussing the issue on the Python bug tracker, the devs once again concluded that CVE-2007-4559 is not a bug: “tarfile.py doesn’t do anything wrong,” the devs said, and there is “no known or possible practical exploit”. The official Python documentation has been updated yet again, with a warning about the possible danger of extracting archives from untrusted sources.
Trellix researchers, however, have a completely different view on the matter: CVE-2007-4559 is indeed a security vulnerability, they said. As proof, the researchers described and demonstrated a simple exploit exploiting the flaw in the Spyder development environment for scientific programming.
Trellix also examined the ubiquity of CVE-2007-4559, analyzing both open source and closed projects. They initially found a vulnerability rate of 61% in 257 different code repositories, increasing the percentage to 65% after an automated check and eventually analyzing a larger dataset of 588,840 unique repositories hosted on GitHub.
All things considered, Trellix estimates there could be over 350,000 projects vulnerable to CVE-2007-4559, with many of these projects being used by machine learning tools to help developers complete a project faster. Taking a stand on the matter, researchers have already created patches for around 11,000 projects with many more expected to follow in the coming weeks.