A good description of Python venv's and containers

jimrh · July 6, 2024, 12:51pm

memjr over on the Raspberry Pi forums posted an excellent reply to my ~~rant~~ question about current versions of Raspberry Pi’s operating systems forcing venv’s and containers down our throats:

jharris1993:

Virtual environments and/or containers:
(Note that this is off-topic, but since virtual environments keep coming up, I want to raise this question/issue.)

I have noticed that in virtually every question I post just about anywhere nowadays, the knee-jerk reaction is a virtual environment.

In a manner similar to the browser writer’s desire to lock everything behind a secure certificate paywall, need it or not, the current Python fad seems to be virtual environments. So much so that if I try to do anything in Bookworm, I am compelled to use a virtual environment that might not be suitable, or add a silly and humiliating switch, like:
--yes_I'm_a_damn-fool-idiot_and_I_want_to_thoroughly_bork_my_system_beyond_all_hope_of_recognition
(Note a simple --force, or a system-settable option would have sufficed.)

My opinion:
A virtual environment, like a virtual machine, is a TOOL, one of many, that can be used by the system designer or developer to accomplish a task.

You don’t use a laptop as a hammer, nor do you use a pipe-wrench to do delicate neurosurgery on someone’s brain - neither tool is appropriate for those particular jobs. Especially as there ARE better and more suitable tools.

There are, and will be, places where a virtual environment is the correct tool, and there are, and will be, places where a virtual environment isn’t. And, don’t you think it’s the province of the designer or system engineer to decide what’s best for his system, based on his own years of knowledge and experience? (Or corporate policies.) And isn’t it a bit arrogant to assume that “you” know more about his system’s requirements than he does?

So, will someone please explain why the current trend in Python is that everyone must use a virtual environment, need it or not, suitable or not, even if it renders the project or application more complex and prone to bugs and mistakes?

I spent decades in the aerospace industry and - especially there - requirements are both rigid and possibly a matter of life-and-death, so I know about requirements. Even there, folks aren’t arrogant about them, it’s just the way it is as there’s no room for mistakes in a manned spacecraft filled with hideously explosive fuel travelling at mach-17+, or an aircraft full of passengers at 40,000 feet at mach 0.8. Even there, designers are given the latitude of experience to design whatever system they feel is the best implementation of a specific requirement. If you can get a Python virtual environment or a Docker container to meet the relevant MIL-STD, go right ahead!

However, working with a simple robot, why does it seem that virtual environments are being forced down our throats? The biggest hindrance to porting the GoPiGo code and libraries to Bookworm is the new Python requirements for virtual environments that poke their grisly heads up in the most unexpected places, adding unnecessary complication and trouble to what should be a simple task.

This makes no sense to me and, (again IMHO), represents a level of arrogance seldom seen in any technical field in the fifty-some-odd years I’ve worked in it since graduating from college.

What say ye?

You completely miss the point of what Python venvs are for.

The python that comes with the OS is there for the OS, not you. This is why that python distribution is set with libraries where the system’s python interpreter can find it. The OS uses python for a lot of stuff. The fact that you too can use it without installing another python insterpreter is purely a bonus.

Because it is the systems python distribution, the system demands that the libraries and the versions it requires exist and NOT be messed up with by the end user.

That is why system libraries blessed by the OS are installed with APT and NOT pip. APT and PIP are not mutually exclusive. PIP installs things that have not been tested and confirmed to work in the OS. So if you install python libraries with apt, you get what is known to work with the system’s requirements. If you install with PIP you get whatever the latest version of whatever libraries you are installing. And those latest versions can and do break things for the system which does not contain fixes that are required to use the newer versions of those libraries you installed with PIP.

Therefore, the change was made. Thou shalt not install python stuff over the system’s distribution with PIP as PIP is NOT the system’s package manager, apt it. And the OS maintainers can control what versions of the libraries are installed with apt.

Notice that so far this has nothing, NOTHING, to do with venvs.

It has everything to do with providing a safety net to not let user royally f up their system by running sudo pip. You still can if you turn off that safety net (plenty of posts on how to do that both here and other places on the internet). The caveat is: if you choose to do that and something brakes, you’re on the hook to fix it. The OS maintainers gave you a very clear warning to NOT screw with it and you did, it’s not their fault, figure it out, they’re too busy working their own OS fixes.

In the real world out there, systems have multiple python distributions installed. The system one and the ones that are usually installed in /opt. Then the sysadmin setups things up so that the system’s python is not found/used by end users and sets up global PYTHONPATH so that there are defaults that use the non system distribution.

So install your own python distribution, leave the system’s alone. Install your libraries in the user context, those you can just upgrade to yoru heart’s desire. And you can use libraries already installed in the system if they’re in the path.

“That’s ridiculous! Why do I need another install of Python!!!”

And you don’t have to, but you should. Need more proof of that? If your system comes with python 3.7 and you upgrade it to 3.9 or something else, be prepared to end up with a very broken system. Search here and you’ll find plenty of people asking for help after destroying their system by upgrading the system’s python distribution.

Just because you’ve always been able to use a system that was less than optimally configured, it does not mean that you should or that the OS maintainers could not fix that. And that they did, whether you like it or not. It’s their OS not yours, you do not need to use it, feel free to use another OS distribution that lets you do whatever you want.

So what’s one to do?

Learn to use the system’s python without disturbing it. Install your own python distribution, RTFM and configure it correctly, which more often than not includes hiding the system’s python and set up PYTHONPATH as needed.
Use a venv, which was never intended to fix any of the above, but does so very well.

A venv is not a FAD. Python venv has been a thing for a LOOOOOOOOOOOOOOOOOOOOONG time. What has happened is that more and more people are figuring out that python requires some things and those a very hard to make it happen in a system that runs multiple python projects at the same time and some of them use the same libraries but REQUIRE different versions of those.

This is hardly new. This was a thing in Unix and Linux too, but most people did not do admin in *nix, but had their own PC where MS always gave them full admin by default. Back in the day if you installed anything in windows, you’d end up with DLLs in the folders where you installed software. Then you installed another piece of software and you’d end up with copies of the same DLLs in those folders too. Installed more software and you’d have even more copies of those DLLs.

Back then, people like to install every piece of software they could copy/pirate from someone in their systems just because they thought they’d one day use it.

Storage was expensive. Drive storage was small. Systems had limitations. For the most part, people had no idea of how to add a new drive to their system and those that could do it, if they could afford it, they’d need a new case, new disk controllers, etc.

So what was done was to install DLLs in a central system location. That did not go well.

You install Office XYZ and it put adsf.dll v1.0 in /windows/system.

Then you installed Autocad and it also needed to put adsf.dll in /windows/system but it was written with v2.1 and stuff that ran with v1.0 would not run with v2.0.

The more software you installed, the more software you’d break. This became knows as DLL Hell.

You had to either wait for MS to release a new version of Office that did use v2.0 of the same dll, or not have Offfice and Autocad installed at the same time.

Eventually thying went back to installing DLLs in the software’s own folder. And changes to how the OS worked so that it could manage the systme’s DLL, many of which were not part of the OS but were distributed with the OS now. This is when Windows started to become supper bloated. Whether you needed stuff on your PC or not, it was installed there because if you need to install a piece of software that was needed, the install would fail if it was not there.

Now does “keeping DLLs a piece of software needs in the folder it is installed in” sound familiar? Yes. Yes it does. It’s what a python virtual environment does.

If you install something that uses Flask with sudo, it will install all it’s dependencies where the system’s libraries are installed, and if one of Flask’s dependencies is xyzlib v1.2 and the system also does something that uses the same xyzlib but it requires 1.1, congratulations you just broke your system.

Or, you install Flask and don’t break the system. But now you install something else that also uses the same libraries but requires different versions of those, you just broke your Flask install.

So you use a VENV. Whether you use the system’s interpreter or an interpreter you installed that runs side by side, with a VENV, you keep all of the libraries that your software needs in the VENV and it does not break anything else already installed outside of that VENV.

Now you can install a ton of software, each of them with their own version requirements, and none of them ever break another.

When bookworm was being worked on, and long before it was released, it’s maintainers and python maintainers were looking for a way to fix 2 things:

Not breaking the system when user software was installed.
Not breaking user software when the OS was updated.

VENV existed for a long time, it was proven to work and work very, very well, so VENV was the answer. Especially because with a VENV you can install your own libraries in isolation, while still having the ability to use stuff already installed in the system (like gpio) if you created the venv with the correct switch which told the VENV to add the system’s paths to be seen by the interpreter. On top of that, if a system library was upgraded and you wanted to use it, but it has not been made available in the systems’s distribution via APT yet, you could just install it in your VENV with PIP. Because python first looks in the VENV libs before looking for them in the system, if it finds the new one there, it ignores the old one in the system.

Those who do not understand what VENVs are and how they work and refuse to do a little reading (because it’s really simple stuff), or simply don’t want to bother learning about them, just bitch and moan about them over and over and over and over and over and over.

The reason you think vevns are a fad, is because you never had to use them before and you can’t possibly thing why it would ever be needed and now someone is forcing you to use them, so more and more people are catching up to them, what they are, how they work, and having to use them. So everyone will tell you “use a venv” and rightly so.

It is not just Debian. Other distros are jumping on board with the whole “system python” leave it alone thing. And rightly so.

Right now you CAN turn off that restriction in bookworm and install stuff with sudo pip and break your system if that’s what ends up happening. But the day that you will no longer be able to do that is coming. So get used to it now.

Containers.

Containers is a whole different story, which in the end servers a similar purpose. The point of a container is to prepackage software in a manager that is redistributable without having to make any changes to the system where that software is going to run at all.

That is IT.

It is supper convenient, especially when it comes to development, testing and deployment of software in a rapid manner, over and over and over.

In the old days you’d have to install softwared, configure the system and test. If the installing/configuring the system broke things, you’d have to redo your entire system before you could try again.

Enter VMs. VMs gave you the ability to run a computer a code, thus automating the whole process. You no longer had to fix your system anymore before getting it to run or after a QA run which required a fresh install for another round of QA. You precreated a VM image and just spun those up. When you were done, you’d throw them away, spin another one up, deploy, test, etc.

But you had the overhead of running an entire OS on top of another OS on the bare metal.

Enter containers. Just like python venvs, containers are NOTHING MORE than isolating whatever you install in a container to very specific and secured set of folders in the host OS. Yes. Everything in containers is just stuff that gets copied to a folder in /opt/docker/xxxxxxxxxxx or wherever else they are putting it in today.

They don’t run an OS on top of another OS. Other than the stuff you install in the container, they use the system’s drivers, and everything else. And if you need a newer version than the one you have on the base system, you install that in the container and now the container uses the one in it, instead of overwriting the one used by base OS, which could lead to the OS not even working anymore.

The big plus of containers is that YOU DON’T HAVE TO INSTALL ANY SOFTWARE in the system at all in order to run it and upgrade over and over. The software has already been installed in the container you then put in the system and spin it up. Furthermore, it gives you the ability to run multiple instances of that software in the same system, isolated from each other, without having to install it again for each instance you want to run on your behemoth hardware. All the installation has already been done, fixed, tested, troubleshooted by the people who wrote the software and got it working in the container before you even thought of downloading it and running it.

Containers are not fads. Not only are they making the entire internet go around, they have existed for a very very LOOOOOOOOOOOOOOOOOOONG time too, just like venvs have.

People just seem to have found a very good use for them relatively recently and they solved a very big problem which has allowed anything that uses computers or the internet to grow at an exponential rate.

With both venvs and containers, you can install and run software in a fraction of the time and the cost it used to take before that.

You need not jump on board if you don’t need or have to use them. But if you can make good use of them and you choose not too, then you’re a fool <==I’m not trying to be offensive here, just trying to convey the severity of the idea, I hope you understand that.

If ice cream was a piece of software, with a container you take it from the fridge and you eat it. Without a containers you need to go to the store, buy the ingredients, follow the instructions to make it, hopefully without requiring you to install an extra power outlet in your kitchen so you can run the mixer to make the ice cream mix before you can finally put it in the freezer. And if you did everything right, and only if you did everything right, you’d finally be able to open the fridge and eat the ice cream.

If ice cream was a piece of python software, a venv lets you make tons of flavors of ice cream, in the same kitchen, without making your strawberry flavored ice cream get tainted, or worse, ruined, by you making chocolate ice cream right next to it.

Install strawberry ice cream and chocolate ice cream in a container, each with it’s own venv and you can tweak your ice cream flavors independently without changing the other. The end user can download your container, throw a party and know that if you change how you whip the cream for the strawberry flavor, it will not change what they expect to find when they try the chocolate flavored.

cyclicalobsessive · July 6, 2024, 2:02pm

Bottom line to me: Ok Boomer.

My take: Software was a configuration nightmare and still is in episode 42.

If we are not fluent in apt, pip3, venv, Docker, and did not complete “Introduction to Python 3.9+” and “Advanced Python Programming” (and “Introduction to Docker” and “Advanced Docker Concepts”), we should not have sudo privilege and should only be allowed to use systems managed by the software gods. (And perhaps fluent with SED, AWK, Bash, SSL, systemctl, and three different window managers.)

If the software gods deem our computers (robot) no longer a future business unicorn, we are on our own “without a paddle”. It is too much to learn in isolation, these days one has to belong to a group going in the same direction.

I bought the latest PiCamera and latest Pi computer (Pi5) and want to run a program abandoned last year on a 7 year old robot? Yeah, I’m a fool. Try to use a 2 year old book I bought on programming my robot? Yeah, I’m a sucker. The OS is no longer available and the ROS version was EOL’d.

(Hope you bought that god a coffee. Very good write-up.)

jimrh · July 7, 2024, 10:10am

There’s not even a “like” feature there, let alone coffee.

Frankly, venv’s or not, there’s one thing that totally infuriates me about Python: it’s too much of a moving target.

cyclicalobsessive · July 7, 2024, 3:45pm

As a programmer that has attempted to stay relevant for the last 56 years, all of computing has been a moving target, with the same pressure to “keep up” on Python as on C++, Java and JavaScript. (Ada and Fortran also have needed periodic major enhancement.)

Even the GoPiGo API had a massive overhaul in 2017, that required me to sign up for a Python class to learn Python Classes.

What “totally infuriates me” is the diversity, nigh on to Brownian Motion, in networking, in operating systems and in processors that makes the computing domain too large to keep up with it all. The Python virtual environment feature (and the virtual machine → now Docker containers) is a result of trying to isolate the chaos outside the programming languages.

Robotics programs in any language do not allow such mechanisms to isolate the program from the chaos, so we are forced to re-invent with every major OS release and every major processor release.

What you are feeling as rapid Python changes is in part due to the loss of our GoPiGo3 programming “god”, @RobertLucian (Chiriac) guiding us. We are ships without rudders.

jimrh · July 11, 2024, 7:52pm

He replied with some answers to my questions and offered the following link about Python venv’s:

(I had asked for a good treatment of venv’s that didn’t read like a MIL-SPEC or something from a German peer-reviewed chemistry journal.)

This is what he suggested:

So far, looks good.

cyclicalobsessive · July 11, 2024, 8:32pm

The old instruction for setting up the GoPiGo3 API in a virtual environment supposedly setup the API without the desktop tools. Perhaps it works on the current PiOS Legacy 32-bit - I have not paid enough attention to virtual environments, not having been officially inducted into “Python Programming Practice” community.