Tagopensolaris

Linux vs. Solaris packaging: it’s a philosophical thing

I thought this was a post worth making because this was the hangup that kept me, as an eight-year Linux user, from really getting Solaris.

One of the biggest questions I see repeated all across the Internet is, “why can’t Solaris’s package management be more like Linux?” Criticisms abound both of Solaris’s SysV packaging format and the way that Solaris packages have to be installed. Solaris’s opponents claim that the Linux packaging system is far superior, Solaris’s is stuck in the 20th century, and Solaris has to adapt or survive. OpenSolaris introduced the Image Packaging System (IPS), designed by Ian Murdock, the founder of the Debian project, largely to bring many of Solaris’s detractors back into the fold by providing another way of doing things. But how much difference does it make in the long run for Solaris as a platform?

Many of the questions and doubts about the Solaris packaging model stem from a very Linux-centric way of functioning. What I would like to explain is why the impedance mismatch between Linux and Solaris packaging is not so much a technological divide as it is a philosophical one.

I’m going to start by explaining how FreeBSD does things, because I think it fits neatly right in the middle of the Linux and the Solaris way of managing installed software.

A FreeBSD installation consists of two discrete platforms. The first is the base system, which is a set of system binaries and core services like FTP, NTP, DNS, DHCP and SMTP software. These are considered to be part of the operating system; they are managed by the installer and updated when you update the OS to a new release. The base system is installed under /usr, and other programs not part of the operating system should not be installed there.

The second is the third-party application layer, which consists of binary packages and “ports,” which are instructions for how to build an application from source. You might compare it to Gentoo’s portage system, or maybe to building all of your Red Hat packages from source RPM. The ports system goes beyond a simple “./configure && make && make install” in that it provides automatic dependency resolution, nice GUI interfaces to common compile options, installation registries and pre/post install/uninstall scripts the same way that a binary package manager would. Packages from the third-party ports/packages system are installed under /usr/local, separate from the base system.

The goal of this system is to keep the two layers as orthogonal as possible, meaning that it limits the surface area where they touch. The base system, for example, contains a copy of OpenSSL. But if you build an application in the ports tree, it will pull in its own copy of OpenSSL that will be used by the programs in /usr/local. The idea is that if you keep the two layers as separate as possible, you can upgrade the underlying system trivially without worrying about all of your third-party dependencies breaking on you. You can also keep your third-party programs from breaking your OS upgrade. And unlike in Linux, if you rely mostly on vendor-supplied libraries, it’s still very easy to install very modern software on a not-very-modern version of your OS.

In Linux, the solution to a major operating system upgrade is to back up your important data files, reformat your partitions and create your system from scratch on the new operating system. This is fine for systems of trivial complexity, but becomes very burdensome when you have an enterprise product like an ERP system or a digital collections manager and you would really, really like to just be able to upgrade the OS without everything breaking on you. One of the obnoxious idiosyncrasies of Linux is that when you go to upgrade, your vendor’s new packages may conflict with something in a third-party RPM you’ve installed. Third-party software can actually break your ability to upgrade the base system because everything shares the same hierarchies and you may encounter a lot of unintended conflicts.

Solaris’s packaging system has historically been the SysV package, which provides dependency resolution and many of the other amenities of modern packaging systems, but there was never a delivery mechanism for simple Internet- or network-based delivery. Many organizations NFS mount a directory full of packages. In many ways, it’s closer to Slackware’s idea of packages than most modern formats like .rpm or .deb. Blastwave was the first community organization to bring Internet-based package management, complete with automatic dependency resolution, to Solaris, but it did so with its own packages, not by touching the base system.

Solaris takes the FreeBSD approach to a more extreme level, partly out of fragmentation and partly out of necessity. Third-party packaging groups like Blastwave and SunFreeware operate independently of one another. Because of this, rather than a /usr vs. /usr/local separation, each Sun packaging group basically builds its own platform, isolated in its own directory hierarchy. Blastwave uses /opt/csw, SunFreeware uses /usr/sfw, and the old Cool Stack suite of web stack packages (which is now part of the Glassfish Web Stack) resides in /opt/csk.

The consequence of this approach is that if you, as an internal packager producing packages for your organization, want to take a piece of software and make a SysV package out of it, you need to build the platform underneath it first. It’s not as simple as writing an Apache package, because you need to rely on your own complex hierarchy of libraries too. When you’re now maintaining 40 packages instead of the 1 you really wanted to build, it becomes simpler to just rely on rsync from a reference system instead. And if you’re running OpenSolaris in production (and there are lots of perfectly valid reasons to do so), you probably don’t want to rely too heavily on vendor-supplied packages because the distribution is a moving target that changes dramatically every six months.

In many environments, the orthogonal-platform approach isn’t a bad thing. You’re probably dealing heavily with change control in the enterprise anyway, and it’s nice to not have to worry quite as much about a Solaris patch bringing down critical system services. Visible Ops teaches us that the most highly-available IT organizations patch far less frequently and rely more on good release management processes and testing updates in a group. Essentially, in a highly change-controlled environment, you’re essentially going to be building your own distribution, whether that involves rsyncing out Solaris binaries or manually creating well-tested update channels in a Red Hat Network Satellite server. And as with FreeBSD, when you need to perform a major OS upgrade on a highly complex system, it dramatically reduces the chances that something is going to break as a result of the vendor’s updates.

In many other scenarios, it is a bad thing. Many server configurations are very simple — LAMP stacks or Mailman servers, for instance — and you don’t need to put the same effort into maintaining them that you would an ERP or CRM system, a single sign-on portal or other important enterprise services. If the system breaks horribly, it can be rebuilt very easily. For the majority of organizations, most systems are like this, and the ability to very quickly bootstrap a system with needed services is still a big draw to the enterprise consumer. And from a security perspective, keeping four different copies of a library on your system, that are all used by different programs, means that there are four times as many security updates to make, and four times as many chances to let something slip through the cracks. Often it means several different configurations to maintain. For this reason, many organizations ignore Blastwave entirely. (Lots of others spurn third-party packages entirely out of security concerns, quite understandably.)

Linux attempts to create an all-inclusive platform where all software is on the same playing field, so to speak. Third-party packages rely on system libraries in the same way that the vendor’s packages do, for better or for worse, and everything benefits from (or breaks from) updates to system packages. For minor updates, this is a great thing. For major updates, this prevents the majority of systems with sufficiently complex configurations from ever being able to perform an in-place upgrade. The downside is mitigated a little bit by the fact that the package management system makes it quite a bit easier to get the new system up and running again.

But what makes Linux special among these three approaches is that there’s absolutely nothing keeping you from designing your own isolated platform using your own dependencies, just like you would on BSD or Solaris. BSD and Solaris try to enforce this separation, while Linux gives you enough rope to hang yourself with if you’re so inclined.

There’s perfectly valid reasoning for all of these approaches, and I don’t think it’s a bad thing that administrators are able to pick which platform to use based on the situation. It’s important to remember that Solaris isn’t lagging in the 20th century — it’s just a grizzled war veteran who understands the realities of enterprise IT administration.

ZFS Inline Deduplication

Those of you who have been following the lists, the bug trackers or Planet OpenSolaris know this already, but for the rest of you, Sun’s ZFS filesystem has just seen inline dedupe support merged into OpenSolaris trunk, presumably to be appearing in the next major OS release.

Jeff Bonwick has, as always, a very detailed blog entry about it, but here’s the only part you really need to know:

If you have a storage pool named ‘tank’ and you want to use dedup, just type this:

zfs set dedup=on tank

That’s it.

Just as simple as you would have imagined, given how easy everything else is in ZFS and OpenSolaris.

ZFS’s implementation is pretty neat. The filesystem was already pretty well-tuned for deduplication because ZFS has always kept end-to-end checksums of data in the first place to ensure the integrity of all data on the system. Now those checksums just happen to be used for something other than ensuring data integrity.

© 2019 @jgoldschrafe

Theme by Anders NorenUp ↑