Page 3 of 8

New job!

The three of you who have been following this blog for awhile have probably noticed that around February of this year, the number of topics I’ve blogged about has dropped pretty significantly. That’s because I left my jack-of-all-trades systems engineer job to take a position as a systems integration lead with Time Inc., a position dealing primarily with the difficult tasks of systems automation and configuration management.

While I love the job, and have a great deal of fondness for the people I work with, I do have to say that the amount of new technology I’ve gotten exposure to has been fairly limited. Though I’ve learned a lot about how a really well-oiled machine runs things, most of my technical posts have been about the somewhat generic subjects of Puppet and Linux, and they haven’t been as varied in scope or dimension as I’d really like.

However, I’ve just accepted a position as Systems and Storage Manager for Cold Spring Harbor Laboratory, where I expect to be spending a lot more time working with the open-source community and working with a team to develop clever solutions to the problems faced by many cash-constrained IT organizations. Being that we have a mission to better the world through scientific research, what better place to contribute to open-source?

(CSHL has a long and storied history of contributions to open-source software, likely dating back even further than Lincoln Stein’s still-used CGI.pm CPAN module).

Expect a lot more good stuff on this blog towards the end of the year as I get to finish up several things I never thought I’d have a chance to (IBM SAN data recovery, I’m looking at you).

Repo updates aplenty

I’ve just pushed a pile of important updates into the holyhandgrenade repo. Here’s a quick rundown of the most important changes:

-thirdparty repo

I’ve added another repository, holyhandgrenade-thirdparty, in which I redistribute rebuilds from other people’s SRPMs in an attempt to cut down on the amount of unnecessary dependencies. In particular, I’m trying to kill all the dependencies on the RBEL repo, which I’ve become increasingly unhappy with on account of them doing everything totally differently from Fedora upstream (shit, there’s unmodified rebuilds of openSUSE packages in there!)

I could probably push most of this stuff into the main repo, but I don’t want to seem like I’m taking authorship credit away from some people who really deserve it, like T.C. Hollingsworth who has put a ton of work into his packages in the Node.JS ecosystem. Since he hasn’t provided builds for RHEL 6, I have.

Moving right along.

Node.JS packages

I’ve started keeping a large supply of Node.JS packages supporting Etsy’s statsd and some other endeavors. As it stands, it’s more than enough to support Node.JS standalone, but not enough to daemonize it the de facto standard way (the Forever library). Stay tuned, as this is where I’ll be focusing most of my packaging attention in the next few weeks.

Many of these are in -thirdparty, but a large number that I’m writing will start to make their way into the main -stable and -testing repos.

More statsd and Graphite goodness

I’ve added a bunch of other statsd/Graphite-related packages, specifically:

  • collectd-carbon (collectd Python plugin to export statistics to Graphite)
  • collectd-graphite (collectd Perl plugin to export statistics for Python)
  • python-statsd (synchronous statsd client for Python)
  • python-gstatsd (asynchronous/Twisted statsd client/server for Python)

Additionally, the Graphite packages (carbon, whisper, and graphite-web) received significant updates.

A pure-C implementation of statsd will be pushed as soon as I get around to checking it out.

RHEL/CentOS init scripts for Carbon

As part of the recent set of updates I’m pushing to the holyhandgrenade-testing repo, I pushed some updated Graphite packages which contain three init scripts for Carbon:

  • carbon-aggregator
  • carbon-cache
  • carbon-relay

As before, I’m making a special post to draw search engine attention to these in case they end up being useful for anyone not using my packages. As usual, you can find these scripts on GitHub:

Note: These are specific to my Graphite packages, which means they specify carbon-{aggregator,cache,relay}.py files in /usr/bin instead of /opt/graphite. If you are using the default /opt/graphite hierarchy, you must change the $exec variables in the scripts.

Happy graphing! 

CentOS/RHEL init script for uWSGI

I created this as part of the uWSGI package that I’m publishing later this week, but I thought this might also be useful to people not using the package, so here’s a separate post! Hopefully it saves somebody some work.

This script, inspired by many scripts before it for Mongrel and other app servers, looks through /etc/uwsgi and launches an instance for each .ini/.json/.xml/.yaml/.yml file it finds. It expects the directories /var/log/uwsgi and /var/run/uwsgi to exist.

You can find the script on my GitHub page for the RPM:

https://github.com/jgoldschrafe/rpm-uwsgi/blob/master/SOURCES/uwsgi.init

FHS-compliant Graphite packages for RHEL/CentOS 6

Well, it took me a number of hours of beating on it, but I wrestled Graphite into being FHS-compliant and packaged it up on the holyhandgrenade-testing repo. They’re largely untested and a bit rougher around the edges than I’d like, but they seem to work.

The current version in the repo is 0.9.7c, as it was much easier to rip apart the version I was already using. I’m hoping to have the latest 0.9.9 version up soon.

Update: The packages in the holyhandgrenade-testing repo are now up to date with version 0.9.9.

As with my other packages, you can track changes to the specs through the GitHub repos:

Note the following changes from the standard distribution:

  • Python libraries, including Django templates, are installed into the standard Python sitelib.
  • Static assets are in /usr/share/graphite-web.
  • Configuration files, including local_settings.py, are in /etc/graphite.
With a tiny bit of love, they could be backported to RHEL 5, but be aware that they require Python 2.6 or higher, so you’ll have to tweak the package name and the %{__python} macro to have it build appropriately.

Introducing the holyhandgrenade yum repo

You’ve probably figured out by now that I’m completely insane. I typically don’t let this leak out and affect other people, but it seems that a chunk of my home lab has found its way onto the Internet. As a result, now I have a yum repo.

It’s just for RHEL6 and derivatives right now, and only on x86_64 (is anyone still using i386?), but I’ll probably start cross-compiling for CentOS 5 if anyone has a need.

Right now, the holyhandgrenade repo contains Ruby Enterprise Edition (existing packages on the Internet don’t build for RHEL6) and all Rubygem prerequisites for Chef built as RPM against Ruby Enterprise Edition. RBEL is still needed for the things this repo doesn’t contain (CouchDB, RabbitMQ, etc.).

As a bonus, if you install Chef from this repo, it will actually work. As of this writing, that’s not the case with RBEL. Hooray!

You can install the repo with:

I was in a rush to get this live, so there’s no GPG signing of packages yet. That will happen soon, I promise.

I’ve also created separate GitHub projects for each package. You can view my GitHub page here.

Now I can start work on that Chef tutorial.

Disk Performance, Part 2: RAID Layouts and Stripe Sizing

In Part 1, I discussed how storage performance is typically measured in random IOPS, and talked about how to calculate them for a single spinning disk and a RAID array. Today, I’m going to get into the nitty-gritty of striping in RAID-5 and RAID-6, and discuss how to determine the optimal stripe width for your server configuration.

For a lot of workloads, this will be premature optimization. I’d advise you not to think too hard about your storage subsystem unless you’re actually worried that you will be I/O-constrained. Most of these considerations, implemented appropriately, will cut down on your total number of disk operations, but won’t make things faster on an undersubscribed system, where rotational latency and seek times are probably your only pertinent bottlenecks. It’s a better idea to invest your time elsewhere, like finding ways to make your systems easier to manage.

Also note that this article won’t tell you how to get all the numbers you need to properly size your array — not yet, and I plan on getting to that in the near future — but I hope to give you an understanding of what to watch out for as well as a starting point for figuring out how to profile your own applications.

Continue reading

Disk Performance, Part 1: How Performance Is Measured

When we as computer users think of disk performance, we usually think about streaming, sequential performance, otherwise known as throughput. Desktop operating systems have trained us to think in this way, because the most prominent display of disk speed that your average person sees is an Explorer or Finder window showing file copy progress — we know that our music collection is being copied at 25 MB per second, for example. This measurement is a good fit for the task, because it gives us the best approximation of how long it will take until the file copy is finished.

In the server world, though, this generally isn’t how disk performance is measured. Servers are shared resources that do much more complicated things with data than typical desktop systems. Most database access is highly random — you pull a record here, a record there, and piece them together in the application. Rows in a MySQL table are usually no more than a couple of kilobytes each, and the rows you need to join together to service one complex query typically live all over the disk. For most other server-side applications, small files are accessed a lot, and large files are accessed infrequently. So instead of throughput, which is measured in bytes/sec, we typically work with a different measurement called IOPS (pronounced i-ops).

IOPS stands for I/O Operations Per Second, and it refers to the average number of random small reads and writes that a disk drive can perform in one second. Let’s start looking at some numbers and calculating something useful.

Continue reading

Runbooks are stupid and you’re doing them wrong

Well, maybe you are and maybe you aren’t. I have no idea. But if your shop is anything like the majority of IT shops I’ve seen, then this assessment is probably on the money.

The runbook is one of the most pervasively mediocre, poorly thought-out and badly-implemented concepts in the entire IT industry. For those of you who are unfamiliar with the term, the runbook is basically a “how can grandma run this application?” document.

Their use should be very strongly scrutinized.

Continue reading

PSA regarding Puppet template variables and default values

Inside of a Puppet template, you would think that one of these would work to set a default value for something not explicitly defined in your manifest:

It doesn’t. This will clobber the variable, every time.

To understand why this happens, you need to consider two critical pieces of information:

  1. The architecture of Puppet is stupidly complicated, which leads to unexpected behaviors all over the place.
  2. Because Puppet variables are lazy-loaded, they need to reinvent how variables are accessed in the ERb templating system. The Puppet developers, in their infinite wisdom, decided to do this by using method_missing and dumping leaky abstractions all over the place.

To summarize, the reason this doesn’t work is because there really isn’t a variable named my_var at all. ERb tries to find a symbol called my_var and can’t, because that thing that looks like a variable is really syntactic sugar over something completely different happening under the covers.

The correct way to do this is to force a lookup through the scope object as follows:

The more vested I get in Puppet, the more I want to try out Chef.

© 2019 @jgoldschrafe

Theme by Anders NorenUp ↑