Hey, you sass that hoopy Bill Napier? There's a frood who really knows where his towel is.

Borg, Borglets, and Borgmasters. Oh My!

I've been messing with EC2 and cloud computing for some time now. And it's always frustrated me how hard it was to deploy software, versus how easy it was for me to deploy software at work.

Prior to containers like Docker, bring up a new machine was a chore. If you took good notes that last time you did it, that helped (I never did). But chances are, even with your notes, some stuff had changed since the last time you did it.

Enter Docker and containers. I've talked (briefly) about Docker before, but as a refresher you build a container, and can simply deploy it on any machine you like (among many other important things). I say simply, and at first I truely believed it. But once I started running it, I found quite a few gnarly points to deal with. Like the fact that when your container crashes, it's up to you to restart it. Or if you want logs, feel free to configure it on the host. So much for easily moving containers from host to host, you still need to take notes on how your machine was originally setup.

At Google, this has been a solved problem for quite some time. Our production infrastructure handles almost all this for you. You build an image that contains the binaries you need, list the dependent images you need, fill out some constraints on how you want it to run, and hand it to the Borgmaster and it takes care of it for you. Now, I wouldn't call this system user friendly. The constraint language is SUPER complicated (and also SUPER flexible), but at least it's documented. Personally, I came up with a config file back in 2009 and have been using it as the basis for each new service I need to bring up.

I got really excited when I attended OScon last year and learned about Kubernetes (from Google). Being familiar with Borg, I knew immediately what they were talking about. They were re-building our internal Borg systems to run on EC2 (And of course, Google Compute Engine).

And now Google has gone ahead and made a research paper available describing it. (like other core Google tech like Bigtable and Spanner). If you run an internet stack somewhere, you should go read it. If you find research papers to be a little slow going and want a quick read, check out this blog post from the Kubernetes team describing how things learned when developing Borg influenced design decisions in Kubernetes.

One of the reasons I'm uncomfortably excited about Kubernetes is that I trust the people that brought me Borg, to get their second system right. I don't have that same trust from the Docker team, as I've seen them punt on issue that I felt they should also be solving, and in general not always go in the direction I felt was right.

So let's be honest. This isn't all for the greater good, Google expeects to turn a profit on this. Google offers Google Container Engine, which is a hosted kubernetes solution, just like Borg. I'm really looking forward to this system getting more mature, so I can simplify the amount of sysadmin work I need to do to run the small internet services I run.

Code First, or Design First

This post is pretty much a response to this blog post from 18F. 18F Discussion: Should Project Teams Code First Or Design First?. While it pretty much stands alone as well, you might want to go read the first article to put thing in context.

Let me put my old guy hat on for a bit. I’ve been doing this a long time and the answer has always been design first. I think that developers today get too tied up with the power of a strong framework. Yes, your framework and your skills are good enough that you can do this in a few hours and have something passable for the client to view. The question is really, should you? And I think (rather strongly) the answer is no.

The underlying problem is that writing code is slow. Drawing pictures on paper (or whiteboard) is fast. If I sketch something out and it’s bad, or has bad parts, I just throw the paper out and start over again. Maybe incorporate good things from the initial draft and leave out the bad things.

I’ve always instilled in my teams the power of the design review. Once you’re done with a design, sit down and talk to the rest of the team about it. They’ll hopefully help you find things that are bad, things where their experience may give you feedback you wouldn’t have seen yourself, etc. And the best part is, the only thing you’re invested in is the documentation. It’s easy to move words around in a document, or move blocks in a diagram.

With code that even works a little bit, there is an impetus. You’ve invested time in getting it to work this well, you’re going to be hesitant to change it, which can keep you closed minded when discussing alternative solutions. Or even worse, the customer sees your demo and latches on to it, thinking it’s only a trivial amount of work to get it to launch.

When I worked at Hillcrest Labs, we had the prototype that wouldn’t die. It was all fake stuff that would only ever run on a PC (we were targeting an embedded platform). It took a lot of work to convince our management that we needed to concentrate on only working on the real product and remove everyone from working on the prototype. And eventually we did, but it was an uphill battle.

We can also talk about the quality of prototype code versus production code. The thinking is “This is just a prototype, I don’t need to handle this error here.” Which means that when it becomes time to do it for real, you either go through all your existing (crappy) code and try to patch it up to production quality, or you start over and all your work was for naught.

So, while an interesting article, I don’t really think there is really much of a discussion. Coding first just has too many risks.

Dynamic Queries to your App Engine Data Store

I'm using AppEngine for a project at work. Exactly what it does it not important, but at some point it stores an Event entry into the AppEngine datastore for each thing that we do. I put this is so we could easily show the lsat 5 events on a webpage, and also because it felt like a good idea at the time.

This tool has been running for about 6 months now, we've got a bunch of events stored. I need to look at this data to see how long the events are taking, so I can have an idea of what a good event timeout value would be. A simple approach would be to write some python code in my application to render a custom admin page that could display this information for me. This approach is fraught with issues:

  • More code to write, which means more passes through the write, deploy, test, debug cycle. (The debug datastore doesn't have enough interesting information to do this locally.)
  • I've got over 7000 entries in the datastore. To do this serially is going to be kinda slow. I'm not positive I can do it before the AppEngine request deadline kicks in. Or if I can beat the deadline now, I won't be able to in the future.

So I started looking around and came up with another approach. AppEngine supports a datastore backup feature, and BigQuery can use that data as an input. Bingo. I can now treat my datastore as if it were SQL and write dynamic queries to explore the data that way.

The Nitty Gritty

So there are a few things to setup. First, you need a Google Cloud Storage bucket to store your stuff in, and you need to give your application access to write to it. If you are lucky enough to be using the default bucket associated with your app, then you have nothing to do but look up your bucket name (it's in the cloud console for your app engine app).

Setup Cloud Storage

Start by going into your AppEngine settings page and go to "Application Settings". Under there we want to find and note your "Service Account Name". This is the Google user your app runs as. Remember it, we'll need it for the next step.

Go into the cloud console for your specific app. If you didn't create one explicity, Google created one for you. The application id for your app should appear somewhere on this page. Go into it and create a new bucket, and then give the Service Account from above write acccess to this bucket. (Again, if you're using the default bucket for your app, this already setup for you).

Do the backup

Find the "Datastore Admin" tab in your AppEngine settings page, select the entity type to backup, and start your backup. Make sure to point it to your Cloud Storage bucket when asked.

Setup BigQuery

Go back to the cloud console for your specific app and scroll down to find BigQuery. You'll now need to create a datastore (call it whatever you want, I called mine test) and then create a table inside that datastore (I also called my table test), and point it to the file inside your bigstore bucket. Mine looked something like:

gs://napier-bigstore-test/gfhjakghdkjaghfkjdaghfkjdaghfad.Event.backup_info

Now you're ready to start examining your data.

Use BigQuery

BigQuery's query language is very similar (but not exactly) SQL.

A simple example, how many events did I have?

SELECT
  COUNT(*)
FROM
 test.test

Told me I had over 7000 entries. But I needed something more interesting. My events can be grouped around a key, and I'm looking for information on how long they took. The query I used was this one:

SELECT
  key,
  AVG((END-start)/1000/1000/60) AS avg_duration,  
  MAX((END-start)/1000/1000/60) AS max_duration,
  MIN((END-start)/1000/1000/60) AS min_duration,
  COUNT(*) AS cnt
FROM
  test.test
WHERE 
  status == "complete"
GROUP BY
  key
ORDER BY
  avg_duration DESC

Which basically says: For each key, show me the min, avg, and max duration of "complete" events, ordered by average duration. For ease of the human eye, I did all my durations in minutes, because that worked for my data.

Summary

7000 entries? I've forgotten how to count that low. -- BigQuery

So yes, BigQuery is overkill for this dataset. I could have just whipped up a python script, or dumped it into MySQL and done it that way. But the advantage of this way is that all the bits are already setup. I just needed to hook up the outputs to the inputs and I was ready. I didn't even need to download anything to my local machine, it all took place "in the cloud".

But I view this setup as just the start of something. Yes, I solved my immediate question, but what else could I do? How about setting up daily backups, and then scripting up a daily import of those into BigQuery. Then whenever I need to do a quick query on something, the data is already there.

Or going even further, how about some data visualizations based on BigQuery? If I have a snapshot for every day, I could graph some things like events per day, average durations per day, errors per day, etc.

So yes, it's still a bit of work to get all this glued together such that we can work on it. In my mind, there is still lots of room for improvement, but at least in this case Google delivers tools that work with other Google products in a meaningful fashion. That's not always the case.

I Read These 10 Books, and You'll Never Believe What Happened Next!

What happened next is that I got on with my life. But it got you to click through, didn't it? Anyway...

Margaret and I were discussing one night our 10 favorite books of all time. 5 was pretty easy to get to off the top of my head, but filling in the last 5 was a bit harder. What helped fill it in was going to my bookshelf and taking a look. I tend to keep books I like.

Rather than just list these out on Facebook, I decided that it would be more interesting to actually write a little about each book. So here goes.

The Books

The Hitchhiker's Guide to the Galaxy (by Douglas Adams)

I've lost track of how many times I've read this series cover to cover. I've got a well worn copy of first four in the series and a standalone copy of Mostly Harmless. Adams has such a unique style of story telling. Kinda manic and all over the place, with a lot of (typically British) humor of the absurd. It's never really overtly comic, and yet still has a number of laugh-out-loud moments.

Hey, you sass that hoopy Bill Napier? There's a frood who really knows where his towel is.

The name of this blog actually comes from this book.

Lord of the Rings (by JRR Tolkein)

T'was in the darkest depths of Mordor, I met a girl so fair. / But Gollum, and the evil one crept up and slipped away with her, her, her....yeah.

I was introduced to LotR rather late compared to most other geeks. I didn't read it the first time until High School. And the only reason I really leared of it was because I was listening to a lot of Led Zeppelin at the time, and a few of their songs make reference to the series (Misty Mountain Hop).

LotR was my first exposure to what I call "Epic Fantasy". There is a huge, detailed world and the reader is thurst into it with no background and has to figure things out on the fly, much like the characters in the book. I love how detailed Tolkein made Middle Earth, where even things like the phase of the moon it properly described.

Snowcrash (by Neal Stephenson)

Didn't find out about this novel or Stephenson until college, again kinda late to the game. The opening of the book is really what grabbed me. Name another book that starts off with a pizza delivery and the mob. Where does he come up with this stuff?

Snowcrash was published in 1992 and most of what Stephenson talked about was totally fantastic, definite Science Fiction. I reread the book sometime later (2004?) and was gobsmacked to see how many of the things Stephenson made up were now actual products. Google Earth. Second Life. I bet startup companies today are going back to this book today to find out their next idea, I'm sure it's in there.

Ender's Game (by Orson Scott Card)

Remember, the enemy's gate is down.

Again, discovered this book in college (same guy who recommended Snowcrash). This was back before Card starting running off his mouth about gay people, so it was still OK to read him.

A lot of Ender's game could be described as pretty typical Young Adult SF, like Heinlein "Juveniles". Exceptional kid, picked to save the world, smarter than the adults, has to overcome the bullies jealous of him. We all wanted to be Ender. All pretty typical YA coming of age store. If it weren't for the ending, this book never would have made the list. I won't spoil it for you here (even though the movie spoiled it in the trailer...), but there's quite a twist at the end that catapults a good story into a great one.

Sadly, pretty much everything else Card has written is dreck.

Name of the Wind (by Patrick Rothfuss)

Words are pale shadows of forgotten names. As names have power, words have power. Words can light fires in the minds of men. Words can wring tears from the hardest hearts.

Of all things, Margaret's randomly bought this one for me. She has a history of picking winners for me, and this one is no exception. Another epic fantasy, this time with a believable magic system underpinning the story. If you've read the Dresden Files, you'll be familiar the magic system as Rothfuss has borrowed heavily from it (he's admits he's a fan).

But the store is so much more than that. Starts off as a pretty standard YA coming of age story, like a more adult version of Harry Potter (including the school!). Except Kvothe get's kicked out of school and then we're off on an adventure.

Have no idea how this one's going to end. He's promised a triolgy and has only written two of them. There's quite a bit of story to cover in a single book....

Dune (by Frank Herbert)

My own name is a killing word

I took a course in Science Fiction as a Junior at Penn. In addition to lecture (which basically covered a lot of this history), we had to read two books each week and be able to write about them for recitation.

This weeks theme was "Epic Novels" (I'm sensing a theme). We had to read Le Guin's The Dispossessed and Herbert's Dune. I stared with Le Guin's book and it took most of the week for me to get through it. Even though it won basically every SF award possible (Nebula, Hugo, Locus), to this day I have no idea what it was about.

I was in a bind. I had recitation in under 24 hours and all 500+ pages of Dune to read. My plan was to get through enough of the book to be able to talk about it, and finish it after recitation. What ended up happening was that I stayed up all night to finish it. I couldn't put it down. Again, another Epic story of a fantastic world and of interesting people. None of the movies for Dune do the book justice.

The Robots of Dawn (by Issac Asimov)

The robot had no feelings, only positronic surges that mimicked those feelings. (And perhaps human beings had no feelings, only neuronic surges that were interpreted as feelings.)

It was summer, probalby during high school. Church Youth Group trip to Ocean City, NJ. I needed something to read, so I picked this up at the bookstore. I proceeded to get very sunburnt on the back of my legs while reading it on the beach.

This book affected me so much that I still, 20 years later, remember where I read it. It was probalby the first hard science fiction I've read (at least that made this list), and some of the idea still stick with me. Of course, Asimov's 3 laws of robotics (and the 0th law). But also the "people mover" idea, where there are treads moving at different rates, and you can pay more to ride a faster tread. Or pay a premium and get a seat.

So this got me hooked on Asmiov. I went back and read the earlier books in the series, but this was by far the best. This one is where the ideas were more mature, more distilled. The earlier books still seemed a touch simplistic.

Why not Foundation? To be honest, it's probably the better series. But I couldn't tell you where I was when I read it the first time (The second time was spring break during college for class).

Hackers (by Steven Levy)

Systems are organic, living creations: if people stop working on them and improving them, they die.

In college I got really hooked on the history of computing. Still am. Soul of a New Machine (Data General). Revolution in The Valley (Apple). In The Plex (Google). Some book on a company who failed (Sorry, don't remember the name). The book on the history of the iPod.

But this book is by Stevn Levy (just like the iPod book and the Google book). It's one of his earlier works, and I actually like it better than some of this later works. Levy has a tendancy to write his books as a series of magazine articles. In Hackers, it works because he split the book into a series of separate stories. His other books it's more annoying.

And it talks about all the stuff that happened when I was a kid, too young to participate. The Altair. The original Apple. Woz and Jobs back when they were the dynamic duo. Homebrew Computing Club. The good old days.

On a Pale Horse (by Piers Anthony)

What kind of fool had he been, to throw away romance untried?

I read a lot of Piers Antohny as a kid. And by a lot, I mean pretty much everything he wrote. I loved his work. Eventually I realized that most of most famous books (The Xanth Series) were super formulaic and started to get boring.

On the other hand, his Incarnations of Immortality series Bio of a Space Tyrant were pretty good, at least in the mind of a 12 year old boy.

Imagine that Death is job that you can work for eternity. He (and his friends Time, War, Fate, and Nature) are in a constant battle against Satan. It blew my mind and I really enjoyed reading it.

Recently I started re-reading the series. It's been nearly 30 years, and I still remember quite a bit of it. And I'm finding that it doesn't hold up that well. Not that's its dated, but more that I'm older now and things that seemed profound to a 12 year old have been proven false via experience. Plus Anthony seems to be a touch of a woman-hater, which can make it hard to read.

Phantom Tollbooth (by Norton Juster)

Ok, this is weird one. I read it in junior high. I don't really remember why it was so impactful on me, but anytime a book for a 10 years old comes up, this is the one that I point out. Looking forward to Joshua being old enough to read it so I can get him a copy.

Didn't Make The List

There were a few books that I like, but felt just got bumped off the list. I'll list them here in case your interested.

  • Old Mans Way by John Scalzi. Very good book, really enjoyed reading it and the rest of the series, but nothing from it has really stuck with me.
  • Game of Thrones by George RR Martin. Only started reading this due to the HBO series, trying to stay ahead of spoilers. It's very very good, but I don't ever see myself re-reading it. The good points are too few and far between, especially in the later books.

So, what are some of your favorite books?

My Heart is Bleeding

A funny name for a serious bug. -- Bill Napier

You may have heard about it in the news, but may be unsure of what to do about it. I know this because I've had a number of people already ask me what they should do about it. Let me enlighten you.

What's the problem?

So I'll try and explain it two ways. Semi-technical, at a level most people who are familiar with computers should understand. And then as an analogy, for those of you who don't understand the first explanation.

Technical

The bug is in openssl, which is a library used in many (many) please to do secure communications. If you see "https" in your browser, there is a chance that the site you're talking to is protected via openssl.

Without going into specifics on the bug (Check out http://heartbleed.com/ or the CVE for more specifics there), the bug (essentially) allows an attacker to access anything the webserver can access. For most websites, this could mean everything. Usernames, password, credit cards, SSN, tax returns, etc. Or even use it as a starting off point for exploiting another bug and creating a backdoor.

Analogy

When you leave your house, you lock all the doors, right? Imagine that your door lock had a bug in it (a design flaw) that allowed an attacker (theif?) access to your home without you even knowing that they have been there. Obviously they can do the easy stuff. Steal your TV and your jewels. They could also be rather annoying and steal your Social Security card and birth certificate and passport and start impersonating you (Identity theft). Or they could come in and just put up hidden cameras and bugs and a backdoor into the house so they can come and go and do whatever they want, even after you've changed the locks.

Hold me, I'm scared

So first of, not every website is affected. There are some site (I like LastPass) that will check if services you use are affected or not. As you can see from this infographic, a lot of the financial sites are fine.

infographic

Should I change my passwords then?

Short and sweet? Maybe.

Here's the tricky part. The bug has existed for 2 years. As far as we know, nobody knew about it until December (at which point the bug was fixed and a release pushed). There is a slight chance that black-hat hackers silenty discovered the bug first and have been using it prior to December, but it's not thought that is a huge risk. So even if a site you use has the bug for a bit, you may be safe.

Prior to this weeks announcemnt, the good guys (white-hats) knew about it, but that doesn't really change the risk profile. Things changed this week when a tool was release that exploited the bug to recover recent traffic sent to and from sites with the bug. This made things much riskier because it makes it easier. Anybody could download this tool, click a few buttons, and possibly catch your username and password while you were logging into the site. And chances are people did.

Keep in mind that it makes no sense you change your password until the site has the bug fixed. Otherwise you're new password will be at just as much risk as your old one.

Levels of Paranoia

If you're a tin-foil-wearing-hat kinda guy, you should probably change any and all password for any site that has ever been affected by the bug (once they have fixed it of course). This isn't feasible for most people. In my LastPass vault I've easily got 150 passwords, and I know there are some that it doesn't know about (usually due to laziness).

A more reasonable approach would be to change the password for any affected site that you've accessed since (say) 4/5/2014. (again, once they have fixed the bug). And by affected I mean affected after the tool was release. Sites (like Google) that patched it prior to the tools release are probably ok. Even if you haven't entered the password during that "risky" window, you should do it to get a new login cookie. This should protect you aginst any of those "Script Kiddies" who downloaded the exploit and immediately started snooping traffic. This is a case of fixing things that we know are a problem (we know that people are going to be doing this) vs. fixing things we think may have happened (a more sophisticated custom attack).

I can't recommend doing nothing at all. At the least you should check any important sites (as defined by you) and change your password if accessed inside that risky window. For me, this is anything to do with money. PayPal, Google (wallet/play/drive/gmail), banks, credit cards, etc. Should hopefully be a smaller list that the full 150+.

Any other reccomendations

LastPass

I'm going to plug LastPass again. the basic gist of what they do is keep track of all your passwords. This is handy so if something like this happens again, you have a list of sites you go to rather than trying to figure that list out. LassPass also has a proven record of trying to inform and protect their customers and their tool can already tell me which of my sites have been affected and if it's time to change my password there or not (depending on fixed status).

An additional protection is its ability to handle site-specific passwords, where each site has it's own unique password. If you don't have site-specific passwords, it may be possible for an attacker to gain your username/password from a HeartBleed affected site and then start trying it against "safe sites". If each site has its own password, this isn't an issue.

And I promise I'm not a LastPass shill. I get no compensation for this post, just a VERY happy customer.

Second Factor

Please turn on second factor authentication on any site that offers it. This also provides defense-in-depth as even if an attacker gains your password via another bug, without having access to your second factor, they cannot access your acount. I wish more services had this kind of setup (banks, I'm looking at you).

Further References

http://krebsonsecurity.com/2014/04/heartbleed-bug-what-can-you-do/
Krebs is great. You should just read his stuff because.

Beer Bug: Bringing moar data to your brew

Getting started in brewing beer has a minimal capital investment. For around $80, you can get all the equipment you need to brew a beer better that most of the beer you can buy in the store.

What you'll quickly find is that yes, you can make good beer with a starter kit. But there are parts of it that just suck. Like trying to get a siphon going to rack your beer (but an auto-siphon). Or post boil, getting the temperature down to where you can pitch the year (get a wort chiller). Bit by bit you keep getting things that make the process a little easier.

The Beer Bug

So I signed up to kickstart this thing back in November 2012, expecting to have it by Christmas. It ended up arriving in Feburary 2014. This is pretty much par-for-the-course for kickstarter.

Anyway, what does this thing do? In short, it measures temperature and original gravity of your brew, and uses Wifi to upload that data to the cloud. On their website, they have pretty graphs and other stuff so you can keep track of how your brew is going.

Tracking temperature is actually a really good way to make your beer even better. Basically you want to keep it within a small temperature range to control the flavor of your beer. Too warm or too cold and the yeast may give off undesirable flavors. Without the data, there's no way to control that temperature range without resorting to guessing.

Example Graphs

Original gravity takes a bit more explaining, as it's not a measurement people have heard of. Roughly, it measures the amount of solids dissolved in a solution. For brewing, this means fermentables (basically, sugar) to feed the yeast. By taking a measurement at the beginning and the end of the brew, you can calculate how much alcohol your brew will end up with.

OG is also the only way to know when you're done fermenting. If you look at the green line in the graph above, you can see the OG leveling off towards the end, indicating the fermenting is done.

Without the Beer Bug, measuring OG is tricky. To do it safely (ie. with no contamination risk), you need to sneak a sample out and use your (rather fragile) hydrometer in a tall flask. It's a pain. I usually just end up guessing when the fermentation is done (based on time) and then taking the final measurement.

I plan on getting a brew together in the next couple weeks to try my new toy out. Very much looking forward to it!