The two-man rule in engineering

In nuclear weapons design, there is a two-man rule that prevents any single individual from accidentally — or maliciously — launching nuclear weapons. Each step requires knowledge and consent from two individuals to proceed. Even when the President initiates a launch order, he must jointly authenticate with the Secretary of Defense (they’re given separate codes, even though the President has sole authority).

When the order reaches the launch control center, two people are required to authenticate and initiate the launch, for example by (vastly simplifying…) turning two keys simultaneously.

The benefits are at least twofold. First, it’s much harder to compromise or impersonate two people simultaneously than it is to compromise one. Second, it also provides error correction. When two people are involved in a process, it’s much more likely that if someone is about to make an oversight or error, it will be caught. This works better when the roles are asymmetric, because then they won’t both be on the same “wavelength.” Most good processes of this type seem to be asymmetric in some way.

There are many contexts where we want error correction and extra security: executing large financial transfers, preparing patients for surgery, performing space shuttle launch checks, or running nuclear reactors. It also comes up a lot in software development, which is what got me thinking about this. Let’s count the ways we implement the two man rule:

Code review: Everyone is either doing this or making bad excuses for why they shouldn’t. But it’s the clearest and most accessible example of a two-man rule in software engineering.

Spec review: An essential part of any sizable project is a review of the specification to make sure, in particular, that 1) the right thing is being built in the right way, and 2) the right people and teams are aware of any impact the work might have on them.

Continuous integration: The branch built on your machine, but does it build on another one? This turns up countless “oh right I added this config variable/package and forgot to propagate the change” incidents before they become blocking.

Pair programming: I think of this as just real-time code review. It has all the same benefits and more, with the downside that it can’t be done asynchronously.

Deployments: I wish we did this closer to 100% of the time, but it has definitely been helpful to have a second person on hand for deployments in addition to the primary engineer. This is especially critical during complex deployments that happen in phases or involve many moving parts. Ideally the role is relegated to going through the checklist one last time (“says there are database migrations, are we expecting downtime or can we keep pre-boot on, and if so is the config correct?”), and in the event of an issue, helping to investigate or doing the checklist in reverse to roll back.

Mind the Gap

As we continue to grow, there are a few areas where I think a more consistent two-man rule will lead to high return on effort in the future:

  • manually rebooting servers, changing server counts or container types
  • adding/scaling services
  • running one-off commands against the production database

And yes, every once in a blue moon we deploy tiny changes to production without full code review, or force a failing build onto staging — something that is intentionally difficult and unwieldy to do. This has gone from rare to extremely rare, and I expect this trend to continue. But I like processes to be developed and enforced bottom-up if possible, and prefer values over inflexible rules. So far this tenet hasn’t failed us, and we still trust each other with good judgment above all else.

However, as the stakes get higher every day, the cost/benefit equation will eventually tip towards a standard operating procedure that can be summarized as “trust, but verify.” If that doesn’t sound like a good proverb to live by, maybe a second opinion is in order?

 

Don’t tweak all the variables at once

I have been at Privy for a year. I’m proud of the team and product we’ve built, and I was excited to sit down and make a list of some of the new things I learned during my time here. Then I realized that most of these “lessons” would’ve been covered if I had just re-read everything ever written by Fred Brooks, Martin Fowler and Eric Ries…but that doesn’t make a good blog post.

So that got me thinking about the things I already sorta-knew that had been validated. Perhaps there was some pattern there. And so I made my first order list, which I present below.

I have learned virtually nothing about…

  1. Using a stack in the middle of the adoption curve: Ruby on Rails.
    • Ruby/MRI is between 2 and 50x slower than running a static language on the JVM, but even a slight increase in developer productivity more than makes up for the operations cost.
    • The advantage of using a really fancy stack (more cool factor for recruiting, etc) really doesn’t seem to compare favorably to the disadvantages (more uncertainty, smaller pool of technical talent).
    • The evidence that startups regularly die due to technology stack is vanishingly flimsy, so no need to dwell here.
  2. Building a local team.
    • Geographically distributed teams and getting on the bandwagon of “work anywhere cuz we have Slack lol” seems all the rage today, but the early team is more important than the early product, and the best teams are in the same place every day.
    • Resisting the urge to go remote has been something of a useful filtering mechanism: does this individual believe enough in our vision to consider moving here for the job?[1]
  3. Having some really solid cultural values (or aspirations, as they may be) that aren’t totally groundbreaking.
    • It’s more important that we live up to great values than come up with amazing ones. I’ll leave the latter to the management consultants.
  4. Using traditional engineering management.
    • We basically do agile: there are weekly-ish sprints; we do higher level planning on a monthly basis; a couple times a year we work on a strategic roadmap. We write software specifications before we code, and we ship daily with continuous integration and lower test coverage than I’d like to admit. Yawn.
    • We don’t use “flat” organizations or Holacracy or whatever trendy hipster management structure is in vogue. What the hell kind of problem is this trying to solve anyway? My theory is it’s got something to do with cool factor for recruiting, but I have a feeling the people trying this are no more certain than I am.

What’s the big meta lesson here?

If anything, it probably goes a little bit like this: the available levers to pull in a startup are numerous, but there are only a few that make a measurable difference. The things that are most likely to kill us are the things that kill most startups: having a subpar team, building a product that nobody wants, executing poorly on feedback loops, that kind of thing.

These are the things that, in Paul Graham terminology, make you “default dead” until you figure out how to get them right. And it’s critically important to realize that things like “what do we build?” and “who do we sell it to?” are the things that startups are doing “wrong by default” and need to diagnose and fix as quickly as possible.

But then there are the other things, like “how do we write a scalable system to respond to HTTP requests?” or “how should we manage engineering teams?” in which there are essentially no forced errors, and where (barring a well-articulated exception[2]) the correct answer is the default one. So almost all of the risks here seem to be to the downside, and any upside is probably insignificant compared to the scale and difficulty of the hard problem: building a novel product under uncertainty.

There are certainly going to be exceptions to this. There are going to be teams that have figured out how to deviate from orthodoxy and are reaping benefits from it. I’m OK with this, and my theory is that it either doesn’t matter (e.g., they were going to be a success anyway) or it won’t rescue them (they’re doomed and they didn’t differentiate in a way that mattered).

And so it must follow that the majority of our iterating and tweaking is on the thing that will make us a great company: what do we build? Who do we sell it to? There are enough variables in there that I don’t really have any brainpower left over to do anything except reach for Generic Ruby/Python/JavaScript framework and using engineering/recruiting/management techniques that were old 30 years ago.

 

[1] This isn’t all roses, since it biases us significantly towards younger folks who don’t have as many attachments, the net effect of which is…debatable, but obviously not lethal in a vibrant tech city like Boston.

[2] Example: One excuse I’ve used to provision real hardware in a real datacenter as opposed to just spinning up an EC2 instance is “I’ve done the math and TCO in AWS is literally 25X more expensive.”

How to uninstall the default Windows 10 apps and disable web search

If you’re like me, you’ve been enjoying Windows 10 for quite some time now. Couple things annoy me:

1. I accidentally changed all my file associations to the new default Windows apps, because the (intentionally) misleading firstrun experience presented fine print I glossed over.
2. I don’t like searching the web from the Windows Start menu, because I’d rather not transmit everything I type there over the network. Call me old fashioned.

Remove default apps

Open up a powershell prompt and run this to remove most of the default apps:

Get-AppxPackage *onenote* | Remove-AppxPackage
Get-AppxPackage *zunevideo* | Remove-AppxPackage
Get-AppxPackage *bingsports* | Remove-AppxPackage
Get-AppxPackage *windowsalarms* | Remove-AppxPackage
Get-AppxPackage *windowscommunicationsapps* | Remove-AppxPackage
Get-AppxPackage *windowscamera* | Remove-AppxPackage
Get-AppxPackage *skypeapp* | Remove-AppxPackage
Get-AppxPackage *getstarted* | Remove-AppxPackage
Get-AppxPackage *zunemusic* | Remove-AppxPackage
Get-AppxPackage *windowsmaps* | Remove-AppxPackage
Get-AppxPackage *soundrecorder* | Remove-AppxPackage

Turn off Web Search

Next, open up Group Policy Editor (gpedit.msc) and navigate to:

Computer Configuration -> Administrative Templates -> Windows Components -> Search. Enable the policies:

  • Do not allow web search
  • Don’t search the web or display web results in Search
  • Don’t search the web or display web results in Search over metered connections

Finally, open up “Cortana and Search Settings” and disable “Search online and enable web results”.

Heroku Pricing Changes

Couple of quick points on Heroku’s pricing changes which I’ve been meaning to get out:

  • Its not an across-the-board price cut. While the dyno pricing has decreased, they also got rid of the free $36ish/month in free dyno credits.
  • New free tier replaces the free dyno credit. Minimum 6 hours of sleep per day means no more abusing the free tier by pinging your app every few minutes to keep it from sleeping. Seems a lot of people were doing this to run production apps for free; good riddance.
  • New $7/month hobby tier is a great new option for people who were previously hosting production apps for free and need them live 24/7. This is a great deal since you can even have worker/background dynos for the same price. Makes sense for Heroku too – they’ll derive a good deal of long tail revenue from folks who would’ve previously just stuck with the free tier (maybe using the ping hack to prevent idling). Honestly I think the revenue is not the point – it’s more just preventing people from abusing the free tier while giving enough folks a no-excuses carrot to use the platform so it’ll be a no-brainer when they “go pro.”
  • Professional dyno pricing drop is great, but it’s going to be a wash for the majority of paying users because the free credit is going away. Basically there’s no more big cliff where you go from free->paid any more, but the steepness of the pricing increases is somewhat lower. My intuition is the winners are the 4-5 figure/month customers, makes sense since that’s around the time they start thinking about moving to AWS directly for cost savings. More of them will just consider staying.

Why Work at a Startup?

Because I’m tired of explaining to everyone, I’m going to make this list to refer to anyone who asks. While I don’t think any of these are particularly original, it makes a handy checklist for anyone considering a similar jump[1].

  •  Faster time to market. At Privy, we routinely ship code that was written earlier in the day or week. Seems petty, but as an engineer, it’s frustrating to improve something and then not have it in the hands of customers for weeks or months.
  • More hats to wear. The diversity of work at a startup appeals to me. I can work on product, recruiting, and engineering. Before lunch. The pace and scope of work is both faster and longer term, and I like being involved in multiple parts of the business.
  • Be judged by customers, not managers. A startup makes each person less insulated from the market. Therefore the correlation between performance and rewards tends to be much closer.
  • Less politics. As a consequence of the last point, politics becomes less important. It’s much harder to bullshit accomplishments in a startup when the entire company fits into a small room or two. Tired of carrying teammates who aren’t pulling their own weight? Join a startup.
  • Incredible learning. As another corollary to being closer to market forces, I’ve learned a lot about how to run a business that provides value to customers in exchange for money. I’ve in turn been able to apply experience I’ve learned elsewhere that I never would’ve been able to use at a larger company, because my job title would’ve prevented me from doing anything other than engineering.
  • Challenging the status quo, not defending it. Name recognition is cool, but I never got the sense that my role at Office was about reshaping how people work – probably because our market share had nowhere to go but down. But I’ve found I don’t mind playing the underdog as long as I have a thesis about how the future should change for the better.

 

1. In a necessary but not sufficient way (i.e. if these don’t apply to you, a startup is probably a bad idea; but if they do apply to you, a startup could still be a bad idea).

Don’t get a Masters in Computer Science

I am pretty sure most software engineers should get a BS in computer science. I’ve written extensively about this. But I’m often asked by prospective engineers whether it’s worth the effort to get the MS too. In the past I’ve mostly dodged on this, with a hedged answer I would charitably paraphrase as “umm, probably no, but maybe yes, if you find a subfield you really like.”

Today I realized that this is terrible advice. If you have to ask, you should not get a master’s degree in computer science.

Why? Because all you MS CS candidates suck at the most basic interviews.

Seriously.

Like I sometimes have trouble differentiating between people with an MS and people who have literally never coded in their lives. But maybe that’s because they aren’t mutually exclusive:

  • I don’t do this anymore, but I used to just ask fizzbuzz over the phone, and the candidates who routinely failed this were either masters students or masters grads looking for their first job.
  • For some MS CS grads, reversing a string is literally a half-hour affair, and doing it in-place without an O(n) memory allocation is considered “tricky.”
  • I once had a poor soul with a masters degree spend 10 minutes failing to name a way to communicate between 2 computers.

I don’t know what’s going on here.

But I have a few theories:

1) Software engineering experience compounds, but instruction in CS fundamentals offers diminishing returns after 4 years. I might be suffering from some Dunning-Kruger here as I only have a BS, but the vast majority of fundamental, broadly applicable theory seems to taper out after ~3 years of quality instruction, in my experience.

2) MS programs lack even remotely standardized curriculum or admissions requirements. Master’s programs seem to fall into two camps: the “we’re vetting you for a PhD” camp, and the “professional degree” camp (which is very likely a cash cow for the university). Both camps assume you have prior exposure to the subject matter, and therefore won’t have a well-structured curriculum in fundamentals. But if an MS CS program doesn’t teach CS fundamentals (that’s what the BS is for, right?), and doesn’t require a BS CS for admission, how does that ensure graduates have a baseline level of knowledge upon graduation? It doesn’t.

3) MS students have low or no exposure to actual coding. A lot of MS degree work I’ve seen either involved studying esoteric algorithms or mathematical proofs, or research that mostly involved bragging about how the machine running a neural network has 256GB of RAM. I took a few graduate level courses back in my day, and I’d venture at least half of them required no coding whatsoever. Now recall the part about no structured curriculum, and you are well on your way to a choose-your-own-adventure degree that could easily see you to graduation day writing about about as much code as a real engineer might deploy to production before lunch today.

Of course, it goes without saying this isn’t all candidates from all schools. But it is a pattern, and these days I just reflexively de-prioritize talking to MSCS candidates because to do otherwise is a setup for disappointment.

The truth is, I suspect this state of affairs is a mix of correlation and causation. I know it’s wrong, but “if this candidate was any good, he would’ve gotten a job on the strength of his skills rather than making his resume fancier while waiting out the recession or whatever” has crept into the back of my mind before.

It’s simple. We, uh, kill the batman.

It doesn’t really have to be this way. If your goal is to be the best engineer that you can, those 2ish years of extra experience you get in the industry make a big difference. Those are your learning years where you absorb hard-won experience from your seniors on engineering trade-offs and how to work on teams with existing codebases under real multidimensional constraints.

And if your goal is to make the most money you can, an MS almost never pays off unless you just happened to specialize in something that is both rare and highly in demand. Otherwise, if you are lucky, you are looking at, compared to a fresh BS CS grad, a pay bump of ~$10k. Maybe. Forget comparing to someone who graduated with the BS CS one or two years ago; they’ve left you in the dust.

This should be obvious, if you think about it for a moment. New grad engineers increase their skills and value tremendously over 2 years; they get commensurate increases in salary to reflect this[1], and the average person who took those 2 years to get an MS CS is starting from an experience deficit and never catches up. It’s no wonder then that it only offers a ~$5-10k salary bump: it isn’t all that valuable on its own.

So don’t get a master’s degree[2]. It probably won’t pay off, and your engineering career will suffer. There are exceptions, but they don’t apply to Joe Shmoe with an MS from Nowheresville.

[1] Mostly by changing jobs, because employers in this industry seem to routinely under-level new grad engineers as they gain experience, but that’s another rant for another time.

[2] But if you do, get a BS CS first. I see again and again that most successful people with master’s degrees started with the BS.