"Dave, you need to stop poaching from that team."

It wasn't the first time in my career I'd heard that type of statement.

"It wasn't me." I said, mimicking Shaggy.

"The third person this week has said they want to transfer to your org. You need to make it stop."

"It's not my fault. Their attrition is more than twice the level of attrition in the rest of Amazon. They're quitting Amazon, and looking for other openings. They asked to join my team because their co-workers came here. I don't reach out, they do."

My unintentional burning of bridges with my peer came down to a difference in team culture. People on their team said that Amazon was a terrible place to work. They had to hire more every year because they lost more to attrition than other organizations.

But everything was blamed on the type of work, the difficulty of their work, or the complexity of the space. Anything apart from what, I believe, was the simple truth, that their org's culture was dreadful.

Why should you care about team culture?

You spend 40+ hours a week with your co-workers. It's awesome to spend that time with people you enjoy, and horrible to spend that time with people you don't like. Culture can make or break your overall life happiness. I can't overstate how important it is.

It's more fun to work hard with a team you like. Team sports are a thing for a reason. We're genetically programmed to feel inspired and energetic when we're working with a group on a common goal.

Years ago, we ran into a messy deployment of our software onto mobile devices. Mobile devices are super tricky. You see, with a normal server in a datacenter, you can always wipe out your software and restart whatever you got wrong. In almost all scenarios, you won't break the Linux box.

In the worst-case scenario (which rarely happens), you can have an employee physically hit the reboot button on the Linux box you broke.

But what happens if you (for example), break network connectivity on customer's tablets or phones? Well, you can't do anything about it. You can't touch those devices physically to fix them. You can't give them new software (since they can't update themselves), so they're stuck with your broken software unless you do some type of recall. It has the potential of being a terrible disaster.

So we deployed some software which had the potential to "brick" our customer's devices. To brick a device means to turn their fancy electronic device into a useless brick. It's not a good thing. In the worst-case scenario, what it means is that the software on a device is so bad that it can't get updated versions of your software anymore. That means you might end up in that recall situation.

Now, ignoring all the technical details, here's what happened.

Our on-call figured out we had a disaster on our hands in the early afternoon. The entire team dropped what they were doing. Time for teamwork.

One engineer pulled logs from devices. Another couple dove deep into the code to make certain we knew exactly what went wrong. Yet another looked into options we had to fix the devices without requiring them to be mailed back to Amazon, which was absolutely our least favorite choice.

The manager of the team ordered pizza because they knew we'd be there for a while. I authorized the ordering of pizza. Bam, middle management to the rescue.

While standing around with slices of pizza, a couple of engineers figured out a super tricky way to convince these 95% broken devices to phone home to Amazon's servers, request a software update, and reboot themselves.

After testing on internal devices, they verified it worked! It was a gigantic relief, and a really neat innovation.

While waiting for the deployment, and eating enough pizza to make the team collectively gain 5 lbs, we discussed and wrote up a proposal to build our tricky deployment tool into a tool other mobile teams could use.

The team ended up going home around 8-9pm.

Was the team demoralized for working a 12-hour day? Did they hate their job?

Heck no! I mean, it wasn't all rainbow and puppies. One engineer had a young baby at home, and reluctantly left the office early. I suspect they would have preferred to hang with their co-workers and eat pizza. And there was a good amount of stress when the team thought our disaster was unrecoverable.

But months later, the team still laughed and joked about how awesome they came together to prevent disaster. Literally years later, engineers would say, "Remember the time we almost bricked hundreds of thousands of devices?! Haha."

Every member of the team liked their teammates more, not less. No one was blamed. No one was yelled at. Everyone felt like they contributed (even the managers!), and the result was a pretty good success.

You enjoy your job more when you care about your co-workers. You have more success when you enjoy your job. People accomplish more when they're inspired and care.

What are the components of a good team culture?

If I looked at a team and said, "That's a great team culture" or "Their team culture is terrible", what would differentiate it? Beyond the output metrics (attrition, for example), what are the input metrics?

  1. Feeling safe. You can't relax and feel like you're a part of a team if you're worried about your career safety. Certainly, a big downside for any top grading processes.
  2. Co-workers encouraging your growth. If people cut each other down, you're not on their team, you're competing against them. You need your team to raise each other up. You want to help those who help you.
  3. A purpose you can get behind. If you disagree with your team's mission, you can't lean into it. This doesn't need to be about curing cancer. It can be an internal mission. Like, "We're awesome at building internal tools!" I'll get to this tricky one a bit later.
  4. Co-workers you know as humans. If you don't know your co-workers, it's hard to connect with them. They're the slow engineer, not Tabitha. They're the manager, not Vernon. And you treat individuals differently than you treat titles, or nameless co-workers.

One - How do you make a team feel safe?

The biggest thing is that no one should ever point fingers. As has always been said, "praise in public, correct privately". This means that the manager, and all co-workers, should focus on supporting each other. Through everything.

Before I continue, I'll acknowledge that a manager's job is also to watch for team members who aren't pulling their weight, or need corrective feedback. That's all private. I'm talking about the importance of public support and collaboration.

When mistakes happen, a core element of the COE process is that you look at processes and mechanisms, not human error. A person pushes bad code? How do we improve our QA processes? A person reboots the wrong servers? How can we improve our administration tools?

Yes, someone might have screwed up. But blaming them for failure will never make them feel like a part of a team. And it certainly doesn't improve their performance next time.

You know what improves a team, and an individual's performance? Support. Seek opportunities to support each other, particularly when someone is feeling vulnerable.

Remember my story above of that horrible issue which almost broke hundreds of thousands of devices? Who caused that? What'd they do wrong?

I don't remember. It came up early as part of our investigation into what code change broke things. Everyone ignored who did the change, they were simply focused on fixing it. We discussed ways of making our code more robust, and our rollback procedures stronger. But I don't remember who made the mistake because I (and the team) really didn't care.

This post is for paying subscribers only

Sign up now and upgrade your account to read the post and get access to the full library of posts for paying subscribers only.

Sign up now Already have an account? Sign in