Scaling a team is a difficult task that has been the downfall of many startups.
You can’t just hire people and watch great results pile up.
As a leader, your job is to get your team ready to scale and create an environment where efficient work and communication is a given.
How do you get your team to that level of maturity?
We bring you an answer to that question from Joseph Gefroh, Engineering Director at HealthSherpa, and seasoned startup leader. He takes a deep dive into his experiences, and gives you checklists and practices to focus on when you face growing pains. This blog post is based on an interview by Karolina Toth in episode 54 of the Level-up Engineering podcast.
This post covers:
Joseph is living in Seattle, Washington, and he’s currently the engineering director at HealthSherpa, a health insurance marketplace based in San Francisco. He’s been in various software engineering and leadership positions over the past ten years. He’s worked at several startups throughout his career, helping their developer teams grow and improve their practices.
He’s worked for a state government consultancy, he’s founded his own startup, and even though it failed, he’s been working at startups in different stages of maturity ever since. He mentors developers and consults managers, and he’s a startup advisor as well.
His hobbies include watching TV shows, playing video games, and trading in the stock market.
I’ve never thought that a team is ready for hypergrowth. Building up a team always feels like laying the railroad tracks as you’re speeding across them while trying to figure out how to make the train fly.
You need to perfect the onboarding process even if it’s remote onboarding, so you can turn the zero or negative productivity of a new hire into a net positive. You need to develop processes to turn tacit knowledge into explicit knowledge that requires minimal work to share with new hires. For example, you can use documentation and shared norms to do that.
Alignment is a common problem; it’s difficult to scale your culture as your company grows. You have to make sure to keep moving in the right direction and to align everyone with the goals, priorities and the practices of the organization.
Culture includes knowledge sharing, continuous improvement, and creating an atmosphere that empowers everyone to ask questions. A team without psychological safety can’t scale up, because growth comes with the risk of failure.
Psychological safety enables everyone to take risks. Without it, engineers won’t become productive, because they are more concerned about not looking dumb than they are about learning. You can’t punish new engineers for mistakes that most new hires make, because that stops them from contributing to the best of their ability.
As long as you only need to onboard a couple of new engineers a year, you can take your time and let your senior engineers put a lot of effort into mentoring developers. When you’re doubling or tripling your size each year, you can’t afford to do that with every new hire. You need to make onboarding work without slowing the team down or sacrificing your clean code standards.
Startup teams often don’t have any documentation because they constantly are focused on building the next feature. You need to change that mindset and show them that documenting their work will save them the effort of explaining the same thing repeatedly.
Documentation is the most efficient way to share knowledge because it’s easily repeatable. You turn what you’ve learned into a playbook, and send it to the new hires. You can make repositories out of these documents covering everything new engineers need to know about the system, so you can just send one link with all the info.
When you lack documentation, new engineers are forced to rely on asking questions, making it difficult for them to pick up the pace. Answering these questions takes time away from both your established engineers and the new hires. Not answering these questions leads to the new engineers making avoidable mistakes while figuring out the system by trial and error.
In some organizations, it takes weeks for new engineers to set up their local development environment because nothing is documented or automated. Early phase startups tend to be far from the gold standard of new developers being able to deploy to production on day one.
Rapidly growing organizations often have issues with stability.
During onboarding, new hires can’t learn everything about your existing systems, but they need to be productive, so they make changes. They don’t have enough domain expertise yet, and more experienced engineers often aren’t available to review their code. These changes can impact other parts of the system, so bugs and mistakes inevitably happen.
You can combat this by introducing foundational observability and traceability into your technology and process. Otherwise things are going to break, customers are going to get mad, and internal operations teams won’t be able to do their job. And the worst part is, engineering will be slow to find out about it, so they’ll be slow to fix it.
Place some developers into a growing team who know your systems. Make sure to have logging, auditing and observability in place to allow you to trace problems to their source and restore anything that breaks.
I worked in an organization where we had tracking errors pile up for a year. By the time we realized that it was broken, we couldn’t fix it because the incorrect tracking led to a complete data loss. That data would’ve been invaluable at that point.
As your engineering grows, the rest of the organization always wants to take the extra capacity and apply it to new problems, whether it’s new features or new business lines. Small teams have to prioritize hard, but large teams often lose this mindset.
As you scale your team, innovation can’t be your sole focus anymore. Incremental changes may not result in 10X or 100X revenue potential, but they can provide a lot of value for your users. You always have to balance innovation with supporting your existing product and responding to customer feedback.
Product companies tend to focus too much on innovation when they gain additional capacity. You can’t keep building new things; you also need to run what you’ve built, and that takes a lot of engineering capacity by itself.
If you don’t put enough effort into running your product, it’s going to break, which may take away your organization’s ability to support building new features. You’re going to prioritize putting out fires, and interrupt your product roadmap to keep your users on board.
You have to bake documentation into the process of writing code. You need to put the mindset to maintain what you’ve built into your engineering culture. Explaining how it works is an integral part of maintaining a code base.
My previous teams had this mindset, “If you can’t explain how it works, you don’t truly understand it.” This often resonates with engineers because they want to understand things. This can help you to get them to share what they’ve built.
Documentation is critical in the onboarding process and when a team hands over a project to another team.
You can hold events that support knowledge sharing like book clubs or lunch-and-learn get-togethers. Create opportunities for your team to demonstrate what they’re working on; the best way to share knowledge is to take a peek under the hood. These demos require documentation, which can be a presentation or any type of document that details how the code works.
Involve engineering leaders into creating documentation by making it their responsibility to make sure anything their team builds is properly documented. They can delegate these tasks to the engineers writing the code. Making this a formal expectation guarantees that it will be done, and it helps leaders in making documentation a part of your engineering culture.
This way you utilize both authority and the desire of engineers to learn from each other.
Some engineers prefer written documentation, while others love to talk to people and ask questions. The ones who prefer asking questions often find that they don’t get all the answers or they get them slowly, so they need to utilize documentation as well.
Whatever they may prefer, you need your engineers to take initiative and do their due diligence. As long as the documentation is available, the teams should take advantage of them.
You don’t want engineers to spend hours going through the documentation looking for info they could get from another engineer in five minutes. At the same time, you don’t want them bombarding more experienced engineers with questions that they could easily find the answer to. You need to balance these approaches.
Create an engineering culture that provides space for both active and passive knowledge sharing. You want to provide psychological safety to ask questions and utilize the advantages of passive knowledge sharing.
Start by standardizing or automating your deployment process. Make sure that delivering code to production is repeatable and consistent regardless of which developer does it.
Start by documenting every step. Once you have the documentation, it’s easy to automate the process. For example, you can turn your rules into a container.
Monitoring is another technical practice worth standardizing. It gives metrics to the engineers about their code in production like response times or error rates. Automate this as well, if possible.
There are a lot of tools out there that are easy to set up and that save you a lot of time in the long run. Use a tool that notifies you when something breaks; some of them can even send you a Slack message. These tools help a lot by saving you the trouble of regularly checking if there’s something wrong.
Things break in tech, and your engineers need to know what to do to reduce the time it takes to restore the system. If they have to try to figure it out during an incident, you’re going to have unnecessarily long outages.
On the other hand, if your team has automated monitoring and a standardized incident response process in place, they’ll know what broke, and they just need to follow the protocol. This allows even junior developers to track down the source of the failure and fix it by following the checklist.
Once you’ve nailed down the technical processes, you can start looking at the human side. Make sure that your leaders have regular one-on-one meetings with their engineers to keep them happy, productive, and growing.
Keep your engineering culture on track as well. For example, some leaders may take an incident as an opportunity to point fingers. Create a blameless culture and leave room to fail.
People need to be able to make mistakes so they can learn from them and grow. If they avoid mistakes at any cost, they stop growing.
When the engineering organization is in order, focus on your relationship with your cross-functional peers. Smooth out the work relationship between engineering, product, and design departments. Improve communication starting with the ticket quality to deliver clear requirements between the different functions.
Make your operation model clear to everyone involved, whether it’s waterfall or agile.
For example, many organizations say that they use Scrum, but they don’t implement the incrementality. They just break down a waterfall process into two-week mini-waterfalls, but the goal and the scope never change. You don’t need agile, just make sure that your process works for your organization.
Use a process that aligns with your organization’s values, and delivers the results that you’re looking for. When your team tries to use Scrum but they violate the Scrum contracts every day, they may be better off switching to Kanban or another approach.
Work out the ways the product, engineering and design departments interact with the rest of the company, like sales, marketing, and account management. Standardize a process that lets the other departments know what’s going on in the PED organization.
You need a process for your entire company to be able to make requests from engineering. Make it transparent, so every party involved knows what to expect.
This is the point where you establish an intake process, and let everybody in the company know how to submit a request. Specify what information to provide to maximize the chance of engineering accepting their request, and what they can expect in response time, commitment, and scheduling.
People always do what gets the job done in the easiest way. If the operations team can directly ask an engineer to take care of their request, they will go that route. When this starts happening across the entire organization, these requests will disrupt your roadmap and make your engineering execution unreliable.
You need to establish a formal intake process to keep up execution speed and prioritize these tasks.
The lack of an intake process leads to chaos. Engineering turns into a black box where no one knows the status of their request. Everyone keeps sending messages to the engineers, and that slows everything down even more.
Create an intake process, and make sure to prioritize the requests, so you’re always working on what’s most valuable.
Prioritize by first focusing on the process that addresses the biggest challenge for your organization at the moment. For example, if your company doesn’t have incidents, you may delay coming up with an incident response plan and focus on engineering culture.
You can work on more processes at the same time. You don’t need to have automated deployments before you start working out your intake process. You don’t need to have observability in place before you start working out an SDLC plan with your product and design peers.
Organizations with 10 people don’t need an intake process, but they will as they scale up to 50 people. When an engineering leader gets pinged each day about the status of a ticket, then focus on making your intake process more transparent.
When you hire people, they come with a range of different ideas and they lack a shared context. Aligning 3-5 people in the same room is easy, but it becomes a different challenge with 100 people on different floors, in different offices, or in a fully remote engineering team.
You can get away without formal processes in small companies, but they become necessary as you grow. Reaching alignment becomes exponentially more challenging at scale, because the number of communication channels increases.
In a team of 9 people, you have 36 communication channels. In a team of 50 people, you jump to 1,225 communication channels. At that point, you can’t keep everyone up to date on a one-on-one basis, like you did before.
You need standards and processes to overcome this. For example, when submitting a ticket, you need to set a standard for minimal necessary information.
At one point, I was in a fundraising company that had no intake process. We had different functions, like finance operations, account management, sales, product, etc.
Whenever someone in a different department wanted something from engineering, they directly messaged an engineer or had a conversation with them in the office. The engineers usually agreed to help out, and the rest of the company took it as a commitment that engineering would get them what they needed.
This led to weird commitments that nobody was tracking and no one was working on. When the deadlines passed, the teams that made these requests got mad and blamed engineering for committing to help in the first place. The engineering organization turned into a black box, and from the perspective of the rest of the company, we were missing deadlines that we didn’t even know about.
This also hurt our execution. If we made a commitment to a key stakeholder like the CFO, and he got mad because his issue wasn’t addressed, we had to drop everything and deliver on his request.
We ended up establishing an intake process. We created a JIRA board where anyone in the company could submit a request. The engineering leaders committed to look at that board twice a day.
We created a template for the requests with basic questions like:
The person submitting the ticket had to commit to answering any related questions that came up. When we needed more information, they would make themselves available, or assign someone to answer our questions.
When you set up an intake process, you have to commit to make it work; otherwise, no one will use it. At this company, we set up a team of two developers whose job was responding to every request, prioritizing them and executing urgent tasks. Sometimes they may decide that a request takes too many resources or can’t be done, but that’s helpful information to the stakeholders as well.
When we implemented the process, no one followed it. They could still get things done by directly messaging engineers, which was simpler for them, so they had no reason to start using the intake process.
I told all my engineers to never act upon a direct message they get from another team. I asked them to send a link to the Jira board if anyone sent them a message about an issue.
Getting this message through required a lot of repetition. Some engineers found it difficult to say no, because they liked helping their colleagues, but they were dropping important work in the process. Eventually, they all stopped taking tasks from other stakeholders, and word got around the company that we only take issues via the intake process.
This made managing the workload easier.
We empowered the engineers on the response team to do what they can to prevent the same issues from coming up again. We wanted to get ahead of the work and stop certain requests from reaching the engineering organization in the first place.
The intake process highlighted patterns in the requests. For example, we saw a lot of password resets and disabled accounts. We could build internal tooling to handle these issues, and many others, so we empowered the other stakeholders in the organization to solve their problems.
We used to get requests to calculate the aggregate financial transactions and list them in a certain way. We ended up creating an Excel report and automatically delivered it to the stakeholders every week. We never received those requests again.
This removed entire classes of requests from the pipeline.
Over time, we kept receiving fewer requests, so we could work with less interruptions. This made the company more stable. The focus on prevention worked out in the long term, as it freed up capacity to build the systems we were supposed to.
Walking through the journey of a typical startup CTO highlights when you start needing management. Every company is different, but a senior IC or the technical founder generally becomes CTO by default.
As the team grows, the CTO’s work piles up. Their meeting calendar goes from a couple of meetings per week to several meetings per day. This forces the CTO to leave the IC path and commit to management, which they may not be happy about.
With the most senior engineer out of the front line, other engineers often become unhappy, and delivery can become inconsistent. When the CTO is taking on work that could add up to several full-time jobs, time management becomes impossible and things start to slip through the cracks.
The right time to hire or promote a management layer is when your CTO starts struggling. They can pick up the slack in the areas your CTO isn’t excellent at. Look at the skillset of your CTO in categories like:
When you’re evaluating your CTO’s work and the engineering organization as a whole, ask these questions:
For example, your CTO is great at coding and architecture, but they lack experience in the people management and the execution management area. In that case, you want to bring in a layer of engineering leadership to take that burden off your CTO. Improving management will make it possible for you to keep growing and maximize engineering productivity.
I haven’t seen a successful attempt at scaling a team without managers.
When you try to scale without managers, certain individual contributors start acting like managers, but the lack of formal authority makes it difficult for them. Google experimented with getting rid of management, and it turned out poorly.
Not every organization works well with managers. Before you bring them in, make sure that your organization has the mindset to appreciate management work. Managers streamline processes, improve them over time, and provide visibility into processes, practices, and execution.
If your organization doesn’t appreciate management work, you’re setting up the new managers to fail.
At one time, I was told at an all-hands meeting that managers are useless and engineers just need to do their jobs. This happened because the organization didn’t understand its own complexity, and that managers were necessary to support engineers at doing their job. This organization set the expectation that engineers should do everything.
You generally want to avoid your engineers having to chase down stakeholders and interpret product requirements beside writing code.
When you plan to bring in a management layer, communicate to everyone what the managers will do and how it will impact them and the company. You need to educate them about the role and the upsides of management. You may utilize different tactics depending on whether the resistance comes from the individual contributor or the executive leadership level.
Measurements and metrics can help you convince individual contributors. They often think that they don’t need managers to run the processes, and it’s difficult to argue against their convictions without evidence. On the other hand, you can get your point across if you can point out shortcomings in failure rate, bug rate, or delivery frequency.
The executive leadership may be stuck in a startup mindset, such as, “If you’re not contributing directly, you’re not contributing at all.”
You can reason with executive leadership by talking about leverage. You can work on building the next feature, or you can work on a mechanism to make your next 100 features faster and higher quality. Managers always focus on long-term goals.
A manager can handle about nine direct reports, and working with 15-20 direct reports forces managers to spend all their time in one-on-one meetings. It becomes too much and they start to lose focus. When the manager’s attention splits up among too many direct reports, everyone suffers.
There are natural team size restrictions. In my experience, seasoned managers can work with 7-12 direct reports, while junior managers can start with 3-7 direct reports. When the team gets too large, add new managers, or split up the team.
As your company grows, the same events will happen at different levels. The managers first report to the CTO, but when the CTO has too many direct reports, you may need a new layer of management. At that point, senior managers or directors can alleviate this burden and reduce the number of communication channels one person has to deal with.
Re-evaluate your processes at least every quarter.
You can use metrics to track the impact of a process, and reiterate as soon as you see them drop. For example, you can track DevOps metrics like change failure rate, deployment frequency, change volume, and cycle time. As soon as you see a negative impact on these metrics, you can iterate your processes; you don’t need to wait months.
If you empower your engineering managers and leaders, you can enable them to build self-managed teams and let them experiment at their own pace. It’s similar to agile software development; you improve your processes in an iterative, incremental manner. Experiment, measure the effects, and update them based on the feedback you get.
As your company goes through different phases of maturity, your previous processes may stop working at the next stage. When you go from an innovation model into a maintenance model, you are going to need to reassess your processes.
This is how you combat a sub-process cruft building up in your organizations, where some of your processes create friction, or the reason they exist is gone.
Stay vigilant, and constantly curate whether your processes still provide value. You can measure their effects by a set of metrics, a qualitative analysis or a sentiment analysis. You can survey your developers asking questions like, “Do you feel better about your job?”
🚀 Need developers for your team or project? Hire our experienced Angular, React or Node.js developers! Click here for a FREE consultation.
About the author:
Gabor Zold is a content marketer and tech writer, focusing on software development technologies and engineering management. He has extensive knowledge about engineering management-related topics and has been doing interviews with accomplished tech leaders for years. He is the audio wizard of the Level-up Engineering podcast.