Managing platform teams is an interesting leadership challenge. Few leaders have had the opportunity to run platform teams, so the know-how isn’t well-documented in engineering management literature.
When do you need to start your platform team?
What kind of engineers do you want in your platform team?
How do you smooth out the notoriously difficult communication and avoid creating bad blood with the product team?
We bring you answers to these questions from legendary leader Camille Fournier. She’s currently running platform engineering as Managing Director at Two Sigma. This post is based on the interview in episode 50 of the Level-up Engineering podcast, hosted by Karolina Toth.
This blog post covers:
Camille is currently leading platform engineering at Two Sigma as Managing Director. She lives in New York City, she has two kids, and her passion is traveling - when COVID doesn’t get in the way.
Two Sigma is a financial services company providing data-driven services across quantitative investing, insurance and private markets, including real estate, private equity and venture capital. The business also includes a platform for investment allocators. They combine a scientific and a technology-driven mindset in each field to approach different financial challenges.
Previously, she was CTO at a startup called Rent the Runway. She wrote the book, The Manager's Path, and she regularly publishes articles in various places about management, engineering leadership, architecture, and more.
The platform team at Two Sigma is the software infrastructure part of the engineering team.
Historically, you might have called teams doing similar work infrastructure teams. The difference is that infrastructure engineering tends to focus on physical infrastructure.
In recent years, companies have started using cloud services, so the platform’s team role has been changing. Cloud services don’t erase the need for platform teams, because leaving every product team to figure out the cloud for themselves would lead to chaos.
Platform teams make the different cloud offerings work together in a maintainable manner.
Our platform team builds most of the shared software infrastructure within Two Sigma. For example, we build software development tools, Kubernetes clusters, or the software we use as glue for our public cloud usage. Bespoke builds and test platforms live inside the platform team as well.
The idea behind the platform engineering team is that they build development tools that multiple product engineering teams can use. It may be multiple teams or multiple different parts of the business. It’s also the platform team’s job to maintain these tools until they are decommissioned.
You don’t involve the platform team when you need a one-off custom tool.
Platform teams always have internal customers, which opens up the opportunity to get a better understanding of their needs. Product teams tend to work with external customers, so it takes extra effort to translate business goals and to figure out what they need to build. Translating a business idea or a user need into software can be difficult when your customers aren’t engineers themselves.
At Two Sigma, we are working primarily for internal customers. We work directly with our internal clients, understand their challenges, and build software to meet their needs.
On the other hand, Rent the Runway was building a product for a wide consumer audience. When you’re building for a large group, you don’t have a deep understanding of your customer, because you can’t get individualized feedback. It’s the job of product managers to figure out what users want, and to work with engineering to serve their needs.
Platform teams tend to build longer-running projects with a more operational focus. Platforms are meant to be long-running systems where the key values include the following:
Stability can be less important for product teams. Product developers often focus on building prototypes, where reliability isn’t mission-critical. In this case, it’s not about stability; it’s about working with product managers and quickly turning a spec into a working demo.
Platform teams have to look into the future to see how their product is going to be used, how it may evolve, and how a future transition can be managed. You can’t say to a thousand engineers that you’re going to shut down an old tool tomorrow and they have to migrate everything to the new tool by then.
Platform engineering teams have to be ahead of demands coming from the product teams. If you aren’t looking far enough into the future, you can’t react to their needs quickly enough, and they may end up building their own tools. That’s okay when only one team is going to use it, but scaling a platform product that was hacked together to serve one highly specific purpose can be a problem.
At times, the platform team has to stop product teams from implementing new technology, because they don’t meet security standards. Another common occurrence is that you can’t let in new tools simply because you haven’t had a chance to evaluate them yet.
Platform teams need to work on becoming enablers. You don't want to be the bottom line when others are excited about trying out cutting-edge technologies. It’s better to support them, run safe experiments, and see what you may learn from them.
Partnering with early adopters can be a great opportunity for the platform team to learn the value of new tools, and how you may adapt them to the platform. This sounds exciting, but it can be boring work when it comes to learning how to operate, scale and install these tools.
You can draw useful conclusions from these experiments, for example:
The platform team has to drive the product teams in replacing unstable legacy tools. You have to let them know which tool is outdated and when the replacement is ready, and make sure the transition is smooth.
This is extra work for product teams because migrating to new technology isn’t a trivial process. They can get frustrated with demands for upgrading and migrating. Nobody’s happy about moving from Python 2 to Python 3 when it shows up on the product team’s roadmap out of nowhere.
It’s the platform team’s job to make these migrations as easy as possible for the product teams. The platform team also takes on the responsibility of modernizing security, stability, reliability, and scalability aspects, so the product teams don’t need to worry about it.
We’re always evolving, but Two Sigma has been conducting heavyweight yearly plannings. Each product team partners with an engineer from the platform team during the planning process to map out tooling requests. This also allows the platform team to let them know when there’s a migration coming up, so they can put it into their plans.
Coordination during planning works better from platform to product than the other way around. It’s because platform teams are building fewer large systems, so they have a clearer picture of what they will build and release during the year. Product teams often learn on the go that they need to launch something in a few months, and they need a new tool to build that.
We have product managers throughout engineering, and sometimes leadership coordinates products through them. You can’t have product managers be the only channel of communication, especially when it comes to coordinating technical details. You have to involve engineering leaders, engineering managers, tech leads, and senior developers as well.
Communication works best when a product team reaches out directly to a team in the platform organization. They can communicate quickly about what they need and find a solution. The more you involve senior leadership in early stages, the slower and more complex the process can be.
Sometimes you don’t have a choice but to escalate the situation because the platform team you’re in contact with lacks the bandwidth to address your problem. I aim to provide flexibility for my platform teams to work with their product counterparts, but it has to be balanced.
If the platform team is constantly working on fulfilling one-off requests, it hurts productivity. If you see that happening, you need to figure out what the product teams are trying to do, and plan ahead to provide the necessary tooling.
It’s important for engineers to keep some slack in their schedules. Sometimes the right decision is to deprioritize a large project to do something else and unblock another team. It works in any situation where there is a dependency between teams; giving up some local productivity increases the overall engineering productivity.
Sometimes the platform team doesn’t get looped in at the right time to build ahead of the product team’s needs. They often end up using old systems to solve their problems, and they just hack it together to make it work. By the time the platform team finds out about it, they may be close to launch.
Often, you could have built a better tool for them, or they may have built new features based on a tool so old you’re about to decommission it.
Product teams can see the platform team as a bottleneck and avoid involving them. Other times product engineers may take on technologies that belong in the platform team’s domain because they want to play with new technologies.
Sometimes they actively try to avoid bringing in the platform team because that brings in extra coordination requirements. They’re right, but we all need to work with people outside our teams, which comes with some overhead, even if engineering leaders do their best to make it easier.
If you let this go on, you can end up with a massive shadow IT.
The problem comes when the product engineers call in the platform team to take over a tool they built. This causes resentment and frustration on both sides.
The product engineers were just trying to move fast, and they don’t think they did anything wrong. At the same time, the platform team has to take over a tool they had no hand in building, and it may even be inoperable. At this point, the platform team has to make it useful beyond that one team and support it moving forward.
It’s almost impossible to stop this from happening. Engineers always want to move fast, and they can’t estimate the long-term cost of adding new platform elements.
I try to prevent this by going out there and trying to get involved as early as possible. Resources are still limited, so this may happen despite your best effort.
Platform engineers need to have an interest and knowledge of the interface between the software and the machine. For example, this may be a virtual machine, or it may be bare metal depending on your company.
Here are some current technologies in platform engineering:
I look for customer empathy when hiring software engineers for my platform teams. It’s important, because providing support for the engineers using your platform is a big part of the job.
In platform teams, especially in teams building software development tools, often you’re building a tool while you’re also using it. This gives you a deep understanding of that system.
In this situation, platform engineers tend to forget that their counterparts working on a product may be capable engineers, but they’re not living and breathing those tools. For example, product engineers don’t need to have a perfect mental model of Git’s innermost workings.
You want to hire engineers for platform teams who can be patient with technical customers, and have the ability to step back and appreciate their points of view as well. This soft skill has become important in every field of software development, but it’s critical in platform teams, even though your customers are engineers as well.
You can check for customer empathy in interviews if you ask candidates about times when they built a tool that another engineer on their team used and had problems with it. The point is to see how they work with other engineers to understand the problem, and how they help them.
Not everyone has built development tools, but most engineers have worked on a team with other engineers before. In this case, you may ask them, “How do you think your colleagues will be able to interact with the software you build?” You can get a sense of empathy if you see that they aim to write code that others can read, or by seeing their approach to helping others work with their code.
From there, you can lead the conversation to see how they react to other engineers having a different perspective of their code. You want to see whether they can snap out of their own perspective and recognize that it’s important to help their colleagues.
You don’t want to see the attitude that they only care about writing code. A lack of openness to helping others understand their code is a red flag when you’re hiring for a platform team.
There are great engineers who write great code but aren’t collaborative. You may want to hire people like that for specific roles, but it may not be a platform engineering team, which requires a lot of collaboration.
It depends on the size and age of your company. For example, startups often don’t have a platform, but they may have a team that builds business services at the core of everything else.
Rent the Runway had built solid back-end services around core business features like taking reservations or expanding the product catalog. It didn’t make sense for strategic product focused teams to own these services, so we built teams specifically to keep these operational and scalable.
Teams responsible for the company’s storage systems, or DevOps & SRE teams, are generally close to being platform teams.
As companies evolve, you often see specific engineers starting to own their core business services that many product teams work with. These people tend to keep taking on more functionalities. Sometimes this work happens in the DevOps & SRE teams as they invest in scaling, stability, and reliability, and they start to build their own tools.
Tech companies are bound to create platform teams as they scale up. Your goal may be simply to make sure that product teams won’t make different decisions about what databases or storage systems to use.
When you look at your engineering teams, and see that a couple of people in each team are working on cloud integrations, it’s time to form a dedicated platform team. You don’t want 10 engineers across 5-10 teams to build this. You can do a better job with five people in one team supporting all the other teams while freeing up the other five engineers to work on the product.
Centralizing platform work can slow down the evolution and add extra coordination. But there is a point where the increase in security, reliability, and the bandwidth product teams gain outweigh the difficulties in communication.
In my experience, platform teams tend to fail when they get caught up in the hype of a new technology, for example Kubernetes. They start building an offering for it without thinking through what they’re already using, and how they can migrate from the legacy tool to the new one. It may not be the first thing you think about, but considering these aspects is also the job of platform teams.
Platform engineers have to be patient and look further ahead in the future than most product engineers. People often underestimate this aspect, which leads to failed projects and teams.
Platform engineering isn’t just about discovering new tech or maintaining old tech. You have to move the old tech into the new as seamlessly as possible. Working in a platform team requires a specific type of patience.
Building new stuff is great, but the job is more than that.
You have to get the product teams to use it, understand their problems and fix them. You have to look at the long tail of people using the old tech, and think about what improvements you can offer them to get them to migrate. You also have to figure out how to make the migration as easy as possible, going as far as even doing it for the product teams in some cases.
🚀 Need developers for your team or project? Hire our experienced Angular, React or Node.js developers! Click here for a FREE consultation.
About the author:
Gabor Zold is a content marketer and tech writer, focusing on software development technologies and engineering management. He has extensive knowledge about engineering management-related topics and has been doing interviews with accomplished tech leaders for years. He is the audio wizard of the Level-up Engineering podcast.