Our infrastructure team manages our data center, and high-performance computing clusters. This includes running and scaling Kubernetes, deploying on-prem hardware, capacity planning, and working with other teams on experiment and tooling design. See our recent blog post to get a sense of what kind of challenges we solve in our day-to-day work. This position closely resembles infrastructure/DevOps in a very large-scale startup.
We look for a track record of the following:
In this role, you will work closely with and directly accelerate researchers, but don’t need to become a machine learning expert yourself. We value people who can quickly obtain deep technical understanding of new domains, and enjoy being self-directed and identifying the most important problems to solve. Experience with high-performance computing, or open-source contributions are a bonus.
Time commitment: Full time