Snorkel helps the world's largest organizations solve their toughest ML challenges. To continue our growth and achieve our product initiatives around foundation and large language models, we needed to fully redesign our interactive ML systems so that our products stay performant under increasing data and model scale.
However, building low-latency ML products for enterprises is challenging. Some enterprises are on-premises only and have limited compute resources. Others have very large datasets that need to be processed interactively. Still, others want to use the latest and greatest large language models. How do you go from sprawling requirements to an architecture that is performant for everyone?
In this talk, we share how we went from the customer requirements all the way to using Ray to design and build the interactive ML system that now powers our flagship enterprise product, Snorkel Flow. We'll dive into:
Distributed data/task parallelism to run ML workloads of any scale for resource-constrained customers.
Scalable in-memory processing for blazing fast ML workloads for resource-abundant customers.
How we combined the above two approaches into a single architecture deployable on any customer.
Lessons learned from using Ray to build performant ML systems.
Will is a software engineer at Snorkel AI and tech lead on interactive ML infrastructure, enabling enterprises to run heavy ML workloads over large datasets at low latencies. Prior to Snorkel, Will was a cofounder of include.ai, a Sequoia-backed company. Previously, Will spent time at DeepMind and Google Brain, where he was a coauthor on the reinforcement learning for chip design effort. Will studied computer science at Stanford.
John Allard is a Member of Technical Staff at OpenAI, where he works on the fine tuning product. Prior to OpenAI, he worked at the intersection of infrastructure and backend systems as a staff engineer at Snorkel AI. A UC Santa Cruz Computer Science graduate, John is passionate about API design, distributed systems, and large-scale inference.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.