Ray Deep Dives

Deploying Many Models Efficiently with Ray Serve

September 19, 1:00 PM - 1:30 PM
View Slides

Serving numerous models is essential today due to diverse business needs and various customized use-cases. However, this raises the challenge of how to efficiently deploy and manage these models while considering both ease of use and cost-effectiveness. This talk aims to provide a comprehensive insight into various patterns of serving many models using Ray Serve. We will delve into how 3 features in Ray Serve - model composition, multi-application, model multiplexing - enable seamless deployment of numerous models while optimizing resource utilization.


• Discuss common industry patterns for serving many models.

• Learn how to simplify management and enhance performance of many-model serving through Ray Serve's model composition, multi-application, and model multiplexing features.

• Deep dive into case studies of Ray Serve users running many-model applications in production.

About Sihan

Sihan is a software engineer at Anyscale, a contributor to the Ray Serve. Before joining Anyscale, he was the software engineer in Pinterest, working on ML inference service.

About Jon

Jon Park is a Principal ML Engineer at Clari, where he leads ML Platform. Previously, he worked as a Director of API Platform Engineering at TIBCO. He holds a BA in Computer Science and MBA from UC Berkeley, and Master of Computer Science in Data Science from UIUC.

About Cindy

Cindy Zhang is a software engineer focusing on Ray Serve and Ray infrastructure at Anyscale.

Sihan Wang

Software Engineer, Anyscale

Jon Park

Principal ML Engineer, Clari

Cindy Zhang

Software Engineer, Anyscale
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.