Ray Deep Dives

Developing and Serving RAG-Based LLM Applications in Production

September 18, 2:30 PM - 3:00 PM
View Slides

There are a lot of different moving pieces when it comes to developing and serving LLM applications. This talk will provide a comprehensive guide for developing retrieval augmented generation (RAG) based LLM applications — with a focus on scale (embed, index, serve, etc.), evaluation (component-wise and overall) and production workflows. We’ll also explore more advanced topics such as hybrid routing to close the gap between OSS and closed LLMs.


• Evaluating RAG-based LLM applications are crucial for identifying and productionizing the best configuration.

• Developing your LLM application with scalable workloads involves minimal changes to existing code.

• Mixture of Experts (MoE) routing allows you to close the gap between OSS and closed LLMs.

About Philipp

Philipp Moritz is one of the creators of Ray, an open source system for scaling AI. He is also co-founder and CTO of Anyscale, the company behind Ray. He is passionate about machine learning, artificial intelligence and computing in general and strives to create the best open source tools for developers to build and scale their AI applications.

About Goku

Goku works on education, engineering and product at Anyscale Inc. He was the founder of Made With ML, a platform that educates data scientists and MLEs on MLOps 1st principles and production-grade implementation. He has worked as a machine learning engineer at Apple and was a ML lead at Ciitizen (a16z health) prior to that.

Philipp Moritz

Co-founder and CTO, Anyscale

Goku Mohandas

AI Education Lead, Anyscale
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.