Episodes / Ep. 1
EPISODE 1 March 3, 2026 50 min

How Wayfair uses Arize for Reliable Agentic AI

agentic AI evals retail logistics Wayfair Arize AI
Play

Summary

Host Aparna Sinha discusses the current landscape of enterprise AI with guests Victor Sulaiman from Wayfair and Aman Khan from Arize AI. They explore the integration of AI in retail, the importance of evaluating AI systems, and the emerging field of Agentic Commerce. The conversation delves into optimizing logistics and support with AI, the necessity of human oversight on autonomous agents, and techniques for measuring ROI of enterprise AI investments.

Chapters

Chapters

Why Listen

If you're trying to figure out where the ROI actually lives in Agentic AI, this is the episode for you. Learn how one of the world's largest retailers moved beyond the pilot phase into production-grade agentic systems — from customer support automation to freight logistics optimization.

Key Takeaways

  • Wayfair reduced ticket resolution times from 7 days to 2 days by automating low-lift tickets
  • Using multiple LLMs as a jury to deliberate on decisions avoids costly errors
  • LLM reasoning can optimize freight capacity beyond traditional algorithms
  • Sometimes a simple reflex agent beats a complex hierarchical one — avoid over-engineering
  • Real-world feedback loops are essential before scaling AI agents to production

In the debut episode of Enterprise Aligned AI, host Aparna Sinha sits down with Victor Sulaiman, Senior Product Manager at Wayfair, and Aman Khan, Head of Product at Arize AI, to get into the weeds of how one of the world’s largest retailers moved beyond the “pilot phase” into production-grade agentic AI systems.

AI in Retail: More Than Chatbots

Victor walks through how Wayfair is deploying AI across the business — from customer support automation that reduced ticket resolution times from 7 days to just 2 days, to agentic commerce experiences that help customers find and visualize products in entirely new ways.

The Evaluation Problem

Aman explains why building an AI agent is easy but making it reliable is the real challenge. Arize’s approach to scalable AI evaluation and governance — including open-source tools — gives teams the confidence to ship agents to production. The key insight: you need to evaluate agents in the real world, not just in test environments.

Agentic Logistics

One of the most fascinating segments explores how Wayfair uses LLM reasoning — not just traditional algorithms — to optimize freight capacity and container volume. This is agentic AI applied to hard operational problems where the ROI is immediately measurable.

LLM as a Jury

Why does Wayfair use multiple LLMs to “deliberate” on decisions like furniture translations across languages? Because a single model makes hilarious (and costly) errors. The jury approach with human oversight creates a reliability layer that makes autonomous decisions safe for production.

Build vs. Buy and Scaling Up

The episode concludes with practical insights on the build vs. buy dilemma for AI tooling and strategies for scaling AI from pilot projects to full production deployment.

Guests

Aman Khan

Aman Khan

Head of Product @ Arize AI

Aman is Head of Product at Arize AI, an AI Development and Evaluation platform used by companies like Uber, Duolingo, Reddit, Instacart, Booking.com and Wayfair. At Arize, Aman helps teams launch and improve their AI agents and systems. He recently led a popular deeplearning.ai course on Evaluating AI Agents, and has been featured by Lenny's Newsletter to cover AI Product Management and Evals a number of times. Prior to Arize, Aman led products at Spotify, Cruise and Apple.

VS

Victor Sulaiman

Senior Product Manager @ Wayfair

Victor is a Senior Product Manager at Wayfair, where he leads AI and Agentic workflows for retail and marketplace platforms. He has over a decade of experience driving enterprise AI adoption, from Uber to Meta and even worked on foundational model research, he specializes in building scalable, reliable agent systems that put the user at the center. Victor is passionate about the intersection of product strategy, marketplace economics, and responsible AI deployment.

Share:

Subscribe to the Podcast and Blog

Get the latest insights and episode links delivered straight to your inbox.

Stay connected

You're subscribed!

Check your inbox to confirm your subscription. Welcome to the Enterprise Aligned AI community.