The document discusses deploying machine learning models in production environments. It outlines several challenges with current approaches such as models being opaque objects and a focus on training rather than prediction. It then proposes six requirements for an architecture to handle live traffic directly from trained models: 1) easy integration, 2) high performance, 3) fault tolerance, 4) scalability, 5) maintainability, and 6) extensibility. Finally, it introduces Dato Predictive Services as a platform that meets these requirements by deploying models as low-latency REST services that can elastically scale and includes monitoring and model management capabilities.