This document provides an overview of installing and programming with Apache Spark on the Hortonworks Data Platform (HDP). It discusses how Spark fits within HDP and can be used for batch processing, streaming, SQL queries and machine learning. The document outlines how to install Spark on HDP using Ambari and describes Spark programming with Resilient Distributed Datasets (RDDs), transformations, actions and caching/persistence. It provides examples of Spark APIs and programming patterns.