This document discusses random decision forests and how they work at scale. It provides an overview of decision trees and random decision forests. Individual decision trees are prone to overfitting, but a random decision forest addresses this by growing many decision trees on randomly selected subsets of the data and features. This results in greater accuracy compared to a single decision tree by averaging the predictions of the ensemble. The document demonstrates how to build random decision forests using Spark MLlib and discusses hyperparameters like the number of trees and feature subset strategy that can be tuned.