Apache Spark has changed dramatically in the past year — from new APIs in Spark 1.4 to dramatic execution improvements and even better APIs in 2.0. In this intermediate-level tutorial, I'll address the question of which Spark APIs to use with a series of brief technical explanations and demos that highlight best practices, latest APIs, and new features.
We'll look at how Dataset and DataFrame behave in Spark 2.0, look at Whole-Stage Code Generation, and go through a simple example of Spark 2.0 Structured Streaming (Streaming with DataFrames) that you can run in your own free instance of Databricks.
Spark Training from ProTech
If you're just getting started with Spark development, check out our 3 day Spark Programming course page to see upcoming public classes or request an onsite training for your team.