Apache Spark Optimization with Scala
Unlock the secrets to writing high-performance code with Apache Spark with Scala. Our course covers essential tools and techniques to optimize your applications, ensuring they run blazing fast. Learn from industry experts and master the practices used by top developers to enhance your coding efficiency and effectiveness.
- Duration
- 9h of 4K content
- Lessons
- 27 lessons
By Daniel Ciocîrlan
Money-back guarantee · Unlimited access · Free updates
Course Roadmap
Skills You'll Learn
- Understand Spark internals and predict job performance
- Read query plans before jobs run
- Read and interpret DAGs while jobs are running
- Understand performance differences between Spark APIs
- Write broadcast joins to eliminate expensive shuffles
- Optimize joins with column pruning and pre-partitioning
- Use bucketing for fast data access
- Fix data skews, straggling tasks, and OOMs
- Use RDD broadcast joins and cogrouping for multi-way joins
- Apply the iterator-to-iterator pattern for efficient processing
- Leverage JVM optimizations for high-performance Spark jobs
- Configure and deploy Spark in multiple ways
Goal
Why the $&*(# is my Spark job running so slow?
I’ve had my fair share of pain with Spark, and if you’re reading this, you’ve probably seen this too: you run a 4-line job on a gig of data, with two innocent joins, and it takes a bloody hour to run. Or another one: you have an hour long job which was progressing smoothly, until the task 1149/1150 where it hangs, and after two more hours you decide to kill it because you don’t know if it’s you or a bug in Spark. Usually, PIBKAC - problem is between keyboard and chair - but in desperation, the only idea you have is turn it off and on again.
Then you go like, “hm, maybe my Spark cluster is too small, let me bump some CPU and mem”. Then… same thing. Amazon’s probably laughing now and you’re paying for it. So this has to be the million dollar question.
This is the only course on the web where you can learn how to optimize Spark jobs and master Spark optimization techniques. With the strategies you learn in this Spark optimization course you will save yourself time, headaches and money.
Let’s improve that Spark performance.
In this Spark optimization course, we cut the weeds at the root. We dive deep into Spark performance optimization and you will learn how it works under the hood. We’ll see that we have incredible leverage, IF we write intelligent code, and you will do exactly that. You will learn 20+ Spark optimization techniques and strategies. Each of them individually can give at least a 2x perf boost for your jobs (some of them even 10x), and I show it on camera.
What Our Students Say
-
My team is expanding the use of Akka in our products so I needed a quick introduction on this topic. I have tried a couple of courses but the introduction to Akka was always too abrupt, too hard to comprehend. I blamed Akka for this as being too hard to explain. This was until I was exposed to the Rock The JVM courses which were an absolute delight when it comes to presenting such complex topics in such an easy to understand way. And Daniel has not stopped at Akka but has added to his portfolio amazing courses on Scala and Spark too. It seems like he is quite enjoying taking such challenges like complex technologies and making them so simple for everyone. I have instantly recommended Daniel’s work to my team, which helped them immensely with taking their skills to a new level, and I do recommend these courses to anyone who wants to have the fastest ramp-up in these tough but popular technologies.
Mihai FecioruAdobe · California
-
From Scala, to Akka, to Spark, Daniel delivers exceptional material in each and every one of these technologies. I’ve been using them for a long time and there is always something new I will discover from him. The level of detail he gets into as well as the way he delivers material is mindblowing. I personally find his latest course Spark Optimization pure gold and one of a kind. I’ve been using Spark for a year now and I haven’t even thought how much you can leverage query plans to make such optimizations. I can’t stop thinking every time, how he manages to go so deep - because using a technology is one thing, but knowing its internals so well and how everything works behind the scenes is another story when it comes to distributed systems. Long story short Daniel is definitely the best instructor I’ve come across and each one of his courses is the best resource you can find online. Kudos for all your work and knowledge sharing.
Giannis PolyzosVerverica · Greece
-
Daniel’s courses on Scala and Big Data are the best in class. I’ve been in touch with Daniel’s teaching and courses since early 2018. The first course that I took from him was Scala & Functional Programming; I was skeptical about it because over the internet there are many courses you can find, but few really worthy. I remember the very first day when Daniel started to speak and shared his examples - I started to love Scala, and then more as we went on. I am with Scala for the last 5 years now, but never ever has anyone explained to me or gave me comparable resources to Rock the JVM. Daniel gave me a shift in life and helped me crack top tech company interviews. His courses on big data are a must for any aspiring big data developer or data enthusiast. I highly recommend Daniel as an educator both online and on campus.
Anirban GoswamiApple · California
What's Included
Optimizing DataFrame Transformations (Mostly Joins)
Optimizing RDD Transformations (Mostly Joins)
Meet Rock the JVM
Daniel Ciocîrlan
Founder, Rock the JVM
I'm a software engineer and the founder of Rock the JVM.
I started Rock the JVM out of love for Scala and the technologies it powers. They are amazing tools, and I want to share as much of my experience with them as I can.
I've taught Java, Scala, Kotlin and related technologies such as Cats, ZIO and Spark to 100,000+ students at various levels. I've held live training sessions for companies including Adobe and Apple, taught university students who now work at Google and Facebook, run Hour of Code for 7-year-olds, and taught more than 50,000+ kids to code.
I have a Master's Degree in Computer Science and I wrote my Bachelor and Master thesis on Quantum Computation. Before learning programming, I won medals at international Physics competitions.
Enroll now!
All-Access Membership
Full (and growing) catalog
$195 billed yearly —Save 54%
Unlimited access to every Rock the JVM course
- 348 hours of 4K content
- All Scala courses
- All Kotlin courses
- All Typelevel courses
- All ZIO courses
- All Apache Spark courses
- All Apache Flink courses
- All Akka/Pekko courses
- Access to the private Rock the JVM community
- New courses included automatically
The Apache Spark Bundle with Scala
4 courses, one price
$180All courses in this bundle with a one-time payment
- 4 courses included
- 38 hours of 4K content
- All PDF slides
- Free updates
- Lifetime access
- Access to the private Rock the JVM community
Apache Spark Optimization with Scala
Lifetime license
$85Just this course with a one-time payment
- 9 hours of 4K content
- All PDF slides
- Free updates
- Lifetime access
- Access to the private Rock the JVM community
100% Money Back Guarantee
If you're not happy with this course, I want you to have your money back. Contact me with a copy of your welcome email and I will refund you.
Less than 0.05% of students have ever asked for a refund — and every payment was returned in under 72 hours.