Join in Distributed SQL Engine

Join in MapReduce

https://www.edureka.co/blog/mapreduce-example-reduce-side-join/

  • Replicated Join (Map Side Join)
  • Reduce Side Join
  • Reduce Side Join with Bloom Filter
  • Composite Join

Join in Spark SQL

https://medium.com/datakaresolutions/optimize-spark-sql-joins-c81b4e3ed7da

  • Sort-Merge Join
  • Broadcast Join
  • Shuffle Hash Join

Join in Impala

https://impala.apache.org/docs/build/html/topics/impala_perf_joins.html

  • Broadcast Join
  • Partitioned Join

Join in Presto

https://prestodb.io/docs/current/admin/properties.html#general-properties

  • Broadcast Join
  • Distributed Hash Join