Map Side Join in HIVE

Vaibhavvaidya
2 min readJun 1, 2022

--

Most important Map Side join in HIVE,

Before that let’s understand why we need Map Side join?
When we join two table in HIVE behind the scene internally it’s nothing but Map-Reduce, in that scenario Reducer has to do lot of works compare to Mapper. Shuffle and sorting of data is heaviest thing which reducer has to do and time consuming.
This is why Map Side Join comes into play.

So What is Map Side Join?
In Map Side join reducer does not have to do anything All work is done at Mapper side only such join known as Map Side join. In this type Shuffle and Sorting not needed Mapper output is final output.

There is only one condition to use Map Side join.
during the joining of two tables, one table should be small and one should be big

So What exactly is meant by the term small? 🤔

This is a property set up by hive(We can customize it as well using same property)
set hive.Mapjoin.smalltable.filsize

The value of this property indicates the size. If a file is under the value provided within the property, then the table is considered a small table, else a big table ✅

By default, the value of the property is set to 25MB

If one table is small and the other is big, HIVE internally will try to execute a Map Side join.
Only if all the criteria are met. 📋

If you want to do this Map Side join explicitly, you need to set the following properties
set hive,auto.convert.join = false
set hive.ignore.Mapjoin.hint = false

These two values by default are set to true ✔️

Once, the properties are set, you can now explicitly mention a Map side join
select /* +MAPJOIN(o) */ c.customer_id, o.order_id
from customers c join orders o
on c.id = o.customer_id

NOTE: MAPJOIN(o) is nothing but hint to HIVE that the o(orders) table is small table.

Since it is a Map side join, no reducer will be present and all work gets done on Mapper side only.
Benefits:
1.Improve processing time
2.Reduce data transfer between machines

Whenever possible we should try to write the join in a way that it’s a Map side join.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response