database - How to optimize join between a small table and large table? -


a small database (one machine n1) , large database(on machine n2, billion records) need joined. app server need read data db servers memory. should read small db first ? , read second db ?

how can join executed fastest ? how done in real life in general ?

generally, should try push processing database. maybe big database server can pull small 1 local , process on server.

if want process in application common , optimal strategy perform hash join. convert small data set hash table. then, can probe items big data set against hash table. requires little memory, little cpu , can stream big data set.

this strategy works if join condition equality (e.g. orders.customerid = customers.id) , 1 of 2 sets small enough fit in memory.


Comments