Description
Currently merge join and merge cogroup use two DAGs - the first DAG creates the index file in hdfs and second DAG does the merge join. Similar to replicate join, we can broadcast the index file and cache it and use it in merge join and merge cogroup. This will give better performance and also eliminate need for the second DAG.
Attachments
Attachments
Issue Links
- links to