Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2814

improve the kudu-mapreduce handling of KuduTableOutputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 1.9.0
    • NA
    • java
    • None

    Description

      https://github.com/apache/kudu/blob/master/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableOutputFormat.java#L134-L135

       

      • MULTITON only allows one instance per thread, a second Unit Test fails (That seems like it would also cause problems when MR's JVM reuse feature is enabled)
      • whole point is so that we can have a static `KuduTableOutputFormat.getKuduTable` to have this `getTableFromContext` method in the KuduTableMapReduceUtil
      • `getTableFromContext` is used by two sample mapper implementations in kudu-client-tools, the  importcsvmapper and importparquetmapper:
        • Insert insert = this.table.newInsert();
          PartialRow row = insert.getRow();
      • the whole point of having an OutputFormat is to not have to worry how your data gets written when you work on the mapper/reducer
      • You write to the OutputFormat not get it’s internals and write yourself

       

      To Do: remove those static methods and think about how importparquetmapper and csvmapper can properly use a KuduTableOutputFormat without having to rely on its internals

      remove the Multiton if it's not needed any more.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              cvaliente Clemens Valiente
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: