Uploaded image for project: 'HCatalog'
  1. HCatalog
  2. HCATALOG-294

HCatRecordReader does not take advantage of the laziness of LazyHCatTuple

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.4
    • 0.4.1
    • mapreduce
    • None

    Description

      Currently HCatRecordReader copies the underlying LazyHCatTuple into a DefaultHCatTuple. In cases where the underlying record is from an RCFile this is very inefficient, as it forces a read of all the columns. We need a way in this case to copy the underlying Writable returned by RCFileInputFormat and then just annotate it with necessary partition columns. In these cases the early projection provided by the caller should be ignored, though some remapping of the schema may be necessary to make sure that column positions appear correct to the caller.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gates Alan Gates
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: