Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3571

Thrift serialization can be very inefficient

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Won't Fix
    • Impala 2.5.0
    • None
    • Distributed Exec

    Description

      I saw a table with deserialized text form taking 3mb disk space, but it has more than 180mb serialized memory form. I cannot attach the table here(from a customer). you can find that in CDH-37506.

      statestore stores only serialized form and then send to impala. After deserialization, it can become smaller than expected so frontend is able to store large metadata after deserialization. the problem usually happens when serializing in backend or deserializing in frontend. Fine control of split update size (see IMPALA-3499), can indirectly controls how much memory frontend uses to deserialize metadata.

      We need to study this more. maybe related IMPALA-3254.

      anyone know why? did I miss anything?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              HuaisiXu Huaisi Xu
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: