Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2787

Support de-duplicate records in Impala

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.2.4
    • Fix Version/s: None
    • Component/s: Backend
    • Labels:
      None

      Description

      Two use cases:

      Use Case 1: Remove duplicate rows where the all data in the row is identical
      Use Case 2: Remove duplicate rows where the all data in the row is identicalm except for a small number of columns

      Rather than using SELECT DISTINCT from one table to another table, it would be great if Impala can support it natively and remove duplicate records on the table itself without a new table.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ericlin Eric Lin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: