Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36094

Group SQL component error messages in Spark error class JSON file

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.2.0
    • Fix Version/s: None
    • Component/s: Spark Core, SQL
    • Labels:
      None

      Description

      To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the mailing list and introduced in SPARK-34920).
      In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE.

      We will start with the SQL component first.
      As a starting point, we can build off the exception grouping done in SPARK-33539. In total, there are ~1000 error messages to group split across three files (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In this ticket, each of these files is split into chunks of ~20 errors for refactoring.

      Here is an example PR that groups a few error messages in the QueryCompilationErrors class: PR 33309.

      Guidelines:

      • Error classes should be unique and sorted in alphabetical order.
      • Error classes should be unified as much as possible to improve auditing. If error messages are similar, group them into a single error class and add parameters to the error message.
      • SQLSTATE should match the ANSI/ISO standard, without introducing new classes or subclasses. See the error guidelines; if none of them match, the SQLSTATE field should be empty.
      • The Throwable should extend SparkThrowable; see SparkArithmeticException as an example of how to mix SparkThrowable into a base Exception type.

      We will improve error message quality as a follow-up.

        Attachments

        1.
        Refactor a few query compilation errors to use error classes Sub-task Resolved Karen Feng
        2.
        add error class for StructType.findNestedField Sub-task Resolved Wenchen Fan
        3.
        Define the new exception that mix SparkThrowable for all base exe in QueryExecutionErrors Sub-task Resolved PengLei
        4.
        Refactor first set of 20 query execution errors to use error classes Sub-task Resolved PengLei
        5.
        Refactor second set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        6.
        Refactor third set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        7.
        Refactor fourth set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        8.
        Refactor fifth set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        9.
        Refactor sixth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        10.
        Refactor seventh set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        11.
        Refactor eighth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        12.
        Refactor ninth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        13.
        Refactor tenth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        14.
        Refactor eleventh set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        15.
        Refactor twelfth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        16.
        Refactor thirteenth set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        17.
        Refactor fourteenth set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        18.
        Refactor fifteenth set of 20 query execution errors to use error classes Sub-task In Progress Unassigned
        19.
        Refactor sixteenth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        20.
        Refactor seventeenth set of 20 query execution errors to use error classes Sub-task Open Unassigned
        21.
        Refactor first set of 20 query parsing errors to use error classes Sub-task In Progress Unassigned
        22.
        Refactor second set of 20 query parsing errors to use error classes Sub-task Open Unassigned
        23.
        Refactor third set of 20 query parsing errors to use error classes Sub-task Open Unassigned
        24.
        Refactor fourth set of 20 query parsing errors to use error classes Sub-task Open Unassigned
        25.
        Refactor first set of 20 query compilation errors to use error classes Sub-task Open Unassigned
        26.
        Rename error classes with _ERROR suffix Sub-task Resolved dgd_contributor

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              karenfeng Karen Feng
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: