Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22814

JDBC support date/timestamp type as partitionColumn

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.2, 2.2.1
    • Fix Version/s: 2.4.0
    • Component/s: SQL
    • Labels:
      None
    • Flags:
      Patch

      Description

      In spark, you can partition MySQL queries by partitionColumn.
      val df = (spark.read.jdbc(url=jdbcUrl,
      table="employees",
      columnName="emp_no",
      lowerBound=1L,
      upperBound=100000L,
      numPartitions=100,
      connectionProperties=connectionProperties))
      display(df)

      But, partitionColumn must be a numeric column from the table.
      However, there are lots of table, which has no primary key, and has some date/timestamp indexes.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                maropu Takeshi Yamamuro
                Reporter:
                charliechen Yuechen Chen
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 168h
                  168h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified