Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33445

Can't parse decimal type from csv file

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.1.3, 2.2.3, 2.3.4, 2.4.7
    • None
    • PySpark
    • None

    Description

      The attached file is a one column csv file containing decimals.

      Execute: mydf2 = spark_session.read.csv("tsd.csv", header=True, inferSchema=True)

      Then invoking mydf2.schema will result in error:

      ValueError: Could not parse datatype: decimal(6,-7)

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            bullsoverbears Punit Shah
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment