Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20156

Java String toLowerCase "Turkish locale bug" causes Spark problems

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • Spark Shell
    • None
    • Ubunutu 16.04
      Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121)

    Description

      If the regional setting of the operation system is Turkish, the famous java locale problem occurs (https://jira.atlassian.com/browse/CONF-5931 or https://issues.apache.org/jira/browse/AVRO-1493).
      e.g :

      "SERDEINFO" lowers to "serdeınfo"
      "uniquetable" uppers to "UNİQUETABLE"

      work around :
      add -Duser.country=US -Duser.language=en to the end of the line
      SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Dscala.usejavacp=true"

      in spark-shell.sh

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            srowen Sean R. Owen
            serkan_tas Serkan Taş
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment