Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32815

Fix LibSVM data source loading error on file paths with glob metacharacters

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.6, 3.0.1, 3.1.0
    • 2.4.8, 3.0.2, 3.1.0
    • MLlib
    • None

    Description

      SPARK-32810 fixed a long standing bug in a few Spark built-in data sources that fails to read files whose names contain glob metacharacters, such as [, ], {, }, etc.

      CSV and JSON data source on the Spark side were affected. We've also noticed that the LibSVM data source had the same code pattern that leads to the bug, so the fix https://github.com/apache/spark/pull/29659 included a fix for that data source as well, but it did not include a test for the LibSVM data source.

      This ticket tracks adding a test case for LibSVM, similar to the ones for CSV/JSON, to verify whether or not the fix works as intended.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            maxgekk Max Gekk Assign to me
            maxgekk Max Gekk
            Votes:
            0 Vote for this issue
            Watchers:
            3 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment