Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1622

Expose input split(s) accessed by a task in UI or logs

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • Web UI
    • None

    Description

      Right now it's hard to debug which input files or blocks therein have invalid data. The InputSplit for a HadoopRDD is not even exposed programmatically in Scala/Java (it's private[spark]).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            matei Matei Alexandru Zaharia
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment