Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19751

Create Data frame API fails with a self referencing bean

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • SQL
    • None

    Description

      createDataset API throws a stack overflow exception when we try creating a
      Dataset using a bean encoder. The bean is self referencing

      BEAN:
      public class HierObj implements Serializable {
      String name;
      List<HierObj> children;

      public String getName()

      { return name; }

      public void setName(String name)

      { this.name = name; }

      public List<HierObj> getChildren()

      { return children; }

      public void setChildren(List<HierObj> children)

      { this.children = children; }

      }

      // create an object
      HierObj hierObj = new HierObj();
      hierObj.setName("parent");
      List children = new ArrayList();

      HierObj child1 = new HierObj();
      child1.setName("child1");
      HierObj child2 = new HierObj();
      child2.setName("child2");
      children.add(child1);
      children.add(child2);

      hierObj.setChildren(children);

      // create a dataset
      Dataset ds = sparkSession().createDataset(Arrays.asList(hierObj), Encoders.bean(HierObj.class));

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            maropu Takeshi Yamamuro
            avinashmeda Avinash Venkateshaiah
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment