Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7804

Incorrect results from JDBCRDD -- one record repeatly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 1.3.0, 1.3.1
    • None
    • SQL
    • None

    Description

      Getting only one record repeated in the RDD and repeated field value:

      I have a table like:

      attuid  name email
      12  john   john@appp.com
      23  tom   tom@appp.com
      34  tony  tony@appp.com
      

      My code:

       JavaSparkContext sc = new JavaSparkContext(sparkConf);
      
              String url = "....";
      
              java.util.Properties prop = new Properties();
      
              List<JDBCPartition> partitionList = new ArrayList<>();
      
              //int i;
      
              partitionList.add(new JDBCPartition("1=1", 0));
      
              
              List<StructField> fields = new ArrayList<StructField>();
              fields.add(DataTypes.createStructField("attuid", DataTypes.StringType, true));
              fields.add(DataTypes.createStructField("name", DataTypes.StringType, true));
              fields.add(DataTypes.createStructField("email", DataTypes.StringType, true));
              StructType schema = DataTypes.createStructType(fields);
              JDBCRDD jdbcRDD = new JDBCRDD(sc.sc(),
                      JDBCRDD.getConnector("oracle.jdbc.OracleDriver", url, prop),
                       
                      schema,
                      " USERS",
                      new String[]{"attuid", "name", "email"},
                      new Filter[]{ },
                      
                      partitionList.toArray(new JDBCPartition[0])
            
              );
      
          
              System.out.println("count before to Java RDD=" + jdbcRDD.cache().count());
              JavaRDD<Row> jrdd = jdbcRDD.toJavaRDD();
              System.out.println("count=" + jrdd.count());
              List<Row> lr = jrdd.collect();
              for (Row r : lr) {
                  for (int ii = 0; ii < r.length(); ii++) {
                      System.out.println(r.getString(ii));
                  }
              }
      

      ===========================
      result is :

      34
      tony
       tony@appp.com
      34
      tony
       tony@appp.com
      34
      tony 
       tony@appp.com
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            zwu.net@gmail.com Paul Wu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: