Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
1.3.0, 1.3.1
-
None
-
None
Description
Getting only one record repeated in the RDD and repeated field value:
I have a table like:
attuid name email 12 john john@appp.com 23 tom tom@appp.com 34 tony tony@appp.com
My code:
JavaSparkContext sc = new JavaSparkContext(sparkConf); String url = "...."; java.util.Properties prop = new Properties(); List<JDBCPartition> partitionList = new ArrayList<>(); //int i; partitionList.add(new JDBCPartition("1=1", 0)); List<StructField> fields = new ArrayList<StructField>(); fields.add(DataTypes.createStructField("attuid", DataTypes.StringType, true)); fields.add(DataTypes.createStructField("name", DataTypes.StringType, true)); fields.add(DataTypes.createStructField("email", DataTypes.StringType, true)); StructType schema = DataTypes.createStructType(fields); JDBCRDD jdbcRDD = new JDBCRDD(sc.sc(), JDBCRDD.getConnector("oracle.jdbc.OracleDriver", url, prop), schema, " USERS", new String[]{"attuid", "name", "email"}, new Filter[]{ }, partitionList.toArray(new JDBCPartition[0]) ); System.out.println("count before to Java RDD=" + jdbcRDD.cache().count()); JavaRDD<Row> jrdd = jdbcRDD.toJavaRDD(); System.out.println("count=" + jrdd.count()); List<Row> lr = jrdd.collect(); for (Row r : lr) { for (int ii = 0; ii < r.length(); ii++) { System.out.println(r.getString(ii)); } }
===========================
result is :
34 tony tony@appp.com 34 tony tony@appp.com 34 tony tony@appp.com