Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25523

Multi thread execute sparkSession.read().jdbc(url, table, properties) problem

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 2.3.0
    • 2.3.0
    • SQL
    • None

    Description

      public static void test2() throws Exception{
      String ckUrlPrefix="jdbc:clickhouse://";
      String quote = "`";
      JdbcDialects.registerDialect(new JdbcDialect() {
      @Override
      public boolean canHandle(String url)

      { return url.startsWith(ckUrlPrefix); }

      @Override
      public String quoteIdentifier(String colName)

      { return quote + colName + quote; }

      });

      SparkSession spark = initSpark();
      String ckUrl = "jdbc:clickhouse://192.168.2.148:8123/default";
      Properties ckProp = new Properties();
      ckProp.put("user", "default");
      ckProp.put("password", "");

      String prestoUrl = "jdbc:presto://192.168.2.148:9002/mysql-xxx/xxx";
      Properties prestoUrlProp = new Properties();
      prestoUrlProp.put("user", "root");
      prestoUrlProp.put("password", "");

      // new Thread(()->

      { // spark.read() // .jdbc(ckUrl, "ontime", ckProp).show(); // }

      ).start();

      System.out.println("------------------------------------------------------");

      new Thread(()->

      { spark.read() .jdbc(prestoUrl, "tx_user", prestoUrlProp).show(); }

      ).start();

      System.out.println("------------------------------------------------------");

      new Thread(()->

      { Dataset<Row> load = spark.read() .format("com.vertica.spark.datasource.DefaultSource") .option("host", "192.168.1.102") .option("port", 5433) .option("user", "dbadmin") .option("password", "manager") .option("db", "test") .option("dbschema", "public") .option("table", "customers") .load(); load.printSchema(); load.show(); }

      ).start();
      System.out.println("------------------------------------------------------");
      }

      public static SparkSession initSpark() throws Exception

      { return SparkSession.builder() .master("spark://dsjkfb1:7077") //spark://dsjkfb1:7077 .appName("Test") .config("spark.executor.instances",3) .config("spark.executor.cores",2) .config("spark.cores.max",6) //.config("spark.default.parallelism",1) .config("spark.submit.deployMode","client") .config("spark.driver.memory","2G") .config("spark.executor.memory","3G") .config("spark.driver.maxResultSize", "2G") .config("spark.local.dir", "d:\\tmp") .config("spark.driver.host", "192.168.2.148") .config("spark.scheduler.mode", "FAIR") .config("spark.jars", "F:\\project\\xxx\\vertica-jdbc-7.0.1-0.jar," + "F:\\project\\xxx\\clickhouse-jdbc-0.1.40.jar," + "F:\\project\\xxx\\vertica-spark-connector-9.1-2.1.jar," + "F:\\project\\xxx\\presto-jdbc-0.189-mining.jar")  .getOrCreate(); }

       

       

      ----------------------------  The above is code ------------------------------

      question: If i open vertica jdbc , thread will pending forever.

      And driver loging like this:

       

      2018-09-26 10:32:51 INFO SharedState:54 - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/C:/Users/admin/Desktop/test-project/sparktest/spark-warehouse/').
      2018-09-26 10:32:51 INFO SharedState:54 - Warehouse path is 'file:/C:/Users/admin/Desktop/test-project/sparktest/spark-warehouse/'.
      2018-09-26 10:32:51 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2f70d6e2{/SQL,null,AVAILABLE,@Spark}
      2018-09-26 10:32:51 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1d66833d{/SQL/json,null,AVAILABLE,@Spark}
      2018-09-26 10:32:51 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65af6f3a{/SQL/execution,null,AVAILABLE,@Spark}
      2018-09-26 10:32:51 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@55012968{/SQL/execution/json,null,AVAILABLE,@Spark}
      2018-09-26 10:32:51 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@59e3f5aa{/static/sql,null,AVAILABLE,@Spark}
      2018-09-26 10:32:52 INFO StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint
      2018-09-26 10:32:52 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.4.232:49434) with ID 0
      2018-09-26 10:32:52 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.4.233:44834) with ID 2
      2018-09-26 10:32:52 INFO BlockManagerMasterEndpoint:54 - Registering block manager 192.168.4.232:35380 with 1458.6 MB RAM, BlockManagerId(0, 192.168.4.232, 35380, None)
      2018-09-26 10:32:52 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.4.231:42504) with ID 1
      2018-09-26 10:32:52 INFO BlockManagerMasterEndpoint:54 - Registering block manager 192.168.4.233:40882 with 1458.6 MB RAM, BlockManagerId(2, 192.168.4.233, 40882, None)
      2018-09-26 10:32:52 INFO BlockManagerMasterEndpoint:54 - Registering block manager 192.168.4.231:44682 with 1458.6 MB RAM, BlockManagerId(1, 192.168.4.231, 44682, None)
      2018-09-26 10:33:29 INFO ApplicationHandler:353 - Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26...

       

      no more.

      if I Annotate vertica jdbc code, everything will be ok.

      The vervica jdbc url is wrong , it's doesn't work.

      if i run vertica jdbc code alone, it will throw connection exception.That's right.

       

      but, when it run with other , it will pending. and spark ui show no any job exist.

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhiyin1233 huanghuai
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: