Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-4503

Phoenix-Spark plugin doesn't release zookeeper connections

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 4.11.0
    • None
    • None
    • None
    • HBase 1.2 on Linux (Ubuntu, CentOS)

    Description

      1. Phoenix-Spark plugin doesn't release zookeeper connections
      Example:

      for(int i=0; i < 50; i++){
      			Dataset<Row> df = sqlContext.read().format("org.apache.phoenix.spark")
      					.option("table", "\"Sales\"").option("zkUrl", "localhost:2181")
      					.load();
      			df.show(2);
      		}
      		Thread.sleep(1000*60); 
      

      When the above snippet is executed, we can see number of connections to 2181 increasing and not getting released until after the main thread wakes up from sleep and program ends as can be seen below (14 is the number of connections even before the program starts to run) :
      netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      14
      16:52:05
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      22
      16:52:15
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      38
      16:52:18
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      68
      16:52:23
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      100
      16:52:27
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      116
      16:52:32
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      116
      16:52:38
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      116
      16:52:52
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      116
      16:53:00
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      116
      16:53:24
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      14
      16:53:32
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      14
      16:53:34
      root@user1 ~ $

      2. Instead if "jdbc" format is used to create Spark Dataframe, the connection count doesn't shoot up
      Example:

      for(int i=0; i < 50; i++){			
      			Dataset<Row> df = sqlContext.read().format("jdbc")
      					.option("url", "jdbc:phoenix:localhost:2181")
      					.option("dbtable", "\"Sales\"")
      					.option("driver", "org.apache.phoenix.jdbc.PhoenixDriver")
      					.load();
      			df.show(2);
      		}
      		Thread.sleep(1000*60);	
      

      Connection counts during program execution(14 being the count before execution starts):

      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      14
      17:00:42
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      14
      17:00:43
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:00:46
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:00:50
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:00:55
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:01:12
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:01:18
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:01:28
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:01:34
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:01:37
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      16
      17:01:39
      root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
      14
      17:02:07

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              snalapure@dataken.net Suhas Nalapure
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: