Uploaded image for project: 'Apache HAWQ (Retired)'
  1. Apache HAWQ (Retired)
  2. HAWQ-1549

Re-syncing standby fails even when stop mode is fast

    XMLWordPrintableJSON

Details

    Description

      Recently observed a behaviour while re-syncing standby from hawq command line.

      Here are the reproduction steps -

      1 - Open a client connection to hawq using psql
      2 - From a different terminal run command - hawq init standby -n -v -M fast
      3 - Standby resync fails with error

      20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-There are other connections to this instance, shutdown mode smart aborted
      
      20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-Either remove connections, or use 'hawq stop master -M fast' or 'hawq stop master -M immediate'
      
      20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-See hawq stop --help for all options
      
      20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[ERROR]:-Active connections. Aborting shutdown...
      
      20171113:03:49:21:158143 hawq_init:hdp3:gpadmin-[ERROR]:-Stop hawq cluster failed, exit
      

      4 - When -M (stop mode) is passed it should terminate existing client connections.

      The source of this issue appears to be tools/bin/hawq_ctl method _resync_standby. When this is called the command formation does not include stop_mode options as passed to the arguments.

       def _resync_standby(self):
              logger.info("Re-sync standby")
              cmd = "%s; hawq stop master -a;" % source_hawq_env
              check_return_code(local_ssh(cmd, logger), logger, "Stop hawq cluster failed, exit")
              ......
              ......
      

      I can start this and submit a PR when changes are done.

      Attachments

        Issue Links

          Activity

            People

              outofmemory Shubham Sharma
              outofmemory Shubham Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: