Uploaded image for project: 'Apache HAWQ (Retired)'
  1. Apache HAWQ (Retired)
  2. HAWQ-1433

ALTER RESOURCE QUEUE DDL does not check the format of attribute MEMORY_CLUSTER_LIMIT and CORE_CLUSTER_LIMIT

    XMLWordPrintableJSON

Details

    Description

      Shubham Sharma <topologicalqubit@gmail.com>
      2:11 PM (2 hours ago)

      to user, sebastiao.gone.
      Hello Sebastio, I think you have encountered the following issue -

      1 - Problem - alter resource queue pg_default with (CORE_LIMIT_CLUSTER/MEMORY_LIMIT_CLUSTER=90);

      gpadmin=# select * from pg_resqueue;
      rsqname | parentoid | activestats | memorylimit | corelimit | resovercommit | allocpolicy | vsegresourcequota | nvsegupperlimit | nvseglowerlimit | nvseg
      upperlimitperseg | nvseglowerlimitperseg | creationtime | updatetime | status
      -------------------------------------------------------------------------------------------------------------------------------
      -------------------------------------------------------------------------------------
      pg_root | 0 | -1 | 100% | 100% | 2 | even | | 0 | 0 |
      0 | 0 | | | branch
      pg_default | 9800 | 20 | 50% | 50% | 2 | even | mem:256mb | 0 | 0 |
      0 | 0 | | 2017-04-12 22:45:55.056102+01 |
      (2 rows)

      gpadmin=# alter resource queue pg_default with (CORE_LIMIT_CLUSTER=90);
      ALTER QUEUE

      gpadmin=# select * from test;
      a

      (0 rows)
      gpadmin=# \q

      2 - restart hawq cluster

      3 - ERROR

      [gpadmin@hdp3 ~]$ psql
      psql (8.2.15)
      Type "help" for help.
      gpadmin=# select * from test;
      WARNING: FD 31 having errors raised. errno 104
      ERROR: failed to register in resource manager, failed to receive content (pquery.c:787)

      3 - alter resource queue pg_default with (CORE_LIMIT_CLUSTER/MEMORY_LIMIT_CLUSTER=50%); --Let's switch back
      ! Not allowed !
      alter resource queue pg_default with (CORE_LIMIT_CLUSTER=50%);
      WARNING: FD 33 having errors raised. errno 104
      ERROR: failed to register in resource manager, failed to receive content (resqueuecommand.c:364)

      4 - How to fix - Please be extra careful while using this.
      gpadmin=# begin;
      BEGIN
      gpadmin=# set allow_system_table_mods='dml';
      SET
      gpadmin=# select * from pg_resqueue where corelimit=90;
      rsqname | parentoid | activestats | memorylimit | corelimit | resovercommit | allocpolicy | vsegresourcequota | nvsegupperlimit | nvseglowerlimit | nvseg
      upperlimitperseg | nvseglowerlimitperseg | creationtime | updatetime | status
      -------------------------------------------------------------------------------------------------------------------------------
      -------------------------------------------------------------------------------------
      pg_default | 9800 | 20 | 50% | 90 | 2 | even | mem:256mb | 0 | 0 |
      0 | 0 | | 2017-04-12 22:59:30.092823+01 |
      (1 row)
      gpadmin=# update pg_resqueue set corelimit='50%' where corelimit=90;
      UPDATE 1
      gpadmin=# commit;
      COMMIT

      5 - System should be back to normal

      gpadmin=# select * from test;
      a

      (0 rows)

      Regards,
      Shubh

      Attachments

        Issue Links

          Activity

            People

              xsheng Xiang Sheng
              yjin Yi Jin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: