Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 3.1.0, 3.0.1
    • Fix Version/s: 3.1.1, 3.0.4
    • Component/s: Configuration, Manager
    • Labels:
      None

      Description

      traffic_server will exit if exists bad rules in remap.config, whenever startup or reload ( traffic_line -x ).

      If remap.config is not edited correctly, the server/cluster will crash when reloading.

      It's better to pre-check the remap table and not switch to the new table if remap.config is not enough correct.

        Activity

        Hide
        Leif Hedstrom added a comment -

        Please try the changes on trunk, I'm closing this for now.

        Show
        Leif Hedstrom added a comment - Please try the changes on trunk, I'm closing this for now.
        Hide
        Conan Wang added a comment -

        My test:
        1. duplicate remap rule and adding non-exist remap plugin now will not crash running ATS. I can see "WARNING: failed to reload remap.config, not replacing!" in traffic.out.

        2. though bad remap.config is not reloaded, seem "TSRemapDeleteInstance" (TSfree ihandle) of my remap plugin is called according to my debug log. However, my remap plugin work well like before, that is, I can still retrieve the stored variable in ihandle. I don't know whether there is potential problem.

        3. return TS_ERROR in "TSRemapNewInstance" still crash.

        Server {0x1079f6000} ERROR: Failed to create new instance for plugin /Users/conan/box/ts-trunk/libexec/trafficserver/add_header.so (non-zero retval)... bailing out
        FATAL: UnixUDPNet.cc:295: failed assert `event != NULL`
        /Users/conan/box/ts-trunk/bin/traffic_server - STACK TRACE: 
        0   libtsutil.3.dylib                   0x0000000102fb05ab ink_fatal_va + 283
        1   libtsutil.3.dylib                   0x0000000102fb08b4 ink_fatal + 356
        2   libtsutil.3.dylib                   0x0000000102fadcff _ink_assert + 271
        3   traffic_server                      0x0000000102684f45 _ZN19UDPReadContinuationD1Ev + 85
        4   traffic_server                      0x00000001026850c6 _ZN14ClassAllocatorI19UDPReadContinuationE4$_44D1Ev + 24
        5   traffic_server                      0x000000010268f59d _ZN14ClassAllocatorI19UDPReadContinuationED1Ev + 37
        6   traffic_server                      0x00000001026850eb __tcf_3 + 27
        7   libsystem_c.dylib                   0x00007fff872217c8 __cxa_finalize + 274
        8   libsystem_c.dylib                   0x00007fff87221652 exit + 18
        9   traffic_server                      0x00000001024aca9c _ZN10UrlRewrite17load_remap_pluginEPPciP11url_mappingS0_iiPi + 8284
        10  traffic_server                      0x00000001024b5473 _ZN10UrlRewrite10BuildTableEv + 12757
        11  traffic_server                      0x00000001024b8dc5 _ZN10UrlRewriteC1EPKc + 2165
        12  traffic_server                      0x00000001023b450e _Z16reloadUrlRewritev + 350
        13  traffic_server                      0x00000001023b5296 _ZN21UR_UpdateContinuation19file_update_handlerEiPv + 24
        14  traffic_server                      0x00000001023d7434 _ZN12Continuation11handleEventEiPv + 148
        15  traffic_server                      0x00000001026a787a _ZN7EThread13process_eventEP5Eventi + 418
        16  traffic_server                      0x00000001026a6c85 _ZN7EThread7executeEv + 191
        17  traffic_server                      0x00000001026a6174 _ZL21spawn_thread_internalPv + 132
        18  libsystem_c.dylib                   0x00007fff8722e8bf _pthread_start + 335
        19  libsystem_c.dylib                   0x00007fff87231b75 thread_start + 13
        
        Show
        Conan Wang added a comment - My test: 1. duplicate remap rule and adding non-exist remap plugin now will not crash running ATS. I can see "WARNING: failed to reload remap.config, not replacing!" in traffic.out. 2. though bad remap.config is not reloaded, seem "TSRemapDeleteInstance" (TSfree ihandle) of my remap plugin is called according to my debug log. However, my remap plugin work well like before, that is, I can still retrieve the stored variable in ihandle. I don't know whether there is potential problem. 3. return TS_ERROR in "TSRemapNewInstance" still crash. Server {0x1079f6000} ERROR: Failed to create new instance for plugin /Users/conan/box/ts-trunk/libexec/trafficserver/add_header.so (non-zero retval)... bailing out FATAL: UnixUDPNet.cc:295: failed assert `event != NULL` /Users/conan/box/ts-trunk/bin/traffic_server - STACK TRACE: 0 libtsutil.3.dylib 0x0000000102fb05ab ink_fatal_va + 283 1 libtsutil.3.dylib 0x0000000102fb08b4 ink_fatal + 356 2 libtsutil.3.dylib 0x0000000102fadcff _ink_assert + 271 3 traffic_server 0x0000000102684f45 _ZN19UDPReadContinuationD1Ev + 85 4 traffic_server 0x00000001026850c6 _ZN14ClassAllocatorI19UDPReadContinuationE4$_44D1Ev + 24 5 traffic_server 0x000000010268f59d _ZN14ClassAllocatorI19UDPReadContinuationED1Ev + 37 6 traffic_server 0x00000001026850eb __tcf_3 + 27 7 libsystem_c.dylib 0x00007fff872217c8 __cxa_finalize + 274 8 libsystem_c.dylib 0x00007fff87221652 exit + 18 9 traffic_server 0x00000001024aca9c _ZN10UrlRewrite17load_remap_pluginEPPciP11url_mappingS0_iiPi + 8284 10 traffic_server 0x00000001024b5473 _ZN10UrlRewrite10BuildTableEv + 12757 11 traffic_server 0x00000001024b8dc5 _ZN10UrlRewriteC1EPKc + 2165 12 traffic_server 0x00000001023b450e _Z16reloadUrlRewritev + 350 13 traffic_server 0x00000001023b5296 _ZN21UR_UpdateContinuation19file_update_handlerEiPv + 24 14 traffic_server 0x00000001023d7434 _ZN12Continuation11handleEventEiPv + 148 15 traffic_server 0x00000001026a787a _ZN7EThread13process_eventEP5Eventi + 418 16 traffic_server 0x00000001026a6c85 _ZN7EThread7executeEv + 191 17 traffic_server 0x00000001026a6174 _ZL21spawn_thread_internalPv + 132 18 libsystem_c.dylib 0x00007fff8722e8bf _pthread_start + 335 19 libsystem_c.dylib 0x00007fff87231b75 thread_start + 13
        Hide
        Leif Hedstrom added a comment -

        On 2, yeas, as expected. It still goes through all instantiatiobs, but "unwinds" on failure . Since this is no different for your plugin whether it succeeds or not, it shouldn't matter. You still have to deal with proper uninstantiation on every reload.

        On 3, bah, missed that special case, nice catch. I'll look at it tomorrow.

        Show
        Leif Hedstrom added a comment - On 2, yeas, as expected. It still goes through all instantiatiobs, but "unwinds" on failure . Since this is no different for your plugin whether it succeeds or not, it shouldn't matter. You still have to deal with proper uninstantiation on every reload. On 3, bah, missed that special case, nice catch. I'll look at it tomorrow.
        Hide
        Leif Hedstrom added a comment -

        Reopen to deal with plugins failing to reinstatiate,

        Show
        Leif Hedstrom added a comment - Reopen to deal with plugins failing to reinstatiate,
        Hide
        Leif Hedstrom added a comment -

        Let me know if this last commit fixes it.

        Show
        Leif Hedstrom added a comment - Let me know if this last commit fixes it.
        Hide
        Leif Hedstrom added a comment -

        Resolving again, please reopen with more issues .

        Show
        Leif Hedstrom added a comment - Resolving again, please reopen with more issues .
        Hide
        Conan Wang added a comment -

        Fix that. But returning TS_ERROR in "TSRemapInit" also crashes. Sorry for missing that test in advance.

        Show
        Conan Wang added a comment - Fix that. But returning TS_ERROR in "TSRemapInit" also crashes. Sorry for missing that test in advance.
        Hide
        Conan Wang added a comment -

        My test:
        return TS_ERROR directly in TSRemapInit

        TSReturnCode⋅
        TSRemapInit(TSRemapInterface* api_info, char* errbuf, int errbuf_size)
        {
           return TS_ERROR;
        
        [Oct 15 14:51:28.411] Server {0x7fff739b6960} WARNING: Failed to initialize plugin /Users/conan/box/ts-trunk/libexec/trafficserver/add_header.so (non-zero retval) ... bailing out
        FATAL: UnixUDPNet.cc:295: failed assert `event != NULL`
        /Users/conan/box/ts-trunk/bin/traffic_server - STACK TRACE: 
        0   libtsutil.3.dylib                   0x0000000105af6e3b ink_fatal_va + 283
        1   libtsutil.3.dylib                   0x0000000105af7144 ink_fatal + 356
        2   libtsutil.3.dylib                   0x0000000105af458f _ink_assert + 271
        3   traffic_server                      0x00000001051c9ed5 _ZN19UDPReadContinuationD1Ev + 85
        4   traffic_server                      0x00000001051ca056 _ZN14ClassAllocatorI19UDPReadContinuationE4$_44D1Ev + 24
        5   traffic_server                      0x00000001051d452d _ZN14ClassAllocatorI19UDPReadContinuationED1Ev + 37
        6   traffic_server                      0x00000001051ca07b __tcf_3 + 27
        7   libsystem_c.dylib                   0x00007fff85e547c8 __cxa_finalize + 274
        8   libsystem_c.dylib                   0x00007fff85e54652 exit + 18
        9   traffic_server                      0x0000000104ff0b18 _ZN10UrlRewrite17load_remap_pluginEPPciP11url_mappingS0_iiPi + 4552
        10  traffic_server                      0x0000000104ffa383 _ZN10UrlRewrite10BuildTableEv + 12757
        11  traffic_server                      0x0000000104ffdcf5 _ZN10UrlRewriteC1EPKc + 2165
        12  traffic_server                      0x0000000104ef9ada _Z18init_reverse_proxyv + 138
        13  traffic_server                      0x0000000104f623bd _Z20init_HttpProxyServerv + 13
        14  traffic_server                      0x0000000104ecebd9 main + 6073
        15  traffic_server                      0x0000000104e582a4 start + 52
        [Oct 15 14:51:28.747] Manager {0x7fff739b6960} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 6: Abort trap: 6
        
        Show
        Conan Wang added a comment - My test: return TS_ERROR directly in TSRemapInit TSReturnCode⋅ TSRemapInit(TSRemapInterface* api_info, char * errbuf, int errbuf_size) { return TS_ERROR; [Oct 15 14:51:28.411] Server {0x7fff739b6960} WARNING: Failed to initialize plugin /Users/conan/box/ts-trunk/libexec/trafficserver/add_header.so (non-zero retval) ... bailing out FATAL: UnixUDPNet.cc:295: failed assert `event != NULL` /Users/conan/box/ts-trunk/bin/traffic_server - STACK TRACE: 0 libtsutil.3.dylib 0x0000000105af6e3b ink_fatal_va + 283 1 libtsutil.3.dylib 0x0000000105af7144 ink_fatal + 356 2 libtsutil.3.dylib 0x0000000105af458f _ink_assert + 271 3 traffic_server 0x00000001051c9ed5 _ZN19UDPReadContinuationD1Ev + 85 4 traffic_server 0x00000001051ca056 _ZN14ClassAllocatorI19UDPReadContinuationE4$_44D1Ev + 24 5 traffic_server 0x00000001051d452d _ZN14ClassAllocatorI19UDPReadContinuationED1Ev + 37 6 traffic_server 0x00000001051ca07b __tcf_3 + 27 7 libsystem_c.dylib 0x00007fff85e547c8 __cxa_finalize + 274 8 libsystem_c.dylib 0x00007fff85e54652 exit + 18 9 traffic_server 0x0000000104ff0b18 _ZN10UrlRewrite17load_remap_pluginEPPciP11url_mappingS0_iiPi + 4552 10 traffic_server 0x0000000104ffa383 _ZN10UrlRewrite10BuildTableEv + 12757 11 traffic_server 0x0000000104ffdcf5 _ZN10UrlRewriteC1EPKc + 2165 12 traffic_server 0x0000000104ef9ada _Z18init_reverse_proxyv + 138 13 traffic_server 0x0000000104f623bd _Z20init_HttpProxyServerv + 13 14 traffic_server 0x0000000104ecebd9 main + 6073 15 traffic_server 0x0000000104e582a4 start + 52 [Oct 15 14:51:28.747] Manager {0x7fff739b6960} ERROR: [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 6: Abort trap: 6
        Hide
        Leif Hedstrom added a comment -

        Argh, I missed one place ... Fixing it now, thanks for cathing it.

        And just to be sure, note that we'll fail to startup the "first" time if the remap file is broken. This is intentional, and I would be hard to convince otherwise. What the changes do is to prevent a running server to exit when you reload the config.

        Show
        Leif Hedstrom added a comment - Argh, I missed one place ... Fixing it now, thanks for cathing it. And just to be sure, note that we'll fail to startup the "first" time if the remap file is broken. This is intentional, and I would be hard to convince otherwise. What the changes do is to prevent a running server to exit when you reload the config.
        Hide
        Brian Geffon added a comment -

        Backported to 3.0.x in commit 2f9ccb0e21cdcf8392ee15f909f2b12d8b35d991

        Show
        Brian Geffon added a comment - Backported to 3.0.x in commit 2f9ccb0e21cdcf8392ee15f909f2b12d8b35d991

          People

          • Assignee:
            Brian Geffon
            Reporter:
            Conan Wang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development