Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9570

Logs backed up on restart are kept forever

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.3, 7.0
    • Component/s: scripts and tools
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:

      Description

      When (re)starting Solr, the start script will backup any existing solr.log or solr_gc.log to a file solr_log_<date> and solr_gc_log_<date> respectively. That may be all good, but these old copies are never cleaned up, as they are not under the control of log4j.
      This issue will instead rotate solr.log properly on startup, delete old time-stamped files taking up place, back up (one generation only) of console-log and solr_gc.log in $SOLR_LOGS_DIR/archived/.

      1. SOLR-8370.patch
        9 kB
        Jan Høydahl

        Issue Links

          Activity

          Hide
          elyograg Shawn Heisey added a comment -

          If old logs are moved into a subdirectory, then the shell script can use the "find" utility to clean them up. This utility exists on most POSIX systems and although there are differences in capability between different systems, I think there is strong effort to accept the options for core usage like this on all platforms.

          It's possible to do something similar in Windows, but in typical style, it requires a syntax change depending on the Windows version.

          Show
          elyograg Shawn Heisey added a comment - If old logs are moved into a subdirectory, then the shell script can use the "find" utility to clean them up. This utility exists on most POSIX systems and although there are differences in capability between different systems, I think there is strong effort to accept the options for core usage like this on all platforms. It's possible to do something similar in Windows, but in typical style, it requires a syntax change depending on the Windows version.
          Hide
          janhoy Jan Høydahl added a comment -

          Actually my plan is as follows:
          For the main solr.log I propose to stop moving the old log before startup, and instead rely on log4j's default of continuing to append the existing file. Although it is nice with a new file on each startup, I don't think it is a must? This single change will fix the issue. You can just search for "Welcome to Apache Solr" to find each new startup.

          Wrt the solr_gc.log I'm also going to enable log rotation, which is supported since JRE7. But the gc log rotation works differently, it will first write to solr_gc.log.0.current until size limit and then rename it to solr_gc.log.0 and start writing solr_gc.log.1.current etc. On every restart it unfortunately always starts overwriting solr_gc.log.0.current. One option could be to on startup create $SOLR_LOGS_DIR/old/ (or wipe it if exists) and move all current gc logs there.

          Show
          janhoy Jan Høydahl added a comment - Actually my plan is as follows: For the main solr.log I propose to stop moving the old log before startup, and instead rely on log4j's default of continuing to append the existing file. Although it is nice with a new file on each startup, I don't think it is a must? This single change will fix the issue. You can just search for "Welcome to Apache Solr" to find each new startup. Wrt the solr_gc.log I'm also going to enable log rotation, which is supported since JRE7. But the gc log rotation works differently, it will first write to solr_gc.log.0.current until size limit and then rename it to solr_gc.log.0 and start writing solr_gc.log.1.current etc. On every restart it unfortunately always starts overwriting solr_gc.log.0.current . One option could be to on startup create $SOLR_LOGS_DIR/old/ (or wipe it if exists) and move all current gc logs there.
          Hide
          dsmiley David Smiley added a comment -

          I like the idea of keeping with Log4j's options rather than manual/hand-rolled file juggling. Can we tell Log4j, at some very early of point of startup (like before we've logged anything) to rotate the log? That'd be nice. I wonder if Log4j2 or Logback support such a configuration option – it seems like something very desirable. If we don't find such an option, I'm fine with Jan's recommendation of just accepting that the logs since boot aren't necessarily at the top.

          Show
          dsmiley David Smiley added a comment - I like the idea of keeping with Log4j's options rather than manual/hand-rolled file juggling. Can we tell Log4j, at some very early of point of startup (like before we've logged anything) to rotate the log? That'd be nice. I wonder if Log4j2 or Logback support such a configuration option – it seems like something very desirable. If we don't find such an option, I'm fine with Jan's recommendation of just accepting that the logs since boot aren't necessarily at the top.
          Hide
          elyograg Shawn Heisey added a comment -

          I think we could avoid creating a console logfile at all in most conditions, and not worry about rotating it at all. Here's what I think should happen with the console logging:

          • background, console NOT muted: redirect to console log.
          • background, console muted: redirect to /dev/null.
          • foreground: don't mute. Output to console, do not redirect or copy to console log.

          I don't know whether the first condition would ever actually happen, but if it can happen, we could have console log rotation only in that situation, rotating the console log through a fixed and limited set of filenames where the oldest one is simply deleted.

          The main log is rotated by log4j, but the script currently renames the active log on startup, so that a fresh log is always from server start. If a predictable string of text is logged only at service start and can be found with a backwards search from the end of the file, we might not need to have the script rename solr.log at startup. If log4j can do startup rotation, that would take care of the issue entirely.

          I'll take your word on the GC log. I haven't looked into it.

          Show
          elyograg Shawn Heisey added a comment - I think we could avoid creating a console logfile at all in most conditions, and not worry about rotating it at all. Here's what I think should happen with the console logging: background, console NOT muted: redirect to console log. background, console muted: redirect to /dev/null. foreground: don't mute. Output to console, do not redirect or copy to console log. I don't know whether the first condition would ever actually happen, but if it can happen, we could have console log rotation only in that situation, rotating the console log through a fixed and limited set of filenames where the oldest one is simply deleted. The main log is rotated by log4j, but the script currently renames the active log on startup, so that a fresh log is always from server start. If a predictable string of text is logged only at service start and can be found with a backwards search from the end of the file, we might not need to have the script rename solr.log at startup. If log4j can do startup rotation, that would take care of the issue entirely. I'll take your word on the GC log. I haven't looked into it.
          Hide
          janhoy Jan Høydahl added a comment -

          The file solr-8983-console.log is simply a redirect of all stdout/err from the start script. It is only written when in backgoround mode, and log4j's ConsoleAppender is always muted when the file is written. The file is useful since it will capture any errors happening outside of the logging system such as System.out.println or stdout/stderr printouts from some of Solr's dependencies etc. So we should never redirect stdout/err to /dev/null. I think we should try to do some simple rotation not to lose potential important info after a crash.

          The main log will append, and when we switch to Log4j2 we can implement startup rotation.

          Show
          janhoy Jan Høydahl added a comment - The file solr-8983-console.log is simply a redirect of all stdout/err from the start script. It is only written when in backgoround mode, and log4j's ConsoleAppender is always muted when the file is written. The file is useful since it will capture any errors happening outside of the logging system such as System.out.println or stdout/stderr printouts from some of Solr's dependencies etc. So we should never redirect stdout/err to /dev/null . I think we should try to do some simple rotation not to lose potential important info after a crash. The main log will append, and when we switch to Log4j2 we can implement startup rotation.
          Hide
          elyograg Shawn Heisey added a comment -

          The file is useful since it will capture any errors happening outside of the logging system

          Good point, and I did not think of that.

          If we'll always be attempting to mute log4j's CONSOLE output when in background mode, perhaps we can let the console log grow, because it will not normally get much output. Alternately we could rotate it through a simple fixed set of 10 filenames, changing a single 0-9 digit ... so there would be history for several start attempts.

          Show
          elyograg Shawn Heisey added a comment - The file is useful since it will capture any errors happening outside of the logging system Good point, and I did not think of that. If we'll always be attempting to mute log4j's CONSOLE output when in background mode, perhaps we can let the console log grow, because it will not normally get much output. Alternately we could rotate it through a simple fixed set of 10 filenames, changing a single 0-9 digit ... so there would be history for several start attempts.
          Hide
          janhoy Jan Høydahl added a comment -

          Another option is to log stdout/err to the System log. In Linux that can be done by piping java ... 2>&1 | logger. Personally I like a plain file better...

          Show
          janhoy Jan Høydahl added a comment - Another option is to log stdout/err to the System log. In Linux that can be done by piping java ... 2>&1 | logger . Personally I like a plain file better...
          Hide
          erickerickson Erick Erickson added a comment -

          Reasonable points, although I can quibble about whether there should ever be System.out.println calls anywhere. I suppose there are some situations where we may want to give feedback before the logging stuff is initialized.

          I can report from "in the field" that more than one client has run out of disk space because the CONSOLE log grew without bound. That's taken care of recently, so just as long as we keep it so ....

          Show
          erickerickson Erick Erickson added a comment - Reasonable points, although I can quibble about whether there should ever be System.out.println calls anywhere. I suppose there are some situations where we may want to give feedback before the logging stuff is initialized. I can report from "in the field" that more than one client has run out of disk space because the CONSOLE log grew without bound. That's taken care of recently, so just as long as we keep it so ....
          Hide
          janhoy Jan Høydahl added a comment -

          Patch:

          * SOLR-9570: Various log tidying now happens at Solr startup:
            Old solr_log_<date> and solr_gc_log_<date> files are removed, avoiding disks to fill up,
            solr.log.X files are rotated, preserving solr.log from last run in solr.log.1, solr.log.1 => solr.log.2 etc
            solr-*-console.log files are moved into $SOLR_LOGS_DIR/archived/ instead of being overwritten
            Last JVM garbage collection log solr_gc.log is moved into $SOLR_LOGS_DIR/archived/, overwriting previous (janhoy)  
          

          This adds a new UtilsTool to SolrCLI, which handles the tasks in Java code, removing complexity from scripts. The scripts simply call the tool like this

          run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -remove_old_solr_logs 7 || echo "Failed removing old solr logs"
          run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -archive_gc_logs        || echo "Failed archiving old GC logs"
          run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -archive_console_logs   || echo "Failed archiving old console logs"
          run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -rotate_solr_logs 9     || echo "Failed rotating old solr logs"
          

          See patch for documentation of each tool option.

          The tool -rotate_solr_logs <gen> in particular, is a workaround for backing up {solr.log}}, in that it will rotate existing solr.log.N the same way that log4j does. This code can be removed when we upgrade to newer log4j which can handle this itself.

          Potential issues:

          • We now keep only the last solr_gc.log in logs/archived/, will be overwritten on next start. However, this will be fixed in SOLR-7506 when we introduce GC log rotation
          • We now archive latest solr-NN-console.log into logs/archived/. This is a 1-generation rotation only, and on two consecutive restarts only the last run console log is kept. Earlier we kept timestamped logs forever...

          Tested on OSX and on Windows 10. Further test and feedback welcome. I'll commit this on thursday if all is sunshine

          Show
          janhoy Jan Høydahl added a comment - Patch: * SOLR-9570: Various log tidying now happens at Solr startup: Old solr_log_<date> and solr_gc_log_<date> files are removed, avoiding disks to fill up, solr.log.X files are rotated, preserving solr.log from last run in solr.log.1, solr.log.1 => solr.log.2 etc solr-*-console.log files are moved into $SOLR_LOGS_DIR/archived/ instead of being overwritten Last JVM garbage collection log solr_gc.log is moved into $SOLR_LOGS_DIR/archived/, overwriting previous (janhoy) This adds a new UtilsTool to SolrCLI, which handles the tasks in Java code, removing complexity from scripts. The scripts simply call the tool like this run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -remove_old_solr_logs 7 || echo "Failed removing old solr logs" run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -archive_gc_logs || echo "Failed archiving old GC logs" run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -archive_console_logs || echo "Failed archiving old console logs" run_tool utils -s "$DEFAULT_SERVER_DIR" -l "$SOLR_LOGS_DIR" -rotate_solr_logs 9 || echo "Failed rotating old solr logs" See patch for documentation of each tool option. The tool -rotate_solr_logs <gen> in particular, is a workaround for backing up {solr.log}}, in that it will rotate existing solr.log.N the same way that log4j does. This code can be removed when we upgrade to newer log4j which can handle this itself. Potential issues: We now keep only the last solr_gc.log in logs/archived/ , will be overwritten on next start. However, this will be fixed in SOLR-7506 when we introduce GC log rotation We now archive latest solr-NN-console.log into logs/archived/ . This is a 1-generation rotation only, and on two consecutive restarts only the last run console log is kept. Earlier we kept timestamped logs forever... Tested on OSX and on Windows 10. Further test and feedback welcome. I'll commit this on thursday if all is sunshine
          Hide
          janhoy Jan Høydahl added a comment -

          This patch also fixes an old bug in the bin\solr.cmd script, where backing up of the solr.log file is locale-dependent, see

          REM backup log files (use current timestamp for backup name)
          For /f "tokens=2-4 delims=/ " %%a in ('date /t') do (set mydate=%%c-%%a-%%b)
          For /f "tokens=1-2 delims=/:" %%a in ("%TIME%") do (set mytime=%%a%%b)
          

          This code assumed a date format of MM/DD/YY while in Norwegian locale date /t prints as 17.10.2016. This code will be gone after this patch

          Show
          janhoy Jan Høydahl added a comment - This patch also fixes an old bug in the bin\solr.cmd script, where backing up of the solr.log file is locale-dependent, see REM backup log files (use current timestamp for backup name) For /f "tokens=2-4 delims=/ " %%a in ('date /t') do (set mydate=%%c-%%a-%%b) For /f "tokens=1-2 delims=/:" %%a in ( "%TIME%" ) do (set mytime=%%a%%b) This code assumed a date format of MM/DD/YY while in Norwegian locale date /t prints as 17.10.2016 . This code will be gone after this patch
          Hide
          janhoy Jan Høydahl added a comment - - edited

          Sorry, commit log had wrong issue-number.

          Committed master (97761966f30557c33b3bbb131ce64ea7905ae213)
          https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=97761966f30557c33b3bbb131ce64ea7905ae213

          Branch_6x (ed3b268d62f23e212c410cb35aa4318afa088f55)
          https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=ed3b268d62f23e212c410cb35aa4318afa088f55

          Show
          janhoy Jan Høydahl added a comment - - edited Sorry, commit log had wrong issue-number. Committed master (97761966f30557c33b3bbb131ce64ea7905ae213) https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=97761966f30557c33b3bbb131ce64ea7905ae213 Branch_6x (ed3b268d62f23e212c410cb35aa4318afa088f55) https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=ed3b268d62f23e212c410cb35aa4318afa088f55
          Hide
          janhoy Jan Høydahl added a comment -

          Resolving this. If you discover issues around the new log cleanup/move tasks, please reopen this, or create a new issue.

          Show
          janhoy Jan Høydahl added a comment - Resolving this. If you discover issues around the new log cleanup/move tasks, please reopen this, or create a new issue.
          Hide
          noble.paul Noble Paul added a comment -
          Show
          noble.paul Noble Paul added a comment - is this causing the failure https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1996/
          Hide
          janhoy Jan Høydahl added a comment -

          Reopening to fix a test failure

          Show
          janhoy Jan Høydahl added a comment - Reopening to fix a test failure
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit af88e7f54d2042a2ff5c3bef7b6016084ad15cec in lucene-solr's branch refs/heads/master from Jan Høydahl
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=af88e7f ]

          SOLR-9570: Fix test failures and start using SolrTestCaseJ4's createTempDir mm

          Show
          jira-bot ASF subversion and git services added a comment - Commit af88e7f54d2042a2ff5c3bef7b6016084ad15cec in lucene-solr's branch refs/heads/master from Jan Høydahl [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=af88e7f ] SOLR-9570 : Fix test failures and start using SolrTestCaseJ4's createTempDir mm
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1b626fa0ca0152a024cd68fbf17e042469e6e3a2 in lucene-solr's branch refs/heads/branch_6x from Jan Høydahl
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1b626fa ]

          SOLR-9570: Fix test failures and start using SolrTestCaseJ4's createTempDir mm

          (cherry picked from commit af88e7f)

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1b626fa0ca0152a024cd68fbf17e042469e6e3a2 in lucene-solr's branch refs/heads/branch_6x from Jan Høydahl [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1b626fa ] SOLR-9570 : Fix test failures and start using SolrTestCaseJ4's createTempDir mm (cherry picked from commit af88e7f)
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          Closing after 6.3.0 release.

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - Closing after 6.3.0 release.

            People

            • Assignee:
              janhoy Jan Høydahl
              Reporter:
              janhoy Jan Høydahl
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development