Issue Details (XML | Word | Printable)

Key: NUTCH-342
Type: Bug Bug
Status: Open Open
Priority: Minor Minor
Assignee: Unassigned
Reporter: Chris Schneider
Votes: 1
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Nutch

Nutch commands log to nutch/logs/hadoop.logs by default

Created: 05/Aug/06 03:03 PM   Updated: 18/Aug/06 06:06 AM
Return to search
Component/s: None
Affects Version/s: 0.8
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works NUTCH-342.patch 2006-08-05 09:18 PM Chris Schneider 0.4 kB


 Description  « Hide
If (by default) Nutch commands are going to send their output to a file named "hadoop.log", then it seems like the default location for this file should be the same location where Hadoop is putting its hadoop.log file (i.e., $HADOOP_LOG_DIR). Currently, if I set HADOOP_LOG_DIR to a special location (via hadoop-env.sh), this has no effect on where Nutch commands send their output.

Some would probably suggest that I could just set NUTCH_LOG_DIR to $HADOOP_LOG_DIR myself. I still think that it should be defaulted this way in the nutch script. However, I'm unaware of an elegant way to modify such Nutch environment variables anyway. The hadoop-env.sh file provides a convenient place to modify Hadoop environment variables, but doing the same for Nutch environment variables presumably requires you to modify .bash_profile or a similar user script file (which is the way I used to accomplish this kind of thing with Nutch 0.7).



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Chris Schneider added a comment - 05/Aug/06 09:18 PM
Here's a patch that defaults NUTCH_LOG_DIR to $HADOOP_LOG_DIR and NUTCH_LOGFILE to $HADOOP_LOG_FILE.

Chris Schneider added a comment - 06/Aug/06 08:08 AM
I apologize for my confusion. I had been thinking that hadoop-env.sh was getting sourced when a Nutch command was run; it is not. Thus, $HADOOP_LOG_DIR and $HADOOP_LOG_FILE are not set when executing Nutch commands. For now, I think it makes most sense for me to set NUTCH_LOG_DIR and NUTCH_LOGFILE to the same locations as $HADOOP_LOG_DIR and $HADOOP_LOG_FILE via .bash_profile, etc. I consider this awkward, but am unsure about how best to address this design problem. I'm beginning to think that NUTCH_LOGFILE should default to something like "nutch-$USER-$COMMAND-`hostname`.log", which would seem more appropriate to find within the $NUTCH_HOME/logs directory.

Stefan Groschupf added a comment - 18/Aug/06 06:06 AM
We should cleanup logging in nutch in general asap!
The way things are configured by today is everything else than elegant or clean.