Issue Details (XML | Word | Printable)

Key: HDFS-436
Type: Sub-task Sub-task
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Konstantin Boudnik
Reporter: Konstantin Boudnik
Votes: 0
Watchers: 10
Operations

If you were logged in you would be able to see more operations.
Hadoop HDFS
HDFS-435

AspectJ framework for HDFS code and tests

Created: 09/Jun/09 06:29 PM   Updated: 23/Jun/09 04:09 PM
Return to search
Component/s: test
Affects Version/s: None
Fix Version/s: 0.21.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-22 07:17 PM Konstantin Boudnik 16 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-21 05:38 PM Konstantin Boudnik 16 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-20 01:40 AM Konstantin Boudnik 16 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-20 01:37 AM Konstantin Boudnik 16 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-20 01:34 AM Konstantin Boudnik 16 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-19 10:00 PM Konstantin Boudnik 16 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-17 09:06 PM Konstantin Boudnik 15 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-16 10:48 PM Konstantin Boudnik 13 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-15 06:30 PM Konstantin Boudnik 13 kB
Text File Licensed for inclusion in ASF works HADOOP-6003.patch 2009-06-09 08:00 PM Konstantin Boudnik 13 kB
File Licensed for inclusion in ASF works HADOOP-6003.sh 2009-06-17 09:06 PM Konstantin Boudnik 0.5 kB
File Licensed for inclusion in ASF works HADOOP-6003.sh 2009-06-09 08:00 PM Konstantin Boudnik 0.5 kB

Resolution Date: 23/Jun/09 05:07 AM


 Description  « Hide
This subtask takes care about HDFS part of Hadoop only. Others will be added later as needed: it will include only new aspects development and modifications of build.xml file

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Konstantin Boudnik added a comment - 09/Jun/09 08:00 PM
Shell script for folders creation and the patch

Konstantin Boudnik added a comment - 09/Jun/09 08:10 PM - edited
This patch includes the following additions:
  • AspectJ framework (version 1.6.4) is added to the Ivy resolver's configuration
  • the implementation of a simple probability calculation and configuration needed by fault injection
  • two aspects for datanode's classes BlockReceiver and FSDataset are created and tested

It is expected to see unit tests failing with faults in place. We might need to develop different kind of tests to utilize fault injection in a better way.

The interface of the new framework is as follows:

  • ant injectfaults will weave the aspects in place after the normal compilation of HDFS classes is complete
  • ant run-test-hdfs will execute unit tests as usual, but faults will be injected according to the rules
  • ant jar will create Hadoop's jar as usual, but if 'injectfaults' has been executed before then the jar file will include instrumented classes, e.g. with fault invocations

The rules of faults injection probability calculation are as follows:

  • default probability level is set to 0. Thus even with aspects weaved into the classes faults won't be injected/executed unless specified explicitly
  • to set certain class' faults probability level one needs to specify system property in the following format
    • ant run-test-hdfs -Dfault.probability.FSDataset=3

      which will set the probability of faults injections into FSDataset class at about 3%


Hadoop QA added a comment - 12/Jun/09 12:06 AM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12410248/HADOOP-6003.patch
against trunk revision 783672.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed core unit tests.

-1 contrib tests. The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/491/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/491/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/491/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/491/console

This message is automatically generated.


Konstantin Boudnik added a comment - 12/Jun/09 05:46 PM
As far as I can tell this failing test isn't related to the patch at all, because of its orthogonality. Test failure seems to be connected to some sort of configuration issues.

Review of the patch will be highly appreciated!


Philip Zeyliger added a comment - 15/Jun/09 03:52 AM
It looks like some of the patches have inconsistent usage of tabs/spaces; you may want to clean that up.

It is expected to see unit tests failing with faults in place. We might need to develop different kind of tests to utilize fault injection in a better way.

I think you're right on here. It seems like having a global setting for injection probability will mostly lead to test results that are hard to grok. Have you thought about setting up a separate package/target for tests that are designed to interact with injected faults? Once one or two of those tests exists, it'll be clearer whether "ant run-test-hdfs -Dfault.probability.FSDataset=3" is the right API.


Konstantin Boudnik added a comment - 15/Jun/09 06:30 PM
Indentation inconsistency is fixed.

As for the separate tests development: I agree on that. Different tests would have to be developed with FI in minds. However, for the Phase 1 (e.g. the framework introduction) we might use existing tests if in need to verify certain aspects of the error handling.

On the API: I think current proposal will not stand when we'll be talking about complex injection scenarios and would have to replaced with something else. I'm agree that we need to develop some of these scenarios before a better API might be developed.


Tsz Wo (Nicholas), SZE added a comment - 15/Jun/09 09:51 PM
I think the ant command may look like:
ant test -Dfault.injection.enable -Dfault.injection.conf=CONF_FILE

All the FI parameters, included probability values, are specified in CONF_FILE. Then, we only have to change conf file but not the build script for the later development.


Konstantin Boudnik added a comment - 15/Jun/09 10:33 PM
Agree, config file option seems like a good idea. I was planning to add this
later and have this framework to be in place ASAP, but I can surely add this
feature right now.

One more thing: I believe it would be good to have dynamic configuration i.e.
through system properties as well as the config file. This ability to pass
system properties provides ad-hoc testing opportunity if one needs to quickly
check/reproduce something.

I'll provide config file option shortly.


Konstantin Boudnik added a comment - 16/Jun/09 10:47 PM
This version of the patch includes a way to define probability levels through a standard Hadoop configuration file. Default location of the config file is conf/fi-site.xml
An alternative location might be set through -Dfault.probability.config= system property

Also, one can use a standard build.property file to specify all needed probability levels in the runtime


Konstantin Boudnik added a comment - 17/Jun/09 09:06 PM
Final modifications:
  • new package is renamed to org.apache.hadoop.fi
  • FIConfig class is renamed to FiConfig
  • system property prefix is made to be 'fi' instead of rather lengthy fault.probability
  • default config file for FI framework is added
  • missed Apache license boiler plate is attached to one of the files

Tsz Wo (Nicholas), SZE added a comment - 19/Jun/09 02:00 AM
  • Tried the following
  1. ant injectfaults
  2. set fi.* to 1
  3. ant test-core -Dtestcase=TestFileCreation
    It still passed the test. Have I done something wrong?
  • ant clean failed.
    bash-3.2$ ant clean
    Buildfile: build.xml
    ...
    BUILD FAILED
    d:\@sze\hadoop\latest\build.xml:1604: Unable to delete file d:\@sze\hadoop\latest\build\ivy\lib\Hadoop\common\aspectjtools-1.6.4.jar
    
    Total time: 0 seconds
    
  • I think fi-site.xml should not be placed in conf directory. It may confuse cluster administrators. It is better to put everything under test.

Konstantin Boudnik added a comment - 19/Jun/09 10:00 PM - edited
Slightly different version of the original patch:
conf/fi-site.xml is moved to src/test. build.xml is modified to copy it for test runs.
<iajc> task definition is moved inside of the target 'injectfaults' to guarantee that it always defined

Couldn't reproduce the issue with

ant clean

works for me every time. Is it possible that you had some files permission issues?

Wrt to TestFileCreation: I have ran this test many times and seen a failure only once. The problem with this test is that while I can confirm that faults methods are called they aren't called frequent enough (i.e. < 80 calls during the test execution) to reach the necessary threshold of 1% to inject a fault. In other words, the instrumented isn't being called often enough. I'd suggest to run TestDirectoryScanner instead where these instrumented functions are called pretty often so the test fails on every run.


Tsz Wo (Nicholas), SZE added a comment - 20/Jun/09 12:11 AM
  • I thought fi.* is the probability, but it actually is the probability in percentage. It seems to me that it is not common to specify probability in percentage. I suggest that if it is a percentage, it needs a %. i.e.
    • <value>1</value> means probability = 1
    • <value>1%</value> means probability = 0.01
  • Tried the latest patch. Everything worked fine: TestFileCreation failed when setting fi.* = 100.0. ant clean succeeded. I probably messed something up last time. My faults!

Konstantin Boudnik added a comment - 20/Jun/09 01:34 AM
I agree with the point - it is misleading when a level of probability set as an integer.
I have fixed the config file comment and ProbabilityModel code to work with the value as a float between 0.00 and 1.00
I think adding a special case to handle '%' is excessive.

Konstantin Boudnik added a comment - 20/Jun/09 01:37 AM
Default probability for all faults has to be 0.00

Konstantin Boudnik added a comment - 20/Jun/09 01:40 AM
Debug output has to have log4j's debug level

Tsz Wo (Nicholas), SZE added a comment - 20/Jun/09 06:08 PM
Codes look good. Two more minor comments:
  • javadoc need to be quoted by /** */. For example,
    +/*
    + * This class wraps the logic around fault injection configuration file
    + * Default file is expected to be found in src/test/fi-site.xml
    + * This default file should be copied by JUnit Ant's tasks to 
    + * build/test/extraconf folder before tests are ran
    + * An alternative location can be set through
    + *   -Dfault.property.config=<file_name>
    + */
  • Unnecessary space changes should be reverted. e.g.
    --- build.xml	(revision 786191)
    +++ build.xml	(working copy)
    @@ -428,7 +428,7 @@
           <fileset dir="${mapred.src.dir}" includes="mapred-default.xml"/>
         </copy>
       </target>
    -
    +  
       <target name="compile-hdfs-classes" depends="compile-core-classes">
         <jsp-compile
          uriroot="${src.webapps}/hdfs"
    @@ -452,7 +452,7 @@
         </jsp-compile>
     
         <!-- Compile Java files (excluding JSPs) checking warnings -->
    -    <javac 
    +   <javac 
          encoding="${build.encoding}" 
          srcdir="${hdfs.src.dir};${build.src}" 
          includes="org/apache/hadoop/**/*.java"

Tsz Wo (Nicholas), SZE added a comment - 20/Jun/09 06:11 PM
After the project split, this issue belongs to hdfs.

Konstantin Boudnik added a comment - 21/Jun/09 05:38 PM
JavaDoc comments tags are fixed
Better documentation for the FI classes is written.

Other comments are considered and fixes are made.


Konstantin Boudnik added a comment - 22/Jun/09 07:17 PM - edited
Patch is modified (mostly the modifications of the build.xml file) to reflect the split between projects and renaming of the core to common.

It has been testes and verified that it works with new project structure.


Tsz Wo (Nicholas), SZE added a comment - 23/Jun/09 05:07 AM
+1 patch looks good.

Tsz Wo (Nicholas), SZE added a comment - 23/Jun/09 05:07 AM
I have committed this. Thanks, Cos!

Tsz Wo (Nicholas), SZE added a comment - 23/Jun/09 01:35 PM
Forgot to post test-patch results earlier:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 9 new or modified tests.
     [exec] 
     [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

The javadoc warnings are not related. See HDFS-440.


Konstantin Boudnik added a comment - 23/Jun/09 04:09 PM
Thanks a bunch, Nicholas!

I'm so sorry to cause so many remarks - it's my very first real patch in the
Hadoop and I was kinda sluggish.

Thanks for your patience and commitment!
Cos