Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2343

repeated replace() function calls damage the performance

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.21.0
    • None
    • tools/rumen
    • None

    Description

      In the file

      .hadoop-0.21.0/mapred/src/tools/org/apache/hadoop/tools/rumen/LoggedTaskAttempt.java                line:362   
      
      hadoop-0.21.0/mapred/src/tools/org/apache/hadoop/tools/rumen/LoggedTask.java                               line:249
      

      consecutive replace() is called to remove the special characters. It's 5+ times slower than using a for loop replace them all.

      e.g.
       - str.replace('a', '|');
       - str.replace('b', '|');
      
       + StringBuilder sb = new StringBuilder( str.length() );
       + for (int i=0; i < str.length(); i++)
       +  {
       +           char c = str.charAt(i);
       +         if ( c == 'a'  || c =='b' )
       +               sb.append('|');
       +       else
       +                sb.append('c');
       +  }
       +  str  = sb.toString();
      

      This bug has the same problem as the MySQL bug : http://bugs.mysql.com/bug.php?id=45699

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nancyesmis Xiaoming Shi
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: