Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-196

File length not reported correctly after application crash

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Our application (Hypertable) creates a transaction log in HDFS. This log is written with the following pattern:

      out_stream.write(header, 0, 7);
      out_stream.sync()
      out_stream.write(data, 0, amount);
      out_stream.sync()
      [...]

      However, if the application crashes and then comes back up again, the following statement

      length = mFilesystem.getFileStatus(new Path(fileName)).getLen();

      returns the wrong length. Apparently this is because this method fetches length information from the NameNode which is stale. Ideally, a call to getFileStatus() would return the accurate file length by fetching the size of the last block from the primary datanode.

      Attachments

        Activity

          People

            Unassigned Unassigned
            nuggetwheat Doug Judd
            Votes:
            2 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: