Lucene - Core
  1. Lucene - Core
  2. LUCENE-3230

Make FSDirectory.fsync() public and static

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 3.3, 4.0-ALPHA
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I find FSDirectory.fsync() (today protected and instance method) very useful as a utility to sync() files. I'd like create a FSDirectory.sync() utility which contains the exact same impl of FSDir.fsync(), and have the latter call it. We can have it part of IOUtils too, as it's a completely standalone utility.

      I would get rid of FSDir.fsync() if it wasn't protected (as if encouraging people to override it). I doubt anyone really overrides it (our core Directories don't).

      Also, while reviewing the code, I noticed that if IOE occurs, the code sleeps for 5 msec. If an InterruptedException occurs then, it immediately throws ThreadIE, completely ignoring the fact that it slept due to IOE. Shouldn't we at least pass IOE.getMessage() on ThreadIE?

      The patch is trivial, so I'd like to get some feedback before I post it.

        Activity

        Hide
        Robert Muir added a comment -

        bulk close for 3.3

        Show
        Robert Muir added a comment - bulk close for 3.3
        Hide
        Shai Erera added a comment -

        I opened LUCENE-3237 to improve how fsync works. After we move sync() to IndexOutput, a public static sync() API won't make much sense.

        Show
        Shai Erera added a comment - I opened LUCENE-3237 to improve how fsync works. After we move sync() to IndexOutput, a public static sync() API won't make much sense.
        Hide
        Michael McCandless added a comment -

        OK, I think close this issue, and open another (to switch to syncing the actual IOs we opened)... this will be a challenge for Lucene though.

        Show
        Michael McCandless added a comment - OK, I think close this issue, and open another (to switch to syncing the actual IOs we opened)... this will be a challenge for Lucene though.
        Hide
        Shai Erera added a comment -

        I opened 2 RAF instances over the same file and called getFD() on each. I've received 2 instances of FD, each had a different 'handle' value. So I agree it's not clear whether syncing one FD affects the other.

        I also found some posts on the web claiming that FS.sync() doesn't really work on all OS, and that some hardware manufacturers enable "hardware write caching", so even if the OS obeys the sync() call, the HW may not. I guess there's not much we can do about the HW case.

        So perhaps we shouldn't make it a public API after all. If someone can sync() on the same OutputStream he used to flush/close, it's better than how we do it today. I hate to introduce public API w/ big warnings.

        As for our case (Lucene usage of sync()), it'd be good if we can sync() in the IndexOutput that wrote the data. So maybe we should add sync() to IO? Not sure how it will play out in Directory, which today syncs file names, and not IndexOutput instances.

        Shall I close this issue then? Or rename it to handle the IO.sync() issue?

        Show
        Shai Erera added a comment - I opened 2 RAF instances over the same file and called getFD() on each. I've received 2 instances of FD, each had a different 'handle' value. So I agree it's not clear whether syncing one FD affects the other. I also found some posts on the web claiming that FS.sync() doesn't really work on all OS, and that some hardware manufacturers enable "hardware write caching", so even if the OS obeys the sync() call, the HW may not. I guess there's not much we can do about the HW case. http://hardware.slashdot.org/story/05/05/13/0529252/Your-Hard-Drive-Lies-to-You http://lists.apple.com/archives/darwin-dev/2005/Feb/msg00072.html (bad fsync on Mac OS X) So perhaps we shouldn't make it a public API after all. If someone can sync() on the same OutputStream he used to flush/close, it's better than how we do it today. I hate to introduce public API w/ big warnings. As for our case (Lucene usage of sync()), it'd be good if we can sync() in the IndexOutput that wrote the data. So maybe we should add sync() to IO? Not sure how it will play out in Directory, which today syncs file names, and not IndexOutput instances. Shall I close this issue then? Or rename it to handle the IO.sync() issue?
        Hide
        Michael McCandless added a comment -

        This seems OK, but my only worry is.... I'm not sure this way of fsync'ing really "works"? Ie, this code opens a r/w RAF, calls sync, closes it. It's not clear that this is guaranteed to sync file handles open in the past against the same file. This is something we separately should look into / fix, but with this uncertainty it makes me nervous exposing this as a public API... maybe we could expose it with a big warning.

        Also, while reviewing the code, I noticed that if IOE occurs, the code sleeps for 5 msec. If an InterruptedException occurs then, it immediately throws ThreadIE, completely ignoring the fact that it slept due to IOE. Shouldn't we at least pass IOE.getMessage() on ThreadIE?

        +1

        Show
        Michael McCandless added a comment - This seems OK, but my only worry is.... I'm not sure this way of fsync'ing really "works"? Ie, this code opens a r/w RAF, calls sync, closes it. It's not clear that this is guaranteed to sync file handles open in the past against the same file. This is something we separately should look into / fix, but with this uncertainty it makes me nervous exposing this as a public API... maybe we could expose it with a big warning. Also, while reviewing the code, I noticed that if IOE occurs, the code sleeps for 5 msec. If an InterruptedException occurs then, it immediately throws ThreadIE, completely ignoring the fact that it slept due to IOE. Shouldn't we at least pass IOE.getMessage() on ThreadIE? +1

          People

          • Assignee:
            Shai Erera
            Reporter:
            Shai Erera
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development