Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core, mk
    • Labels:
      None

      Description

      A file system abstraction allows to add new features (cross cutting concerns) in a modular way, for example:

      • detection and special behavior of out-of-disk space situation
      • profiling and statistics over JMX
      • re-try on file system problems
      • encryption
      • file system monitoring
      • replication / real-time backup on the file system level (for clustering)
      • caching (improved performance for CRX)
      • allows to easily switch to faster file system APIs (FileChannel, memory mapped files)
      • debugging (for example, logging all file system operations)
      • allows to implement s3 / hadoop / mongodb / ... file systems - not only by us but from 3th party, possibly the end user
      • zip file system (for example to support read-only, compressed repositories)
      • testing: simulating out of disk space and out of memory (ensure the repository doesn't corrupt in this case)
      • testing: simulate very large files (using an in-memory file system)
      • splitting very large files in 2 gb blocks (FAT and other file systems that don't support large files)
      • data compression (if needed)

        Activity

        Hide
        Stefan Guggisberg added a comment -

        so far i didn't have the need for a file system abstraction in the microkernel.
        until such need arises i'd rather stick with the standard java file i/o.
        i don't see the point in using an abstraction just for the sake of it.

        Show
        Stefan Guggisberg added a comment - so far i didn't have the need for a file system abstraction in the microkernel. until such need arises i'd rather stick with the standard java file i/o. i don't see the point in using an abstraction just for the sake of it.
        Hide
        Thomas Mueller added a comment -

        > just for the sake of it.

        Could you provide more details? I have listed 15 reasons why I think it's useful (see above).

        Show
        Thomas Mueller added a comment - > just for the sake of it. Could you provide more details? I have listed 15 reasons why I think it's useful (see above).
        Hide
        Jukka Zitting added a comment -

        Agreed with Stefan. Components above the MK should ideally require no other storage mechanism than the MK, so this would only be needed by a specific MK implementation. Given that most of the foreseeable MK implementations are explicitly not directly based on the file system (i.e. cloud storage, etc.), I don't see too much value in such an abstraction.

        Show
        Jukka Zitting added a comment - Agreed with Stefan. Components above the MK should ideally require no other storage mechanism than the MK, so this would only be needed by a specific MK implementation. Given that most of the foreseeable MK implementations are explicitly not directly based on the file system (i.e. cloud storage, etc.), I don't see too much value in such an abstraction.
        Hide
        Stefan Guggisberg added a comment -

        >> just for the sake of it.
        > Could you provide more details? I have listed 15 reasons why I think it's useful (see above).

        as i said, so far i didn't have a need for encryption, zip file support etc in the current microkernel implementation.

        Show
        Stefan Guggisberg added a comment - >> just for the sake of it. > Could you provide more details? I have listed 15 reasons why I think it's useful (see above). as i said, so far i didn't have a need for encryption, zip file support etc in the current microkernel implementation.
        Hide
        Thomas Mueller added a comment -

        > this would only be needed by a specific MK implementation

        Sure.

        > foreseeable MK implementations are explicitly not directly based on the file system

        Actually, that's not correct: the default data store is file based,
        and there is a plan to create a file based persistence manager within Adobe.

        > so far i didn't have a need for encryption,
        > zip file support etc in the current microkernel implementation.

        Probably you didn't know that customers were asking for such features,
        and you didn't know about existing features within Day CRX.
        Among the feature already supported are:

        • detection and special behavior of out-of-disk space situation
        • profiling and statistics over JMX
        • caching (improved performance)
        • testing: simulating out of disk space and out of memory
          (ensure the repository doesn't corrupt in this case)

        Requested features are:

        • re-try on file system problems
        • encryption
        • file system monitoring
        • replication / real-time backup on the file system level (for clustering)
        • allows to implement s3 / hadoop / mongodb / ... file systems

        Additional features that would be nice for development:

        • debugging (for example, logging all file system operations)
        • testing: simulate very large files (using an in-memory file system)
        Show
        Thomas Mueller added a comment - > this would only be needed by a specific MK implementation Sure. > foreseeable MK implementations are explicitly not directly based on the file system Actually, that's not correct: the default data store is file based, and there is a plan to create a file based persistence manager within Adobe. > so far i didn't have a need for encryption, > zip file support etc in the current microkernel implementation. Probably you didn't know that customers were asking for such features, and you didn't know about existing features within Day CRX. Among the feature already supported are: detection and special behavior of out-of-disk space situation profiling and statistics over JMX caching (improved performance) testing: simulating out of disk space and out of memory (ensure the repository doesn't corrupt in this case) Requested features are: re-try on file system problems encryption file system monitoring replication / real-time backup on the file system level (for clustering) allows to implement s3 / hadoop / mongodb / ... file systems Additional features that would be nice for development: debugging (for example, logging all file system operations) testing: simulate very large files (using an in-memory file system)
        Hide
        Jukka Zitting added a comment -

        I guess this should be better approached from the top instead of from the bottom.

        Do we want to implement a MK with features like the ones you describe? If yes, should that be the default implementation included in oak-mk or a separate component?

        Once we have consensus on those questions, it should be pretty straightforward to tell whether an abstraction like this is needed and where it should be located.

        Show
        Jukka Zitting added a comment - I guess this should be better approached from the top instead of from the bottom. Do we want to implement a MK with features like the ones you describe? If yes, should that be the default implementation included in oak-mk or a separate component? Once we have consensus on those questions, it should be pretty straightforward to tell whether an abstraction like this is needed and where it should be located.
        Hide
        Thomas Mueller added a comment -

        > I guess this should be better approached from the top instead of from the bottom.

        For some things are easier if you use a service provider interface. Java itself uses the service provider interface quite a lot. We had a file system abstraction in Jackrabbit 2.x, unfortunately we didn't use it in many cases. Please note Java 7 has a file system abstraction as well, and the proposed API is very similar to Java 7. Once Java 7 is the minimum requirement we can remove our own abstraction and use the Java features. But I guess we will have to support Java 6 for quite some time still.

        > Do we want to implement a MK with features like the ones you describe?

        Yes. We have to, if we want to replace CRX 2.x without losing features. Of course we can implement the whole MK stack again in CRX, with the abstraction, but I think that's kind of pointless (specially for the data store). Or we can do it as we did in CRX, that is, implement the features without using an abstraction. But I would like to avoid that, because this resulted in a non-optimal and very inflexible solution.

        > Do we want to implement a MK with features like the ones you describe?
        > If yes, should that be the default implementation included in oak-mk or a separate component?

        Yes, that's what I propose.

        Show
        Thomas Mueller added a comment - > I guess this should be better approached from the top instead of from the bottom. For some things are easier if you use a service provider interface. Java itself uses the service provider interface quite a lot. We had a file system abstraction in Jackrabbit 2.x, unfortunately we didn't use it in many cases. Please note Java 7 has a file system abstraction as well, and the proposed API is very similar to Java 7. Once Java 7 is the minimum requirement we can remove our own abstraction and use the Java features. But I guess we will have to support Java 6 for quite some time still. > Do we want to implement a MK with features like the ones you describe? Yes. We have to, if we want to replace CRX 2.x without losing features. Of course we can implement the whole MK stack again in CRX, with the abstraction, but I think that's kind of pointless (specially for the data store). Or we can do it as we did in CRX, that is, implement the features without using an abstraction. But I would like to avoid that, because this resulted in a non-optimal and very inflexible solution. > Do we want to implement a MK with features like the ones you describe? > If yes, should that be the default implementation included in oak-mk or a separate component? Yes, that's what I propose.
        Hide
        Thomas Mueller added a comment -

        Additional use case:

        • Detect unclosed files (add a special wrapper that detects this, so that
          tests that would fail in Windows also fail in Mac OS and Linux)
        Show
        Thomas Mueller added a comment - Additional use case: Detect unclosed files (add a special wrapper that detects this, so that tests that would fail in Windows also fail in Mac OS and Linux)
        Hide
        Jukka Zitting added a comment -

        Where are we with this? Is there a particular place in the code where this functionality is needed, or should we resolve this perhaps as Later until such a need arises?

        On a related note, what's the status of the related o.a.j.mk.fs code in oak-core? Unless we need more of the functionality than we currently do (one-liners in MicroKernelFactory and NodeMapInDb), I'd rather replace it with Commons IO or even just plain Java IO.

        If, as it sounds like, the code will mostly be needed for the MicroKernel implementation, we should at least move the o.a.j.mk.fs package from oak-core to oak-mk.

        Show
        Jukka Zitting added a comment - Where are we with this? Is there a particular place in the code where this functionality is needed, or should we resolve this perhaps as Later until such a need arises? On a related note, what's the status of the related o.a.j.mk.fs code in oak-core? Unless we need more of the functionality than we currently do (one-liners in MicroKernelFactory and NodeMapInDb), I'd rather replace it with Commons IO or even just plain Java IO. If, as it sounds like, the code will mostly be needed for the MicroKernel implementation, we should at least move the o.a.j.mk.fs package from oak-core to oak-mk.
        Hide
        Thomas Mueller added a comment -

        Is there a particular reason why it needs to be removed?

        > Is there a particular place in the code where this functionality is needed

        It's not needed right now, but in my view it will be needed later this year.
        I guess it would have been simpler if it wasn't removed.

        > we should at least move the o.a.j.mk.fs package from oak-core to oak-mk.

        Yes. That's where it was originally.

        Show
        Thomas Mueller added a comment - Is there a particular reason why it needs to be removed? > Is there a particular place in the code where this functionality is needed It's not needed right now, but in my view it will be needed later this year. I guess it would have been simpler if it wasn't removed. > we should at least move the o.a.j.mk.fs package from oak-core to oak-mk. Yes. That's where it was originally.
        Hide
        Julian Reschke added a comment -

        If it's not needed right now, it (IMHO) shouldn't be in.

        Show
        Julian Reschke added a comment - If it's not needed right now, it (IMHO) shouldn't be in.
        Hide
        Michael Dürig added a comment -

        If it's not needed right now, it (IMHO) shouldn't be in.

        +1

        Show
        Michael Dürig added a comment - If it's not needed right now, it (IMHO) shouldn't be in. +1
        Hide
        Thomas Mueller added a comment -

        Julian, what's the point to remove it now if it will be needed later on?

        Show
        Thomas Mueller added a comment - Julian, what's the point to remove it now if it will be needed later on?
        Hide
        Dominique Pfister added a comment -

        > On a related note, what's the status of the related o.a.j.mk.fs code in oak-core? Unless we need more of the > functionality than we currently do (one-liners in MicroKernelFactory and NodeMapInDb), I'd rather replace it > with Commons IO or even just plain Java IO.

        I'd replace it with Commons IO and its FileUtils features: this is already in use in other projects (such as Jackrabbit itself) and much better tested.

        > If, as it sounds like, the code will mostly be needed for the MicroKernel implementation, we should at least > move the o.a.j.mk.fs package from oak-core to oak-mk.

        I don't see a need for this in the MicroKernel implementation: instead of moving it back into this project, I'd rather move it to oak-commons and rename the package to o.a.j.commons.fs, if Tom insists on keeping it.

        Show
        Dominique Pfister added a comment - > On a related note, what's the status of the related o.a.j.mk.fs code in oak-core? Unless we need more of the > functionality than we currently do (one-liners in MicroKernelFactory and NodeMapInDb), I'd rather replace it > with Commons IO or even just plain Java IO. I'd replace it with Commons IO and its FileUtils features: this is already in use in other projects (such as Jackrabbit itself) and much better tested. > If, as it sounds like, the code will mostly be needed for the MicroKernel implementation, we should at least > move the o.a.j.mk.fs package from oak-core to oak-mk. I don't see a need for this in the MicroKernel implementation: instead of moving it back into this project, I'd rather move it to oak-commons and rename the package to o.a.j.commons.fs, if Tom insists on keeping it.
        Hide
        Dominique Pfister added a comment -

        > commons and rename the package to o.a.j.commons.fs

        Sorry, should be o.a.j.oak.commons.fs, of course.

        Show
        Dominique Pfister added a comment - > commons and rename the package to o.a.j.commons.fs Sorry, should be o.a.j.oak.commons.fs, of course.
        Hide
        Julian Reschke added a comment -

        > Julian, what's the point to remove it now if it will be needed later on?

        What's the point in keep unused code in the project when it can be re-added when it's needed?

        Show
        Julian Reschke added a comment - > Julian, what's the point to remove it now if it will be needed later on? What's the point in keep unused code in the project when it can be re-added when it's needed?
        Hide
        Thomas Mueller added a comment -

        > I don't see a need for this in the MicroKernel implementation

        Well, I see a need.

        What about if we discuss whether a file system abstraction is needed or not (in the long term) in next meeting / F2F / Oakathon. Before we move stuff around.

        > I'd replace it with Commons IO

        Or, as an alternative, Commons IO could be used within the file system abstraction.

        Show
        Thomas Mueller added a comment - > I don't see a need for this in the MicroKernel implementation Well, I see a need. What about if we discuss whether a file system abstraction is needed or not (in the long term) in next meeting / F2F / Oakathon. Before we move stuff around. > I'd replace it with Commons IO Or, as an alternative, Commons IO could be used within the file system abstraction.
        Hide
        Thomas Mueller added a comment -

        > What's the point in keep unused code in the project when it can be re-added when it's needed?

        If you already know it's needed in the long term, and if it's already there, why remove it first and add it later?

        Show
        Thomas Mueller added a comment - > What's the point in keep unused code in the project when it can be re-added when it's needed? If you already know it's needed in the long term, and if it's already there, why remove it first and add it later?
        Hide
        Jukka Zitting added a comment -

        I moved the code to oak-mk in revision 1333504 and replaced the dependencies in oak-core with simpler Java IO alternatives.

        I'm fine with leaving the code in svn for now if there's a reasonable expectation that we'll be using it down the line for a native Oak disk persistence instead of always using an externally developed database for storage.

        Show
        Jukka Zitting added a comment - I moved the code to oak-mk in revision 1333504 and replaced the dependencies in oak-core with simpler Java IO alternatives. I'm fine with leaving the code in svn for now if there's a reasonable expectation that we'll be using it down the line for a native Oak disk persistence instead of always using an externally developed database for storage.
        Hide
        Thomas Mueller added a comment -

        Your (new) deleteRecursive implementation doesn't detect if the files / directories couldn't be deleted.

        Show
        Thomas Mueller added a comment - Your (new) deleteRecursive implementation doesn't detect if the files / directories couldn't be deleted.
        Hide
        Thomas Mueller added a comment -

        > I'm fine with leaving the code in svn for now if there's a reasonable expectation that we'll be using it down the line for a native Oak disk persistence instead of always using an externally developed database for storage.

        We already have a "native Oak disk persistence": the file data store.

        Show
        Thomas Mueller added a comment - > I'm fine with leaving the code in svn for now if there's a reasonable expectation that we'll be using it down the line for a native Oak disk persistence instead of always using an externally developed database for storage. We already have a "native Oak disk persistence": the file data store.
        Hide
        Jukka Zitting added a comment -

        Your (new) deleteRecursive implementation doesn't detect if the files / directories couldn't be deleted.

        Yep, I know.

        I think that bit of code should go away entirely, so I didn't want to spend too much effort on it. IMHO the MicroKernelFactory shouldn't be in charge of MK lifecycle and thus shouldn't be in the business of removing directories, etc. See also the OAK-32 TODO I left in the code.

        Show
        Jukka Zitting added a comment - Your (new) deleteRecursive implementation doesn't detect if the files / directories couldn't be deleted. Yep, I know. I think that bit of code should go away entirely, so I didn't want to spend too much effort on it. IMHO the MicroKernelFactory shouldn't be in charge of MK lifecycle and thus shouldn't be in the business of removing directories, etc. See also the OAK-32 TODO I left in the code.
        Hide
        Stefan Guggisberg added a comment -

        resolving as "won't fix" for now, as discussed with thomas off-list.

        we will revisit this topic once we have a concrete use case/need for a file system abstraction in oak-mk.

        Show
        Stefan Guggisberg added a comment - resolving as "won't fix" for now, as discussed with thomas off-list. we will revisit this topic once we have a concrete use case/need for a file system abstraction in oak-mk.

          People

          • Assignee:
            Thomas Mueller
            Reporter:
            Thomas Mueller
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development