Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9.0-incubating
    • enhancer-0.10.0
    • Enhancer
    • None

    Description

      Currently different types of ContentItem define there own constructors that do fit there specific implementation. e.g. the InMemoryBlob defines constructors that allow to parse the content as ByteArray. This makes completely sense for this implementation, because directly allows to parse the data if they are already loaded in memory. The WebContentItem as an other example can not support a Constructor taking a byte array, because at the time of construction only the URL of - reference to - the content is available. Also for a File based ContentItem implementation a constructor with an byte array would not be preferable as the whole point of such an implementation would be to avoid to load the whole content in memory.

      However with the introduction of a factory pattern to construct ContentItems the interfaces used to parse content MUST be normalized - because they are part of the API of the ContentItemFactory interface. To solve this the following two interfaces are added to the Stanbol Enhancer API

      First the _ContentSource_ interface intended to be used for already dereferenced content

        • the content as stream */
          + getStream() : InputStream
          /** the content as byte array */
          + getData() : byte[]
          /** optionally the media type of the content */
          + getMediaType() : String
          /** optionally the file name of the content */
          + getFileName() : String
          /** optionally additional headers */
          + getHeaders() : Map<String,List<String>>

      With the following default implementations:

      • StreamSource: A ContentSource wrapping an InputStream. Multiple calls to #getStream() will not be supported. Calls to #getData() will load the contents provided by the stream into memory.
      • ByteArraySource: A ContentSource implementation that internally uses a byte array. To be used in cases where users need to parse content to the Stanbol Enhancer that is already loaded in-memory. Calls to #getData() MUST NOT copy the internal byte array.
      • StringSource: A ContentSource implementation that directly allows to parse a String instance.

      Note that ContentItem/Blob implementations that

      • store the content in-memory should prefer to call ContentSource#getData() to retrieve the content from the ContentSource
      • stream the content to a file/database/CMS need to use ContentSource#getStream() to avoid loading the whole content in-memory!

      Second the _ContentReference_ interface intended to be used to create ContentItems/Blons for content where only a reference is available.

      /** the Reference to the content */
      + gerReference() : String
      /** dereferences the content */
      + dereference() : ContentSource

      With the following default implementation:

      • UrlReference: Allows to use any Java URL to reference a Content. This basically is a replacement for the current WebContentItem implementation.

      Both interfaces and implementations will be part of the Stanbol Enhancer Services API module.

      Attachments

        Activity

          People

            rwesten Rupert Westenthaler
            rwesten Rupert Westenthaler
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: