Legal Discuss
  1. Legal Discuss
  2. LEGAL-117

Aggregation of GPL dictionaries with Apache OpenOffice (incubating) binary releases

    Details

    • Type: Question Question
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Labels:
      None

      Description

      Localized versions of OpenOffice.org have traditionally included dictionaries (a term used to designate data files for writing aids in general, like spell-checking dictionaries and thesauri) under the GPL license. These dictionaries are provided in the form of data files.

      Dictionaries are not a dependency of OpenOffice.org: they are packaged, even in the installer for native builds, as extensions. Any Windows version of OpenOffice.org is shipped as one file, containing separate modules for OpenOffice.org and for each linguistic extension (i.e., the dictionaries).

      This is possible because OpenOffice.org dictionaries, as confirmed by the Free Software Foundation in 2007 https://issues.apache.org/ooo/show_bug.cgi?id=65039 fall in the "mere aggregation" provision of the GPL license http://www.gnu.org/licenses/gpl-faq.html#MereAggregation

      The only remaining issue to be able to include GPL dictionaries in Apache OpenOffice is thus the Apache policy http://www.apache.org/legal/resolved.html which forbids GPL software from being included in Apache projects; but the rationale for this choice http://www.apache.org/licenses/GPL-compatibility.html clearly states that "This licensing incompatibility applies only when some Apache project software becomes a derivative work of some GPLv3 software", definitely not the case under discussion.

      In light of the above, can Apache OpenOffice include GPL spell-checking dictionaries with its binary releases?

        Activity

        Hide
        Sam Ruby added a comment -

        Just for completeness: is there any possibility that the dictionaries could be relicensed under more favorable terms? How difficult would it be to obtain alternatives?

        Is the proposal that dictionaries be something that are actively developed at the ASF, or is the proposal merely that they be bundled, as is, with no modification?

        How would the license for these artifacts be identified to the licensee?

        Show
        Sam Ruby added a comment - Just for completeness: is there any possibility that the dictionaries could be relicensed under more favorable terms? How difficult would it be to obtain alternatives? Is the proposal that dictionaries be something that are actively developed at the ASF, or is the proposal merely that they be bundled, as is, with no modification? How would the license for these artifacts be identified to the licensee?
        Hide
        Andrea Pescetti added a comment -

        In most cases it will be impossible to relicense dictionaries (as OpenOffice.org dictionaries are old projects with a long history of contributions) or find appropriate replacements (as they are the de facto standard dictionaries: other projects, like Mozilla, use the same dictionaries, possibly as extensions).

        In this issue we only focus on bundling the unmodified dictionaries: the OpenOffice.org source tree did contain dictionaries, but they were snapshots of externally developed dictionaries anyway, so it would be fine that the development occurs elsewhere.

        License notices for dictionaries, as for all extensions, are contained in each extension package and are displayed to the user from within the OpenOffice.org (or Apache OpenOffice) interface too.

        Show
        Andrea Pescetti added a comment - In most cases it will be impossible to relicense dictionaries (as OpenOffice.org dictionaries are old projects with a long history of contributions) or find appropriate replacements (as they are the de facto standard dictionaries: other projects, like Mozilla, use the same dictionaries, possibly as extensions). In this issue we only focus on bundling the unmodified dictionaries: the OpenOffice.org source tree did contain dictionaries, but they were snapshots of externally developed dictionaries anyway, so it would be fine that the development occurs elsewhere. License notices for dictionaries, as for all extensions, are contained in each extension package and are displayed to the user from within the OpenOffice.org (or Apache OpenOffice) interface too.
        Hide
        Rob Weir added a comment -

        To add to what Andrea said: We're thinking of this bundle for the binary release only, not the source release. The binary release would have the standard LICENSE and NOTICE file, which would include any licenses and required notices for 3rd party modules, including the dictionaries. Since we're an end-user facing GUI application, the currently proposed approach is to surface these as well in the Help/About dialog. probably by included the text of the NOTICE file and linking to the local copy of the LICENSE file.

        Show
        Rob Weir added a comment - To add to what Andrea said: We're thinking of this bundle for the binary release only, not the source release. The binary release would have the standard LICENSE and NOTICE file, which would include any licenses and required notices for 3rd party modules, including the dictionaries. Since we're an end-user facing GUI application, the currently proposed approach is to surface these as well in the Help/About dialog. probably by included the text of the NOTICE file and linking to the local copy of the LICENSE file.
        Hide
        Henri Yandell added a comment -

        Seems like it could be considered Category B. The core issues seem to be:

        • Do we have any issues with the statement that it is aggregation from John Steele Scott?
        • What's the risk that this could be mispackaged and stop being aggregation? Is it easy, or would it be an unnatural use of dictionary files to cross that line.
        • Ensuring it is flagged sufficiently for users who generally don't expect this in ASF work. Bear in mind that license simplicity is going to an attraction for Apache OO, so it wants to be very clear.
        Show
        Henri Yandell added a comment - Seems like it could be considered Category B. The core issues seem to be: Do we have any issues with the statement that it is aggregation from John Steele Scott? What's the risk that this could be mispackaged and stop being aggregation? Is it easy, or would it be an unnatural use of dictionary files to cross that line. Ensuring it is flagged sufficiently for users who generally don't expect this in ASF work. Bear in mind that license simplicity is going to an attraction for Apache OO, so it wants to be very clear.
        Hide
        Niclas Hedhman added a comment -

        Isn't there an option to have the dictionaries to be downloaded on-demand from third-party site (such as Maven Central), and present the licensing information to the end-user at that point in time?

        This would not only satisfy everything at ASF, but also make the the distro size significantly smaller.

        Show
        Niclas Hedhman added a comment - Isn't there an option to have the dictionaries to be downloaded on-demand from third-party site (such as Maven Central), and present the licensing information to the end-user at that point in time? This would not only satisfy everything at ASF, but also make the the distro size significantly smaller.
        Hide
        Andrea Pescetti added a comment -

        Risks that this could be mispackaged and stop being aggregation do not exist: modern versions of OpenOffice.org (and thus, Apache OpenOffice) only provide and use dictionaries in the form of extensions, so the packaging is separate by design.

        Downloading: indeed it is an option we discussed, but it has several major drawbacks with respect to mere aggregation, like requiring a redesign of significant portions of the application and being unpractical for many of the Apache OpenOffice use cases we expect (i.e., usage in countries where Internet access is not ubiquitous, or usage in structures like schools that may apply strict policies on downloads). And in this moment it would also require us to restructure the extensions hosting, that already has traffic problems and that would not be able to sustain the traffic.

        Being clear on licenses: as Rob wrote, we can take advantage of the fact that Apache OpenOffice is a GUI application to give effective information to users.

        Show
        Andrea Pescetti added a comment - Risks that this could be mispackaged and stop being aggregation do not exist: modern versions of OpenOffice.org (and thus, Apache OpenOffice) only provide and use dictionaries in the form of extensions, so the packaging is separate by design. Downloading: indeed it is an option we discussed, but it has several major drawbacks with respect to mere aggregation, like requiring a redesign of significant portions of the application and being unpractical for many of the Apache OpenOffice use cases we expect (i.e., usage in countries where Internet access is not ubiquitous, or usage in structures like schools that may apply strict policies on downloads). And in this moment it would also require us to restructure the extensions hosting, that already has traffic problems and that would not be able to sustain the traffic. Being clear on licenses: as Rob wrote, we can take advantage of the fact that Apache OpenOffice is a GUI application to give effective information to users.
        Hide
        Pedro Giffuni added a comment -

        One additional issue to consider here is that most dictionaries include hyphenation files under a the Latex License:

        https://issues.apache.org/ooo/show_bug.cgi?id=74283

        (I closed that issue and related ones since we don't carry anymore those files in the repository but apparently there were legal issues that the Oracle OpenOffice.org project hadn't cleared up yet).

        Very few dictionaries are GPL-only: most are under MPL/LGPL/GPL.My question here is ... is GPL-only is incompatible with LPPL, or does the aggregation concept cover this completely?

        Show
        Pedro Giffuni added a comment - One additional issue to consider here is that most dictionaries include hyphenation files under a the Latex License: https://issues.apache.org/ooo/show_bug.cgi?id=74283 (I closed that issue and related ones since we don't carry anymore those files in the repository but apparently there were legal issues that the Oracle OpenOffice.org project hadn't cleared up yet). Very few dictionaries are GPL-only: most are under MPL/LGPL/GPL.My question here is ... is GPL-only is incompatible with LPPL, or does the aggregation concept cover this completely?
        Hide
        Andrea Pescetti added a comment -

        @Pedro: this is indeed an interesting issue, but I would keep the current ticket on focus (i.e., clarifying whether we can continue to include GPL dictionaries in binary builds) since that would be the most problematic part. If you repost your LPPL-related comments on ooo-dev, I'll answer you with some historical details and then we might want to open a separate JIRA ticket about it, since opening two focused JIRA issues will be less dispersive than discussing two different licensing issues in the same ticket.

        Show
        Andrea Pescetti added a comment - @Pedro: this is indeed an interesting issue, but I would keep the current ticket on focus (i.e., clarifying whether we can continue to include GPL dictionaries in binary builds) since that would be the most problematic part. If you repost your LPPL-related comments on ooo-dev, I'll answer you with some historical details and then we might want to open a separate JIRA ticket about it, since opening two focused JIRA issues will be less dispersive than discussing two different licensing issues in the same ticket.
        Hide
        Sam Ruby added a comment -

        Andrea (or others) can you identify at least one dictionary that is GPL-only that the ooo PMC would feel is important to ship? While I agree that LPPL is s separate issue, the dual (or tri-) license comment is very relevant. In particular, the question as to how to handle code for which one of the licenses under which it is made available is either MPL 1.0 or MPL 1.1 has already been answered:

        http://www.apache.org/legal/resolved.html#category-b

        (P.S. http://www.apache.org/legal/resolved.html#no-modification might be helpful in considering how to handle LPPL)

        Show
        Sam Ruby added a comment - Andrea (or others) can you identify at least one dictionary that is GPL-only that the ooo PMC would feel is important to ship? While I agree that LPPL is s separate issue, the dual (or tri-) license comment is very relevant. In particular, the question as to how to handle code for which one of the licenses under which it is made available is either MPL 1.0 or MPL 1.1 has already been answered: http://www.apache.org/legal/resolved.html#category-b (P.S. http://www.apache.org/legal/resolved.html#no-modification might be helpful in considering how to handle LPPL)
        Hide
        Andrea Pescetti added a comment - - edited

        The Italian dictionary is GPL-only, for example. The Czech dictionary mentioned in the old issue I linked to seems to be GPL-only too.

        Show
        Andrea Pescetti added a comment - - edited The Italian dictionary is GPL-only, for example. The Czech dictionary mentioned in the old issue I linked to seems to be GPL-only too.
        Hide
        Sam Ruby added a comment -

        What timeframe would the ooo PPMC be looking at wanting to create a release containing these dictionaries?

        Show
        Sam Ruby added a comment - What timeframe would the ooo PPMC be looking at wanting to create a release containing these dictionaries?
        Hide
        Rob Weir added a comment -

        We're task-driven more than date-driven, but at the current pace I'd guess Q1 2012.

        Show
        Rob Weir added a comment - We're task-driven more than date-driven, but at the current pace I'd guess Q1 2012.
        Hide
        Sam Ruby added a comment -

        In that case, I would like to leave this issue open over the upcoming holidays in order to give everybody an ample opportunity to participate.

        Show
        Sam Ruby added a comment - In that case, I would like to leave this issue open over the upcoming holidays in order to give everybody an ample opportunity to participate.
        Hide
        Lawrence Rosen added a comment -

        Another data point you might want to add to the evaluation of LEGAL-117 and the broader issues of GPL compatibility with Apache software:

        At a conference earlier this week, I learned from an open source counsel at a large open source company that this entire issue of GPL dictionaries as "derivative works" had been considered by him also for his company's important products. Apparently lots of us express the same concerns and resolve ourselves to live with the GPL regardless of what the FSF FAQ literally says. [I won't reveal the name of the attorney who shared this tidbit, but I will say that his company is a very well known contributor to Apache; several of its employees are members here. And no, it is not the same company for whom I wrote such a formal legal opinion recently.]

        So perhaps we should resolve ourselves to listen also to others outside Apache who have analyzed the legal and cultural aspects of this question. Ask your own attorneys and company executives what they think we should do to take advantage of GPL software without "contaminating" our own Apache License 2.0 software.

        I agree with Sam that we should leave this issue open over the upcoming holidays and we should encourage anyone with a serious opinion to participate in the discussion on LEGAL-117.

        /Larry

        Show
        Lawrence Rosen added a comment - Another data point you might want to add to the evaluation of LEGAL-117 and the broader issues of GPL compatibility with Apache software: At a conference earlier this week, I learned from an open source counsel at a large open source company that this entire issue of GPL dictionaries as "derivative works" had been considered by him also for his company's important products. Apparently lots of us express the same concerns and resolve ourselves to live with the GPL regardless of what the FSF FAQ literally says. [I won't reveal the name of the attorney who shared this tidbit, but I will say that his company is a very well known contributor to Apache; several of its employees are members here. And no, it is not the same company for whom I wrote such a formal legal opinion recently.] So perhaps we should resolve ourselves to listen also to others outside Apache who have analyzed the legal and cultural aspects of this question. Ask your own attorneys and company executives what they think we should do to take advantage of GPL software without "contaminating" our own Apache License 2.0 software. I agree with Sam that we should leave this issue open over the upcoming holidays and we should encourage anyone with a serious opinion to participate in the discussion on LEGAL-117 . /Larry
        Hide
        Lawrence Rosen added a comment -

        BTW, the title of LEGAL-117 implies that this question applies only for binary distributions. That's not right.

        If we distribute binary GPL software we will also distribute the source. The GPL license requires that!

        /Larry

        Show
        Lawrence Rosen added a comment - BTW, the title of LEGAL-117 implies that this question applies only for binary distributions. That's not right. If we distribute binary GPL software we will also distribute the source. The GPL license requires that! /Larry
        Hide
        Andrea Pescetti added a comment -

        In this case, since dictionaries are data files (text files), "binary" and "source" form are actually the same thing. So the extension package already contains both.

        To be picky, dictionaries do come in two forms, a "compressed" and an "uncompressed" one (both in the form of text files); the extension stores the "compressed" form, but the "uncompressed" one can be straightforwardly obtained from it and vice versa; some authors prefer to make modifications on the "compressed" version and some on the "uncompressed" one. Thus I see no reason to distinguish between "binary" and "source" form here.

        Show
        Andrea Pescetti added a comment - In this case, since dictionaries are data files (text files), "binary" and "source" form are actually the same thing. So the extension package already contains both. To be picky, dictionaries do come in two forms, a "compressed" and an "uncompressed" one (both in the form of text files); the extension stores the "compressed" form, but the "uncompressed" one can be straightforwardly obtained from it and vice versa; some authors prefer to make modifications on the "compressed" version and some on the "uncompressed" one. Thus I see no reason to distinguish between "binary" and "source" form here.
        Hide
        Dennis E. Hamilton added a comment -

        @Andrea @Sam,

        I would like to confine this to the specific question of dictionaries (i.e., writing aids) to be distributed and installed in user run-times via Apache OpenOffice binaries created by the podling as release companions or whatever the proper term is. It would be useful to pick a problematic one and see how the following questions can be answered:

        1. How are these distributed/made-available in the first place, by their authors, and how is licensing and other notices affixed and how is the location of available "source" indicated?

        2. Is the writing aid meant to be editable or modifiable by users for more than their own private purposes? There are already questions on other lists about being able to modify the stock dictionaries via user function (e.g., being able to add a word spelling to the distributed spell-checking dictionary rather than a supplement carried by the run-time.).

        3. Is it appropriate (and feasible) to ensure that the released binary's run-time cannot be used to modify such writing aids and thereby create derivative works using features of the Apache OpenOffice release itself?

        4. Is the GPL (or other type-B or type-X) notice affixed to the artifact in a way where it cannot go unnoticed? That is, if someone fishes a writing-aid out from where it is stored by installation of the binary run-time, it is unmistakeable that there are non-ALv2 license terms, that the artifact's presence satisfies the non-ALv2 conditions, and that separate distribution is subject to the non-ALv2 conditions.

        Perhaps this narrows the question enough to satisfy any ASF concern about packaging user-separatable non-ALv2 artifacts in the released binary.

        Show
        Dennis E. Hamilton added a comment - @Andrea @Sam, I would like to confine this to the specific question of dictionaries (i.e., writing aids) to be distributed and installed in user run-times via Apache OpenOffice binaries created by the podling as release companions or whatever the proper term is. It would be useful to pick a problematic one and see how the following questions can be answered: 1. How are these distributed/made-available in the first place, by their authors, and how is licensing and other notices affixed and how is the location of available "source" indicated? 2. Is the writing aid meant to be editable or modifiable by users for more than their own private purposes? There are already questions on other lists about being able to modify the stock dictionaries via user function (e.g., being able to add a word spelling to the distributed spell-checking dictionary rather than a supplement carried by the run-time.). 3. Is it appropriate (and feasible) to ensure that the released binary's run-time cannot be used to modify such writing aids and thereby create derivative works using features of the Apache OpenOffice release itself? 4. Is the GPL (or other type-B or type-X) notice affixed to the artifact in a way where it cannot go unnoticed? That is, if someone fishes a writing-aid out from where it is stored by installation of the binary run-time, it is unmistakeable that there are non-ALv2 license terms, that the artifact's presence satisfies the non-ALv2 conditions, and that separate distribution is subject to the non-ALv2 conditions. Perhaps this narrows the question enough to satisfy any ASF concern about packaging user-separatable non-ALv2 artifacts in the released binary.
        Hide
        Andrea Pescetti added a comment -

        Answering Dennis using the Italian dictionary as an example:

        1. The download location for the independent product is http://extensions.services.openoffice.org/node/1204 (site is currently unstable, you will need to reload the page several times). The package contains all license notices, displayed upon installation. It doesn't indicate the location where to get "source" code since we are in the other case foreseen by the GPL, i.e., we convey the "source" together with the "binary" (they are the same files).

        2. Shipped dictionaries are not editable when installed and there are no mechanisms within OOo or AOO to do so. Since they are text files, a user can still modify them before installation, but this is not part of this scenario.

        3. Same as 2, all dictionaries are installed read-only; the user and the suite only have access to "user dictionaries", that are handled in a completely different way and use another file format.

        4. To fish out a dictionary you need to explore the installation tree and open the "extensions" subfolder (which should make you realize this is an extension), reach the extension's own subfolder and find the corresponding README and LICENSE files for that dictionary alongside the dictionary files. So it can be safely assumed that a user will notice the licensing terms.

        Show
        Andrea Pescetti added a comment - Answering Dennis using the Italian dictionary as an example: 1. The download location for the independent product is http://extensions.services.openoffice.org/node/1204 (site is currently unstable, you will need to reload the page several times). The package contains all license notices, displayed upon installation. It doesn't indicate the location where to get "source" code since we are in the other case foreseen by the GPL, i.e., we convey the "source" together with the "binary" (they are the same files). 2. Shipped dictionaries are not editable when installed and there are no mechanisms within OOo or AOO to do so. Since they are text files, a user can still modify them before installation, but this is not part of this scenario. 3. Same as 2, all dictionaries are installed read-only; the user and the suite only have access to "user dictionaries", that are handled in a completely different way and use another file format. 4. To fish out a dictionary you need to explore the installation tree and open the "extensions" subfolder (which should make you realize this is an extension), reach the extension's own subfolder and find the corresponding README and LICENSE files for that dictionary alongside the dictionary files. So it can be safely assumed that a user will notice the licensing terms.
        Hide
        Dennis E. Hamilton added a comment -

        @Andrea,

        This is a great help. I have analyzed your downloadable dict-it.oxt Writing Aids extension. Nice work. (For those following along at home, the *.oxt file can be opened with a Zip utility. It is in a jar-like format that goes back to what appear to be pre-ODF times.)

        SCOPE ISSUE

        It needs to be clear that we are talking about Writing Aids extensions. These are compilations that contain at least six files, typically, and can contain many more. (The dict-it.oxt that Andrea distributes contains 34 files). The parts may be individually licensed and the package itself may have an overall license. They are named as dictionary extensions but each such Native Language extension pachages several parts including a spelling dictionaries, thesauri, and hyphenation rules. For some languages, several variants are covered in the same dictionary extension. (English is this way with en-US, en-GB, en-ZA, all in one extension.)

        THE QUESTIONS

        Building on the original questions and the responses from Andrea Pescetti. All of the specifics are for Windows. Other platforms will vary:

        1. Dictionary extensions incorporated in a binary release are installed into subfolders of the run-time configuration directly. Dictionary extensions obtained separately are in dic-<NL>.oxt packages that can be read by the run-time and expanded into folders that the run-time has access to.
        Licensing is all over the place. The dictionary extension is itself a combined work and may have an extension-level license as well as notices of licenses of material that is employed. In addition, the individual parts may be subject to separate license conditions described in part-related README files and also in readable text of the part itself. There is generally no differentiation between source and non-source and whether or not what is provided is source (as defined by the GPL) or object. In many cases, it appears that this can be handled by cleaning up some wording and details in the many README files.
        For English, French, Spanish, German, and Italian, I found GPL, LGPL, MPL, BSD, Princeton WordNet (BSD-like with a retention-of-title clause), LaTEX Project Public License, and a special COPYING-OASIS (having nothing to do with OASIS) that prohibited use with any product that does not have ODF as a native format. Finally, these all appear to be derivative works, and there are change logs and multiple copyright notices that reflect that.

        2. It is clear that these are not editable by the run-time. The ones shipped as part of the binary-release installation are in places that are normally read-only to all but administrators.

        3. Ones that are installed separately do not require administrator privileges to set up, but they are placed in "Application Data/" (a hidden folder name) and on a non-obvious path where the parent folder has a generated *.tmp/ name. The dictionary extension files themselves are not read-only, however. Still, there is no provision to modify them via the run-time and additions of further spellings are npt made to these dictionaries.

        4. I agree that the README files and also extension-folder license notices provide ample indication that there are various licenses applicable to the extension and its components. I think there is adequate notice. The present README information is typical.

        ON REFLECTION

        @Sam Ruby

        It appears that there is so much variability in how licensing arises in individual dictionary extensions, and in the content of the extensions themselves, that a simple question about GPL being applicable to a dictionary data file (actually, two files, *.dic and *.aff where it is difficult to know how either of them is a source for use together). is not necessarily going to provide the answer that is needed to deal with these cases.

        I think the holiday pause can also be exploited to relook at the variety of dictionary-extension cases. It needs to be seen whether the cases can be reparsed into some small number of specific, concrete reusable practices. Then we can determine if ASF concerns are satisfied by what those might be.

        Show
        Dennis E. Hamilton added a comment - @Andrea, This is a great help. I have analyzed your downloadable dict-it.oxt Writing Aids extension. Nice work. (For those following along at home, the *.oxt file can be opened with a Zip utility. It is in a jar-like format that goes back to what appear to be pre-ODF times.) SCOPE ISSUE It needs to be clear that we are talking about Writing Aids extensions. These are compilations that contain at least six files, typically, and can contain many more. (The dict-it.oxt that Andrea distributes contains 34 files). The parts may be individually licensed and the package itself may have an overall license. They are named as dictionary extensions but each such Native Language extension pachages several parts including a spelling dictionaries, thesauri, and hyphenation rules. For some languages, several variants are covered in the same dictionary extension. (English is this way with en-US, en-GB, en-ZA, all in one extension.) THE QUESTIONS Building on the original questions and the responses from Andrea Pescetti. All of the specifics are for Windows. Other platforms will vary: 1. Dictionary extensions incorporated in a binary release are installed into subfolders of the run-time configuration directly. Dictionary extensions obtained separately are in dic-<NL>.oxt packages that can be read by the run-time and expanded into folders that the run-time has access to. Licensing is all over the place. The dictionary extension is itself a combined work and may have an extension-level license as well as notices of licenses of material that is employed. In addition, the individual parts may be subject to separate license conditions described in part-related README files and also in readable text of the part itself. There is generally no differentiation between source and non-source and whether or not what is provided is source (as defined by the GPL) or object. In many cases, it appears that this can be handled by cleaning up some wording and details in the many README files. For English, French, Spanish, German, and Italian, I found GPL, LGPL, MPL, BSD, Princeton WordNet (BSD-like with a retention-of-title clause), LaTEX Project Public License, and a special COPYING-OASIS (having nothing to do with OASIS) that prohibited use with any product that does not have ODF as a native format. Finally, these all appear to be derivative works, and there are change logs and multiple copyright notices that reflect that. 2. It is clear that these are not editable by the run-time. The ones shipped as part of the binary-release installation are in places that are normally read-only to all but administrators. 3. Ones that are installed separately do not require administrator privileges to set up, but they are placed in "Application Data/" (a hidden folder name) and on a non-obvious path where the parent folder has a generated *.tmp/ name. The dictionary extension files themselves are not read-only, however. Still, there is no provision to modify them via the run-time and additions of further spellings are npt made to these dictionaries. 4. I agree that the README files and also extension-folder license notices provide ample indication that there are various licenses applicable to the extension and its components. I think there is adequate notice. The present README information is typical. ON REFLECTION @Sam Ruby It appears that there is so much variability in how licensing arises in individual dictionary extensions, and in the content of the extensions themselves, that a simple question about GPL being applicable to a dictionary data file (actually, two files, *.dic and *.aff where it is difficult to know how either of them is a source for use together). is not necessarily going to provide the answer that is needed to deal with these cases. I think the holiday pause can also be exploited to relook at the variety of dictionary-extension cases. It needs to be seen whether the cases can be reparsed into some small number of specific, concrete reusable practices. Then we can determine if ASF concerns are satisfied by what those might be.
        Hide
        Sam Ruby added a comment -

        @Dennis:

        First a general comment: a number of permissive licenses are one way compatible with more restrictive licenses, such as the GPL: http://www.gnu.org/licenses/quick-guide-gplv3-compatibility.png

        As to specific other licenses, those questions can be pursued separately. If we determine that the inclusion of even one part of GPL-licensed content makes such dictionaries something that can't be included in binary distributions by the ASF, there probably isn't a need for follow-up questions. If we determine that it is OK to include GPL-licensed content in dictionaries as used by Apache Open Office, then the follow-up questions are likely to be easier to resolve.

        In particular, and if we get to that point, we would only need to evaluate licenses which are incompatible with the GPL in that they contain restrictions over and above what the GPL license allows. The following lists may be helpful in that effort:

        http://www.gnu.org/licenses/license-list.html#GPLCompatibleLicenses
        http://www.gnu.org/licenses/license-list.html#GPLIncompatibleLicenses

        Show
        Sam Ruby added a comment - @Dennis: First a general comment: a number of permissive licenses are one way compatible with more restrictive licenses, such as the GPL: http://www.gnu.org/licenses/quick-guide-gplv3-compatibility.png As to specific other licenses, those questions can be pursued separately. If we determine that the inclusion of even one part of GPL-licensed content makes such dictionaries something that can't be included in binary distributions by the ASF, there probably isn't a need for follow-up questions. If we determine that it is OK to include GPL-licensed content in dictionaries as used by Apache Open Office, then the follow-up questions are likely to be easier to resolve. In particular, and if we get to that point, we would only need to evaluate licenses which are incompatible with the GPL in that they contain restrictions over and above what the GPL license allows. The following lists may be helpful in that effort: http://www.gnu.org/licenses/license-list.html#GPLCompatibleLicenses http://www.gnu.org/licenses/license-list.html#GPLIncompatibleLicenses
        Hide
        Dennis E. Hamilton added a comment -

        @Sam Ruby

        Thanks. Those are useful links.

        There is some question about conflicts with GPL-compatibility in some cases that may not be matters of dual-licensing.

        With regard to GPL specifically, I suspect there is a way to accomplish this.

        Now that I see the extent of how writing-aid extensions are created, distributed, bundled, and installed, I agree focus on GPL is important.

        With the prospect that circumscribed use of GPLed artifacts is not a show-stopper, I am going to look ahead to see exactly how GPL is honored in these derivatives (the major ones are under active maintenance) and how the source versus object question can be resolved in a clean way that leaves no doubt that GPL is dealt with properly.

        There is a related consideration about where is the "up-stream" for these and how do they come into the hands of Apache OpenOffice for either bundling or availability as downloadable extensions. Since Andrea, for one, seems to have developed a comprehensive OOo-external approach with other Italian contributors, I want to consult them to understand exactly what the development process is and how the relationship with Apache OpenOffice can be kept sanitary.

        That's a side project, but something is needed. I don't doubt that releasing binaries without writing aids would be a giant fail. (I'm also assuming that not releasing binaries is a bigger fail.)

        Show
        Dennis E. Hamilton added a comment - @Sam Ruby Thanks. Those are useful links. There is some question about conflicts with GPL-compatibility in some cases that may not be matters of dual-licensing. With regard to GPL specifically, I suspect there is a way to accomplish this. Now that I see the extent of how writing-aid extensions are created, distributed, bundled, and installed, I agree focus on GPL is important. With the prospect that circumscribed use of GPLed artifacts is not a show-stopper, I am going to look ahead to see exactly how GPL is honored in these derivatives (the major ones are under active maintenance) and how the source versus object question can be resolved in a clean way that leaves no doubt that GPL is dealt with properly. There is a related consideration about where is the "up-stream" for these and how do they come into the hands of Apache OpenOffice for either bundling or availability as downloadable extensions. Since Andrea, for one, seems to have developed a comprehensive OOo-external approach with other Italian contributors, I want to consult them to understand exactly what the development process is and how the relationship with Apache OpenOffice can be kept sanitary. That's a side project, but something is needed. I don't doubt that releasing binaries without writing aids would be a giant fail. (I'm also assuming that not releasing binaries is a bigger fail.)
        Hide
        Dennis E. Hamilton added a comment -

        @Andrea @Sam

        A little more Sunday reflections. (The technicalities will be discussed back on the project and perhaps on the project wiki.)

        Here's a hypothetical steady state:

        1. Bundled dictionary extensions (and any other extensions that are bundled, for that matter), are installed by binary releases as *.oxt extension packages in appropriate locations under the installation location of the binary run-time. That is, they are indistinguishable from the same *.oxt downloaded from an extension site, but installed within the binary release installation configuration (and vetted for that purpose). All use of the extension content is cached privately as necessary and exercised by the run-time using the standard, license agnostic functions that the run-time has for relying on dictionary extensions of any origin. So it is clear that these aqgregations are kept intact and only bundled but not integrated any more than if an user had downloaded them independently.

        2. OpenOffice.org already provides an extension-management interface (usable for installing separately downloaded extensions) that also allows extensions to be removed or disabled. It is not and would not be possible to export extensions. However, it is not a stretch to consider that the extension management panel shown for an individual extension could also provide access to license information and an authoritative location where the *.oxt was/is obtainable from. If a "source code offer" is required, it could be presented there as well.

        3. Bundling of dictionary extensions would be limited to ones that allow literal distribution without restriction and have no limitations that impact field of use of the run-time. The vetting of bundled dictionary extensions (and any other bundled extensions) would be to ensure that any license conditions on the *.otx-contained artifacts are being honored in the occurence of those artifacts in the *.otx and that having an additional level of bundling in a binary-release installer is clearly allowed and appropriate.

        There remain some question on what the upstream source is and how that is separate from Apache OpenOffice when a *.otx is not an ALv2-licensed object, but that is something the project can work out so that the deployment approach of (1-3) can be implemented in a sanitary way.

        Show
        Dennis E. Hamilton added a comment - @Andrea @Sam A little more Sunday reflections. (The technicalities will be discussed back on the project and perhaps on the project wiki.) Here's a hypothetical steady state: 1. Bundled dictionary extensions (and any other extensions that are bundled, for that matter), are installed by binary releases as *.oxt extension packages in appropriate locations under the installation location of the binary run-time. That is, they are indistinguishable from the same *.oxt downloaded from an extension site, but installed within the binary release installation configuration (and vetted for that purpose). All use of the extension content is cached privately as necessary and exercised by the run-time using the standard, license agnostic functions that the run-time has for relying on dictionary extensions of any origin. So it is clear that these aqgregations are kept intact and only bundled but not integrated any more than if an user had downloaded them independently. 2. OpenOffice.org already provides an extension-management interface (usable for installing separately downloaded extensions) that also allows extensions to be removed or disabled. It is not and would not be possible to export extensions. However, it is not a stretch to consider that the extension management panel shown for an individual extension could also provide access to license information and an authoritative location where the *.oxt was/is obtainable from. If a "source code offer" is required, it could be presented there as well. 3. Bundling of dictionary extensions would be limited to ones that allow literal distribution without restriction and have no limitations that impact field of use of the run-time. The vetting of bundled dictionary extensions (and any other bundled extensions) would be to ensure that any license conditions on the *.otx-contained artifacts are being honored in the occurence of those artifacts in the *.otx and that having an additional level of bundling in a binary-release installer is clearly allowed and appropriate. There remain some question on what the upstream source is and how that is separate from Apache OpenOffice when a *.otx is not an ALv2-licensed object, but that is something the project can work out so that the deployment approach of (1-3) can be implemented in a sanitary way.
        Hide
        Sam Ruby added a comment -

        re: "Bundling of dictionary extensions would be limited to ones that allow literal distribution without restriction and have no limitations that impact field of use of the run-time"

        The OOO PMC is welcome to impose restrictions over and above what the Legal Affairs Committee requires. But meanwhile the scope of this particular issue remains limited to GPL dictionaries. Please continue to bring forward individual licenses for consideration, as you encounter specific needs required to support the Apache Open Office effort.

        Show
        Sam Ruby added a comment - re: "Bundling of dictionary extensions would be limited to ones that allow literal distribution without restriction and have no limitations that impact field of use of the run-time" The OOO PMC is welcome to impose restrictions over and above what the Legal Affairs Committee requires. But meanwhile the scope of this particular issue remains limited to GPL dictionaries. Please continue to bring forward individual licenses for consideration, as you encounter specific needs required to support the Apache Open Office effort.
        Hide
        Dennis E. Hamilton added a comment -

        Perhaps this will clarify the situation better:

        1. When we talk about GPL dictionaries, we might be talking about dict-NL.otx that have content under the GPL or the specific cases of NL.dic and NL.aff files that are distributed inside a dict-NL container. The dict-NL can contain dictionary pairs for one or more NL variants (e.g., en-US and en-GB) accompanied by separate, optional hyphenation-rule data and thesaurus databases. I can't speak for Andrea, but that is something that needs to be clear for this request. Both cases are spoken of as dictionaires. There is considerable comingling of licensed material in dict-NL aggregates. Any of the variant parts may have its own copyright and license notices and may be under a GPL license.

        2. With regard to "Bundling of dictionary extensions would be limited to ones that allow literal distribution without restriction and have no limitations that impact field of use of the run-time" it is apparently better to be more specific about GPL. In this case, we are talking GPL2 sometimes and GPL3 other times. (And sometimes LGPL.) It is rare for there to be but a single license and only GPL applicable to works within the aggregate. What I had in mind was anything permitting aggregation as defined in GPL3 clause 5 and permitting conveyance without change (including via aggregation one or more levels) as defined in GPL3 clause 4. And that the license satisfy the OSD in all other respects of course. All use of the artifact content by the run-time is ephemeral although caching of the writing-aid data can be done for performance reasons.

        3. In selecting such an artifact for bundled distribution in a binary release, the PPMC would want to assertain that GPL3 (or equivalent terms of GPL2) clause 5 was satisfied in the artifact. I suppose that is a matter for PPMC diligence, but it seems necessary to ensure that there are clean hands in providing the artifact in accordance with (2). This applies to other licenses too. I suppose this aspect falls under IP clearance for the binary release. None of these artifacts are produced by the PPMC and their source forms would not be in the Apache SVN. Where the dict-NL[.oxt] assemblages are retained for import into binary releases as bundled dictionary extensions is not clear.

        It might be useful to examine an important single candidate to ground this further. I think the dict-en for English Language writing aids would be particularly interesting. The dict-it for Italian Language writing aids that Andrea and others have produced is useful as a contrasting case.

        Show
        Dennis E. Hamilton added a comment - Perhaps this will clarify the situation better: 1. When we talk about GPL dictionaries, we might be talking about dict-NL.otx that have content under the GPL or the specific cases of NL.dic and NL.aff files that are distributed inside a dict-NL container. The dict-NL can contain dictionary pairs for one or more NL variants (e.g., en-US and en-GB) accompanied by separate, optional hyphenation-rule data and thesaurus databases. I can't speak for Andrea, but that is something that needs to be clear for this request. Both cases are spoken of as dictionaires. There is considerable comingling of licensed material in dict-NL aggregates. Any of the variant parts may have its own copyright and license notices and may be under a GPL license. 2. With regard to "Bundling of dictionary extensions would be limited to ones that allow literal distribution without restriction and have no limitations that impact field of use of the run-time" it is apparently better to be more specific about GPL. In this case, we are talking GPL2 sometimes and GPL3 other times. (And sometimes LGPL.) It is rare for there to be but a single license and only GPL applicable to works within the aggregate. What I had in mind was anything permitting aggregation as defined in GPL3 clause 5 and permitting conveyance without change (including via aggregation one or more levels) as defined in GPL3 clause 4. And that the license satisfy the OSD in all other respects of course. All use of the artifact content by the run-time is ephemeral although caching of the writing-aid data can be done for performance reasons. 3. In selecting such an artifact for bundled distribution in a binary release, the PPMC would want to assertain that GPL3 (or equivalent terms of GPL2) clause 5 was satisfied in the artifact. I suppose that is a matter for PPMC diligence, but it seems necessary to ensure that there are clean hands in providing the artifact in accordance with (2). This applies to other licenses too. I suppose this aspect falls under IP clearance for the binary release. None of these artifacts are produced by the PPMC and their source forms would not be in the Apache SVN. Where the dict-NL [.oxt] assemblages are retained for import into binary releases as bundled dictionary extensions is not clear. It might be useful to examine an important single candidate to ground this further. I think the dict-en for English Language writing aids would be particularly interesting. The dict-it for Italian Language writing aids that Andrea and others have produced is useful as a contrasting case.
        Hide
        Pedro Giffuni added a comment -

        @Andrea and @Sam concerning LPPL.
        I think the LPPL is not completely untied to this discussion: GPL and LPPL are both restricted and in a certain sense it would seem like accepting the GPL would also open the doors to accepting other GPL'd content (images, icons, fonts) or other restricted licenses like the LPPL. Unfortunately the FSF Foundation declared LPPL incompatible with the GPL and this causes some inconsistencies. If we accept the GPL here we would be imposing a restriction on third parties: they (and we) can't redistribute the LPPL'd hyphenation "code".

        I see some similarity here to what happened with PostgreSQL on Debian: some Debian purists complained about pgsql (BSD) using GNU readline (GPL) along with the GPL-incompatible OpenSSL (with an advertisement clause),.

        Now, a perhaps bigger issue here is that it is unclear to me if the FSF has a double standard with respect to applying the GPL to non-code. In the case of fonts the FSF says this:

        http://www.gnu.org/licenses/license-list.html#GPLFonts
        "The GNU GPL can be used for fonts. However, note that it does not permit embedding the font in a document unless that document is also licensed under the GPL."

        The reason for this appears to be that they consider fonts can contain embedded software. Perhaps the hyphenation rules can also be considered software?

        Of course there's also the issue of what is "Mere Aggregation": if I use the dictionary to clean the language in a document I am writing that may be OK, unless that document is actually a dictionary.

        These are purely theoretical issues, I think the GPL is unsuitable for non-code and in practice I doubt it is enforceable in this case but I guess I will entertain my self finding out if lawyers consider these to be "risks".

        Show
        Pedro Giffuni added a comment - @Andrea and @Sam concerning LPPL. I think the LPPL is not completely untied to this discussion: GPL and LPPL are both restricted and in a certain sense it would seem like accepting the GPL would also open the doors to accepting other GPL'd content (images, icons, fonts) or other restricted licenses like the LPPL. Unfortunately the FSF Foundation declared LPPL incompatible with the GPL and this causes some inconsistencies. If we accept the GPL here we would be imposing a restriction on third parties: they (and we) can't redistribute the LPPL'd hyphenation "code". I see some similarity here to what happened with PostgreSQL on Debian: some Debian purists complained about pgsql (BSD) using GNU readline (GPL) along with the GPL-incompatible OpenSSL (with an advertisement clause),. Now, a perhaps bigger issue here is that it is unclear to me if the FSF has a double standard with respect to applying the GPL to non-code. In the case of fonts the FSF says this: http://www.gnu.org/licenses/license-list.html#GPLFonts "The GNU GPL can be used for fonts. However, note that it does not permit embedding the font in a document unless that document is also licensed under the GPL." The reason for this appears to be that they consider fonts can contain embedded software. Perhaps the hyphenation rules can also be considered software? Of course there's also the issue of what is "Mere Aggregation": if I use the dictionary to clean the language in a document I am writing that may be OK, unless that document is actually a dictionary. These are purely theoretical issues, I think the GPL is unsuitable for non-code and in practice I doubt it is enforceable in this case but I guess I will entertain my self finding out if lawyers consider these to be "risks".
        Hide
        Sam Ruby added a comment -

        re "it would seem like accepting the GPL would also open the doors to accepting other GPL'd content (images, icons, fonts) or other restricted licenses like the LPPL."

        The scope of this issue is "Aggregation of GPL dictionaries with Apache OpenOffice (incubating) binary releases". Any other podling/project, any other license, or any other use other than aggregation of of dictionaries with binary releases" is outside of the scope of this issue.

        Show
        Sam Ruby added a comment - re "it would seem like accepting the GPL would also open the doors to accepting other GPL'd content (images, icons, fonts) or other restricted licenses like the LPPL." The scope of this issue is "Aggregation of GPL dictionaries with Apache OpenOffice (incubating) binary releases". Any other podling/project, any other license, or any other use other than aggregation of of dictionaries with binary releases" is outside of the scope of this issue.
        Hide
        Sam Ruby added a comment -

        Unless somebody steps forward with a clear objection, I plan to approve this on Monday morning, US EST.

        Show
        Sam Ruby added a comment - Unless somebody steps forward with a clear objection, I plan to approve this on Monday morning, US EST.
        Hide
        Pedro Giffuni added a comment - - edited

        @Sam
        I think I have found a relevant issue that is really specific to ASF policies and may have to be dealt with in this case.

        It has been stated here that 'since dictionaries are data files (text files), "binary" and "source" form are actually the same thing'.

        For reciprocal licenses the Third-Party Licensing Policy is clear: "Note that works written in a scripting language without a binary form cannot be included in any ASF product under one of these licenses (see Transition and Exceptions)."

        This would, in principle, apply to GPL'd dictionaries but most importantly it applies fully to weak copyleft (MPL) too.
        Reading further, Apache policies have specifically dealt this with this situation by defining a special rule for incubating projects. According to this link:

        http://www.apache.org/legal/3party.html#transition

        We need an authorized exception for this case and according to the General Rule this will give us time until the second ASF release. I understand this issue is also under current revision (every 6 months?).

        Show
        Pedro Giffuni added a comment - - edited @Sam I think I have found a relevant issue that is really specific to ASF policies and may have to be dealt with in this case. It has been stated here that 'since dictionaries are data files (text files), "binary" and "source" form are actually the same thing'. For reciprocal licenses the Third-Party Licensing Policy is clear: "Note that works written in a scripting language without a binary form cannot be included in any ASF product under one of these licenses (see Transition and Exceptions)." This would, in principle, apply to GPL'd dictionaries but most importantly it applies fully to weak copyleft (MPL) too. Reading further, Apache policies have specifically dealt this with this situation by defining a special rule for incubating projects. According to this link: http://www.apache.org/legal/3party.html#transition We need an authorized exception for this case and according to the General Rule this will give us time until the second ASF release. I understand this issue is also under current revision (every 6 months?).
        Hide
        Sam Ruby added a comment -

        Once approved (something that at this point looks very likely), this approval will represent an authorized exception for the use aggregation of GPL dictionaries with Apache OpenOffice. If other exceptions are needed, they should be pursued as separate issues.

        Show
        Sam Ruby added a comment - Once approved (something that at this point looks very likely), this approval will represent an authorized exception for the use aggregation of GPL dictionaries with Apache OpenOffice. If other exceptions are needed, they should be pursued as separate issues.
        Hide
        Dennis E. Hamilton added a comment - - edited

        There are other cases that will need to be covered, since many of the writing tools extensions that are used have content with multiple, different licenses. I accept that is immaterial here.

        With regard to the specific request, I have been examining these packages and I find the source=object claim to be questionable. It may hold up, but I think it requires more evidence. Some of these do not strike me as the "preferred form" for maintenance and modification. I know that some are hand-modified, but that strikes me as comparable to using debug except that the distributed form is text, but text that is very difficult to reliably hand-craft.

        That may not matter, but I am concerned we are giving source=object too much weight as a factor. The problem is not ours, since if there is a preferred form, there is nothing about it in the packages as currently constructed and licensed. It would be more solid if source=object is immaterial to the resolution.

        Aside: It would also be great if the contributors of the packages provided more information on what the dictionary files are derived from and how that was done. I think that is an immaterial factor on this issue but one that we need to attend to in regard to the provenence of the material and the asserted license. (This seems to be a general consideration for 3rd-party material.)

        Show
        Dennis E. Hamilton added a comment - - edited There are other cases that will need to be covered, since many of the writing tools extensions that are used have content with multiple, different licenses. I accept that is immaterial here. With regard to the specific request, I have been examining these packages and I find the source=object claim to be questionable. It may hold up, but I think it requires more evidence. Some of these do not strike me as the "preferred form" for maintenance and modification. I know that some are hand-modified, but that strikes me as comparable to using debug except that the distributed form is text, but text that is very difficult to reliably hand-craft. That may not matter, but I am concerned we are giving source=object too much weight as a factor. The problem is not ours, since if there is a preferred form, there is nothing about it in the packages as currently constructed and licensed. It would be more solid if source=object is immaterial to the resolution. Aside: It would also be great if the contributors of the packages provided more information on what the dictionary files are derived from and how that was done. I think that is an immaterial factor on this issue but one that we need to attend to in regard to the provenence of the material and the asserted license. (This seems to be a general consideration for 3rd-party material.)
        Hide
        Pedro Giffuni added a comment -

        @ Sam, @ Myself
        Apparently the issue I presented previously is only mentioned explicitly in the Draft of the policy and does not appear in the definitive document.
        Once approved (which seems likely) I don't think raising other issues contrary to the decision would make any sense.

        It would further save us some time asking right away if there is any impediment in having the (properly labelled) dictionaries in subversion. This makes sense as apparently it is hereby being considered that source=object in this case.

        Show
        Pedro Giffuni added a comment - @ Sam, @ Myself Apparently the issue I presented previously is only mentioned explicitly in the Draft of the policy and does not appear in the definitive document. Once approved (which seems likely) I don't think raising other issues contrary to the decision would make any sense. It would further save us some time asking right away if there is any impediment in having the (properly labelled) dictionaries in subversion. This makes sense as apparently it is hereby being considered that source=object in this case.
        Hide
        Andrea Pescetti added a comment -

        So, may we assume this is approved? We are now ready to start the technical work if the legal review is successful.

        Show
        Andrea Pescetti added a comment - So, may we assume this is approved? We are now ready to start the technical work if the legal review is successful.
        Hide
        Sam Ruby added a comment -

        Approved. People have been given ample opportunity to comment, and no issues were identified. Note that this approval is narrow: it only applies to the aggregation of GPL dictionaries with Apache OpenOffice binary releases.

        Show
        Sam Ruby added a comment - Approved. People have been given ample opportunity to comment, and no issues were identified. Note that this approval is narrow: it only applies to the aggregation of GPL dictionaries with Apache OpenOffice binary releases.

          People

          • Assignee:
            Sam Ruby
            Reporter:
            Andrea Pescetti
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development