Details

    • Type: Task Task
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: Jena 2.12.0
    • Component/s: None
    • Labels:

      Description

      Hadoop RDF Tools is a set of experimental modules developed internally at my employer (Cray) that has been agreed to open source by donating to the Jena project in order that the wider community can further their development.

      The donated code base comprises of 4 modules:

      • Hadoop RDF Common - Set of Writable implementations for representing primitive RDF types i.e. nodes, triples, quads
      • Hadoop RDF IO - Set of InputFormat, RecordReader, OutputFormat and RecordWriter implementations for consuming/producing RDF from Hadoop programs.
      • Hadoop RDF Map/Reduce - Set of Mapper and Reducer implementations that provide some common operations that users are likely to want to do with RDF data
      • Hadoop RDF Stats - Demo application that uses the other libraries to analyse some RDF data and produce various statistics on it

      Since this code was developed outside of the Apache process it is required to go through the IP Clearance procedure which is managed by the Incubator - https://incubator.apache.org/ip-clearance/index.html

      This issue will act as a tracking point for tasks related to carrying out the IP Clearance process.

      1. Hadoop-RDF-Tools.patch
        636 kB
        Rob Vesse
      2. Incubator-Website.patch
        6 kB
        Rob Vesse
      3. Hadoop RDF Tools - External Version.pptx
        736 kB
        Rob Vesse
      4. Hadoop RDF Tools - External Version.pdf
        1.93 MB
        Rob Vesse
      5. Incubator-Website.patch
        6 kB
        Rob Vesse

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open In Progress In Progress
        1h 45m 1 Rob Vesse 01/Apr/14 12:16
        In Progress In Progress Resolved Resolved
        55d 22h 55m 1 Rob Vesse 27/May/14 11:12
        Resolved Resolved Closed Closed
        10s 1 Rob Vesse 27/May/14 11:12
        Closed Closed Reopened Reopened
        250d 9h 4m 1 Andy Seaborne 01/Feb/15 19:16
        Reopened Reopened Closed Closed
        3m 55s 1 Andy Seaborne 01/Feb/15 19:20
        Andy Seaborne made changes -
        Status Reopened [ 4 ] Closed [ 6 ]
        Resolution Done [ 11 ]
        Fix Version/s Jena 2.12.0 [ 12326844 ]
        Hide
        Andy Seaborne added a comment -

        Set fix version 2.12.0.

        Bulk change to issues that did not have a fix version and were closed between 2.11.0 and 2.12.0.

        Show
        Andy Seaborne added a comment - Set fix version 2.12.0. Bulk change to issues that did not have a fix version and were closed between 2.11.0 and 2.12.0.
        Andy Seaborne made changes -
        Resolution Done [ 11 ]
        Status Closed [ 6 ] Reopened [ 4 ]
        Hide
        Andy Seaborne added a comment -

        Reopen to set fix version.

        Show
        Andy Seaborne added a comment - Reopen to set fix version.
        Rob Vesse made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Rob Vesse made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        Resolution Done [ 11 ]
        Hide
        Rob Vesse added a comment -

        Resolving as Done, the Incubator IP Clearance vote has passed by lazy consensus and the result email should show up in the archives shortly

        The Hadoop RDF Tools code can now be actively developed against, I'll send an email to the list laying out some ideas around this

        Show
        Rob Vesse added a comment - Resolving as Done, the Incubator IP Clearance vote has passed by lazy consensus and the result email should show up in the archives shortly The Hadoop RDF Tools code can now be actively developed against, I'll send an email to the list laying out some ideas around this
        Hide
        Rob Vesse added a comment -

        The PMC IP Clearance vote passed with 4 +1s

        The Incubator IP Clearance vote is now in progress which is the final hurdle to accepting the code and being able to start actively developing it

        Show
        Rob Vesse added a comment - The PMC IP Clearance vote passed with 4 +1s The Incubator IP Clearance vote is now in progress which is the final hurdle to accepting the code and being able to start actively developing it
        Hide
        ASF subversion and git services added a comment -

        Commit 1594851 from Rob Vesse
        [ https://svn.apache.org/r1594851 ]

        Minor corrections to LICENSE and add missing license to POM per comments from Andy (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1594851 from Rob Vesse [ https://svn.apache.org/r1594851 ] Minor corrections to LICENSE and add missing license to POM per comments from Andy ( JENA-666 )
        Hide
        Rob Vesse added a comment -

        The PMC IP Clearance vote is currently in progress

        Show
        Rob Vesse added a comment - The PMC IP Clearance vote is currently in progress
        Hide
        Rob Vesse added a comment -

        It should be, I think any fixes we need for the current state of the code to build are in the 2.11.1 release

        However there are known issues that will need us to work off of 2.11.2-SNAPSHOTs once we get the code IP cleared and are able to start actively developing it again.

        Show
        Rob Vesse added a comment - It should be, I think any fixes we need for the current state of the code to build are in the 2.11.1 release However there are known issues that will need us to work off of 2.11.2-SNAPSHOTs once we get the code IP cleared and are able to start actively developing it again.
        Hide
        Moritz Hoffmann added a comment -

        Would it be possible to bump the Jena dependency to 2.11.2-SNAPSHOT (now 2.11.1-SNAPSHOT), or maybe move away from SNAPSHOT?

        Show
        Moritz Hoffmann added a comment - Would it be possible to bump the Jena dependency to 2.11.2-SNAPSHOT (now 2.11.1-SNAPSHOT), or maybe move away from SNAPSHOT?
        Hide
        ASF subversion and git services added a comment -

        Commit 1585801 from rvesse@apache.org
        [ https://svn.apache.org/r1585801 ]

        Align license header formatting with rest of Jena (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585801 from rvesse@apache.org [ https://svn.apache.org/r1585801 ] Align license header formatting with rest of Jena ( JENA-666 )
        Hide
        Rob Vesse added a comment -

        Right, I think this is now almost in a state upon which we could take a IP Clearance vote.

        I just need to replicate the L&N into the sub-modules and clean up the copyright header formatting

        Show
        Rob Vesse added a comment - Right, I think this is now almost in a state upon which we could take a IP Clearance vote. I just need to replicate the L&N into the sub-modules and clean up the copyright header formatting
        Hide
        ASF subversion and git services added a comment -

        Commit 1585750 from rvesse@apache.org
        [ https://svn.apache.org/r1585750 ]

        Add license headers to relevant XML files (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585750 from rvesse@apache.org [ https://svn.apache.org/r1585750 ] Add license headers to relevant XML files ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1585747 from rvesse@apache.org
        [ https://svn.apache.org/r1585747 ]

        Trim down NOTICE per discussion with Andy on JENA-666

        Show
        ASF subversion and git services added a comment - Commit 1585747 from rvesse@apache.org [ https://svn.apache.org/r1585747 ] Trim down NOTICE per discussion with Andy on JENA-666
        Hide
        ASF subversion and git services added a comment -

        Commit 1585734 from rvesse@apache.org
        [ https://svn.apache.org/r1585734 ]

        Change groupId to org.apache.jena (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585734 from rvesse@apache.org [ https://svn.apache.org/r1585734 ] Change groupId to org.apache.jena ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1585727 from rvesse@apache.org
        [ https://svn.apache.org/r1585727 ]

        Complete migrating to org.apache.jena package (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585727 from rvesse@apache.org [ https://svn.apache.org/r1585727 ] Complete migrating to org.apache.jena package ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1585725 from rvesse@apache.org
        [ https://svn.apache.org/r1585725 ]

        Continue migrating to org.apache.jena package (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585725 from rvesse@apache.org [ https://svn.apache.org/r1585725 ] Continue migrating to org.apache.jena package ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1585724 from rvesse@apache.org
        [ https://svn.apache.org/r1585724 ]

        Continue migrating to org.apache.jena package (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585724 from rvesse@apache.org [ https://svn.apache.org/r1585724 ] Continue migrating to org.apache.jena package ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1585723 from rvesse@apache.org
        [ https://svn.apache.org/r1585723 ]

        Continue migrating to org.apache.jena package (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585723 from rvesse@apache.org [ https://svn.apache.org/r1585723 ] Continue migrating to org.apache.jena package ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1585720 from rvesse@apache.org
        [ https://svn.apache.org/r1585720 ]

        Start migrating to org.apache.jena package (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1585720 from rvesse@apache.org [ https://svn.apache.org/r1585720 ] Start migrating to org.apache.jena package ( JENA-666 )
        Hide
        Rob Vesse added a comment -

        Ok, the NOTICE file can likely be slimmed down then to just include the Jena Notices. Though since the other Jena libraries are dependencies is that even necessary since those notices primarily apply to historical code in the core modules?

        Yes I plan to change the Maven group ID and Java packages I just haven't got round to it yet.

        I can also easily re-do the license headers to match the formatting elsewhere, I automated the conversion anyway so can easily do it again.

        In terms of releases I suspect we might put out an individual release of this in the short term so people can start playing with it knowing it may be buggy and then just include it in future Jena releases as it matures.

        Show
        Rob Vesse added a comment - Ok, the NOTICE file can likely be slimmed down then to just include the Jena Notices. Though since the other Jena libraries are dependencies is that even necessary since those notices primarily apply to historical code in the core modules? Yes I plan to change the Maven group ID and Java packages I just haven't got round to it yet. I can also easily re-do the license headers to match the formatting elsewhere, I automated the conversion anyway so can easily do it again. In terms of releases I suspect we might put out an individual release of this in the short term so people can start playing with it knowing it may be buggy and then just include it in future Jena releases as it matures.
        Hide
        Andy Seaborne added a comment -

        Would this part of a single Jena release or a separate release? Or separate for now (maybe it would change more frequently?), and later in the main release cycle?

        Show
        Andy Seaborne added a comment - Would this part of a single Jena release or a separate release? Or separate for now (maybe it would change more frequently?), and later in the main release cycle?
        Hide
        Andy Seaborne added a comment -

        Non-legal comments:

        1. It's <groupId>com.yarcdata.urika</groupId>, not org.apache.jena and same for the code package structure. Is there any problem with changing that? It's onyl com.hp... because of all the code already out there at the time of joining ASF.
        2. POM files should have a licence as per out other POM files.
        3. The Apache header is slightly different format (same words) to the rest of the code base. Minot/trivial but it would be nice to be identical in case we automate some conversion sometime.
        Show
        Andy Seaborne added a comment - Non-legal comments: It's <groupId>com.yarcdata.urika</groupId> , not org.apache.jena and same for the code package structure. Is there any problem with changing that? It's onyl com.hp... because of all the code already out there at the time of joining ASF. POM files should have a licence as per out other POM files. The Apache header is slightly different format (same words) to the rest of the code base. Minot/trivial but it would be nice to be identical in case we automate some conversion sometime.
        Hide
        Andy Seaborne added a comment -

        For things that are dependencies, and not shipped (source, binary) there isn't a need to do anything. Even if AL code is binary-shipped, it does not need to be in NOTICE. It is nice to put in LICENCE of any combined product. If it's all maven dependencies and the release is only via maven, then there is not need for anything and NOTICE must not contain anything (the minimal principle).

        Same for airline.

        I've run RAT (it's clean except POM files don't have a license) and looked at the dependency tree (clean). Looks good from the contribution point of view.

        Show
        Andy Seaborne added a comment - For things that are dependencies, and not shipped (source, binary) there isn't a need to do anything. Even if AL code is binary-shipped, it does not need to be in NOTICE. It is nice to put in LICENCE of any combined product. If it's all maven dependencies and the release is only via maven, then there is not need for anything and NOTICE must not contain anything (the minimal principle). Same for airline. I've run RAT (it's clean except POM files don't have a license) and looked at the dependency tree (clean). Looks good from the contribution point of view.
        Hide
        Rob Vesse added a comment -

        Andy

        I have now updated all the copyright headers appropriately and stubbed out a basic LICENSE and NOTICE

        The primary dependency is just Apache Hadoop which is handled simply by mentioning it in the NOTICE file I believe? Most of that is also marked as provided as well since they'll be part of a Hadoop installation so if and when we ever created convenience binaries this and their transitive dependencies should not be an issue. Regardless anything that is a transitive dependency via Apache Hadoop must be allowed in Apache distributions anyway or the Hadoop PMC couldn't release it.

        The demo app (hadoop-rdf-stats) uses airline (https://github.com/airlift/airline) to provide the CLI and this is distributed under the ALv2 so again I assume just a mention in NOTICE for that module is appropriate?

        Show
        Rob Vesse added a comment - Andy I have now updated all the copyright headers appropriately and stubbed out a basic LICENSE and NOTICE The primary dependency is just Apache Hadoop which is handled simply by mentioning it in the NOTICE file I believe? Most of that is also marked as provided as well since they'll be part of a Hadoop installation so if and when we ever created convenience binaries this and their transitive dependencies should not be an issue. Regardless anything that is a transitive dependency via Apache Hadoop must be allowed in Apache distributions anyway or the Hadoop PMC couldn't release it. The demo app (hadoop-rdf-stats) uses airline ( https://github.com/airlift/airline ) to provide the CLI and this is distributed under the ALv2 so again I assume just a mention in NOTICE for that module is appropriate?
        Hide
        ASF subversion and git services added a comment -

        Commit 1584619 from rvesse@apache.org
        [ https://svn.apache.org/r1584619 ]

        Add stub LICENSE and NOTICE to hadoop-rdf module (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1584619 from rvesse@apache.org [ https://svn.apache.org/r1584619 ] Add stub LICENSE and NOTICE to hadoop-rdf module ( JENA-666 )
        Hide
        ASF subversion and git services added a comment -

        Commit 1584617 from rvesse@apache.org
        [ https://svn.apache.org/r1584617 ]

        Add standard Apache Copyright headers (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1584617 from rvesse@apache.org [ https://svn.apache.org/r1584617 ] Add standard Apache Copyright headers ( JENA-666 )
        Hide
        Andy Seaborne added a comment - - edited

        Before anything else could you (Rob), as the contributing Cray person, please change the copyright notices and add the boiler plate from "How to apply the Apache License to your work"? That makes it non-accepted, Apache licensed code.

        The ideal from my point of view is to end up is without copyright notices in source files and with the contributor license statement on each file like the rest of the code. "Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements". Is that Cray's expectation here?

        At the moment, a sample of files I looked at have:

         * Copyright 2013 YarcData LLC All Rights Reserved.
        

        What dependencies has the code have (and what licenses do they have)?

        Show
        Andy Seaborne added a comment - - edited Before anything else could you (Rob), as the contributing Cray person, please change the copyright notices and add the boiler plate from "How to apply the Apache License to your work"? That makes it non-accepted, Apache licensed code. The ideal from my point of view is to end up is without copyright notices in source files and with the contributor license statement on each file like the rest of the code. "Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements". Is that Cray's expectation here? At the moment, a sample of files I looked at have: * Copyright 2013 YarcData LLC All Rights Reserved. What dependencies has the code have (and what licenses do they have)?
        Hide
        ASF subversion and git services added a comment -

        Commit 1583942 from Rob Vesse
        [ https://svn.apache.org/r1583942 ]

        Initial import of Hadoop RDF Tools code which is undergoing IP Clearance process (JENA-666)

        Show
        ASF subversion and git services added a comment - Commit 1583942 from Rob Vesse [ https://svn.apache.org/r1583942 ] Initial import of Hadoop RDF Tools code which is undergoing IP Clearance process ( JENA-666 )
        Rob Vesse made changes -
        Attachment Incubator-Website.patch [ 12638223 ]
        Hide
        Rob Vesse added a comment -

        Updated version of Incubator Website patch which I have now committed

        Show
        Rob Vesse added a comment - Updated version of Incubator Website patch which I have now committed
        Rob Vesse made changes -
        Attachment Hadoop RDF Tools - External Version.pdf [ 12638058 ]
        Hide
        Rob Vesse added a comment -

        Adding PDF version of the presentation

        Show
        Rob Vesse added a comment - Adding PDF version of the presentation
        Rob Vesse made changes -
        Hide
        Rob Vesse added a comment -

        Added a Powerpoint presentation which discusses features, known bugs/limitations and future work

        Show
        Rob Vesse added a comment - Added a Powerpoint presentation which discusses features, known bugs/limitations and future work
        Rob Vesse made changes -
        Assignee Rob Vesse [ rvesse ] Andy Seaborne [ andy.seaborne ]
        Hide
        Rob Vesse added a comment -

        Assigning to Andy as while I will be doing most of the leg work an ASF Member has to carry out the actual formal parts of the process (like calling the votes)

        Show
        Rob Vesse added a comment - Assigning to Andy as while I will be doing most of the leg work an ASF Member has to carry out the actual formal parts of the process (like calling the votes)
        Rob Vesse made changes -
        Attachment Incubator-Website.patch [ 12638048 ]
        Hide
        Rob Vesse added a comment -

        Attaching patch file containing the patch for the Incubator website that adds the IP Clearance form to track the progress of the clearance with the Incubator

        Show
        Rob Vesse added a comment - Attaching patch file containing the patch for the Incubator website that adds the IP Clearance form to track the progress of the clearance with the Incubator
        Rob Vesse made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Rob Vesse made changes -
        Field Original Value New Value
        Attachment Hadoop-RDF-Tools.patch [ 12638047 ]
        Hide
        Rob Vesse added a comment -

        Attaching patch file containing the donated code

        Show
        Rob Vesse added a comment - Attaching patch file containing the donated code
        Rob Vesse created issue -

          People

          • Assignee:
            Andy Seaborne
            Reporter:
            Rob Vesse
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development