Bigtop
  1. Bigtop
  2. BIGTOP-282

the licensing status of the MovieLens data files needs to be cleared up

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.8.0
    • Component/s: Tests
    • Labels:
      None

      Activity

      Hide
      Stephen Chu added a comment -

      Do we still need the MovieLens data files?

      In BIGTOP-611, Wing Yew decided to use his own randomly generated data instead of the MovieLens data files. I don't think any of the other tests rely on the MovieLens data.

      Show
      Stephen Chu added a comment - Do we still need the MovieLens data files? In BIGTOP-611 , Wing Yew decided to use his own randomly generated data instead of the MovieLens data files. I don't think any of the other tests rely on the MovieLens data.
      Hide
      Wing Yew Poon added a comment -

      The problem is with bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data.

      Show
      Wing Yew Poon added a comment - The problem is with bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data.
      Hide
      Sean Mackrory added a comment -

      It appears from that license we shouldn't be distributing this without permission. If we don't need that data any more, I think the best course of action is to cease distributing it. I will post a patch shortly.

      Show
      Sean Mackrory added a comment - It appears from that license we shouldn't be distributing this without permission. If we don't need that data any more, I think the best course of action is to cease distributing it. I will post a patch shortly.
      Hide
      Konstantin Boudnik added a comment -

      What's the status on this ticket? Are we removing the files after all? Can people familiar with the matter chime in?

      Show
      Konstantin Boudnik added a comment - What's the status on this ticket? Are we removing the files after all? Can people familiar with the matter chime in?
      Hide
      Sean Mackrory added a comment -

      Attached is a patch that will remove every reference to the files that I see in the tests. I think we should also remove the entirety of the bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data directory, but doing so puts this patch over the size limit for Apache attachments, so I've left it out of my post.

      While it would be nice to run the tests and make sure it all works before committing, I've been meaning to do that for a long time and just never get to it. Unless someone else is able to run them all quickly (I don't have a good environment to run them all in right now and I'm not super familiar with doing so), I propose we just drop the files we shouldn't be distributing, and if it fails on Jenkins we fix it when it fails. Any thoughts on that?

      Show
      Sean Mackrory added a comment - Attached is a patch that will remove every reference to the files that I see in the tests. I think we should also remove the entirety of the bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data directory, but doing so puts this patch over the size limit for Apache attachments, so I've left it out of my post. While it would be nice to run the tests and make sure it all works before committing, I've been meaning to do that for a long time and just never get to it. Unless someone else is able to run them all quickly (I don't have a good environment to run them all in right now and I'm not super familiar with doing so), I propose we just drop the files we shouldn't be distributing, and if it fails on Jenkins we fix it when it fails. Any thoughts on that?
      Hide
      Konstantin Boudnik added a comment -

      Looks good. Please commit along with the removal of the directory. I agree to play 'wait-and-see' game to make sure that no tests are failing after this patch.

      Show
      Konstantin Boudnik added a comment - Looks good. Please commit along with the removal of the directory. I agree to play 'wait-and-see' game to make sure that no tests are failing after this patch.
      Hide
      jay vyas added a comment -

      +1 . and lets minimize the amount of data files we distribute in the future ! BigPetStore generated data can i think be used for meaningfull hive , mahout, wordcount smoke tests.

      Show
      jay vyas added a comment - +1 . and lets minimize the amount of data files we distribute in the future ! BigPetStore generated data can i think be used for meaningfull hive , mahout, wordcount smoke tests.
      Hide
      Sean Mackrory added a comment -

      Forgot to resolve after committing. I haven't heard of any puppies or other innocent animals being hurt because of this change, so it looks like we're good.

      Show
      Sean Mackrory added a comment - Forgot to resolve after committing. I haven't heard of any puppies or other innocent animals being hurt because of this change, so it looks like we're good.

        People

        • Assignee:
          Sean Mackrory
          Reporter:
          Roman Shaposhnik
        • Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development