Details

    • Type: Test Test
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Tests
    • Labels:

      Description

      Add a test for HBase's importtsv tool.

      1. BIGTOP-611.txt
        368 kB
        Wing Yew Poon

        Activity

        Hide
        Stephen Chu added a comment -

        Submitted (to trunk) a test written by Wing Yew Poon and two .tsv resource files to run the test.

        Show
        Stephen Chu added a comment - Submitted (to trunk) a test written by Wing Yew Poon and two .tsv resource files to run the test.
        Hide
        Wing Yew Poon added a comment -

        We need permission to use MovieLens data.
        Lacking that permission, we should use different data.

        Show
        Wing Yew Poon added a comment - We need permission to use MovieLens data. Lacking that permission, we should use different data.
        Hide
        Stephen Chu added a comment -

        I'll ask for permission.

        Show
        Stephen Chu added a comment - I'll ask for permission.
        Hide
        Johnny Zhang added a comment -

        is this related to license ?

        Show
        Johnny Zhang added a comment - is this related to license ?
        Hide
        Wing Yew Poon added a comment -

        The MovieLens README contains the usage license/conditions. It says,

        "The user may not redistribute the data without separate permission."

        I believe that publishing data to a public repository constitutes redistribution.

        Show
        Wing Yew Poon added a comment - The MovieLens README contains the usage license/conditions. It says, "The user may not redistribute the data without separate permission." I believe that publishing data to a public repository constitutes redistribution.
        Hide
        Stephen Chu added a comment -

        I emailed MovieLens and got the following response. I think it may be better to find or create other .tsv/.psv files so we don't have to configure any accepting of licenses.

        Hi Stephen,

        The data sets come with READMEs that outline the acceptable usages. Since you are with Apache, I believe you would not be using this for commercial purposes so you satisfy that clause. The main roadblock I see is what is involved with using the data set as a test resource. Would this mean the data would be hosted in an SCM somewhere for anyone to download?

        Our LensKit framework uses the 100k data set as a test resource, but has Maven download it automatically after using a property to "accept" the license. If you were able to configure your testing environment similarly, then I think using our data would be acceptable.

        HTH,
        Michael Ludwig
        GroupLens
        University of Minnesota

        Show
        Stephen Chu added a comment - I emailed MovieLens and got the following response. I think it may be better to find or create other .tsv/.psv files so we don't have to configure any accepting of licenses. Hi Stephen, The data sets come with READMEs that outline the acceptable usages. Since you are with Apache, I believe you would not be using this for commercial purposes so you satisfy that clause. The main roadblock I see is what is involved with using the data set as a test resource. Would this mean the data would be hosted in an SCM somewhere for anyone to download? Our LensKit framework uses the 100k data set as a test resource, but has Maven download it automatically after using a property to "accept" the license. If you were able to configure your testing environment similarly, then I think using our data would be acceptable. HTH, Michael Ludwig GroupLens University of Minnesota
        Hide
        Roman Shaposhnik added a comment -

        Stephen, thanks a lot for following up on this. I believe we can totally do the Maven property thing.

        Show
        Roman Shaposhnik added a comment - Stephen, thanks a lot for following up on this. I believe we can totally do the Maven property thing.
        Hide
        Stephen Chu added a comment -

        I'll look into how to do the Maven property thing.

        Wing Yew, where did you find the MovieLens .tsv files? I'm having trouble finding where they're hosted on the MovieLens/GroupLens websites.

        Show
        Stephen Chu added a comment - I'll look into how to do the Maven property thing. Wing Yew, where did you find the MovieLens .tsv files? I'm having trouble finding where they're hosted on the MovieLens/GroupLens websites.
        Hide
        Wing Yew Poon added a comment -

        The psv file came from MovieLens. I can't remember which size download I got it from. It is the list of movies in the dataset. The tsv file I produced by replacing the '|' separators in the psv file with tabs.
        It is simpler not to use those files. I'm going to generate new files with random data. I don't think we need to obtain any permission in that case.

        Show
        Wing Yew Poon added a comment - The psv file came from MovieLens. I can't remember which size download I got it from. It is the list of movies in the dataset. The tsv file I produced by replacing the '|' separators in the psv file with tabs. It is simpler not to use those files. I'm going to generate new files with random data. I don't think we need to obtain any permission in that case.
        Hide
        Wing Yew Poon added a comment -

        TestImportTsv and associated resource files.

        Show
        Wing Yew Poon added a comment - TestImportTsv and associated resource files.
        Hide
        Wing Yew Poon added a comment -

        Patch available.

        Show
        Wing Yew Poon added a comment - Patch available.
        Hide
        Roman Shaposhnik added a comment -

        +1 and committed

        Show
        Roman Shaposhnik added a comment - +1 and committed

          People

          • Assignee:
            Wing Yew Poon
            Reporter:
            Stephen Chu
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development