Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2531

Filter function for IsTupleInBag and IsTupleInTuple

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.9.1
    • None
    • piggybank
    • None
    • Patch Available

    Description

      It would be nice to have a FilterFunc that allows to filter based on a tuple in the stream being part of either another tuple of a bag.

      Data (e.g. session data joined with e.g. follow-up sessions where)
      > BAG:

      {('/login'), ('/show'), ('/logout?user_id=2000')}

      , TUPLE: ('/logout?user_id=2000')
      > BAG:

      {('/home'), ('/about')}

      , TUPLE: ('/admin')
      > BAG:

      {('login')}

      , TUPLE: ('/logout')

      It would be great to be able to filter filter based on criteria <B1 CONTAINS T1> or <T1 CONTAINS T2>. In the above case, the only result of such an operation would be the first entry '/logout?user_id=2000' - it should be obvious that this is useful.

      Attachments

        1. PIG-2531.patch
          11 kB
          Florian Leibert (flo)

        Activity

          People

            Unassigned Unassigned
            florianleibert Florian Leibert (flo)
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified