Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-593

RegExLoader stops an non-matching line

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 0.1.0
    • None
    • impl
    • None
    • Patch Available

    Description

      Class RegExLoader and all its subclasses stop if some of lines does not match provided regular expression.

      In particular, I have noticed this when CombinedLogLoader stopped at the following line:

      58.210.62.24 - - [29/Dec/2008:23:06:57 -0800] "GET /tor/browse/?id=24746&rel=FLY
      999%40Jack's+Teen+America+22%2FFLY999原創%40單掛D.C.資訊交流網+Jack's+Teen+Ameri
      ca+22+cd1.avi HTTP/1.1" 8952 200 "http://img252.imageshack.us/tor/browse/?id=247
      46&rel=FLY999%40Jack%27s+Teen+America+22" "Mozilla/4.0 (compatible; MSIE 6.0; Wi
      ndows NT 5.1; )" "-"

      Looks like some japanese characters here do not match \S expression used.

      In general I expect it to skip such lines, not to stop processing data file.

      Attachments

        1. PIG-593.diff
          0.6 kB
          Vadim Zaliva

        Activity

          People

            Unassigned Unassigned
            vzaliva Vadim Zaliva
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: