Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1980

Jexl expressions for CrawlDbReader

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.11
    • Component/s: crawldb
    • Labels:
      None

      Description

      Jexl expression support for the CrawlDbReader. This allows you to read items from the database based on their metadata with flexilibity and boolean logic. Some examples

      * Get all english pages
      -expr "lang=en"
      
      * Get all english pages that have a low response time
      -expr "lang=en && _rs_ > 5000"
      

        Attachments

        1. NUTCH-1980.patch
          7 kB
          Markus Jelsma
        2. NUTCH-1980-1.9.patch
          10 kB
          Markus Jelsma
        3. NUTCH-1980-1.9.patch
          10 kB
          Markus Jelsma
        4. NUTCH-1980-1.9.patch
          9 kB
          Markus Jelsma

          Activity

            People

            • Assignee:
              markus17 Markus Jelsma
              Reporter:
              markus17 Markus Jelsma
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: