Apache Blur
  1. Apache Blur
  2. BLUR-344

Expose a Scanner capability that allows various implementations (e.g. ExportScanner)

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Blur
    • Labels:
      None

      Description

      Blur should have the ability to have "scanner" plugins that, given a query, are handed all the matching records of the query. These would be async long running calls from the thrift api perspective.

      The scanner would essentially be given a collector of the hits with the fields defined by the passed in selector.

      The client would ask for a scan, then poll for the status periodically and - depending on the Scanner implementation - pick up the results in whatever form they were requested.

      For a concrete implementation, think of export. The ExportScanner would be given a location in HDFS and scan over all the results and drop them in that directory - maybe in a particular requested form. The Scanner pattern could be have many useful implementations though - for example, to insert a subset of the data into a new Blur Table.

      Here are some client API thoughts:

      struct ScannerQuery {
        1:Query query,
        2:Selector selector,
        3:string id,
        4:string userContext,
        5:string scannerName,
        6:i64 startTime = 0,
        7:map<string,string> properties
      }
      
      enum ScanStatus {
        COMPLETE,
        RUNNING,
        ERROR
       }
      
        void scan(
          1:ScannerQuery scannerQuery
        ) throws (1:BlurException ex)
      
        list<string> scanList(
        ) throws (1:BlurException ex)
      
        ScanStatus statusScan(
          1:string scanId
        ) throws (1:BlurException ex)
      
        void cancelScan(
          1:string scanId
       ) throws (1:BlurException ex)
      

        Activity

          People

          • Assignee:
            Tim Williams
            Reporter:
            Tim Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development