Uploaded image for project: 'Apache Blur'
  1. Apache Blur
  2. BLUR-344

Expose a Scanner capability that allows various implementations (e.g. ExportScanner)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Blur
    • None

    Description

      Blur should have the ability to have "scanner" plugins that, given a query, are handed all the matching records of the query. These would be async long running calls from the thrift api perspective.

      The scanner would essentially be given a collector of the hits with the fields defined by the passed in selector.

      The client would ask for a scan, then poll for the status periodically and - depending on the Scanner implementation - pick up the results in whatever form they were requested.

      For a concrete implementation, think of export. The ExportScanner would be given a location in HDFS and scan over all the results and drop them in that directory - maybe in a particular requested form. The Scanner pattern could be have many useful implementations though - for example, to insert a subset of the data into a new Blur Table.

      Here are some client API thoughts:

      struct ScannerQuery {
        1:Query query,
        2:Selector selector,
        3:string id,
        4:string userContext,
        5:string scannerName,
        6:i64 startTime = 0,
        7:map<string,string> properties
      }
      
      enum ScanStatus {
        COMPLETE,
        RUNNING,
        ERROR
       }
      
        void scan(
          1:ScannerQuery scannerQuery
        ) throws (1:BlurException ex)
      
        list<string> scanList(
        ) throws (1:BlurException ex)
      
        ScanStatus statusScan(
          1:string scanId
        ) throws (1:BlurException ex)
      
        void cancelScan(
          1:string scanId
       ) throws (1:BlurException ex)
      

      Attachments

        Activity

          People

            williamstw Tim Williams
            williamstw Tim Williams
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: