Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: REST
    • Labels:
      None

      Description

      This is a php class to interact with the rest interface this is my first copy so there could be bugs and changes to come as the rest interface changes. I will make this in to a patch once I am done with it. there is lots of comments in the file and notes on usage but here is some basic stuff to get you started. you are welcome to suggest changes to make it faster or more usable.

      Basic Usage here more details in notes with each function

      // open a new connection to rest server. Hbase Master default port is 60010
      $hbase = new hbase_rest($ip, $port);

      // get list of tables
      $tables = $hbase->list_tables();

      // get table column family names and compression stuff
      $table_info = $hbase->table_schema("search_index");

      // get start and end row keys of each region
      $regions = $hbase->regions($table);

      // select data from hbase
      $results = $hbase->select($table,$row_key);

      // insert data into hbase the $column and $data can be arrays with more then one column inserted in one request
      $hbase->insert($table,$row,$column(s),$data(s));

      // delete a column from a row. Can not use * at this point to remove all I thank there is plans to add this.
      $hbase->remove($table,$row,$column);

      // start a scanner on a set range of table
      $handle = $hbase->scanner_start($table,$cols,$start_row,$end_row);

      // pull the next row of data for a scanner handle
      $results = $hbase->scanner_get($handle);

      // delete a scanner handle
      $hbase->scanner_delete($handle);

      Example of using a scanner this will loop each row until it out of rows.

      include(hbase_rest.php);
      $hbase = new hbase_rest($ip, $port);
      $handle = $hbase->scanner_start($table,$cols,$start_row,$end_row);
      $results = true;
      while ($results){
      $results = $hbase->scanner_get($handle);
      if ($results){
      foreach($results['column'] as $key => $value)

      { .... code here to work with the $key/column name and the $value of the column .... }

      // end foreach
      } // end if
      }// end while
      $hbase->scanner_delete($handle);

      1. hbase_rest.php
        17 kB
        Billy Pearson
      2. hbase_restv2.php
        17 kB
        Billy Pearson

        Issue Links

          Activity

          Hide
          Billy Pearson added a comment -

          just added to help out others closed

          Show
          Billy Pearson added a comment - just added to help out others closed
          Hide
          stack added a comment -

          Can we resolve this issue Billy? Folks seem to have success going via thrift from PHP.

          Show
          stack added a comment - Can we resolve this issue Billy? Folks seem to have success going via thrift from PHP.
          Hide
          Billy Pearson added a comment -

          Bug Fix on insert there was ":" getting added to the end of the data on inserts.

          Show
          Billy Pearson added a comment - Bug Fix on insert there was ":" getting added to the end of the data on inserts.
          Hide
          Billy Pearson added a comment -

          This would make it easy for reads but for puts I would still have to keep it the same. I forgot about keep-alive been a long time sense I had to use fsockopen Myself but I will work on adding keep-alives into the code to keep the connection open so we do not have to tare down and rebuild the connection per transaction. This will help speed up the code.

          Show
          Billy Pearson added a comment - This would make it easy for reads but for puts I would still have to keep it the same. I forgot about keep-alive been a long time sense I had to use fsockopen Myself but I will work on adding keep-alives into the code to keep the connection open so we do not have to tare down and rebuild the connection per transaction. This will help speed up the code.
          Hide
          Michael Bieniosek added a comment -

          Hi Billy,

          It might be easier for you to use php's built-in http client, rather than writing the http client yourself. That way, you won't have to deal with chunked-encoding, keepalives, etc. (which you ignore, but you might get some performance improvements from them).

          To do the cell fetching, for example, you could just do:

              $xml = simplexml_load_file("http://$host:60050/api/$table/row/$row");
          
              $content = array();
          
              foreach ($xml->column as $column) {
                  $content[base64_decode($column->name)] = base64_decode($column->value);
              }
          
              return $content;
          
          Show
          Michael Bieniosek added a comment - Hi Billy, It might be easier for you to use php's built-in http client, rather than writing the http client yourself. That way, you won't have to deal with chunked-encoding, keepalives, etc. (which you ignore, but you might get some performance improvements from them). To do the cell fetching, for example, you could just do: $xml = simplexml_load_file( "http: //$host:60050/api/$table/row/$row" ); $content = array(); foreach ($xml->column as $column) { $content[base64_decode($column->name)] = base64_decode($column->value); } return $content;
          Hide
          Billy Pearson added a comment -

          There is still a problem and I ma not sure if its my code or something with in the rest code but there is a memory usage after using it to insert a few million rows the master or/and the separate rest process (HADOOP-2316) runs out of heap memory there is something using up memory and not releasing it. I have seen it on inserts but have not tested for it much more then that. I run a separate rest process outside of the master so I can kill that with out having to restart the master and the whole cluster.

          Show
          Billy Pearson added a comment - There is still a problem and I ma not sure if its my code or something with in the rest code but there is a memory usage after using it to insert a few million rows the master or/and the separate rest process ( HADOOP-2316 ) runs out of heap memory there is something using up memory and not releasing it. I have seen it on inserts but have not tested for it much more then that. I run a separate rest process outside of the master so I can kill that with out having to restart the master and the whole cluster.

            People

            • Assignee:
              Unassigned
              Reporter:
              Billy Pearson
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development