Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: nutchgora
    • Fix Version/s: nutchgora
    • Component/s: storage
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      This issues should have been dealt with as part of its parent issue, however I think as it is a fairly lareg task in itself, it needs to be done independently. The gora.properties file should, amongst other settings, and beside the extreme basic defaults for sqlstore, include defaults for opening HBase, Cassandra, etc servers on their default ports etc. Leaving this down to individual interpretation puts a huge owness of the user, hence constructing a barrier to entry for getting the configuration settings up and running.

      1. NUTCH-1189-v4.patch
        2 kB
        Lewis John McGibbney
      2. NUTCH-1189-v3.patch
        1 kB
        Ferdy Galema
      3. NUTCH-1189-v2.patch
        1.0 kB
        Lewis John McGibbney
      4. NUTCH-1189.patch
        0.8 kB
        Lewis John McGibbney

        Activity

        Hide
        Hudson added a comment -

        Integrated in Nutch-nutchgora #240 (See https://builds.apache.org/job/Nutch-nutchgora/240/)
        NUTCH-1189 (Update gora.properties for HBase to reflect Gora 0.2) (Revision 1330744)

        Result = SUCCESS
        ferdy :
        Files :

        • /nutch/branches/nutchgora/conf/gora.properties
        Show
        Hudson added a comment - Integrated in Nutch-nutchgora #240 (See https://builds.apache.org/job/Nutch-nutchgora/240/ ) NUTCH-1189 (Update gora.properties for HBase to reflect Gora 0.2) (Revision 1330744) Result = SUCCESS ferdy : Files : /nutch/branches/nutchgora/conf/gora.properties
        Hide
        Ferdy Galema added a comment -

        FYI: I just committed a change to update the HBaseStore properties section.

        Show
        Ferdy Galema added a comment - FYI: I just committed a change to update the HBaseStore properties section.
        Hide
        Hudson added a comment -

        Integrated in Nutch-nutchgora #130 (See https://builds.apache.org/job/Nutch-nutchgora/130/)
        commit to address NUTCH-1189 and update to CHANGES.txt

        lewismc : http://svn.apache.org/viewvc/nutch/branches/nutchgora/viewvc/?view=rev&root=.&revision=1230234
        Files :

        • /nutch/branches/nutchgora/CHANGES.txt
        • /nutch/branches/nutchgora/conf/gora.properties
        Show
        Hudson added a comment - Integrated in Nutch-nutchgora #130 (See https://builds.apache.org/job/Nutch-nutchgora/130/ ) commit to address NUTCH-1189 and update to CHANGES.txt lewismc : http://svn.apache.org/viewvc/nutch/branches/nutchgora/viewvc/?view=rev&root=.&revision=1230234 Files : /nutch/branches/nutchgora/CHANGES.txt /nutch/branches/nutchgora/conf/gora.properties
        Hide
        Lewis John McGibbney added a comment -

        Committed @ revision 1230234 in Nutchgora branch

        Show
        Lewis John McGibbney added a comment - Committed @ revision 1230234 in Nutchgora branch
        Hide
        Lewis John McGibbney added a comment -

        Final patch attachment for now, hopefully we will be revisiting this issue when more data stores become available in Gora in the forthcoming months. Thanks Ferdy for the HBase commentary.

        Show
        Lewis John McGibbney added a comment - Final patch attachment for now, hopefully we will be revisiting this issue when more data stores become available in Gora in the forthcoming months. Thanks Ferdy for the HBase commentary.
        Hide
        Lewis John McGibbney added a comment -

        Yes it certainly should. Couple of things to sort out just now then I'll come back to this, get the other properties added in a commented fashion then we should be good to fire this one off.
        Thanks for now.

        Show
        Lewis John McGibbney added a comment - Yes it certainly should. Couple of things to sort out just now then I'll come back to this, get the other properties added in a commented fashion then we should be good to fire this one off. Thanks for now.
        Hide
        Ferdy Galema added a comment -

        Hi Lewis,

        I took the liberty of adding the HBaseStore documentation to the patch. This should cover HBaseStore for now.

        Show
        Ferdy Galema added a comment - Hi Lewis, I took the liberty of adding the HBaseStore documentation to the patch. This should cover HBaseStore for now.
        Hide
        Lewis John McGibbney added a comment -

        In addition, there is scope to provide a much richer info resource within this file but I will get round to that later.

        Show
        Lewis John McGibbney added a comment - In addition, there is scope to provide a much richer info resource within this file but I will get round to that later.
        Hide
        Lewis John McGibbney added a comment -

        2nd edition added to acknowledge some pointers from the dev list. Admittedly I probably won't much time to work on much over then next while so attaching before this gets lost.

        Show
        Lewis John McGibbney added a comment - 2nd edition added to acknowledge some pointers from the dev list. Admittedly I probably won't much time to work on much over then next while so attaching before this gets lost.
        Hide
        Ferdy Galema added a comment -

        I see what you're getting at. We could indeed include this most important 'hbase.zookeeper.quorum' Configuration property within gora.properties. But I'm not fully sure how to implement it. We could:

        A) Explicitely read this specific property in HBaseStore and paste it onto Configuration. (Requires modification of HBaseStore)
        B) Make it flexible so that for example every property starting with 'hbaseconf.' will be pasted onto Configuration. (Requires modification of HBaseStore).
        C) Even more generic, make is so that EVERY store benefits from Configuration property overrides. (This could be done in either Nutch or Gora code).
        D) Just mention the fact in gora.properties.

        For now I feel that D is the best solution, because it requires the least modifications and fits perfectly within the scope of this issue. Also it keeps a strict separation between gora properties and Configuration properties. This will avoid any confusion. If you agree with me then you could just paste something along the lines of my previous comment into the properties file.

        Show
        Ferdy Galema added a comment - I see what you're getting at. We could indeed include this most important 'hbase.zookeeper.quorum' Configuration property within gora.properties. But I'm not fully sure how to implement it. We could: A) Explicitely read this specific property in HBaseStore and paste it onto Configuration. (Requires modification of HBaseStore) B) Make it flexible so that for example every property starting with 'hbaseconf.' will be pasted onto Configuration. (Requires modification of HBaseStore). C) Even more generic, make is so that EVERY store benefits from Configuration property overrides. (This could be done in either Nutch or Gora code). D) Just mention the fact in gora.properties. For now I feel that D is the best solution, because it requires the least modifications and fits perfectly within the scope of this issue. Also it keeps a strict separation between gora properties and Configuration properties. This will avoid any confusion. If you agree with me then you could just paste something along the lines of my previous comment into the properties file.
        Hide
        Lewis John McGibbney added a comment -

        Hi Ferdy, does this have any knock-on effect what what we would wish to include within gora.properties? I understand that you can manually add peoprties to your HBASEHOME/conf/hbase-site.xml, however if you think any additional properties would add value to this patch please re-submit the patch. Your usage of HBase far exceeds my use case so please feel free.

        Show
        Lewis John McGibbney added a comment - Hi Ferdy, does this have any knock-on effect what what we would wish to include within gora.properties? I understand that you can manually add peoprties to your HBASEHOME/conf/hbase-site.xml, however if you think any additional properties would add value to this patch please re-submit the patch. Your usage of HBase far exceeds my use case so please feel free.
        Hide
        Ferdy Galema added a comment - - edited

        You are right Lewis, HBase does not need any special properties. For completeness though: When running Zookeeper on a port that is not localhost:2181, one is required to specify it using the hbase.zookeeper.quorum in the Configuration object. This is possible by either including it in a hbase-site.xml on classpath or by setting it directly on the Configuration prior to instantiating the DataStore, although this is yet only available in gora 0.2 (for this see GORA-26, GORA-48).

        (sorry for the edits)

        Show
        Ferdy Galema added a comment - - edited You are right Lewis, HBase does not need any special properties. For completeness though: When running Zookeeper on a port that is not localhost:2181, one is required to specify it using the hbase.zookeeper.quorum in the Configuration object. This is possible by either including it in a hbase-site.xml on classpath or by setting it directly on the Configuration prior to instantiating the DataStore, although this is yet only available in gora 0.2 (for this see GORA-26 , GORA-48 ). (sorry for the edits)
        Hide
        Lewis John McGibbney added a comment -

        So as far as I am aware, HBase doesn't need any additional properties specified within the gora.properties file, however both the SQL & Cassandra stores do. By default the minimum properties for the SQL store are attached with extra security features commented out. Finally, all the 'expected' Cassandra properties are included and commented out by default. This is a work in process to lower the barrier to entry for Cassandra users.

        Show
        Lewis John McGibbney added a comment - So as far as I am aware, HBase doesn't need any additional properties specified within the gora.properties file, however both the SQL & Cassandra stores do. By default the minimum properties for the SQL store are attached with extra security features commented out. Finally, all the 'expected' Cassandra properties are included and commented out by default. This is a work in process to lower the barrier to entry for Cassandra users.
        Hide
        Lewis John McGibbney added a comment -

        Ferdy, would it be possible for you to attach a patch for HBase (if required), I will work on the Cassandra stuff, then hopefully we can knock ours heads together with some others to get the remaining back ends included within the gora.poperties file.

        Show
        Lewis John McGibbney added a comment - Ferdy, would it be possible for you to attach a patch for HBase (if required), I will work on the Cassandra stuff, then hopefully we can knock ours heads together with some others to get the remaining back ends included within the gora.poperties file.

          People

          • Assignee:
            Lewis John McGibbney
            Reporter:
            Lewis John McGibbney
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development