Nutch
  1. Nutch
  2. NUTCH-896

Gora-based tests need to have their own config files

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: nutchgora
    • Component/s: None
    • Labels:
      None

      Description

      The tests extending AbstractNutchTest (Injector, Generator, Fetcher) have hard-coded properties for GORA. It would be better to be able to rely on a file gora.properties used only for the tests, just as we do with the nutch-*.xml config files (see CrawlTestUtil). This way we wouldn't use the configs set in the main /conf file as they could be specific to a given GORA backend e.g. Mysql vs hsqldb. This would also help running the tests with a non-default GORA backend.

      We need to modify GORA and make the method DataStoreFactory.setProperties public.

        Issue Links

          Activity

          Hide
          Sertan Alkan added a comment -

          I am not quite sure why we choose to hide setProperties method in o.g.s.DataStoreFactory but instead of setting properties hard-coded in AbstractNutchTest, I guess we could one of the following;

          • We can place a different gora.properties file in src/test which includes these hard coded settings and let this one be used by test classes. This will require a slight change on GORA side as currently DataStoreFactory doesn't have a selection mechanism for the resource to read properties from (though, that will a minor change in GORA). The problem with this is that currently every subclass of AbstractNutchTest uses its own database by setting a different jdbc.url. Is there a specific reason why every subclass needs a different database?
          • We could create different properties file for each implementing test case and put these under src/test. This will again require the same change in GORA mentioned above.
          • OR, we can create a different configuration file containing these settings, and add this file to the Configuration object. At some point, we're again going to need to import these settings into DataStoreFactory possibly via changing the visibility of setProperties method.

          I am leaning towards the first but any comments on the track are welcome.

          Show
          Sertan Alkan added a comment - I am not quite sure why we choose to hide setProperties method in o.g.s.DataStoreFactory but instead of setting properties hard-coded in AbstractNutchTest , I guess we could one of the following; We can place a different gora.properties file in src/test which includes these hard coded settings and let this one be used by test classes. This will require a slight change on GORA side as currently DataStoreFactory doesn't have a selection mechanism for the resource to read properties from (though, that will a minor change in GORA). The problem with this is that currently every subclass of AbstractNutchTest uses its own database by setting a different jdbc.url . Is there a specific reason why every subclass needs a different database? We could create different properties file for each implementing test case and put these under src/test. This will again require the same change in GORA mentioned above. OR, we can create a different configuration file containing these settings, and add this file to the Configuration object. At some point, we're again going to need to import these settings into DataStoreFactory possibly via changing the visibility of setProperties method. I am leaning towards the first but any comments on the track are welcome.
          Hide
          Lewis John McGibbney added a comment -

          This has taken some time to get round to but I am going to embark on the task of fixing o.a.n.storage.TestGoraStorage. We have some pretty nasty issues as described above, which need to be thought through before a fix is possible. I am also leaning towards the first option, having had a look at the tests after compiling it makes most sense atm.
          Is it possible for someone to explain why there are different jdbc.url's because I am getting the following

          lewis@lewis-01:~/ASF/nutchgora/runtime/local$ bin/nutch junit org.apache.nutch.storage.TestGoraStorage
          .E.E
          Time: 0.931
          There were 2 errors:
          1) testMultithread(org.apache.nutch.storage.TestGoraStorage)org.apache.gora.util.GoraException: java.io.IOException: java.sql.SQLTransientConnectionException: java.net.ConnectException: Connection refused
          	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
          	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:118)
          	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:87)
          	at org.apache.nutch.storage.StorageUtils.createDataStore(StorageUtils.java:43)
          	at org.apache.nutch.storage.TestGoraStorage.setUp(TestGoraStorage.java:47)
          Caused by: java.io.IOException: java.sql.SQLTransientConnectionException: java.net.ConnectException: Connection refused
          	at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
          	at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
          	at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81)
          	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104)
          	... 14 more
          

          which would suggest that there is a configuration error, that a service is not listening on a specific port?
          Can we simplify this any in an attempt to get this particular test working again.

          Show
          Lewis John McGibbney added a comment - This has taken some time to get round to but I am going to embark on the task of fixing o.a.n.storage.TestGoraStorage. We have some pretty nasty issues as described above, which need to be thought through before a fix is possible. I am also leaning towards the first option, having had a look at the tests after compiling it makes most sense atm. Is it possible for someone to explain why there are different jdbc.url's because I am getting the following lewis@lewis-01:~/ASF/nutchgora/runtime/local$ bin/nutch junit org.apache.nutch.storage.TestGoraStorage .E.E Time: 0.931 There were 2 errors: 1) testMultithread(org.apache.nutch.storage.TestGoraStorage)org.apache.gora.util.GoraException: java.io.IOException: java.sql.SQLTransientConnectionException: java.net.ConnectException: Connection refused at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:118) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:87) at org.apache.nutch.storage.StorageUtils.createDataStore(StorageUtils.java:43) at org.apache.nutch.storage.TestGoraStorage.setUp(TestGoraStorage.java:47) Caused by: java.io.IOException: java.sql.SQLTransientConnectionException: java.net.ConnectException: Connection refused at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747) at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160) at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:81) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:104) ... 14 more which would suggest that there is a configuration error, that a service is not listening on a specific port? Can we simplify this any in an attempt to get this particular test working again.
          Hide
          Lewis John McGibbney added a comment -

          Set and classify

          Show
          Lewis John McGibbney added a comment - Set and classify
          Hide
          Ferdy Galema added a comment -

          Fixed with NUTCH-1205.

          Show
          Ferdy Galema added a comment - Fixed with NUTCH-1205 .

            People

            • Assignee:
              Julien Nioche
              Reporter:
              Julien Nioche
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development