Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-134

Config management regression test

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Github Url : https://github.com/linkedin/gobblin/pull/758
      Github Reporter : tuGithub
      Github Assignee : chavdar
      Github Created At : 2016-02-24T23:59:33Z
      Github Updated At : 2017-04-22T18:45:51Z

      Comments


      tuGithub wrote on 2016-02-25T00:00:20Z : @chavdar , @sahilTakiar , please review it when you have time

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-188526386


      stakiar wrote on 2016-02-29T21:04:34Z : Can you add a description of what this PR entails?

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190391806


      stakiar wrote on 2016-02-29T21:31:56Z : High Level Comment:

      • Anyway we can get this to run via `TestNG` and not via Azkaban?
      • The current implementation seems to require running a job via Azkaban, which means these tests won't run unless someone manually uploads the job to Azkaban and runs them.
      • However, while these are integration tests I don't see anything that requires them to be run on Azkaban or in a real HDFS cluster, it could simply be run on top of the local file system (e.g. `FileSystem.getLocal()`
      • The advantage would be that these tests get run during each build, which will help us catch any unexpected bugs earlier

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190405625


      tuGithub wrote on 2016-03-01T18:28:41Z : Added the design doc in the ticket: https://jira01.corp.linkedin.com:8443/browse/ETL-4050

      If we only test based on testNG unit testing, that is not actual testing for HDFS store, especially through config client.

      @chavdar , please advise.

      Min

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190843030


      chavdar wrote on 2016-03-02T00:48:45Z : In general, unit tests should be used for testing a single component in isolation. Dependencies should be mocked or use simplified test versions.

      Integration tests should use multiple realistic components potentially running on different Hadoop clusters. Eventually, we may have the tooling to setup one or more local Hadoop clusters for testing but we don't have that right now. Loading a test on Azkaban and running it there should be a temp solution.

      What this PR is missing is a high-level description of how the integration test framework is working. It seems like RegressionTest is the input point but I am still trying to figure this out.

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190995520


      chavdar wrote on 2016-03-02T00:50:49Z : Do these tests rely on manual deployment of the config files on HDFS? If yes, we should integrate it with the config deployment code.

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190996362


      stakiar wrote on 2016-03-02T20:41:15Z : @chavdar why are tests added to TestNG restricted to unit tests?

      • The current store implementations only require working with a `FileSystem` I'm not sure why we can't use the `SimpleLocalHDFSConfigStoreFactory` for this
      • We do have the ability to setup HDFS clusters using the [MiniClusters](https://wiki.apache.org/hadoop/HowToDevelopUnitTests), `GobblinYarnAppLauncherTest` is already doing this
      • This should also allow us to spawn multiple clusters, I'm not sure why these integration tests need that though
      • I've used `MiniDFSCluster` before and it works well
      • Getting this to run via TestNG also ensures the tests get run on each build
      • Correct me if I am wrong but we currently have to run the integration tests on Azkaban manually; and even if we did automate runs on Azkaban that would only happen internally

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-191420336


      tuGithub wrote on 2016-03-08T22:38:47Z : High level structure:
      1. RegressionTest is the entry point of the test
      2. Based on the configuration, ExpectedResultBuilder will build all the expected results by parsing the json file expected.conf for each node
      3. There are multiple levels of validation
      A. validate through SimpleHDFSConfigStore
      B. validate through the InMemoryTopology and InMemoryValueInspector
      C. validate through the ConfigClient

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-194001228


      abti wrote on 2017-01-12T03:18:49Z : @vasanthrajamani Can you please prioritize?

      Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-272065387

      Attachments

        Activity

          People

            Unassigned Unassigned
            abti Abhishek Tiwari
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: