Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Github Url : https://github.com/linkedin/gobblin/pull/758
Github Reporter : tuGithub
Github Assignee : chavdar
Github Created At : 2016-02-24T23:59:33Z
Github Updated At : 2017-04-22T18:45:51Z
Comments
tuGithub wrote on 2016-02-25T00:00:20Z : @chavdar , @sahilTakiar , please review it when you have time
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-188526386
stakiar wrote on 2016-02-29T21:04:34Z : Can you add a description of what this PR entails?
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190391806
stakiar wrote on 2016-02-29T21:31:56Z : High Level Comment:
- Anyway we can get this to run via `TestNG` and not via Azkaban?
- The current implementation seems to require running a job via Azkaban, which means these tests won't run unless someone manually uploads the job to Azkaban and runs them.
- However, while these are integration tests I don't see anything that requires them to be run on Azkaban or in a real HDFS cluster, it could simply be run on top of the local file system (e.g. `FileSystem.getLocal()`
- The advantage would be that these tests get run during each build, which will help us catch any unexpected bugs earlier
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190405625
tuGithub wrote on 2016-03-01T18:28:41Z : Added the design doc in the ticket: https://jira01.corp.linkedin.com:8443/browse/ETL-4050
If we only test based on testNG unit testing, that is not actual testing for HDFS store, especially through config client.
@chavdar , please advise.
Min
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190843030
chavdar wrote on 2016-03-02T00:48:45Z : In general, unit tests should be used for testing a single component in isolation. Dependencies should be mocked or use simplified test versions.
Integration tests should use multiple realistic components potentially running on different Hadoop clusters. Eventually, we may have the tooling to setup one or more local Hadoop clusters for testing but we don't have that right now. Loading a test on Azkaban and running it there should be a temp solution.
What this PR is missing is a high-level description of how the integration test framework is working. It seems like RegressionTest is the input point but I am still trying to figure this out.
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190995520
chavdar wrote on 2016-03-02T00:50:49Z : Do these tests rely on manual deployment of the config files on HDFS? If yes, we should integrate it with the config deployment code.
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-190996362
stakiar wrote on 2016-03-02T20:41:15Z : @chavdar why are tests added to TestNG restricted to unit tests?
- The current store implementations only require working with a `FileSystem` I'm not sure why we can't use the `SimpleLocalHDFSConfigStoreFactory` for this
- We do have the ability to setup HDFS clusters using the [MiniClusters](https://wiki.apache.org/hadoop/HowToDevelopUnitTests), `GobblinYarnAppLauncherTest` is already doing this
- This should also allow us to spawn multiple clusters, I'm not sure why these integration tests need that though
- I've used `MiniDFSCluster` before and it works well
- Getting this to run via TestNG also ensures the tests get run on each build
- Correct me if I am wrong but we currently have to run the integration tests on Azkaban manually; and even if we did automate runs on Azkaban that would only happen internally
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-191420336
tuGithub wrote on 2016-03-08T22:38:47Z : High level structure:
1. RegressionTest is the entry point of the test
2. Based on the configuration, ExpectedResultBuilder will build all the expected results by parsing the json file expected.conf for each node
3. There are multiple levels of validation
A. validate through SimpleHDFSConfigStore
B. validate through the InMemoryTopology and InMemoryValueInspector
C. validate through the ConfigClient
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-194001228
abti wrote on 2017-01-12T03:18:49Z : @vasanthrajamani Can you please prioritize?
Github Url : https://github.com/linkedin/gobblin/pull/758#issuecomment-272065387