Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
About generator.SparkDriver, bigpetstore-spark/README.md says:
You will need to change the master if you want to run on a cluster.
But in reality, if you run it on a cluster, you'll get empty data:
[sekikn@mobile bigpetstore-spark]$ HADOOP_CONF_DIR=/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop spark-submit --master yarn-cluster --class org.apache.bigtop.bigpetstore.spark.generator.SparkDriver build/libs/bigpetstore-spark-1.1.0-SNAPSHOT-all.jar generated_data 10 1000 365.0 345 (snip) 15/11/18 00:12:30 INFO Client: Application report for application_1447772975157_0003 (state: FINISHED) 15/11/18 00:12:30 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: 192.168.0.4 ApplicationMaster RPC port: 0 queue: default start time: 1447773097856 final status: SUCCEEDED tracking URL: http://mobile.local:8088/proxy/application_1447772975157_0003/ user: sekikn 15/11/18 00:12:30 INFO ShutdownHookManager: Shutdown hook called 15/11/18 00:12:30 INFO ShutdownHookManager: Deleting directory /private/var/folders/n2/1bnspz7j4q7100jmh610zd200000gn/T/spark-ccfcde0c-ea95-4361-b2b3-b709a92bee59 [sekikn@mobile bigpetstore-spark]$ hdfs dfs -ls generated_data/transactions 15/11/18 00:13:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 3 items -rw-r--r-- 3 sekikn supergroup 0 2015-11-18 00:12 generated_data/transactions/_SUCCESS -rw-r--r-- 3 sekikn supergroup 0 2015-11-18 00:12 generated_data/transactions/part-00000 -rw-r--r-- 3 sekikn supergroup 0 2015-11-18 00:12 generated_data/transactions/part-00001
This is because simulationLength is a variable and always -1 in RDD function. It must be a constant or broadcasted.