Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Cluster 9 nodes, 256 GB RAM each node, 48 virtual CPUs each node.
Running Spark with
43 Executors, 30 GB memory each, 8 cores per executor.
-
Patch
Description
When running spot-ml, flow, for ~ 1TB data, there is a job (last job) taking 3.3 hours. After reviewing Spark UI, I noticed that this step count at InvalidDataHandler.scala:43 is actually performing the same join that was already executed in job map at FlowSuspiciousConnectsAnalysis.scala:42 (6.4 hours).
count at InvalidDataHandler.scala:43 is filtering corrupt records from scoredFlowRecords, and it seems to be being re-calculated.
Steps to reproduce:
Start spot-ml for flow data with a data set of 1TB or bigger.
Wait until it completes, look at the time last job took.
Attachments
Attachments
Issue Links
- links to