Uploaded image for project: 'Spot (Retired)'
  1. Spot (Retired)
  2. SPOT-171

[ML] No suspicious connections running spot-ml

    XMLWordPrintableJSON

Details

    • Test
    • Status: Closed
    • Major
    • Resolution: Information Provided
    • None
    • None

    Description

      Hello,

      I'm trying spot-ml with a large pcap capture. The capture is from

      https://mcfp.felk.cvut.cz/publicDatasets/CTU-Mixed-Capture-1/

      After ingesting the flows and dns in Apache Spot, the results are always empty. This is how I ran spot-ml:

      ./ml_ops.sh 20170601 flow 1e-20 400
      17/06/01 15:15:00 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
      Deleted /user/spot/flow/scored_results/20170601/scores
      17/06/01 15:15:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      17/06/01 15:15:03 INFO Remoting: Starting remoting
      17/06/01 15:15:03 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@172.18.0.4:41981]
      17/06/01 15:15:13 WARN spark.SparkContext: Dynamic Allocation and num executors both set, thus dynamic allocation disabled.
      17/06/01 15:15:19 INFO SuspiciousConnectsAnalysis: Loading data from: /user/spot/flow/hive/y=2017/m=06/d=01/
      17/06/01 15:15:29 INFO SuspiciousConnectsAnalysis: Starting flow suspicious connects analysis.
      17/06/01 15:15:30 INFO SuspiciousConnectsAnalysis: Fitting probabilistic model to data
      17/06/01 15:15:30 INFO SuspiciousConnectsAnalysis: Training netflow suspicious connects model from /user/spot/flow/hive/y=2017/m=06/d=01/
      17/06/01 15:15:32 INFO SuspiciousConnectsAnalysis: 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
      17/06/01 15:15:32 INFO SuspiciousConnectsAnalysis: calculating byte cuts ...
      17/06/01 15:15:35 INFO SuspiciousConnectsAnalysis: 93.0,146.0,204.0,288.0,313.0,630.0,905.0,1581.0,2840.0,9.5489883E7
      17/06/01 15:15:35 INFO SuspiciousConnectsAnalysis: calculating pkt cuts
      17/06/01 15:15:36 INFO SuspiciousConnectsAnalysis: 2.0,4.0,6.0,9.0,164961.0
      17/06/01 15:15:43 INFO SuspiciousConnectsAnalysis: Running Spark LDA with params alpha = 1.02 beta = 1.001 Max iterations = 20 Optimizer = em
      [Stage 61:==================================================> (187 + 3) / 200]17/06/01 15:16:20 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
      17/06/01 15:16:20 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
      17/06/01 15:23:15 INFO SuspiciousConnectsAnalysis: Identifying outliers
      17/06/01 15:23:15 INFO SuspiciousConnectsAnalysis: Netflow suspicious connects analysis completed.
      17/06/01 15:23:15 INFO SuspiciousConnectsAnalysis: Saving results to : /user/spot/flow/scored_results/20170601/scores
      17/06/01 15:23:55 WARN SuspiciousConnectsAnalysis: Saving invalid records to /user/spot/flow/scored_results/20170601/scores/invalid_records
      SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
      SLF4J: Defaulting to no-operation (NOP) logger implementation
      SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
      17/06/01 15:23:57 WARN SuspiciousConnectsAnalysis: Total records discarded due to NULL values in key fields: 12 . Please go to /user/spot/flow/scored_results/20170601/scores/invalid_records for more details.

      real 8m56.911s
      user 1m41.616s
      sys 0m11.856s
      root@hadoop-master:~/incubator-spot/spot-ml# hadoop fs -ls /user/spot/flow/scored_results/20170601/scores
      Found 3 items
      rw-rr- 2 root supergroup 0 2017-06-01 15:23 /user/spot/flow/scored_results/20170601/scores/_SUCCESS
      rw-rr- 2 root supergroup 0 2017-06-01 15:23 /user/spot/flow/scored_results/20170601/scores/flow_results.csv
      drwxr-xr-x - root supergroup 0 2017-06-01 15:23 /user/spot/flow/scored_results/20170601/scores/invalid_records
      root@hadoop-master:~/incubator-spot/spot-ml#

      With DNS capture I don't get any suspicious connections too. Is this the normal behaviour? ML seems to work fine because it reaches to more than 1000 stages. For DNS ml, what does USER_DOMAIN_CMD mean?

      Thanks,

      Attachments

        Activity

          People

            rabarona Ricardo Barona
            jpferrero Jose Pablo Ferrero
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: