Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-598

Provide a tool to debug state stores

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.0
    • Component/s: kv
    • Labels:
      None

      Description

      There is currently no easy way to debug a Samza job when there are problems with the state store data. It would be nice to provide a tool that will:

      1. Connect to a local KV state store, and dump its content.
      2. Consume a changelog to and materialize a state store locally.

      Optionally, it would be nice to allow users to fetch values for a give key via a REPL. This is a little tricky, though, since the key might be serialized in Protobu, Avro, etc.

      Between 1-2, users will be able to inspect local stores of jobs that are running (1), and create a local copy of a state store for jobs that have lost their local state already (2).

        Issue Links

          Activity

          Hide
          criccomini Chris Riccomini added a comment -

          Should probably look similar to CheckpointTool, in terms of implementation, and CLI usage.

          Show
          criccomini Chris Riccomini added a comment - Should probably look similar to CheckpointTool, in terms of implementation, and CLI usage.
          Hide
          closeuris Yan Fang added a comment -

          If no one is working on this. Going to take it.

          users will be able to inspect local stores of jobs that are running (1)

          The store is locked by the running process. Do not think we can inspect a local stores of running jobs. Right? We are only able to inspect after the job is done/killed.

          create a local copy of a state store for jobs that have lost their local state already (2)

          Users will have to provide path for where they want to put the local state, though it is a little error-prone. The reason is that, after implementing the YARN host affinity (SAMZA-617), the local state will be stored in some places like

          ${yarn.nodemanager.localdirs}/usercache/${user}/appcache/application_${appid}/container_${contid}
          

          Reading the config file/stream will not provide enough information for the path of the local state, such as missing applicationId, containerId. The only way is to provide the path manually.

          Show
          closeuris Yan Fang added a comment - If no one is working on this. Going to take it. users will be able to inspect local stores of jobs that are running (1) The store is locked by the running process. Do not think we can inspect a local stores of running jobs. Right? We are only able to inspect after the job is done/killed. create a local copy of a state store for jobs that have lost their local state already (2) Users will have to provide path for where they want to put the local state, though it is a little error-prone. The reason is that, after implementing the YARN host affinity ( SAMZA-617 ), the local state will be stored in some places like ${yarn.nodemanager.localdirs}/usercache/${user}/appcache/application_${appid}/container_${contid} Reading the config file/stream will not provide enough information for the path of the local state, such as missing applicationId, containerId. The only way is to provide the path manually.
          Hide
          jghoman Jakob Homan added a comment -

          Optionally, it would be nice to allow users to fetch values for a give key via a REPL. This is a little tricky, though, since the key might be serialized in Protobu, Avro, etc.

          Gradle provides the scalaConsole target, which brings up the Scala REPL with all the class files loaded on the classpath. This would be a good entry point for such an interactive tool. That way the user would have full access to Scala and just need a few primitives to get to the state store. Very Sparky...

          Show
          jghoman Jakob Homan added a comment - Optionally, it would be nice to allow users to fetch values for a give key via a REPL. This is a little tricky, though, since the key might be serialized in Protobu, Avro, etc. Gradle provides the scalaConsole target, which brings up the Scala REPL with all the class files loaded on the classpath. This would be a good entry point for such an interactive tool. That way the user would have full access to Scala and just need a few primitives to get to the state store. Very Sparky...
          Hide
          criccomini Chris Riccomini added a comment -

          Yan Fang, feel free to take this ticket on. It sounds like most of the issues you've raised are for connecting to a local store (1) in the initial description, correct? I think (2) should be pretty do-able.

          For (1), I agree, it's problematic if you can't access the DB from a separate process. I haven't tried this, though. Are you sure it's permanently locked from outside files? Based on RocksDB's administration page, it sounds like they support some way to poke at the data from the CLI.

          Show
          criccomini Chris Riccomini added a comment - Yan Fang , feel free to take this ticket on. It sounds like most of the issues you've raised are for connecting to a local store (1) in the initial description, correct? I think (2) should be pretty do-able. For (1), I agree, it's problematic if you can't access the DB from a separate process. I haven't tried this, though. Are you sure it's permanently locked from outside files? Based on RocksDB's administration page , it sounds like they support some way to poke at the data from the CLI.
          Hide
          closeuris Yan Fang added a comment -

          Gradle provides the scalaConsole target, which brings up the Scala REPL with all the class files loaded on the classpath. This would be a good entry point for such an interactive tool.

          That sounds very promising. Will investigate it after finishing the (1) and (2). Maybe in another ticket.

          For (1), I agree, it's problematic if you can't access the DB from a separate process. I haven't tried this, though. Are you sure it's permanently locked from outside files? Based on RocksDB's administration page, it sounds like they support some way to poke at the data from the CLI.

          Did further investigation. Actually I can not find the Ldb Tool mentioned in the administration page in RocksDB. However, after 3.7 release, there is openReadOnly method in RocksDB, which supports multiple read-only clients. This is what we need. Since currently we are using 3.5.1 release, need the Samza to update to the later versions - SAMZA-442.

          Show
          closeuris Yan Fang added a comment - Gradle provides the scalaConsole target, which brings up the Scala REPL with all the class files loaded on the classpath. This would be a good entry point for such an interactive tool. That sounds very promising. Will investigate it after finishing the (1) and (2). Maybe in another ticket. For (1), I agree, it's problematic if you can't access the DB from a separate process. I haven't tried this, though. Are you sure it's permanently locked from outside files? Based on RocksDB's administration page, it sounds like they support some way to poke at the data from the CLI. Did further investigation. Actually I can not find the Ldb Tool mentioned in the administration page in RocksDB. However, after 3.7 release, there is openReadOnly method in RocksDB, which supports multiple read-only clients. This is what we need. Since currently we are using 3.5.1 release, need the Samza to update to the later versions - SAMZA-442 .
          Hide
          criccomini Chris Riccomini added a comment -

          Cool! If you want we can blow this out into two subtasks (listed above), and focus on the changelog restoration version of the tool first. The 3.9.1 RocksDB release was built against JDK8, so we can't pull that in. I checked with the FB folks a week or two ago, and they said that they were about to ship the next release. If it takes too long, we can always do a re-build release of 3.9.1 built against JDK6, so we can suck that in. I have access to do RocksDB JNI builds and Maven releases, and so does Naveen Somasundaram.

          Show
          criccomini Chris Riccomini added a comment - Cool! If you want we can blow this out into two subtasks (listed above), and focus on the changelog restoration version of the tool first. The 3.9.1 RocksDB release was built against JDK8, so we can't pull that in. I checked with the FB folks a week or two ago, and they said that they were about to ship the next release. If it takes too long, we can always do a re-build release of 3.9.1 built against JDK6, so we can suck that in. I have access to do RocksDB JNI builds and Maven releases, and so does Naveen Somasundaram .
          Hide
          navina Navina Ramesh added a comment -

          Yi Pan (Data Infrastructure) With SAMZA-625 and SAMZA-626, I think the users can comfortably debug the running jobs. Are we good to resolve this JIRA for now?

          Show
          navina Navina Ramesh added a comment - Yi Pan (Data Infrastructure) With SAMZA-625 and SAMZA-626 , I think the users can comfortably debug the running jobs. Are we good to resolve this JIRA for now?
          Hide
          nickpan47 Yi Pan (Data Infrastructure) added a comment -

          Hi, Navina Ramesh, I agree.

          Show
          nickpan47 Yi Pan (Data Infrastructure) added a comment - Hi, Navina Ramesh , I agree.
          Hide
          nickpan47 Yi Pan (Data Infrastructure) added a comment -

          Closing as discussed in the comments.

          Show
          nickpan47 Yi Pan (Data Infrastructure) added a comment - Closing as discussed in the comments.

            People

            • Assignee:
              closeuris Yan Fang
              Reporter:
              criccomini Chris Riccomini
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development