Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7001

Windows launch feature parity - augment launch process using PowerShell to match capabilities of *nix launching

    Details

      Description

      The current .bat-based launching has neither the logic nor robustness of a bash or PowerShell-based solution. In pursuit of making Windows a 1st-class citizen for C*, we need to augment the launch-process using something like PowerShell to get as close to feature-parity as possible with Linux.

      1. 7001_v1.txt
        35 kB
        Joshua McKenzie
      2. 7001_v2.txt
        34 kB
        Joshua McKenzie
      3. 7001_v3.txt
        33 kB
        Joshua McKenzie
      4. 7001_v4.txt
        34 kB
        Joshua McKenzie

        Activity

        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -
        What works:
        • added +x to all .bat files so it's more convenient on cygwin
        • upped heap to 2G on legacy .bat startup
        • runtime determination of cassandra-env.ps1 based on similar rules as used in cassandra on *nix
        • background startup, no window popping up
        • foreground startup
        • pidfile watch and deletion
        • -H for heap dump output
        • -E for error file output
        • -verbose to print startup JVM params
        • new stop-server.bat to send control+c to running cassandra process for graceful shutdown
        • heap and new-gen based on available system memory
        • tested on Win8 and Win7
        What doesn't:
        • numactl. There's nothing analogous on Windows; We can look into winbase.h, WinNT.h and native access to kernel32.dll later if we feel this is worth pursuing (link: http://archive.msdn.microsoft.com/64plusLP)
        • MALLOC_ARENA_MAX. libc specific per-thread memory arena, nothing analogous on Windows.
        What wasn't tested:
        • All the optional / commented out params in cassandra-env.ps1, gc etc. Default options for the JVM on Windows should be in parity with linux w/the current config.
        Performance:
        • Got about a 8% bump on write, 5% on reads from the various startup changes. Still looks to be about 20% behind OSX on reads when forcing buffered I/O across the board on both. On the plus side, write performance tested in parity on buffered I/O before the changes compared to OSX, 8% faster than OSX after.
        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - What works: added +x to all .bat files so it's more convenient on cygwin upped heap to 2G on legacy .bat startup runtime determination of cassandra-env.ps1 based on similar rules as used in cassandra on *nix background startup, no window popping up foreground startup pidfile watch and deletion -H for heap dump output -E for error file output -verbose to print startup JVM params new stop-server.bat to send control+c to running cassandra process for graceful shutdown heap and new-gen based on available system memory tested on Win8 and Win7 What doesn't: numactl. There's nothing analogous on Windows; We can look into winbase.h, WinNT.h and native access to kernel32.dll later if we feel this is worth pursuing (link: http://archive.msdn.microsoft.com/64plusLP ) MALLOC_ARENA_MAX. libc specific per-thread memory arena, nothing analogous on Windows. What wasn't tested: All the optional / commented out params in cassandra-env.ps1, gc etc. Default options for the JVM on Windows should be in parity with linux w/the current config. Performance: Got about a 8% bump on write, 5% on reads from the various startup changes. Still looks to be about 20% behind OSX on reads when forcing buffered I/O across the board on both. On the plus side, write performance tested in parity on buffered I/O before the changes compared to OSX, 8% faster than OSX after.
        Hide
        jbellis Jonathan Ellis added a comment -

        I'm okay with skipping numactl and malloc_arena_max.

        (Losing to OS X is embarrassing for anyone, so I'm still interested in optimizing the buffered path.

        Show
        jbellis Jonathan Ellis added a comment - I'm okay with skipping numactl and malloc_arena_max. (Losing to OS X is embarrassing for anyone, so I'm still interested in optimizing the buffered path.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        Yeah, that was my assumption on the OS X front. I don't have a bare metal linux install on this laptop so I couldn't compare apples-to-apples; I may end up going that route as I'd really like to know our benchmark of what we're up against regarding Windows performance. A concern more for later but I'm definitely curious.

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - Yeah, that was my assumption on the OS X front. I don't have a bare metal linux install on this laptop so I couldn't compare apples-to-apples; I may end up going that route as I'd really like to know our benchmark of what we're up against regarding Windows performance. A concern more for later but I'm definitely curious.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment - - edited

        While doing some win7 bare-metal vs. ubuntu 14.04 bare-metal performance comparisons I came across the fact that my env changes broke VisualGC functionality. Took some digging, but it turns out that

        • the cygwin install I had on my previous win8 config was screwing with the TMP/TEMP environment variables and getting in the way of .NET's invoking of libraries from powershell. Clearing those variables wasn't strictly necessary, and...
        • if you clear TMP and TEMP before invoking a jvm, it doesn't know where to put some files that are necessary for jps, visualgc, and likely some other utilities to work.

        I've attached a v2 that fixes that.

        As for the performance front - Windows writes look to be about 6% slower than linux at saturation. On the read front it looks like Windows is much more CPU-heavy without memory-mapped I/O than linux - 4 HT haswell peg at 100% on 54 threads vs. linux at 181, however with memory-mapped index file I/O Windows read performance looks surprisingly good.

        I'm planning on shelving performance profiling until after cleaning up the unit and dtests, but I wanted to have a general baseline of where we are currently as well as confirm no performance regressions from this patch.

        Edit: I should clarify - the tests vs. OSX earlier had the stress client running on the same machine as the cassandra instance. Given Windows being CPU-bottlenecked on buffered I/O, the # of threads on the stress client were mucking with the results.

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - - edited While doing some win7 bare-metal vs. ubuntu 14.04 bare-metal performance comparisons I came across the fact that my env changes broke VisualGC functionality. Took some digging, but it turns out that the cygwin install I had on my previous win8 config was screwing with the TMP/TEMP environment variables and getting in the way of .NET's invoking of libraries from powershell. Clearing those variables wasn't strictly necessary, and... if you clear TMP and TEMP before invoking a jvm, it doesn't know where to put some files that are necessary for jps, visualgc, and likely some other utilities to work. I've attached a v2 that fixes that. As for the performance front - Windows writes look to be about 6% slower than linux at saturation. On the read front it looks like Windows is much more CPU-heavy without memory-mapped I/O than linux - 4 HT haswell peg at 100% on 54 threads vs. linux at 181, however with memory-mapped index file I/O Windows read performance looks surprisingly good. I'm planning on shelving performance profiling until after cleaning up the unit and dtests, but I wanted to have a general baseline of where we are currently as well as confirm no performance regressions from this patch. Edit: I should clarify - the tests vs. OSX earlier had the stress client running on the same machine as the cassandra instance. Given Windows being CPU-bottlenecked on buffered I/O, the # of threads on the stress client were mucking with the results.
        Hide
        jbellis Jonathan Ellis added a comment -

        I haven't found anyone to review a bunch of PowerShell code so I'm going to ask Ryan to get it tested and then we'll call it good.

        Show
        jbellis Jonathan Ellis added a comment - I haven't found anyone to review a bunch of PowerShell code so I'm going to ask Ryan to get it tested and then we'll call it good.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        Jeremiah Jordan did some Powershell work while scripting Cassandra in the past - maybe he'd have some feedback?

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - Jeremiah Jordan did some Powershell work while scripting Cassandra in the past - maybe he'd have some feedback?
        Hide
        jjordan Jeremiah Jordan added a comment -

        Mine was all straight bat files. No powershell here.

        Show
        jjordan Jeremiah Jordan added a comment - Mine was all straight bat files. No powershell here.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        Attaching v3 - fixed -p functionality. Had it hard-coded to pid.txt before - oversight.

        Also cleaned up some of the output printing and added logic to check for bad file path passed to -p (permissions)

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - Attaching v3 - fixed -p functionality. Had it hard-coded to pid.txt before - oversight. Also cleaned up some of the output printing and added logic to check for bad file path passed to -p (permissions)
        Hide
        philipthompson Philip Thompson added a comment - - edited

        From a testing perspective this all looks good Josh. The dtests will start running soon when I get this ccm pull request to Sylvain. Give me a couple hours to get a link to that and the test plan up.

        Show
        philipthompson Philip Thompson added a comment - - edited From a testing perspective this all looks good Josh. The dtests will start running soon when I get this ccm pull request to Sylvain. Give me a couple hours to get a link to that and the test plan up.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        Working w/Philip directly on https://github.com/josh-mckenzie/cassandra/tree/7001-trunk. A couple small issues (logback, cassandra-foreground) ironed out. I'll post a patch here once we've finished ironing out the kinks.

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - Working w/Philip directly on https://github.com/josh-mckenzie/cassandra/tree/7001-trunk . A couple small issues (logback, cassandra-foreground) ironed out. I'll post a patch here once we've finished ironing out the kinks.
        Hide
        philipthompson Philip Thompson added a comment -

        Summary of manual testing (dtest integration coming. See CASSANDRA-7202):
        All functionality was tested by running the .bat files through cmd and powershell, on both Windows 7 and Windows Server 2008.

        All of the command line arguments were functional (foreground, background, -p, -H, -E, -verbose, -help).
        System still started up correctly if invalid paths for output files were specified, but warning messages were displayed. An issue was found with the -p command line argument, which was fixed in v3 of the patch.

        Heap and new-gen memory were correctly set based on system memory (according to rules defined in comments in cassandra.ps1). The .bat file correctly handles problems with powershell shell file execution policy.

        For stop-server.bat, all command line arguments were tested, in both cmd and powershell, on both Windows 7 and Windows Server 2008. Stop-server.bat correctly handles receving invalid pid files, as well as valid pid-files but invalid pid's.

        It should be noted that CASSANDRA-7202 is tracking improving ccm compatibility with this patch, at which point the bootstrap tests will be an effective test for this patch, with a few modifications. It will also ensure that Cassandra is fully functional using these startup scripts.

        Show
        philipthompson Philip Thompson added a comment - Summary of manual testing (dtest integration coming. See CASSANDRA-7202 ): All functionality was tested by running the .bat files through cmd and powershell, on both Windows 7 and Windows Server 2008. All of the command line arguments were functional (foreground, background, -p, -H, -E, -verbose, -help). System still started up correctly if invalid paths for output files were specified, but warning messages were displayed. An issue was found with the -p command line argument, which was fixed in v3 of the patch. Heap and new-gen memory were correctly set based on system memory (according to rules defined in comments in cassandra.ps1). The .bat file correctly handles problems with powershell shell file execution policy. For stop-server.bat, all command line arguments were tested, in both cmd and powershell, on both Windows 7 and Windows Server 2008. Stop-server.bat correctly handles receving invalid pid files, as well as valid pid-files but invalid pid's. It should be noted that CASSANDRA-7202 is tracking improving ccm compatibility with this patch, at which point the bootstrap tests will be an effective test for this patch, with a few modifications. It will also ensure that Cassandra is fully functional using these startup scripts.
        Hide
        jbellis Jonathan Ellis added a comment - - edited

        v3 patch actually fails to apply for me against 2.1 branch:

        patching file bin/cassandra.bat
        Hunk #1 FAILED at 18.
        Hunk #2 FAILED at 26.
        Hunk #3 FAILED at 125.
        
        Show
        jbellis Jonathan Ellis added a comment - - edited v3 patch actually fails to apply for me against 2.1 branch: patching file bin/cassandra.bat Hunk #1 FAILED at 18. Hunk #2 FAILED at 26. Hunk #3 FAILED at 125.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        I'll rebase that later today.

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - I'll rebase that later today.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        Looks like it was a line-ending issue from generating the file on windows. Without using dos2unix on it in cygwin it wouldn't apply but that broke application on linux (note: fun times). v4 passes --check and --stat on cassandra-2.1 on a linux box for me locally.

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - Looks like it was a line-ending issue from generating the file on windows. Without using dos2unix on it in cygwin it wouldn't apply but that broke application on linux (note: fun times). v4 passes --check and --stat on cassandra-2.1 on a linux box for me locally.
        Hide
        jbellis Jonathan Ellis added a comment -

        committed!

        Show
        jbellis Jonathan Ellis added a comment - committed!

          People

          • Assignee:
            JoshuaMcKenzie Joshua McKenzie
            Reporter:
            JoshuaMcKenzie Joshua McKenzie
            Reviewer:
            Philip Thompson
            Tester:
            Philip Thompson
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development