Notes so far:
- sstable filenames are controlled by the io/sstable/Descriptor class, which encapsulates a few parameters including "generation" – the increasing integer in question.
- dropping generation in favor of a uuid seems questionable, given that generation is used by a wide variety of clients in the codebase. So the most likely approach is uuid + generation side by side.
- using the host id as the uuid is easy conceptually, but will violate layering, because code in io will start to depend on db and/or service. Plus there is potential bootstrapping problem where system sstables need to be initialized early on during boot, and it's not clear whether the unique host id is available early enough to feed into system sstable descriptors.
- random uuids are also tricky, because sstable names will no longer be discoverable without directory lookups. Some code (particularly in unit tests) leans on the ability to synthesize sstable names without touching the filesystem. It's possible to persist these uuids in one of the system tables, but it will have to be a local table, and, regardless, changing system schema can make this a breaking change.
I haven't yet found a cost-effective fix that would involve actually modifying the existing naming scheme.
The latest idea I have is to create a directory that will hold symlinks to real sstables (symlinks are available in Java 7). Symlink names will contain the UUIDs. The only extra piece of code would be creating and tearing down symlinks when real sstables are created and deleted. End users could then access sstables through this symlink directory whenever doing related maintenance. The last piece would be making sure that appropriate clients, such as the compactor, can consume sstables with and without UUIDs.
I'll work on this some more tomorrow, but it'll probably spill until next week (or later).