Nice catch. The PartitionUrlByHost seems broken indeed.
I would suggest that we use the existing o.a.n.crawl.URLPartitioner class which has support for three URL partition modes (host, domain, IP) and which is used by the GeneratorJob too.
Pros: support for different partition modes in the Fetcher + no duplicate code.
Or is there a reason why the Fetcher has its own partition logic?
The URLPartitioner class is a Partitioner<SelectorEntry, WebPage> instead of a Partitioner<IntWritable, FetchEntry> but you can perhaps extract a method and use it from both classes, or create one URLPartitioner with two specific inner classes for the Generator and Fetcher.