Just to get a rough idea of performance, I uploaded one of my CSV test files (765MB, 100M docs, 7 small string fields per doc).
Time to complete indexing was 42% longer, and the transaction log grew to 1.8GB. The lucene index was 1.2GB. The log was on the same device, so the main impact may have been disk IO.
I think this is far from what we can really do here. I didn't look too close at the code yet but it seems you are doing blocking writes which might not be ideal here at all. I think what you can do here is to allocate the space you need per record and write concurrently on a Channel (see FileChannel#write(ByteBuffer src, long position)), the same is true for reads (FileChannel#read(ByteBuffer dst, long position)). What we need to store in main memory is the offset and the length to do the realtime get here.
To take that one step further it might be good keep around the already serialized data if possible so if binary update is used can we piggyback the bytes in the SolrInputDocument somehow? If not I think we should use a faster hand written serialization instead of java serialization which is proven to be freaking slow.
Another totally different idea for the RT get is to spend more time on a RAM Reader that is capable of doing exactSeeks on the anyway used BytesRefHash. I don't thinks this would be too far away since the biggest problem here is to provide an efficiently sorted dictionary. maybe this should be a long term goal for the RT Get feature.
Since we are already doing Write Behind here we could also try to use some compression especially if the source data is large, not sure if that will pay off though since we are not keeping the logs around forever.
Eventually I think this should be a feature that lives outside of solr since many Lucene applications could make use of it. ElasticSearch for instance uses pretty similar features which could be adopted to something like a DurableIndexWriter wrapper.