Thanks Andrew Wang, I update the patch for your comments.
About hard limit.
I talked with Colin Patrick McCabe about compatibility offline ....
Yes, I agree you and Colin. In fact, when I added the hard limit, the first thing I considered is the compatibility. I did it based on two reasons: 1). as you said too, it's unlikely anyone's ever changed the max size. 2). The max limit is to restrict user's (user/trusted) namespace xattrs, and it doesn't break the existing HDFS features.
I felt it was OK to do this modification on trunk and branch-2 directly (Certainly we can still use 4 bytes, but I wanted to save 2 bytes from it ).
I see you agree with the hard limit generally, one thing is do we need to only have the hard limit in trunk? How about do it in the branch-2 if we think it's fine?
Could we exit on setting an xattr size bigger than the hard limit, rather than doing a silent min? We should mention this new hard limit somewhere as well.
Sure, updated in the new patch. User will see the hard limit from error msg. I updated the description of the xattr max size in hdfs-default.xml and mentioned the hard limit too.
Any comment on the max size supported by other filesystems like ext4 or btrfs, for reference as to what's reasonable here?
It's a long time since I read xattr of ext4 and other fs last time. I remember some fs have limit about the xattr length, and some fs don't have. It's detail, small difference should be OK.
Typo in FSDirectory: "should" -> "should be". Also the line below it says "unlimited" but that'll never get triggered now.
sure, I forgot to remove the unlimited...
Regarding configuration, we could simplify by just using the hard limit. Admins would still have the option of disabling xattrs entirely; is there really any value in being able to set something smaller than 32KB? This would definitely make it a trunk change.
Agree, How about doing it in a follow-on if we want later?
More review comments:
FSDirectory#addToInodeMap, do we need that new return?
right, no need. I removed it. Maybe I added it suddenly
A bit out of scope and so optional, but I think everywhere we say "prefixName" we really want to say "prefixedName" because "prefixName" sounds more like the name of the prefix rather than a name with a prefix.
Good idea, prefixedName is much more better, I updated all those in the new patch.
Some unrelated import changes in FSNamesystemLock and INodeAttributeProvider
OK, I reverted this modification since they are unrelated (I saw them, I was thinking to remove them).
XAttrFeature has an extra import
What's the reason for switching from ImmutableList to List in some places? The switch is also not complete, since I still see some usages of ImmutableList. I remember we liked ImmutableList originally since it made the need to set very explicit.
I think originally we use ImmutableList is mainly because it's immutable, and we keep it in XAttrFeature, we don't want outside modification affects it. Now it's packed to byte in XAttrFeature, so no need immutable list.
I use ArrayList instead of ImmutableList is because when building ImmutableList, it needs additional list copy (from an internal ArrayList). Then the performance is a bit better.
I missed to switch one ImmutableList in XAttrStorage and fixed it now.
Mind adding Javadoc for SerialNumberMap, and an interface audience private annotation?
XAttrsFormat's class javadoc goes over 80 chars, could use interface audience private also.
Sure, updated them. One thing is XAttrFormat is package-private, so no need to add audience private annotation for it.
Can we add an IllegalStateException to SerialNumberMap#get(T) for Integer overflow? Also there's the case that the int from the map doesn't fit in the 29 bits in XAttrFormat; check that in XAttrsFormattoBytes?
Sure, I also add the check in XAttrFormat#toBytes. Actually I want to create a follow-on to restrict the total number of xattr names for user's (user/trust) xattrs. For HDFS kernel, the number of xattr names are less than 10 currently, but if users create many various xattrs (maybe by mistake), then it will cause unexpected behavior.
Consider also renaming XAttrsFormat to XAttrFormat, so it's named like XAttrStorage
Is it worthwhile to do the same dictionary encoding for the FSImage as well? If the # xattrs is large enough to affect memory footprint, it'd also affect loading times which can already be minutes. Can be a follow-on JIRA for sure.
Actually we already have this compaction for XAttr in FSImage, please see XAttrCompactProto in fsimage.proto.