SID cells are 32-bit aligned, and a multiple of 32 bits in length. The
only outlier is the thermal sensor calibration data, which is 16 bits
per sensor. However a whole 64 bits is allocated for this purpose, so
we could consider it conforming to the rule above.
Also, the register read-out method assumes native endian, unlike the
direct MMIO method, which assumes big endian. Thus no endian conversion
is involved.
Under these assumptions, the register read-out method can be slightly
optimized. Instead of reading one word then discarding 3 bytes, read
the whole word directly into the buffer. However, for reads under 4
bytes or trailing bytes, we still use a scratch buffer to extract the
requested bytes.
We could go one step further if .word_size was 4, but changing that
would affect the sysfs interface's behavior.