mt76 core uses ffs() to find the next free bit. This works well for 32 bit
architectures where BITS_PER_LONG is 32. ffs only checks 32 bit values, so
allocation fails on 64 bit architectures.
Additionally, the wcid mask array was too small in cases where the array
was not a multiple of BITS_PER_LONG.
Fix this by making the wcid mask array u32 instead and use DIV_ROUND_UP
for the size, just in case we ever bump it to a value that's not a multiple
of 32.
Reported-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>