Some TLS TX counters increment per socket/connection, and are not
protected against parallel modifications from several cores.
Switch them to atomic counters by taking them out of the SQ stats into
the global atomic TLS stats.
In this patch, we touch a single counter 'tx_tls_ctx' that counts the
number of device-offloaded TX TLS connections added.
Now that this counter can be increased without the for having the SQ
context in hand, move it to the mlx5e_ktls_add_tx() callback where it
really belongs, out of the fast data-path.
This change is not needed for counters that increment only in NAPI
context or under the TX lock, as they are already protected.
Keep them as tls_* counters under 'struct mlx5e_sq_stats'.