From b1af2676f2501b6178c90d14361809f54ef0d211 Mon Sep 17 00:00:00 2001 From: Harrison Mutai Date: Wed, 8 Mar 2023 12:01:48 +0000 Subject: [PATCH] docs(psci): expound runtime instrumentation docs Change-Id: I3c30b44d4196c30fd07373282150e543959fce1a Signed-off-by: Harrison Mutai --- docs/perf/index.rst | 4 +- docs/perf/psci-performance-instr.rst | 117 +++++++++++++++++++++ docs/perf/psci-performance-methodology.rst | 55 ++++++++++ 3 files changed, 175 insertions(+), 1 deletion(-) create mode 100644 docs/perf/psci-performance-instr.rst create mode 100644 docs/perf/psci-performance-methodology.rst diff --git a/docs/perf/index.rst b/docs/perf/index.rst index bccad006c..b83c6e390 100644 --- a/docs/perf/index.rst +++ b/docs/perf/index.rst @@ -5,10 +5,12 @@ Performance & Testing :maxdepth: 1 :caption: Contents + psci-performance-instr psci-performance-juno + psci-performance-methodology tsp performance-monitoring-unit -------------- -*Copyright (c) 2019-2020, Arm Limited. All rights reserved.* +*Copyright (c) 2019-2023, Arm Limited. All rights reserved.* diff --git a/docs/perf/psci-performance-instr.rst b/docs/perf/psci-performance-instr.rst new file mode 100644 index 000000000..16f386fb9 --- /dev/null +++ b/docs/perf/psci-performance-instr.rst @@ -0,0 +1,117 @@ +PSCI Performance Measurement +============================ + +TF-A provides two instrumentation tools for performing analysis of the PSCI +implementation: + +* PSCI STAT +* Runtime Instrumentation + +This page explains how they may be enabled and used to perform all varieties of +analysis. + +Performance Measurement Framework +--------------------------------- + +The Performance Measurement Framework `PMF`_ is a framework that provides +mechanisms for collecting and retrieving timestamps at runtime from the +Performance Measurement Unit (`PMU`_). The PMU is a generalized abstraction for +accessing CPU hardware registers used to measure hardware events. This means, +for instance, that the PMU might be used to place instrumentation points at +logical locations in code for tracing purposes. + +TF-A utilises the PMF as a backend for the two instrumentation services it +provides--PSCI Statistics and Runtime Instrumentation. The PMF is used by +these services to facilitate collection and retrieval of timestamps. For +instance, the PSCI Statistics service registers the PMF service +``psci_svc`` to track its residency statistics. + +This is reserved a unique ID, name, and space in memory by the PMF. The +framework provides a convenient interface for PSCI Statistics to retrieve +values from ``psci_svc`` at runtime. Alternatively, the service may be +configured such that the PMF dumps those values to the console. A platform may +choose to expose SMCs that allow retrieval of these timestamps from the +service. + +This feature is enabled with the Boolean flag ``ENABLE_PMF``. + +PSCI Statistics +--------------- + +PSCI Statistics is a runtime service that provides residency statistics for +power states used by the platform. The service tracks residency time and +entry count. Residency time is the total time spent in a particular power +state by a PE. The entry count is the number of times the PE has entered +the power state. PSCI Statistics implements the optional functions +``PSCI_STAT_RESIDENCY`` and ``PSCI_STAT_COUNT`` from the `PSCI`_ +specification. + + +.. c:macro:: PSCI_STAT_RESIDENCY + + :param target_cpu: Contains copy of affinity fields in the MPIDR register + for identifying the target core (See section 5.1.4 of `PSCI`_ + specifications for more details). + :param power_state: identifier for a specific local + state. Generally, this parameter takes the same form as the power_state + parameter described for CPU_SUSPEND in section 5.4.2. + + :returns: Time spent in ``power_state``, in microseconds, by ``target_cpu`` + and the highest level expressed in ``power_state``. + + +.. c:macro:: PSCI_STAT_COUNT + + :param target_cpu: follows the same format as ``PSCI_STAT_RESIDENCY``. + :param power_state: follows the same format as ``PSCI_STAT_RESIDENCY``. + + :returns: Number of times the state expressed in ``power_state`` has been + used by ``target_cpu`` and the highest level expressed in + ``power_state``. + +The implementation provides residency statistics only for low power states, +and does this regardless of the entry mechanism into those states. The +statistics it collects are set to 0 during shutdown or reset. + +PSCI Statistics is enabled with the Boolean build flag +``ENABLE_PSCI_STAT``. All Arm platforms utilise the PMF unless another +collection backend is provided (``ENABLE_PMF`` is implicitly enabled). + +Runtime Instrumentation +----------------------- + +The Runtime Instrumentation Service is an instrumentation tool that wraps +around the PMF to provide timestamp data. Although the service is not +restricted to PSCI, it is used primarily in TF-A to quantify the total time +spent in the PSCI implementation. The tool can be used to instrument other +components in TF-A as well. It is enabled with the Boolean flag +``ENABLE_RUNTIME_INSTRUMENTATION``, and as with PSCI STAT, requires PMF to +be enabled. + +In PSCI, this service provides instrumentation points in the +following code paths: + +* Entry into the PSCI SMC handler +* Exit from the PSCI SMC handler +* Entry to low power state +* Exit from low power state +* Entry into cache maintenance operations in PSCI +* Exit from cache maintenance operations in PSCI + +The service captures the cycle count, which allows for the time spent in the +implementation to be calculated, given the frequency counter. + +PSCI SMC Handler Instrumentation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The timestamp during entry into the handler is captured as early as possible +during the runtime exception, prior to entry into the handler itself. All +timestamps are stored in memory for later retrieval. The exit timestamp is +captured after normal return from the PSCI SMC handler, or, if a low power state +was requested, it is captured in the warm boot path. + +*Copyright (c) 2023, Arm Limited. All rights reserved.* + +.. _PMF: ../design/firmware-design.html#performance-measurement-framework +.. _PMU: performance-monitoring-unit.html +.. _PSCI: https://developer.arm.com/documentation/den0022/latest/ diff --git a/docs/perf/psci-performance-methodology.rst b/docs/perf/psci-performance-methodology.rst new file mode 100644 index 000000000..a9f379d20 --- /dev/null +++ b/docs/perf/psci-performance-methodology.rst @@ -0,0 +1,55 @@ +Runtime Instrumentation Methodology +=================================== + +This document outlines steps for undertaking performance measurements of key +operations in the Trusted Firmware-A Power State Coordination Interface (PSCI) +implementation, using the in-built Performance Measurement Framework (PMF) and +runtime instrumentation timestamps. + +Framework +~~~~~~~~~ + +The tests are based on the ``runtime-instrumentation`` test suite provided by +the Trusted Firmware Test Framework (TFTF). The release build of this framework +was used because the results in the debug build became skewed; the console +output prevented some of the tests from executing in parallel. + +The tests consist of both parallel and sequential tests, which are broadly +described as follows: + +- **Parallel Tests** This type of test powers on all the non-lead CPUs and + brings them and the lead CPU to a common synchronization point. The lead CPU + then initiates the test on all CPUs in parallel. + +- **Sequential Tests** This type of test powers on each non-lead CPU in + sequence. The lead CPU initiates the test on a non-lead CPU then waits for the + test to complete before proceeding to the next non-lead CPU. The lead CPU then + executes the test on itself. + +Note there is very little variance observed in the values given (~1us), although +the values for each CPU are sometimes interchanged, depending on the order in +which locks are acquired. Also, there is very little variance observed between +executing the tests sequentially in a single boot or rebooting between tests. + +Given that runtime instrumentation using PMF is invasive, there is a small +(unquantified) overhead on the results. PMF uses the generic counter for +timestamps, which runs at 50MHz on Juno. + +Metrics +~~~~~~~ + +.. glossary:: + + Powerdown Latency + Time taken from entering the TF PSCI implementation to the point the hardware + enters the low power state (WFI). Referring to the TF runtime instrumentation points, this + corresponds to: ``(RT_INSTR_ENTER_HW_LOW_PWR - RT_INSTR_ENTER_PSCI)``. + + Wakeup Latency + Time taken from the point the hardware exits the low power state to exiting + the TF PSCI implementation. This corresponds to: ``(RT_INSTR_EXIT_PSCI - + RT_INSTR_EXIT_HW_LOW_PWR)``. + + Cache Flush Latency + Time taken to flush the caches during powerdown. This corresponds to: + ``(RT_INSTR_EXIT_CFLUSH - RT_INSTR_ENTER_CFLUSH)``. -- 2.39.5