--- /dev/null
+PSCI Performance Measurement
+============================
+
+TF-A provides two instrumentation tools for performing analysis of the PSCI
+implementation:
+
+* PSCI STAT
+* Runtime Instrumentation
+
+This page explains how they may be enabled and used to perform all varieties of
+analysis.
+
+Performance Measurement Framework
+---------------------------------
+
+The Performance Measurement Framework `PMF`_ is a framework that provides
+mechanisms for collecting and retrieving timestamps at runtime from the
+Performance Measurement Unit (`PMU`_). The PMU is a generalized abstraction for
+accessing CPU hardware registers used to measure hardware events. This means,
+for instance, that the PMU might be used to place instrumentation points at
+logical locations in code for tracing purposes.
+
+TF-A utilises the PMF as a backend for the two instrumentation services it
+provides--PSCI Statistics and Runtime Instrumentation. The PMF is used by
+these services to facilitate collection and retrieval of timestamps. For
+instance, the PSCI Statistics service registers the PMF service
+``psci_svc`` to track its residency statistics.
+
+This is reserved a unique ID, name, and space in memory by the PMF. The
+framework provides a convenient interface for PSCI Statistics to retrieve
+values from ``psci_svc`` at runtime. Alternatively, the service may be
+configured such that the PMF dumps those values to the console. A platform may
+choose to expose SMCs that allow retrieval of these timestamps from the
+service.
+
+This feature is enabled with the Boolean flag ``ENABLE_PMF``.
+
+PSCI Statistics
+---------------
+
+PSCI Statistics is a runtime service that provides residency statistics for
+power states used by the platform. The service tracks residency time and
+entry count. Residency time is the total time spent in a particular power
+state by a PE. The entry count is the number of times the PE has entered
+the power state. PSCI Statistics implements the optional functions
+``PSCI_STAT_RESIDENCY`` and ``PSCI_STAT_COUNT`` from the `PSCI`_
+specification.
+
+
+.. c:macro:: PSCI_STAT_RESIDENCY
+
+ :param target_cpu: Contains copy of affinity fields in the MPIDR register
+ for identifying the target core (See section 5.1.4 of `PSCI`_
+ specifications for more details).
+ :param power_state: identifier for a specific local
+ state. Generally, this parameter takes the same form as the power_state
+ parameter described for CPU_SUSPEND in section 5.4.2.
+
+ :returns: Time spent in ``power_state``, in microseconds, by ``target_cpu``
+ and the highest level expressed in ``power_state``.
+
+
+.. c:macro:: PSCI_STAT_COUNT
+
+ :param target_cpu: follows the same format as ``PSCI_STAT_RESIDENCY``.
+ :param power_state: follows the same format as ``PSCI_STAT_RESIDENCY``.
+
+ :returns: Number of times the state expressed in ``power_state`` has been
+ used by ``target_cpu`` and the highest level expressed in
+ ``power_state``.
+
+The implementation provides residency statistics only for low power states,
+and does this regardless of the entry mechanism into those states. The
+statistics it collects are set to 0 during shutdown or reset.
+
+PSCI Statistics is enabled with the Boolean build flag
+``ENABLE_PSCI_STAT``. All Arm platforms utilise the PMF unless another
+collection backend is provided (``ENABLE_PMF`` is implicitly enabled).
+
+Runtime Instrumentation
+-----------------------
+
+The Runtime Instrumentation Service is an instrumentation tool that wraps
+around the PMF to provide timestamp data. Although the service is not
+restricted to PSCI, it is used primarily in TF-A to quantify the total time
+spent in the PSCI implementation. The tool can be used to instrument other
+components in TF-A as well. It is enabled with the Boolean flag
+``ENABLE_RUNTIME_INSTRUMENTATION``, and as with PSCI STAT, requires PMF to
+be enabled.
+
+In PSCI, this service provides instrumentation points in the
+following code paths:
+
+* Entry into the PSCI SMC handler
+* Exit from the PSCI SMC handler
+* Entry to low power state
+* Exit from low power state
+* Entry into cache maintenance operations in PSCI
+* Exit from cache maintenance operations in PSCI
+
+The service captures the cycle count, which allows for the time spent in the
+implementation to be calculated, given the frequency counter.
+
+PSCI SMC Handler Instrumentation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The timestamp during entry into the handler is captured as early as possible
+during the runtime exception, prior to entry into the handler itself. All
+timestamps are stored in memory for later retrieval. The exit timestamp is
+captured after normal return from the PSCI SMC handler, or, if a low power state
+was requested, it is captured in the warm boot path.
+
+*Copyright (c) 2023, Arm Limited. All rights reserved.*
+
+.. _PMF: ../design/firmware-design.html#performance-measurement-framework
+.. _PMU: performance-monitoring-unit.html
+.. _PSCI: https://developer.arm.com/documentation/den0022/latest/
--- /dev/null
+Runtime Instrumentation Methodology
+===================================
+
+This document outlines steps for undertaking performance measurements of key
+operations in the Trusted Firmware-A Power State Coordination Interface (PSCI)
+implementation, using the in-built Performance Measurement Framework (PMF) and
+runtime instrumentation timestamps.
+
+Framework
+~~~~~~~~~
+
+The tests are based on the ``runtime-instrumentation`` test suite provided by
+the Trusted Firmware Test Framework (TFTF). The release build of this framework
+was used because the results in the debug build became skewed; the console
+output prevented some of the tests from executing in parallel.
+
+The tests consist of both parallel and sequential tests, which are broadly
+described as follows:
+
+- **Parallel Tests** This type of test powers on all the non-lead CPUs and
+ brings them and the lead CPU to a common synchronization point. The lead CPU
+ then initiates the test on all CPUs in parallel.
+
+- **Sequential Tests** This type of test powers on each non-lead CPU in
+ sequence. The lead CPU initiates the test on a non-lead CPU then waits for the
+ test to complete before proceeding to the next non-lead CPU. The lead CPU then
+ executes the test on itself.
+
+Note there is very little variance observed in the values given (~1us), although
+the values for each CPU are sometimes interchanged, depending on the order in
+which locks are acquired. Also, there is very little variance observed between
+executing the tests sequentially in a single boot or rebooting between tests.
+
+Given that runtime instrumentation using PMF is invasive, there is a small
+(unquantified) overhead on the results. PMF uses the generic counter for
+timestamps, which runs at 50MHz on Juno.
+
+Metrics
+~~~~~~~
+
+.. glossary::
+
+ Powerdown Latency
+ Time taken from entering the TF PSCI implementation to the point the hardware
+ enters the low power state (WFI). Referring to the TF runtime instrumentation points, this
+ corresponds to: ``(RT_INSTR_ENTER_HW_LOW_PWR - RT_INSTR_ENTER_PSCI)``.
+
+ Wakeup Latency
+ Time taken from the point the hardware exits the low power state to exiting
+ the TF PSCI implementation. This corresponds to: ``(RT_INSTR_EXIT_PSCI -
+ RT_INSTR_EXIT_HW_LOW_PWR)``.
+
+ Cache Flush Latency
+ Time taken to flush the caches during powerdown. This corresponds to:
+ ``(RT_INSTR_EXIT_CFLUSH - RT_INSTR_ENTER_CFLUSH)``.