Availability metrics is defined as the "Percentage availability of the CDR platform over time".
Using the picture as a reference, the availability of the platform refers to the entire CDR solution stack and not just the various API endpoints (Banking, Common, Infosec, and admin).
How are the Big four and other data holders measuring this availability?
A number of approaches can be used, including:
Sending "Synthetic requests" to these API endpoints and receiving a 200 OK response and using that to measure availability metrics.
Pros: Ideal, as it measures the full response to a valid request.
Cons: This approach is not easy to realize. The Data Holder (DH) may have its own Accredited Data Recipient (ADR) registered in the ecosystem, or else may partner with an ADR for this activity, or employ some smart by-pass for requests from its synthetic client..
Sending API requests using a non-existent ADR (i.e. do not have valid consent/access tokens). When an endpoint returns an error such as invalid token use or mTLS failure, the test infers that the APIs are up and running.
Pros: This is simple to implement.
Cons: It confirms only that the API endpoint is running and can respond with an error to a particular invalid request.
Measuring the availability of the CDR Platform from within the solution. For example, the DH might instrument all the solution components using an Application Performance Management (APM) tool.
Pros - Simple to implement. Typically most applications are monitored using one or more tools.
Cons: This validates that the solution components are running. However, edge-related issues would not be visible (e.g. some network issues preventing requests from reaching the API gateway exposing the APIs).
How have the various DHs have approached this?
Can DSB or ACCC provide clarification on what approaches are used and which of the approaches above are compliant or non-compliant?
The DSB cannot comment on how each DH chooses to implement their monitoring. It is reasonable to expect some form of external API monitoring could be part of the solution. Internal monitoring could also be used.
Availability represents the whole CDR solution, so it applies to all APIs and services defined in the standards.
It is likely that external polling would be only part of the solution for DHs monitoring availability. Each DH must decide, based on their architecture, how far into the application stack internal monitoring is required.
Unavailability may also be identified and reported based on internal incident management processes.
This article is based on CDS Standards Maintenance issue 339 Query: Measuring Availability Metrics.