Hong Kong Healthcare Artificial Intelligence SocietyHong Kong Healthcare Artificial Intelligence Society

GMLP Principles 8–10: Testing, Transparency & Monitoring

Validate AI devices under clinically relevant conditions, provide clear information to users, and monitor deployed models for drift and retraining risks.

Real-world monitoring dashboard for deployed AI medical devices

Principle 8: Testing under clinically relevant conditions

Testing must demonstrate device performance during clinically relevant conditions using methodologically and statistically sound test plans executed independently of the training dataset.

Considerations include:

  • Intended patient population and relevant subgroups
  • Clinical environment and human–AI team use
  • Measurement inputs
  • Potential confounding factors

For Hong Kong professionals, this means asking whether validation reflects your patient mix, equipment, and workflow — not only published aggregate metrics from other countries.

Principle 9: Clear, essential information for users

Users — healthcare professionals, patients, or caregivers — must receive clear, contextually relevant information appropriate to their role. This includes:

  • Intended use / indications for use
  • Benefits and risks
  • Model performance for appropriate subgroups
  • Study methodology and characteristics of training and test data
  • Acceptable inputs and known limitations
  • User interface interpretation and clinical workflow integration
  • To the extent possible, the basis for model output

Users should also be informed of the scope and timing of modifications and updates, and provided a means to communicate product concerns to the manufacturer.

Your checklist before go-live

  • Have you read the instructions for use (IFU) and limitations?
  • Do you know which patient groups were under-represented in development?
  • Is there a local escalation path if the tool behaves unexpectedly?

Principle 10: Monitor deployed models and manage retraining

Deployed models need ongoing monitoring in real-world use with a risk-based focus on maintained or improved safety and performance.

When models are retrained after deployment, appropriate controls must manage risks of:

  • Overfitting
  • Unintended bias
  • Performance degradation (e.g. dataset drift)

Shared responsibility

Manufacturers must build monitoring capability; healthcare organisations must define local governance — who reviews alerts, how drift is detected in your population, and when to pause use pending investigation.

Responsible AI literacy means treating deployment as the start of safety oversight, not the end.

Source: IMDRF — Good Machine Learning Practice for Medical Device Development: Guiding Principles (January 2025)

Ready to test your knowledge?

Take a short quiz based on this article to check your understanding.

Take the quiz