GMLP Principles 8–10: Testing, Transparency & Monitoring

Validate AI devices under clinically relevant conditions, provide clear information to users, and monitor deployed models for drift and retraining risks.

All articlesSource document

Real-world monitoring dashboard for deployed AI medical devices

Principle 8: Testing under clinically relevant conditions

Testing must demonstrate device performance during clinically relevant conditions using methodologically and statistically sound test plans executed independently of the training dataset.

Considerations include:

Intended patient population and relevant subgroups
Clinical environment and human–AI team use
Measurement inputs
Potential confounding factors

For Hong Kong professionals, this means asking whether validation reflects your patient mix, equipment, and workflow — not only published aggregate metrics from other countries.

Principle 9: Clear, essential information for users

Users — healthcare professionals, patients, or caregivers — must receive clear, contextually relevant information appropriate to their role. This includes:

Intended use / indications for use
Benefits and risks
Model performance for appropriate subgroups
Study methodology and characteristics of training and test data
Acceptable inputs and known limitations
User interface interpretation and clinical workflow integration
To the extent possible, the basis for model output

Users should also be informed of the scope and timing of modifications and updates, and provided a means to communicate product concerns to the manufacturer.

Your checklist before go-live

Have you read the instructions for use (IFU) and limitations?
Do you know which patient groups were under-represented in development?
Is there a local escalation path if the tool behaves unexpectedly?

Principle 10: Monitor deployed models and manage retraining

Deployed models need ongoing monitoring in real-world use with a risk-based focus on maintained or improved safety and performance.

When models are retrained after deployment, appropriate controls must manage risks of:

Overfitting
Unintended bias
Performance degradation (e.g. dataset drift)

Shared responsibility

Manufacturers must build monitoring capability; healthcare organisations must define local governance — who reviews alerts, how drift is detected in your population, and when to pause use pending investigation.

Responsible AI literacy means treating deployment as the start of safety oversight, not the end.

Source: IMDRF — Good Machine Learning Practice for Medical Device Development: Guiding Principles (January 2025)

Ready to test your knowledge?

Take a short quiz based on this article to check your understanding.

Take the quiz