HealthcareMedical ImagingCASE_08

Detecting manufacturing defects in medical devices with 99.2% sensitivity

A US Class-II medical device manufacturer (single-use surgical instrumentation, engagement 2023) where a 1.8% defect escape rate had caused an average of two recalls per year at ~$4M each. Manual inspection capped at ~400 units/hour per inspector, a hard throughput bottleneck. Our multi-head CNN architecture (separate detection heads per defect class, surface scratches, dimensional deviation, component misalignment, coating failure, sharing a ResNet-50 backbone) was trained on ~80,000 labelled images from the QA team. Runs on edge hardware at the production line at ~0.3s per unit. Escape rate 1.8% → 0.2%. Inspection throughput 6× to ~2,400 units/hour. Zero recalls in 14 months post-deployment.

99.2%

Detection sensitivity

−89%

Escape rate

6×

Inspection throughput

The Challenge

1.8% of defective devices were escaping manual inspection and reaching customers, causing an average of 2 recalls per year at $4M+ each. Manual inspection was capped at 400 units/hour per inspector, a throughput bottleneck at current production volume. The defect types were diverse: surface scratches, dimensional deviations, component misalignment, and coating failures, each requiring different detection logic.

Our Approach

We used a multi-head CNN architecture with separate detection heads for each defect class, trained on 80,000 labelled images provided by the QA team. A critical design decision: we treated defect types as separate problems sharing a backbone, rather than a single multi-class classifier. The system runs on edge hardware at the production line, processing images in real-time at 0.3 seconds per unit. Defect confidence scores above the threshold trigger an automatic divert to the reject bin.

Outcome

Detection sensitivity: 99.2%. Defect escape rate reduced from 1.8% to 0.2%. Inspection throughput: 2,400 units/hour (6× manual). Zero recalls in 14 months post-deployment. The QA team now audits the AI's reject bin rather than inspecting every unit.

What We Learned

Treat diverse defect types as separate problems, not a single multi-class classifier.

Edge deployment requires hardware-software co-design from Discovery, not as an afterthought.

Sensitivity matters more than accuracy for safety-critical inspection, a false negative is much worse than a false positive.

Stages Engaged

AI Readiness Audit

Discovery & Blueprint

Concept Validation

Production Build

Total Duration

7 months total

Artifacts Delivered

PRD

Computer Vision Architecture

Edge Deployment Spec

WBS

Regulatory Validation Protocol

Start with a Feasibility Call

2 hours. No cost. We'll tell you honestly whether AI makes sense for your case.

Book a call