Abstract
Widespread clinical deployment of computer-aided diagnosis (CAD) systems is hindered by the challenge of integrating with existing hospital IT infrastructure. VisionCAD is a vision-based radiological assistance framework that circumvents this barrier by capturing medical images directly from displays using a camera system. The framework operates through an automated pipeline that detects, restores, and analyzes on-screen medical images, transforming camera-captured visual data into diagnostic-quality images suitable for automated analysis and report generation. Validated across diverse medical imaging datasets, VisionCAD achieves diagnostic performance comparable to conventional CAD systems operating on original digital images, with F1-score degradation typically less than 2% across classification tasks, while natural language generation metrics for automated reports remain within 1%. By requiring only a camera device and standard computing resources, VisionCAD offers an accessible approach for AI-assisted diagnosis without modifications to existing clinical infrastructure.