| Autors: Angelov, S. A., Lazarova, M. K. Title: Using LLM for Image Correction Proposals Keywords: image processing, multimodal LLMs, printing quality control, vector databases, vision transformers Abstract: The paper proposes an automated pipeline for printing quality control, leveraging multimodal large language models (LLMs) and vector databases to detect and mitigate defects such as misalignment, graininess, and offsetting in printed images. The pipeline integrates the CLIP-ViT model for feature extraction, ChromaDB for efficient embedding storage and retrieval, and LLaVA for generating actionable recommendations based on statistical metrics, including Structural Similarity Index (SSIM), histogram difference, and Mean Squared Error (MSE), alongside visual inputs. Technical challenges, such as memory constraints on Apple Silicon devices and floating-point image processing, are addressed to ensure scalability. Experimental validation using synthetic 512 × 512 images demonstrates the pipeline's efficacy with recommendations accurately corresponding to induced defects. References
Issue
|
Цитирания (Citation/s):
1. Komarski D., Vassilev V., Nikolov S., Dimitrova R., Dimitrov S., Data-Driven Process FMEA for Flexible Manufacturing Systems: Framework and Industrial Case Study, 2026, Applied Sciences Switzerland, issue 8, vol. 16, DOI 10.3390/app16083760, eissn 20763417 - 2026 - в издания, индексирани в Scopus
Вид: публикация в международен форум, публикация в реферирано издание, индексирана в Scopus