dots.ocr

Jan 19 2026

GitHub: https://github.com/rednote-hilab/dots.ocr

⭐ 星星數： 6759

摘要： 這是一個基於單一視覺語言模型的多語言文件版面解析工具，能夠識別和分析各種語言文件的結構和內容。

技術堆疊： gradio, transformers, modelscope, flash-attn, accelerate, PyMuPDF, openai, huggingface_hub

架構： 專案採用模組化設計，包含核心模型、工具函數和演示模組。使用gradio建立網頁介面，依賴transformers和modelscope處理AI模型，PyMuPDF處理PDF文件，展現了完整的文件解析應用架構。