Turkish OCR on Mobile and Scanned Document Images

dc.contributor.authorKarasu, Kurtulus
dc.contributor.authorBastan, Muhammet
dc.date.accessioned2025-10-24T18:10:20Z
dc.date.available2025-10-24T18:10:20Z
dc.date.issued2015
dc.departmentMalatya Turgut Özal Üniversitesi
dc.description23nd Signal Processing and Communications Applications Conference (SIU) -- MAY 16-19, 2015 -- Inonu Univ, Malatya, TURKEY
dc.description.abstractOptical character recognition (OCR) systems have been widely used to convert documents into digital form. There are lots of both commercial and open source OCR systems available, but a benchmark on Turkish OCR is nonexistent. In this work, we first prepared two publicly available datasets for Turkish OCR, consisting of scanned document images and mobile camera captured document images. Then, we evaluated the Turkish OCR performance of three popular open source OCR systems (Tesseract, CuneiForm, GOCR) on the datasets. Tesseract outperformed the other two on both datasets.
dc.description.sponsorshipDept Comp Engn & Elect & Elect Engn,Elect & Elect Engn,Bilkent Univ
dc.identifier.endpage2077
dc.identifier.isbn978-1-4673-7386-9
dc.identifier.issn2165-0608
dc.identifier.startpage2074
dc.identifier.urihttps://hdl.handle.net/20.500.12899/4096
dc.identifier.wosWOS:000380500900499
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.language.isotr
dc.publisherIeee
dc.relation.ispartof2015 23rd Signal Processing And Communications Applications Conference (Siu)
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_20251023
dc.subjectTurkish OCR; mobile device; scanner; dataset; benchmark; Tesseract
dc.titleTurkish OCR on Mobile and Scanned Document Images
dc.typeConference Object

Dosyalar