ADOPTION OF OPEN-SOURCE OPTICAL CHARACTER RECOGNITION (OCR) IN DIGITIZATION AND DATABASE MANAGEMENT OF STUDENT PROJECTS IN FEDERAL POLYTECHNIC OKO, ANAMBRA STATE, NIGERIA

Authors

  • Obodoeze Franklin Chukwuma Department of Library and Information Science Federal Polytechnic Oko, Anambra State

Keywords:

Open-source OCR, Tesseract, digitization, digital repository, Database management, Projects

Abstract

This study examined the adoption of open-source Optical Character Recognition (OCR) and relational database management for the digitization of student projects and theses at the Dr. Alex Ekwueme Library, Federal Polytechnic, Oko. The main objective was to convert physical documents into a searchable digital repository to improve preservation, accessibility, and retrieval efficiency. A total of 391 student projects were selected across various departments using stratified purposive sampling. The study employed a mixed-methods approach involving needs assessment, scanning, image preprocessing with OpenCV, Tesseract OCR application, MySQL/PostgreSQL database development, staff training, and system evaluation. Key findings revealed a 99.5% successful OCR conversion rate with an average Character Error Rate (CER) of 4.1%. The developed digital repository recorded an average retrieval time of 0.48 seconds and achieved 87.6% user satisfaction, with a 68% reduction in document access time. Library staff also demonstrated high competence in managing the system. The study concludes that open-source OCR and database technologies offer a cost-effective and sustainable solution for digitizing academic collections in Nigerian polytechnics.

Downloads

Published

23-05-2026

Issue

Section

Articles