Mr Stefan Pletschacher
School of Science, Engineering & Environment
Current positions
Lecturer
Biography
Stefan Pletschacher is a Lecturer in Computer Science at the University of Salford and a member of the Pattern Recognition and Image Analysis (PRImA) Research Lab. In the past he has held positions as Research Fellow at the University of Salford and Research Assistant at the Institute for Print and Media Technology at Chemnitz University of Technology. He is a Fellow of The Higher Education Academy (FHEA). Besides his academic career he has worked as freelance software developer as well as consultant for digitisation projects.
Stefan has been developer, technical advisor, and work package leader in large-scale international projects related to media production, digitisation, and Optical Character Recognition (OCR) such as SELEAC, IMPACT, SUCCEED, Europeana Newspapers, and eMOP. As technical lead he has overseen the development of numerous open source and commercial software projects implemented by the PRImA Research Lab (e.g. Aletheia - a comprehensive ground truth production and document analysis system, as well as specialist software tools related to performance evaluation for OCR workflows).
He is the author of the PAGE (Page Analysis and Ground-Truth Elements) format framework which is widely used in ground truth production, large-scale datasets, performance evaluation of OCR and Document Image Analysis (DIA) software, OCR training data creation, and show case production. In January 2016 he was elected to the Editorial Board of ALTO – the technical metadata standard for Optical Character Recognition maintained by The Library of Congress.
Stefan is co-organiser of the long-standing series of competitions on the state-of-the-art in document layout analysis hosted in conjunction with the International Conference on Document Analysis and Recognition (ICDAR). For many years he has been serving on the programme committee and as peer reviewer for conferences and journals in the field of DIA. He has been co-organiser of conferences like Document Engineering (DocEng), Document Analysis Systems (DAS) and Historical Document Imaging and Processing (HIP) and is author of numerous scientific publications.
Areas of Research
My research activities are related to the automation of digitisation processes, including document recognition and encoding as well as methods for supporting new and emerging output devices. A main focus are approaches for processing documents which are not suitable for traditional OCR methods.
- Document Image Analysis
- Image Segmentation
- Feature Extraction
- Block Classification
- Vectorization Methods (raster to vector conversion)
- Pattern Recognition especially OCR, ICR, OMR
- Machine Learning (decision trees, neural networks, fuzzy classifiers, support vector machines, ...)
- Clustering Methods
- Document Encoding
- XML/XSL Technologies
- Automatic Layout Generation
- Single Source Publishing
- New and Emerging Output Devices
- Formats for Mobile Devices
- Automated Digitization Processes
- Digital Preservation of Cultural Heritage