Revolucionar el acceso al patrimonio librario: los sistemas de HTR entre humanidades digitales y ciencia de la información

Stefano Bazzaco

Revolucionar el acceso al patrimonio librario: los sistemas de HTR entre humanidades digitales y ciencia de la información

Bazzaco, Stefano ^[1]
1. [1] Università di Verona
Localización: Philologia hispalensis, ISSN-e 2253-8321, ISSN 1132-0265, Vol. 38, Nº 2, 2024 (Ejemplar dedicado a: Estudios literarios), págs. 59-77
Idioma: español
DOI: 10.12795/PH.2024.v38.i02.03
Títulos paralelos:
- Revolutionizing Access to Library Heritage: HTR Systems between Digital Humanities and Information Science
Enlaces
- Texto completo

Dialnet Métricas: 1 Cita

Resumen
- español
  El presente trabajo busca ofrecer un estado de la cuestión sobre los recientes desarrollos en el campo de la transcripción automática de impresos antiguos y manuscritos con sistemas de HTR (Handwritten Text Recognition), fijando la atención prioritariamente en la creación reciente de modelos de HTR mixtos. Al respecto se explican las características principales de las herramientas más difundidas y el flujo de trabajo para la generación de modelos de reconocimiento de texto. En segundo lugar, se proporciona una muestra significativa de los modelos disponibles en la actualidad, insistiendo en el proceso de producción, los criterios adoptados y la evaluación de los resultados en relación con la experiencia madurada por el grupo de investigación Progetto Mambrino de la Universidad de Verona. Finalmente se proporcionan unas futuras pistas de investigación para la creación y difusión de estos recursos, haciendo hincapié en la necesidad de buscar una mayor sinergia entre contexto académico, expertos informáticos e instituciones de la memoria.
- English
  The present work aims to offer a state of the art on recent developments in the field of automatic transcription of historical printed documents and manuscripts with HTR (Handwritten Text Recognition) systems, focusing primarily on the recent creation of HTR general models. In this regard, the main characteristics of the most widespread tools and the workflow for generating text recognition models are explained. Secondly, a significant sample of the models currently available is provided, insisting on the production process, the criteria adopted and the evaluation of the results, in relation to the experience matured by the Progetto Mambrino research group of the University of Verona. Finally, some future research directions are provided for the creation and dissemination of these resources, emphasizing the need to seek greater synergy between the academic context, computer experts and memory institutions.
Referencias bibliográficas
- Allés Torrent, S. (2020). Crítica textual y edición digital o ¿dónde está la crítica en las ediciones digitales?. Studia Aurea: revista de...
- Alvite-Díez, M. L. y Barrionuevo, L. (2020). Confluence between library and information science and digital humanities in Spain. Methodologies,...
- Alvite-Díez, M. L. y Rojas-Castro, A. (2022). Ediciones digitales académicas: Concepto, estándares de calidad y software de publicación. El...
- Ball, R. y Parker, G. (Eds.). (2014). Cómo ser rey. Instrucciones del emperador Carlos V a su hijo Felipe. Mayo de 1543. CSA-The Hispanic...
- Bazzaco, S. (2018). El Progetto Mambrino y las tecnologías OCR: estado de la cuestión. Historias Fingidas, (6), 257-272. https://doi.org/10.13136/2284-2667/89
- Bazzaco, S. (2020). El reconocimiento automático de textos en letra gótica del Siglo de Oro: creación de un modelo HTR basado en libros de...
- Bazzaco, S., Jiménez Ruiz, A. M., Torralba Ruberte, A. y Martín Molares, M. (2022). Sistemas de reconocimiento de textos e impresos hispánicos...
- Bermúdez Carreño, J. (2023). Inteligencia artificial para la transcripción de letra itálica española del siglo XVIII: Transkribus como herramienta...
- Capurro, C., Provatorova, V. y Kanoulas, E. (2023). Experimenting with Training a Neural Network in Transkribus to Recognise Text in a Multilingual...
- Cordell, R. y Smith, D. (2018). A Research Agenda for Historical and Multilingual Optical Character Recognition. Northeastern University Library....
- Cuéllar, Á. (2023). La Inteligencia Artificial al rescate del Siglo de Oro. Transcripción y modernización automática de mil trescientos impresos...
- Firmani, D., Maiorino, M., Merialdo, P. y Nieddu, E. (2018). Towards Knowledge Discovery from the Vatican Secret Archives. In Codice Ratio...
- Fradejas Rueda, J. M. (2022). De editor analógico a editor digital. Historias Fingidas, (Número Especial 1), 39-65. https://doi.org/10.13136/2284-2667/1108
- García-Reidy, A. (2019). Deconstructing the Authorship of Siempre ayuda la verdad: A Play by Lope de Vega?. Neophilologus, 103(4), 493-510....
- Gille Levenson, M. (2023). Towards a general open dataset and models for late medieval Castilian text recognition (HTR/OCR). Journal of Data...
- Hodel, T., Schoch, D., Schneider, C. y Purcell, J. (2021). General Models for Handwritten Text Recognition: Feasibility and State-of-the Art....
- Kroll, S. y Sanz-Lázaro, F. (2022). Romances teatrales entre Mira de Amescua, Calderón y Lope, ritmo, asonancia y cuestiones de autoría. Revista...
- Liceras Garrido, R., Comino, A. y Murrieta Flores, P. (2022). Mujeres en el Catálogo Monumental de España: Discursos arqueológicos sobre Prehistoria...
- Mancinelli, T. (2016). Early printed edition and OCR techniques: what is the state-of-art? Strategies to be developed from the working-progress...
- Menta, A., Sánchez-Salido, E. y García-Serrano, A. (2022). Transcripción de periódicos históricos: Aproximación CLARA-HD. En M. Á. Alonso,...
- Mühlberger, G., Seaward, L., Terras, M., Ares Oliveira, S., Bosch, V., Bryan, M., Colutto, S., Déjean, H., Diem, M., Fiel, S., Gatos, B.,...
- Neto, A. F. de S., Bezerra, B. L. D. y Toselli, A. H. (2020). Towards the natural language processing as spelling correction for offline handwritten...
- Pavlopoulos, J., Kougia, V., Platanou, P., Shabalin, S., Liagkou, K., Papadatos, E., Essler, H., Camps, J. B. y Fischer, F. (2022). Error...
- Perdiki, E. (2023). Preparing Big Manuscript Data for Hierarchical Clustering with Minimal HTR Training. Journal of Data Mining and Digital...
- Pinche, A. (2023). Generic HTR Models for Medieval Manuscripts. The CREMMALab Project. Journal of Data Mining and Digital Humanities. Special...
- Rabus, A. (2019). Recognizing Handwritten Text in Slavic Manuscripts: A Neural-Network Approach Using Transkribus. Scripta & e-Scripta,...
- Schwarz-Ricci, V. I. (2022). Handwritten Text Recognition per registri notarili (secc. XV-XVI): una sperimentazione. Umanistica Digitale,...
- Souibgui, M. A., Bensalah, A., Chen, J., Fornés, A. y Waldispühl, M. (2022). A User Perspective on HTR Methods for the Automatic Transcription...
- Terras, M. (2010). The Rise of Digitization: An Overview. En R. Rukowski (Ed.), Digital Libraries (pp. 3-20). Sense Publishers.
- Terras, M. (2022a). Inviting AI into the Archives: The Reception of Handwritten Recognition Technology into Historical Manuscript Transcription....
- Terras, M. (2022b). The Role of the Library When Computers Can Read: Critically Adopting Handwritten Text Recognition (HTR) Technologies to...
- Weber, A., Ameryan, M., Wolstencroft, K., Stork, L., Heerlien, M. y Schomaker, L. (2018). Towards a Digital Infrastructure for Illustrated...

Mi Hispadoc

Selección

Opciones de artículo

Seleccionado

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Acceso de usuarios registrados

Revolucionar el acceso al patrimonio librario: los sistemas de HTR entre humanidades digitales y ciencia de la información

Mi Hispadoc

Opciones de artículo

Opciones de compartir

Opciones de entorno