Inclusión de la lengua Wayuunaiki en el reconocimiento de comandos de voz del robot social Pepper empleando la metodología de transformación de modelos

Rojas, Armando Mateus; Amaya, Sindy Paola

Inclusión de la lengua Wayuunaiki en el reconocimiento de comandos de voz del robot social Pepper empleando la metodología de transformación de modelos

dc.contributor.author	Rojas, Armando Mateus
dc.contributor.author	Amaya, Sindy Paola
dc.contributor.cvlac	https://scienti.colciencias.gov.c o/cvlac/visualizador/generarCur riculoCv.do?cod_rh=00006806 30
dc.contributor.cvlac	https://scienti.colciencias.gov.c o/cvlac/visualizador/generarCur riculoCv.do?cod_rh=00007964 25
dc.contributor.googlescholar	https://scholar.google.es/citations?hl=es&pli=%201&user=1az5o_IAAAAJ%20-%201714-1593%20https://scholar.google.es/citations?hl=es&auth%20user=2&user=Gg2sofAAAAAJ
dc.contributor.orcid	https://orcid.org/0000-0002-2399-4859
dc.contributor.orcid	https://orcid.org/0000-0002- 1714-1593
dc.date.accessioned	2020-04-20T17:17:13Z
dc.date.available	2020-04-20T17:17:13Z
dc.date.issued	2019-08
dc.description	La presente propuesta busca dotar al robot social Pepper, con el que cuenta la Universidad Santo Tomás, de la capacidad de reconocimiento para comandos de voz en lengua Wayuunaiki. De esta forma, se permitirá la utilización de funciones de robótica social más avanzadas como la asistencia y el servicio. Así mismo, a través de los resultados y productos esperados se pretende disminuir la brecha tecnológica en la comunidad wayuu proveyendo nuevas capacidades al proyecto Kailumá de la Universidad Santo Tomás. Las herramientas de reconocimiento de voz disponibles están basadas en tecnologías que requieren de una gran cantidad de datos previos junto con un proceso de entrenamiento de dichas herramientas. Esto hace que el soporte de idiomas por parte de estas herramientas se dé para idiomas como el Español e Inglés pero no para lenguas con pocos hablantes, como es el caso de las lenguas indígenas colombianas. Por lo anterior, se propone una solución a esta problemática mediante la utilización de cadenas de transformación de modelos. Para esto, se modelan los comandos de voz en idioma Wayuunaiki que serán la entrada de la cadena de transformación; así, se requiere configurar, entrenar y modelar una herramienta de reconocimiento de voz (speech recognition) que servirá como salida de la cadena de transformación. La generación de dichos modelos surge del análisis del conjunto de comandos de voz para robótica social predefinidos para un entorno de servicio doméstico. La cadena de transformación será implementada como un nodo de ROS para ser habilitada en el robot Pepper	spa
dc.description.abstract	The present proposal seeks to provide the Pepper social robot, which the Santo Tomás University has, with the recognition capacity for voice commands in the Wayuunaiki language. In this way, the use of more advanced social robotics functions such as assistance and service will be allowed. Likewise, through the expected results and products, the aim is to reduce the technological gap in the Wayuu community by providing new capabilities to the Kailumá project of the Santo Tomás University. The voice recognition tools available are based on technologies that require a large amount of previous data along with a training process for these tools. This means that the language support by these tools is given for languages such as Spanish and English, but not for languages with few speakers, as is the case of Colombian indigenous languages. Therefore, a solution to this problem is proposed by using model transformation chains. For this, voice commands in the Wayuunaiki language are modeled, which will be the input of the transformation chain; thus, it is required to configure, train and model a speech recognition tool that will serve as the output of the transformation chain. The generation of such models arises from the analysis of the predefined set of voice commands for social robotics for a domestic service environment. The transformation chain will be implemented as a ROS node to be enabled in the Pepper robot	spa
dc.description.domain	http://unidadinvestigacion.usta.edu.co	spa
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/11634/22652
dc.publisher.branch	CRAI-USTA Bogotá	spa
dc.relation.references	Constitución Política de Colombia, Artículo 7. (1991).	spa
dc.relation.references	Mansen, Captain. Lenguas Indígenas de Colombia. Instituto Caro y Cuero. (2000)	spa
dc.relation.references	S. Roy, A. K. Maiti, I. Ghosh, I. Chatterjee, and K. Ghosh, A new assistive technology in Android platform to aid vocabulary knowledge acquirement in indian sign language for better reading comprehension in l2 and mathematical ability, 2019 6th International Conference on Signal Processing and Integrated Networks (SPIN), March 2019, pp. 408-413.	spa
dc.relation.references	How to Create a Great Experience with Pepper. SoftBank Robotics. (2017).	spa
dc.relation.references	Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng, Deepspeech: Scaling up end-to-end speech recognition, (2014).	spa
dc.relation.references	R. Bolhassan, J. Craneeld, and D. Dorner, Indigenous knowledge sharing in sarawak: A system-level view and its implications for the cultural heritage sector, 2014 47th Hawaii International Conference on System Sciences, Jan 2014, pp. 3378-3388	spa
dc.relation.references	Sebastian Schneider, Michael Goerlich, and Franz Kummert, A framework for designing socially assistive robot interactions, Cognitive Systems Research 43 (2017), 301 - 312.	spa
dc.relation.references	Y. Shi, J. Bai, P. Xue, and D. Shi, Fusion feature extraction based on auditory and energy for noise-robust speech recognition, IEEE Access 7 (2019), 81911-81922	spa
dc.relation.references	T. Barman and N. Deb, State of the art review of speech recognition using genetic algorithm, 2017 IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI), Sep. 2017, pp. 2944{2946.	spa
dc.relation.references	El Buscador. Universidad Santo Tomás. Oct 2018	spa
dc.relation.references	Sebastian Weigelt and Walter Tichy, Poster: Pronat: An agent-based system design for programming in spoken natural language, 05 2015, pp. 819{820.	spa
dc.relation.references	R. Mead, Semio: Developing a cloud-based platform for multimodal conversational ai in social robotics, 2017 IEEE International Conference on Consumer Electronics (ICCE), Jan 2017,m pp. 291-292.	spa
dc.relation.references	D. Zhang, L. Wu, S. Li, Q. Zhu, and G. Zhou, Multi-modal language analysis with hierarchical interaction-level and selection-level attentions, 2019 IEEE International Conference on Multimedia and Expo (ICME), July 2019, pp. 724{729.	spa
dc.relation.references	Tumisho Mokgonyane, Tshephisho Sefara, Thipe Modipa, Mercy Mogale, Madimetja Manamela, and Phuti Manamela, Automatic speaker recognition system based on machine learning algorithms, 01 2019.	spa
dc.relation.references	Jian-Hua Tao, Jian Huang, Ya Li, Zheng Lian, and Ming-Yue Niu, Semi-supervised ladder networks for speech emotion recognition, International Journal of Automation and Computing	spa
dc.relation.references	Y. Suh, Y. Kim, H. Lim, J. Goo, Y. Jung, Y. Choi, H. Kim, D. Choi, and Y. Lee, Development of distant multi-channel speech and noise databases for speech recognition by in-door conversational robots, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA), Nov 2017, pp. 1-4	spa
dc.relation.references	Jianxin Peng, Lei Zhao, and Yanmei Jiang, Investigation of word recognition for the elderly in speech and noise spatial separation, Applied Acoustics 153 (2019), 48 - 52.	spa
dc.relation.references	David R. Appleton, What do we mean by a statistical model?, Statistics in Medicine 14 (1995), no. 2, 185-197.	spa
dc.relation.references	J. L. Mayorga, C. Dominguez-Bonilla, A. Gutierrez, F. Jimenez, and H. Chamorro, Development of real-time control emulator in fpga using hiles methodology, IECON 2015 - 41st Annual Conference of the IEEE Industrial Electronics Society, Nov 2015, pp. 004076-004081	spa
dc.relation.references	P. Serai, P. Wang, and E. Fosler-Lussier, Improving speech recognition error prediction for modern and off -the-shelf speech recognizers, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2019, pp. 7255-7259	spa
dc.relation.references	L. Li, D. Wang, Y. Chen, Y. Shi, Z. Tang, and T. F. Zheng, Deep factorization for speech signal, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2018, pp. 5094-5098.	spa
dc.relation.references	T. Athanaselis, S. Bakamidis, G. Giannopoulos, I. Dologlou, and E. Fotinea, Robust speech recognition in the presence of noise using medical data, 2008 IEEE International Workshop on Imaging Systems and Techniques, Sep. 2008, pp. 349-352.	spa
dc.relation.references	Jianxin Peng, Lei Zhao, and Yanmei Jiang, Investigation of word recognition for the elderly in speech and noise spatial separation, Applied Acoustics 153 (2019), 48 - 52.	spa
dc.relation.references	D. Iguaran Fernandez, A. Molina, O. Quintero and O. Bedoya, Design and implementation of an web api for the automatic translation Colombia's language pairs: Spanish-wayuunaiki case, 05 2013, pp. 1-9	spa
dc.relation.references	Hlldobler, B. Rumpe, and I. Weisemller, Systematically deriving domain-specic transformation languages, 2015 ACM/IEEE 18th International Conference on Model Driven Engineering Languages and Systems (MODELS), Sep. 2015, pp. 136-145.	spa
dc.relation.references	E. Kasano, S. Muramatsu, A. Matsufuji, E. Sato-Shimokawara, and T. Yamaguchi, Estimation of speakers con dence in conversation using speech information and head motion, 2019 16th International Conference on Ubiquitous Robots (UR), June 2019, pp. 294-298.	spa
dc.relation.references	J. Loewen and Kinshuk, The need for technological innovations for indigenous knowledge transfer in culturally inclusive education, 2012 IEEE 12th International Conference on Advanced Learning Technologies, July 2012, pp. 577-578.	spa
dc.relation.references	M. Masinde and P. N. Thothela, Itiki plus: A mobile based application for integrating indigenous knowledge and scientic agro-climate decision support for Africas small-scale farmers, 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT), March 2019, pp. 303-309.	spa
dc.relation.references	T. B. Mokgonyane, T. J. Sefara, T. I. Modipa, M. M. Mogale, M. J. Manamela, and P. J. Manamela, Automatic speaker recognition system based on machine learning algorithms, 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), Jan 2019, pp. 141-146	spa
dc.relation.references	M. Pleva, J. Juhar, S. Ondas, C. R. Hudson, C. L. Bethel, and D. W. Carruth, Novice user experiences with a voice-enabled human-robot interaction tool, 2019 29th International Conference Radioelektronika (RADIOELEKTRONIKA), April 2019, pp. 1-5	spa
dc.rights	Atribución-NoComercial-SinDerivadas 2.5 Colombia
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.subject.keyword	Speech recognition	spa
dc.subject.keyword	Speech recognition	spa
dc.subject.keyword	Indigenous languages	spa
dc.subject.keyword	Model transformation chains	spa
dc.subject.keyword	Social robotics	spa
dc.subject.proposal	Reconocimiento de voz	spa
dc.subject.proposal	Speech recognition	spa
dc.subject.proposal	Lenguas indígenas	spa
dc.subject.proposal	Cadenas de transformación de modelos	spa
dc.subject.proposal	Robótica social	spa
dc.title	Inclusión de la lengua Wayuunaiki en el reconocimiento de comandos de voz del robot social Pepper empleando la metodología de transformación de modelos	spa
dc.type.category	Formación de Recurso Humano para la Ctel: Proyecto ejecutado con investigadores en empresas, industrias y Estado	spa