Generación de comportamientos de enjambre en robots móviles a través del uso del aprendizaje por refuerzo.

Quesada Moncayo, Wilson Orlando

Generación de comportamientos de enjambre en robots móviles a través del uso del aprendizaje por refuerzo.

dc.contributor.advisor	Calderón Chávez, Juan Manuel
dc.contributor.author	Quesada Moncayo, Wilson Orlando
dc.date.accessioned	2019-02-01T14:13:46Z
dc.date.available	2019-02-01T14:13:46Z
dc.date.issued	2019-01-31
dc.description	En este trabajo se hace uso de técnicas de aprendizaje por refuerzo (Q-Learning) con el objetivo de entrenar un grupo de robots para generar comportamientos de enjambre. Se presentan dos posibles soluciones con diferentes enfoques. En la primera solución propuesta se establecen los estados del robot en función de la distancia de sus dos vecinos más cercanos. En la segunda solución propuesta se definen un radio de atracción y otro radio de repulsión, y los estados se establecen según la cantidad de vecinos dentro de cada uno de los radios divididos en los cuatro cuadrantes locales del robot. Para cada solución propuesta se definen las acciones del robot y se propone una política de premios y castigos. Cada robot se conecta con sus vecinos una vez que ha alcanzado una distancia prudente. Se hace uso de teoría de grafos para medir la conectividad del enjambre y saber si la topología del grafo que forma el enjambre al final de la simulación es conexo o no. En este trabajo se asume que la comunicación de cada agente con sus vecinos ya está resuelta. Se realizan varias pruebas en Matlab para cada una de las soluciones propuestas variando el número de robots del enjambre. Finalmente se prueba la segunda solución propuesta en V-rep usando robots cuadricópteros virtuales. Este documento está estructurado de la siguiente forma: En el capítulo 1 y 2 se define el problema y la justificación. El capítulo 3 y 4 contienen una revisión de trabajos relacionados con robótica de enjambre y se definen los objetivos del proyecto. En el capítulo 5 se presentan los conceptos teóricos necesarios utilizados en el desarrollo de este proyecto. En el capítulo 6 y 7 se muestra el diseño metodológico, la administración del proyecto, cronograma de actividades y presupuesto para el proyecto. En el capítulo 8 se muestra el trabajo previo a la realización de este proyecto usando lógica difusa. En el capítulo 9 y 10 se muestra el diseño del proyecto, el planteamiento de las soluciones propuestas, - las pruebas y resultados de las dos soluciones. Finalmente, las conclusiones se muestran en el capítulo 11.	spa
dc.description.abstract	In this work we make use of reinforcement learning techniques (Q-Learning) with the aim of training a group of robots to generate swarming behaviors. Two possible solutions with different approaches are presented. In the first proposed solution, the states of the robot are established according to the distance of its two closest neighbors. In the second proposed solution, a radius of attraction and another radius of repulsion are defined, and the states are established according to the number of neighbors within each of the radii divided into the four local quadrants of the robot. For each proposed solution the actions of the robot are defined and a policy of rewards and punishments is proposed. Each robot connects with its neighbors once it has reached a prudent distance. It makes use of graph theory to measure the connectivity of the swarm and to know if the topology of the graph that forms the swarm at the end of the simulation is connected or not. In this paper it is assumed that the communication of each agent with their neighbors is already resolved. Several tests are carried out in Matlab for each of the proposed solutions, varying the number of robots in the swarm. Finally, the second solution proposed in V-rep is tested using virtual quadrocopters. This document is structured as follows: Chapter 1 and 2 define the problem and the justification. Chapters 3 and 4 contain a review of work related to swarm robotics and define the objectives of the project. Chapter 5 presents the theoretical concepts needed in the development of this project. In chapter 6 and 7 the methodological design, project management, activity schedule and budget for the project are shown. Chapter 8 shows the work prior to the realization of this project using fuzzy logic. Chapter 9 and 10 show the design of the project, the approach of the proposed solutions, - the tests and results of the two solutions. Finally, the conclusions are shown in chapter 11.	spa
dc.description.degreelevel	Pregrado	spa
dc.description.degreename	Ingeniero Electronico	spa
dc.format.mimetype	application/pdf
dc.identifier.citation	Quesada Moncayo, W. O. (2019). Generación de comportamientos de enjambre en robots móviles a través del uso del aprendizaje por refuerzo.	spa
dc.identifier.instname	instname:Universidad Santo Tomás	spa
dc.identifier.reponame	reponame:Repositorio Institucional Universidad Santo Tomás	spa
dc.identifier.repourl	repourl:https://repository.usta.edu.co	spa
dc.identifier.uri	http://hdl.handle.net/11634/15223
dc.language.iso	spa
dc.publisher	Universidad Santo Tomás	spa
dc.publisher.branch	CRAI-USTA Bogotá	spa
dc.publisher.faculty	Facultad de Ingeniería Electrónica	spa
dc.publisher.program	Pregrado Ingeniería Electrónica	spa
dc.relation.references	[1] Centre for research on the epidemiology of disasters CRED, “The human cost of natural disasters: A global perspective”, 2015.	spa
dc.relation.references	[2] Yoon, H., Shiftehfar, R., Cho, S., Spencer, B. F., Nelson, M. E., & Agha, G. A. (2016). “Victim Localization and Assessment System for Emergency Responders”. Journal of Computing in Civil Engineering, 30(2), [04015011].	spa
dc.relation.references	[3] Rodrıguez, Saith, Eyberth Rojas, Katherın Pérez, Carlos Quintero, Oswaldo Pena, Andrés Reyes, and Juan Calderón. "STOx’s 2013 Team Description Paper." (2013).	spa
dc.relation.references	[4] Quintero, Carlos, Saith Rodríguez, Katherín Pérez, Jorge López, Eyberth Rojas, and Juan Calderón. "Learning soccer drills for the small size league of robocup." In Robot Soccer World Cup, pp. 395-406. Springer, Cham, 2014.	spa
dc.relation.references	[5] Rodríguez, Saith, Eyberth Rojas, Katherín Pérez, Jorge López, Carlos Quintero, and Juan Calderón. "Fast path planning algorithm for the robocup small size league." In Robot Soccer World Cup, pp. 407-418. Springer, Cham, 2014.	spa
dc.relation.references	[6] Rodrıguez, Saith, Eyberth Rojas, Katherın Pérez, Jorge López, Carlos Quintero, Juan Manuel Calderón, and Oswaldo Pena. "STOx’s 2015 Extended Team Description Paper." Joao Pessoa, Brazil (2014).	spa
dc.relation.references	[7] Cardona, Gustavo A., Wilfrido Moreno, Alfredo Weitzenfeld, and Juan M. Calderon. "Reduction of impact force in falling robots using variable stiffness." In SoutheastCon, 2016, pp. 1-6. IEEE, 2016.	spa
dc.relation.references	[8] Elibol, Ercan, Juan Calderon, Martin Llofriu, Carlos Quintero, Wilfrido Moreno, and Alfredo Weitzenfeld. "Power usage reduction of humanoid standing process using q-learning." In Robot Soccer World Cup, pp. 251-263. Springer, Cham, 2015.	spa
dc.relation.references	[9] Calderón, Juan M., Wilfrido Moreno, and Alfredo Weitzenfeld. "Fuzzy variable stiffness in landing phase for jumping robot." In Innovations in bio-inspired computing and applications, pp. 511-522. Springer, Cham, 2016.	spa
dc.relation.references	[10] Elibol, Ercan, Juan Calderon, Martin Llofriu, Wilfrido Moreno, and Alfredo Weitzenfeld. "Analyzing and Reducing Energy Usage in a Humanoid Robot During Standing Up and Sitting Down Tasks." International Journal of Humanoid Robotics 13, no. 04 (2016): 1650014.	spa
dc.relation.references	[11] Calderon, Juan, Gustavo A. Cardona, Martin Llofriu, Muhaimen Shamsi, Fallon Williams, Wilfrido Moreno, and Alfredo Weitzenfeld. "Impact Force Reduction Using Variable Stiffness with an Optimal Approach for Falling Robots." In Robot World Cup, pp. 404-415. Springer, Cham, 2016.	spa
dc.relation.references	[12] Calderon, Juan M., Eyberth R. Rojas, Saith Rodriguez, Heyson R. Baez, and Jorge A. Lopez. "A Robot soccer team as a strategy to develop educational iniciatives." In Latin American and Caribbean Conference for Engineering and Technology, Panama City, Panama. 2012.	spa
dc.relation.references	[13] Baez, Heyson, Katherin Perez, Eyberth Rojas, Saith Rodriguez, Jorge Lopez, Carlos Quintero, and Juan Manuel Calderon. "Application of an educational strategy based on a soccer robotic platform." In Advanced Robotics (ICAR), 2013 16th International Conference on, pp. 1-6. IEEE, 2013.	spa
dc.relation.references	[14] M.J. Mataric, “Reinforcement Learning in the Multi-Robot Domain”, Autonomous Robots 4, 73–83 (1997)	spa
dc.relation.references	[15] A Roadmap for US Robotics: From Internet to Robotics. 2016 Edition. Disponible en internet. URL: http://jacobsschool.ucsd.edu/contextualrobotics/docs/rm3-final-rs.pdf	spa
dc.relation.references	[16] Y. Zennir, “Apprentissage par renforcement et système distribués: application a l'apprentissage de la marche d'un robot hexapode”, Ph.D Thesis, Institut National Des Sciences Appliquées De Lyon, 2004. Disponible en internet. URL: http://theses.insa-lyon.fr/publication/2004ISAL0034/these.pdf	spa
dc.relation.references	[17] J. León, “Simulación De Enjambres De Robots En Labores De Exploración Para Detección De Posibles Víctimas”, Tesis de maestría en ingeniería electrónica, Universidad Santo Tomás Bogotá DC, 2017.	spa
dc.relation.references	[18] M. Brambilla, E. Ferrante, M. Birattari, and M. Dorigo, “Swarm robotics: a review from the swarm engineering perspective”, Swarm Intelligence, vol. 7, no. 1, pp. 1–41, 2013.	spa
dc.relation.references	[19] León, Jose, Gustavo A. Cardona, Andres Botello, and Juan M. Calderón. "Robot swarms theory applicable to seek and rescue operation." In International Conference on Intelligent Systems Design and Applications, pp. 1061-1070. Springer, Cham, 2016.	spa
dc.relation.references	[20] León, José, Gustavo A. Cardona, Luis G. Jaimes, Juan M. Calderón, and Pablo Ospina Rodriguez. "Rendezvous Consensus Algorithm Applied to the Location of Possible Victims in Disaster Zones." In International Conference on Artificial Intelligence and Soft Computing, pp. 700-710. Springer, Cham, 2018.	spa
dc.relation.references	[21] Yanguas-Rojas, David, Gustavo A. Cardona, Juan Ramirez-Rugeles, and Eduardo Mojica-Nava. "Victims search, identification, and evacuation with heterogeneous robot networks for search and rescue." In Automatic Control (CCAC), 2017 IEEE 3rd Colombian Conference on, pp. 1-6. IEEE, 2017.	spa
dc.relation.references	[22] S. Zhiguo, T. Jun, Z. Qiao, Z. Xiaomeng, W. Junming, "The Improved Q-Learning Algorithm based on Pheromone Mechanism for Swarm Robot System", IEEE 32nd Chinese Control Conference (CCC), pp. 6033-6038, 2013.	spa
dc.relation.references	[23] A. Šošić, A.M. Zoubir, H. Koeppl, “Reinforcement learning in a continuum of agents”, Swarm Intelligence, Vol. 12. no. 1, pp 23–51, 2018.	spa
dc.relation.references	[24] W.O. Quesada, J.I. Rodríguez, J.C. Murillo, G.A. Cardona, D.Y. Rojas, L.G. Jaimes, J.M. Calderón, “Leader-Follower Formation for UAV Robot Swarm Based on Fuzzy Logic Theory”, Artificial Intelligence and Soft Computing. ICAISC 2018. Lecture Notes in Computer Science, vol 10842. Springer, Cham	spa
dc.relation.references	[25] P.J. Denning, “Computer Science: The Discipline”, 1999. Disponible en internet. URL: http://denninginstitute.com/pjd/PUBS/ENC/cs99.pdf	spa
dc.relation.references	[26] F.S. Caparrini, “Introducción al aprendizaje automático”, Articulo, Dpto. de Ciencias de la Computación e Inteligencia Artificial, Universidad de Sevilla. Tomado de medio virtual en Mayo del 2018 desde http://www.cs.us.es/~fsancho/?e=75	spa
dc.relation.references	[27] R.S. Sutton, A.G. Barto, “Reinforcement Learning: An Introduction”, Near-final draft, May 27, 2018. Disponible en internet. URL: http://incompleteideas.net/book/the-book-2nd.html	spa
dc.relation.references	[28] DL4J, A Beginner’s Guide to Deep Reinforcement Learning, Tomado de medio virtual en Mayo del 2018 desde: https://deeplearning4j.org/deepreinforcementlearning	spa
dc.relation.references	[29] Analytics Vidhya, Simple Beginner’s guide to Reinforcement Learning & its implementation, Tomado de medio virtual en Mayo del 2018 desde: https://www.analyticsvidhya.com/blog/2017/01/introduction-to-reinforcement-learning-implementation/	spa
dc.relation.references	[30] Intel AI, Guest Post (Part I): Demystifying Deep Reinforcement Learning, Tomado de medio virtual en Mayo del 2018 desde: https://ai.intel.com/demystifying-deep-reinforcement-learning/	spa
dc.relation.references	[31] G. Beni, “From Swarm Intelligence to Swarm Robotics”, International Workshop on Swarm Robotics, SR 2004: Swarm Robotics pp 1-9.	spa
dc.relation.references	[32] F.L. Lewis, “Cooperative Control of Multi-Agent Systems - Optimal and Adaptive Design Approaches”, Communications and Control Engineering.	spa
dc.relation.references	[33] https://link.springer.com/chapter/10.1007/978-3-319-91262-2_65	spa
dc.relation.references	[34] https://www.scimagojr.com/journalsearch.php?q=25674&tip=sid&clean=0	spa
dc.rights	Atribución-NoComercial-SinDerivadas 2.5 Colombia
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.rights.coar	http://purl.org/coar/access_right/c_abf2
dc.rights.local	Abierto (Texto Completo)	spa
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.subject.keyword	Machine learning	spa
dc.subject.keyword	Multi-agent systems	spa
dc.subject.keyword	Q–Learning	spa
dc.subject.keyword	Reinforcement learning	spa
dc.subject.keyword	Swarm intelligence	spa
dc.subject.keyword	Swarm robotics	spa
dc.subject.lemb	Inteligencia artificial	spa
dc.subject.lemb	Inteligencia de enjambre	spa
dc.subject.lemb	Robótica	spa
dc.subject.proposal	Aprendizaje automático	spa
dc.subject.proposal	Aprendizaje por refuerzo	spa
dc.subject.proposal	Inteligencia de enjambre	spa
dc.subject.proposal	Q–Learning	spa
dc.subject.proposal	Robótica de enjambre	spa
dc.subject.proposal	Sistemas multiagente	spa
dc.title	Generación de comportamientos de enjambre en robots móviles a través del uso del aprendizaje por refuerzo.	spa
dc.type	bachelor thesis
dc.type.category	Formación de Recurso Humano para la Ctel: Trabajo de grado de pregrado	spa
dc.type.coar	http://purl.org/coar/resource_type/c_7a1f
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.drive	info:eu-repo/semantics/bachelorThesis
dc.type.local	Tesis de pregrado	spa
dc.type.version	info:eu-repo/semantics/acceptedVersion