Aprendizaje por refuerzo para manipulación de objetos con actuador robótico

Camacho Poveda, Edgar Camilo; Pérez Gordillo, Fabian Eduardo

Aprendizaje por refuerzo para manipulación de objetos con actuador robótico

dc.contributor.author	Camacho Poveda, Edgar Camilo
dc.contributor.author	Pérez Gordillo, Fabian Eduardo
dc.contributor.cvlac	http://scienti.colciencias.gov.co:8081 /cvlac/visualizador/ generarCurriculoCv.do? cod_rh=0001630084
dc.contributor.cvlac	http://scienti.colciencias.gov.co:8081 /cvlac/visualizador/ generarCurriculoCv.do? cod_rh=0001516111
dc.contributor.googlescholar	https://scholar.google.com/ citations?user= tJG988kAAAAJ&hl=es
dc.contributor.googlescholar	https://scholar.google.com/ citations?user= vncSAb0AAAAJ&hl=es
dc.contributor.orcid	https://orcid.org/ 0000-0002-6084-2512
dc.contributor.orcid	https://orcid.org/ 0000-0002-2746-8733
dc.date.accessioned	2020-04-20T17:14:54Z
dc.date.available	2020-04-20T17:14:54Z
dc.date.issued	2019-08
dc.description	Este proyecto implementará un algoritmo de aprendizaje por refuerzo para la manipulación de objetos por parte de un brazo robótico, enfocado en tareas que debatirán realizar un robot de servicio doméstico. Partirá de la implementación de la plataforma robótica, la cual tendrá los grados de libertad y el actuador específico para la manipulación de objetos encontrados en situaciones cotidianas. A continuación, se planteará el algoritmo de aprendizaje por refuerzo, partiendo del espacio de estados que mejor describe el ambiente, y el espacio de acciones que permite realizar las tareas de forma adecuada. Así mismo se estudiarán métodos que evaluarán el estado a partir de la observación directa del ambiente, como redes neuronales profundas. El entrenamiento se realizará en su mayoría en un entorno de simulación, con el fin de no poner en riesgo la integridad física del robot en las primeras etapas de aprendizaje. Al final del proceso, si se considera necesario, se refinará el entrenamiento obtenido con el robot real. A continuación, se evaluará cuantitativamente el resultado obtenido por el agente en las diferentes tareas aprendidas. Finalmente, se iniciará con la implementación de este brazo robótico en un prototipo de robot social doméstico construido en su totalidad por la Universidad Santo Tomás, con el fin de participar en certificados de robótica social y doméstica.	spa
dc.description.abstract	This project will implement a reinforcement learning algorithm for the manipulation of objects by a robotic arm, focused on tasks that a domestic service robot will discuss. It will start from the implementation of the robotic platform, which will have the degrees of freedom and the specific actuator for the manipulation of objects found in everyday situations. Next, the reinforcement learning algorithm will be proposed, starting from the state space that best describes the environment, and the action space that allows tasks to be carried out properly. Likewise, methods that will evaluate the state from direct observation of the environment, such as deep neural networks, will be studied. The training will be carried out mostly in a simulation environment, in order not to jeopardize the physical integrity of the robot in the early stages of learning. At the end of the process, if deemed necessary, the training obtained with the real robot will be refined. Next, the result obtained by the agent in the different tasks learned will be evaluated quantitatively. Finally, it will begin with the implementation of this robotic arm in a prototype of a domestic social robot built entirely by the Santo Tomás University, in order to participate in certificates of social and domestic robotics.	spa
dc.description.domain	http://unidadinvestigacion.usta.edu.co	spa
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/11634/22648
dc.publisher.branch	CRAI-USTA Bogotá	spa
dc.relation.references	SPARC. The Partnership for Robotics in Europe, Robotics 2020 Multi-Annual Roadmap For Robotics in Europe, Europe: – Horizon 2020, 201	spa
dc.relation.references	Ministerio de Salud y Protección Social, «Envejecimiento demográfico. colombia 1951-2020 dinámica demográfica y estructuras poblacionales,» Bogotá DC, 2013	spa
dc.relation.references	G. Nargund, «Declining birth rate in Developed Countries: A radical policy re-think is required,» Facts Views Vis Obgyn, vol. 1, p. 191–193, 2009.	spa
dc.relation.references	Revista Dinero, «Cada vez nacen menos bebés en Colombia, ¿hacia dónde vamos?,» 22 11 2018. [En línea]. Available: https://www.dinero.com/edicionimpresa/pais/articulo/tasa-de-natalidad-en-colombia-disminuye-cada-vez-mas/264428. [Último acceso: 28 07 2019]	spa
dc.relation.references	University of California San Diego, Carnegie Mellon University, others, A Roadmap for US Robotics. From Internet to Robotics. 2016 Edition, USA, 2016.	spa
dc.relation.references	E. B. a. B. M. Tony Kuo, «Designing a robotic assistant for healthcare applications,» Robotics in Healthcare Applications, 2018.	spa
dc.relation.references	R. S. Sutton y A. G. Barto, Reinforcement Learning:, London, England: The MIT Press, 2018.	spa
dc.relation.references	C. Watkins, «Learning From Delayed Rewards,» King's College, Cambridge, UK, 1989.	spa
dc.relation.references	H. M. L. Hai Nguyen, «Review of Deep Reinforcement Learning for Robot Manipulation,» 2019 Third IEEE International Conference on Robotic Computing (IRC), 2019.	spa
dc.relation.references	S. Kelly y M. I. Heywood, «Emergent Solutions to High-Dimensional Multitask Reinforcement Learning,» Evolutionary Computation, vol. 26, nº 3, pp. 347-380, 2018.	spa
dc.relation.references	V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra y M. Riedmiller, «Playing Atari with Deep Reinforcement Learning,» de NIPS Deep Learning Workshop 2013, Toronto, Canadá, 2013.	spa
dc.relation.references	M. Tokic12, «Adaptive ε-Greedy Exploration in Reinforcement Learning Based on Value Differences,» Springer Berlin Heidelberg, Berlin, Heidelberg, 2010	spa
dc.relation.references	J. Schulman, S. Levine, P. Moritz, M. I. Jordan y P. Abbeel, «Trust Region Policy Optimization,» de Intern. Conf. on Machine Learning, 2015.	spa
dc.relation.references	J. Schulman, F. Wolski, P. Dhariwal, A. Radford y O. Klimov, «Proximal policy optimization algorithms,» Axiv, vol. abs/1707.06347, 2017.	spa
dc.relation.references	T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver y D. Wierstra, «Continuous control with deep reinforcement learning,» ArXiv:1509.02971, nº arXiv:1509.02971, 2015.	spa
dc.relation.references	T. Haarnoja, A. Zhou, P. Abbeel y S. Levine, «Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor,» ArXiv, nº arXiv:1801.01290.	spa
dc.relation.references	S. Sukhbaatar, Z. Lin, I. Kostrikov, G. Synnaeve, A. Szlam y R. Fergus, «Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play,» de Sixth International Conference on Learning Representations, Montreal, Canadá, 2018	spa
dc.relation.references	M. Plappert, R. Houthooft, P. Dhariwal, S. Sidor, R. Y. Chen, X. Chen, T. Asfour, P. Abbeel y M. Andrychowicz, «Parameter Space Noise for Exploration,» de Seventh International Conference on Learning Representations, Montreal, Canadá, 2018.	spa
dc.relation.references	J. J. Sweafford y F. Fahimi, «MODEL-FREE ONLINE REINFORCEMENT LEARNING OF A ROBOTIC MANIPULATOR,» MECHATRONIC SYSTEMS AND CONTROL, vol. 47, nº 3, pp. 136-143, 2019.	spa
dc.relation.references	M. Breyer, F. Furrer, T. Novkovic, R. Siegwart y J. Nieto, «Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning,» IEEE ROBOTICS AND AUTOMATION LETTERS, vol. 4, nº 2, pp. 1549-1556, 2019.	spa
dc.relation.references	S. Krishnan, A. Garg, R. Liaw, B. Thananjeyan, L. Miller, F. T. Pokorny y K. Goldberg, «SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards,» INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, vol. 38, nº 2-3, pp. 126-145, 2019.	spa
dc.relation.references	T. G. Thuruthel, E. Falotico, F. Renda y C. Laschi, «Model-Based Reinforcement Learning for Closed-Loop Dynamic Control of Soft Robotic Manipulators,» IEEE TRANSACTIONS ON ROBOTICS, vol. 35, nº 1, pp. 124-134, 2019.	spa
dc.relation.references	Y. Tsurumine, Y. Cui, E. Uchibe y T. Matsubara, «Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation,» ROBOTICS AND AUTONOMOUS SYSTEMS, vol. 112, pp. 72-83, 2018.	spa
dc.relation.references	F. Amadio, A. Colome y C. Torras, «Exploiting Symmetries in Reinforcement Learning of Bimanual Robotic Tasks,» IEEE ROBOTICS AND AUTOMATION LETTERS, vol. 4, nº 2, pp. 1938-1845, 2019.	spa
dc.relation.references	A. Iriondo, E. Lazkano, L. Susperregi, J. Urain y A. Fernandez, «Pick and Place Operations in Logistics Using a Mobile Manipulator Controlled with Deep Reinforcement Learning,» APPLIED SCIENCES-BASEL, vol. 9, nº 2, p. 348, 2019.	spa
dc.relation.references	M. Matamoros, C. Rascon, S. Wachsmuth, A. Moriarty, J. Kummert, J. Hart, S. Pfeiffer, M. van der Brugh y M. St-Pierre, RoboCup@Home, Rules & Regulations, 2019	spa
dc.relation.references	H. Li, R. Cai, N. Liu, X. Lin y Y. Wang, «Deep reinforcement learning: Algorithm, applications, and ultra-low-power implementation,» NANO COMMUNICATION NETWORKS, vol. 16, pp. 81-90, 2018.	spa
dc.rights	Atribución-NoComercial-SinDerivadas 2.5 Colombia
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.subject.keyword	Reinforcement learning	spa
dc.subject.keyword	Machine learning	spa
dc.subject.keyword	Social robotics	spa
dc.subject.keyword	Object manipulation	spa
dc.subject.proposal	Aprendizaje por refuerzo	spa
dc.subject.proposal	Aprendizaje de máquina	spa
dc.subject.proposal	Robótica social	spa
dc.subject.proposal	Manipulación de objetos	spa
dc.title	Aprendizaje por refuerzo para manipulación de objetos con actuador robótico	spa
dc.type.category	Formación de Recurso Humano para la Ctel: Proyecto ejecutado con investigadores en empresas, industrias y Estado	spa