Aprendizaje de movimientos en robot humanoide a partir de inferencia de objetivos

dc.contributor.advisorCamacho Poveda, Edgar Camilospa
dc.contributor.advisorHiguera Arias, Carolinaspa
dc.contributor.authorSuarez Huertas, Yeison Estivenspa
dc.contributor.cvlachttps://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0001630084spa
dc.contributor.cvlachttp://scienti.colciencias.gov.co:8081/cvlac/visualizador/generarCurriculoCv.do?cod_rh=spa
dc.contributor.googlescholarhttps://scholar.google.es/citations?user=tJG988kAAAAJ&hl=esspa
dc.contributor.googlescholarhttps://scholar.google.es/citations?user=ZaxycbsAAAAJ&hl=esspa
dc.contributor.orcidhttps://orcid.org/0000-0002-6084-2512spa
dc.contributor.orcidhttps://orcid.org/0000-0001-5141-0817spa
dc.coverage.campusCRAI-USTA Bogotáspa
dc.date.accessioned2020-09-26T00:10:16Zspa
dc.date.available2020-09-26T00:10:16Zspa
dc.date.issued2020-08-16spa
dc.descriptionEste documento presenta la aplicación del aprendizaje por refuerzo inverso (IRL por sus siglas en inglés) en un robot humanoide conocido como Poppy Torso, con el fin de realizar movimientos de las extremidades superiores. El aprendizaje por refuerzo inverso se basa en el aprendizaje a partir de las demostraciones (trayectorias) de un experto. Con el fin de obtener una utilidad final lo más cercana a la utilidad obtenida por el experto en su recorrido, previamente se implementa un aprendizaje por refuerzo (RL por sus siglas en inglés) con una recompensa plenamente establecida dentro del entorno diseñado, el cual logró cumplir el objetivo que corresponde a generar movimientos desde un punto aleatorio hasta un punto establecido. El robot en simulación logra en la mayoría de los casos (con un porcentaje del 97.5% realizado sobre 1000 pruebas) llegar a su objetivo, tanto por aprendizaje por refuerzo como por refuerzo inverso.spa
dc.description.abstractThis document presents the application of Inverse Reinforcement Learning (IRL) in a humanoid robot known as Poppy Torso, in order to perform upper extremity movements. Inverse Reinforcement Learning (IRL) is based on learning from the demonstrations (trajectories) of an expert. In order to obtain a final utility as close to the utility obtained by the expert in his task, reinforcement learning (RL) is previously implemented with a fully established reward within the designed environment, which achieved fulfill the objective that corresponds to generating movements from a random point to a set point. The robot in simulation achieves in most cases (with a percentage of 97.5% performed on 1000 tests) to reach its objective, both by reinforcement learning and by inverse reinforcement.spa
dc.description.degreelevelPregradospa
dc.description.degreenameIngeniero Electronicospa
dc.description.domainhttp://unidadinvestigacion.usta.edu.cospa
dc.format.mimetypeapplication/pdfspa
dc.identifier.citationSuarez Huertas, Y. E. (2020). Aprendizaje de movimientos en robot humanoide a partir de inferencia de objetivos [tesis de pregrado, Universidad Santo Tomás] Repositorio instituconal - Universidad Santo Tomásspa
dc.identifier.instnameinstname:Universidad Santo Tomásspa
dc.identifier.reponamereponame:Repositorio Institucional Universidad Santo Tomásspa
dc.identifier.repourlrepourl:https://repository.usta.edu.cospa
dc.identifier.urihttp://hdl.handle.net/11634/30072
dc.language.isospaspa
dc.publisherUniversidad Santo Tomásspa
dc.publisher.facultyFacultad de Ingeniería Electrónicaspa
dc.publisher.programPregrado Ingeniería Electrónicaspa
dc.relation.referencesR. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. The MIT Press, second ed., 1998.spa
dc.relation.referencesC. H. Yeison Suarez and E. C. Camacho, “Inverse reinforcement learning application for discrete and continuous environments,” in AETA 2019 - Recent Advances in Electrical Engineering and Related Sciences: Theory and Application, Springer International Publishing, 2020.spa
dc.relation.referencesJ. G. Matthieu Lapeyre, Pierre Rouanet, “The poppy project.” https: //www.poppy-project.org/, 2012.spa
dc.relation.referencesP. Manceron, “Ikpy library.” https://github.com/Phylliade/ikpy, 2018.spa
dc.relation.referencesP. Abbeel and A. Y. Ng, ““apprenticeship learning via inverse reinforcement learning, ” in Proceedings of the Twenty-first International Conference on Machine Learning, ICML ’04, (New York, NY, USA),” 2004.spa
dc.relation.referencesH. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning.” https://arxiv.org/pdf/1509.06461.pdf, 2015.spa
dc.relation.referencesD. S. A. G. I. A. Volodymyr Mnih, Koray Kavukcuoglu, “Playing atari with deep reinforcement learning.” https://arxiv.org/pdf/1312. 5602.pdf, 2013.spa
dc.relation.referencesN. Ratliff, J. A. Bagnell, and S. S. Srinivasa, “Imitation learning for locomotion and manipulation, ”in 2007 7th IEEE-RAS International Conference on Humanoid Robots. Nov. 2007.spa
dc.relation.referencesA. J. Ijspeert, J. Nakanishi, and S. Schaal, ““movement imitation with nonlinear dynamical systems in humanoid robots, ” in Proceedings 2002 IEEE International Conference on Robotics and Automation,” vol. 2, p. 1398–1403, May 2002.spa
dc.relation.referencesK. Mu¨lling, J. Kober, O. Kroemer, and J. Peters, “Learning to select and generalize striking movements in robot table tennis,” The International Journal of Robotics Research, vol. 32, no. 3, p. 263–279, 2013.spa
dc.relation.referencesB. D. Argall, S. Chernova, M. Veloso, and B. Browning, ““a survey of robot learning from demonstration, ” Robotics and Autonomous Systems,” vol. 57, no. 5, p. 469, 2009.spa
dc.relation.referencesS. Calinon, F. Guenter, and A. Billard, ““on learning, representing, and generalizing a task in a humanoid robot, ” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics),” vol. 37, p. 286–298, Apr. 2007.spa
dc.relation.referencesP. Pastor, H. Hoffmann, T. Asfour, and S. Schaal, “Learning and generalization of motor skills by learning from demonstration, ” in 2009 IEEE International Conference on Robotics and Automation. May 2009.spa
dc.relation.referencesT. Zhang, Z. McCarthy, O. Jow, D. Lee, K. Goldberg, and P. Abbeel, ““deep imitation learning for complex manipulation tasks from virtual reality teleoperation, ” CoRR,” vol. abs/1710.04615, 2017.spa
dc.relation.referencesB. C. Stadie, P. Abbeel, and I. Sutskever, “Third-Person Imitation Learning, ” ArXiv e-prints. Mar. 2017.spa
dc.relation.referencesP. Sermanet, K. Xu, and S. Levine, ““unsupervised perceptual rewards for imitation learning, ” CoRR,” vol. abs/1612.06699, 2017.spa
dc.relation.referencesP. Sermanet, C. Lynch, Y. Chebotar, J. Hsu, E. Jang, S. Schaal, and S. Levine, “Time-contrastivenetworks: Self-supervised learning from video. 2018.spa
dc.relation.referencesB. Piot, M. Geist, and O. Pietquin, ““bridging the gap between imitation learning and inverse reinforcement learning,” vol. 28, p. 1814–1826, Aug. 2017.spa
dc.relation.referencesD. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, pp. 484 EP –, Jan 2016. Article.spa
dc.relation.referencesT. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, I. Osband, G. Dulac-Arnold, J. Agapiou, J. Z. Leibo, and A. Gruslys, Deep q-learning from demonstrations. feb 2018.spa
dc.relation.referencesG. Masuyama and K. Umeda, “Apprenticeship learning based on inconsistent demonstrations, ” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2015.spa
dc.relation.referencesI. A. D. S. Tom Schaul, John Quan, “Prioritized experience replay.” https://arxiv.org/pdf/1511.05952.pdf, 2016.spa
dc.rightsAtribución-NoComercial 2.5 Colombia*
dc.rightsAtribución-NoComercial 2.5 Colombia*
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.rights.coarhttp://purl.org/coar/access_right/c_abf2
dc.rights.localAbierto (Texto Completo)spa
dc.rights.urihttp://creativecommons.org/licenses/by-nc/2.5/co/*
dc.subject.keywordReinforcement learningspa
dc.subject.keywordInverse reinforcement learningspa
dc.subject.keywordComputational neural networksspa
dc.subject.keywordMachine Learningspa
dc.subject.keywordPythonspa
dc.subject.keywordPoppy torsospa
dc.subject.keywordInference objectivesspa
dc.subject.keywordhumanoid robotspa
dc.subject.lembRedes neuronales computacionalesspa
dc.subject.lembAprendizaje de máquinaspa
dc.subject.lembInferencia de objetivosspa
dc.subject.lembHumanoidesspa
dc.subject.proposalAprendizaje por refuerzospa
dc.subject.proposalAprendizaje por refuerzo inversospa
dc.subject.proposalPythonspa
dc.subject.proposalPoppy Torsospa
dc.subject.proposalInferencia de objetivosspa
dc.subject.proposalRobot Humanoidespa
dc.titleAprendizaje de movimientos en robot humanoide a partir de inferencia de objetivosspa
dc.typebachelor thesis
dc.type.categoryFormación de Recurso Humano para la Ctel: Trabajo de grado de Pregradospa
dc.type.coarhttp://purl.org/coar/resource_type/c_7a1f
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.driveinfo:eu-repo/semantics/bachelorThesis
dc.type.localTesis de pregradospa
dc.type.versioninfo:eu-repo/semantics/acceptedVersion

Archivos

Bloque original

Mostrando 1 - 3 de 3
Cargando...
Miniatura
Nombre:
2020yeisonsuarez.pdf
Tamaño:
6.09 MB
Formato:
Adobe Portable Document Format
Descripción:
Trabajo de grado
Thumbnail USTA
Nombre:
cartaderechosdeautor.pdf
Tamaño:
3.69 MB
Formato:
Adobe Portable Document Format
Descripción:
Carta derechos de autor
Thumbnail USTA
Nombre:
cartadeaprobación.pdf
Tamaño:
128.71 KB
Formato:
Adobe Portable Document Format
Descripción:
Carta de aprobación de la facultad

Bloque de licencias

Mostrando 1 - 1 de 1
Thumbnail USTA
Nombre:
license.txt
Tamaño:
807 B
Formato:
Item-specific license agreed upon to submission
Descripción: