Reducción de Tiempos de Entrenamiento de Algoritmos de Aprendizaje de Máquina a través de Tutoría por parte de un Experto Externo

dc.contributor.advisorCalderon Chavez, Juan Manuel
dc.contributor.authorSalazar Villareal, Carlos Enrique
dc.contributor.corporatenameUniversidad Santo Tomásspa
dc.contributor.cvlachttps://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0000380938
dc.contributor.orcidhttps://orcid.org/0000-0002-4471-3980
dc.contributor.orcidhttps://orcid.org/0000-0002-3454-3711
dc.date.accessioned2023-09-26T00:26:45Z
dc.date.available2023-09-26T00:26:45Z
dc.date.issued2023-09-25
dc.descriptionEste proyecto de grado presenta el planteamiento e implementación de una política de aprendizaje para redes neuronales, basándose y refinando técnicas ya existentes entrenado mediante experiencias de un agente externo (experto), tomando como referencia una arquitectura de red neuronal convolucional. Se detalla el proceso de selección del ambiente problemático, la arquitectura y el paradigma seleccionado. Así mismo, se realizan pruebas múltiples para confirmar el desempeño. Se entrenan con distintas políticas de entrenamiento. Finalmente se evalúa el rendimiento de las distintas políticas entrenado con respecto a una política base \textit{exploración/explotación}. Como producto final se presenta: el dataset experto de referencia, un repositorio con los programas realizados para el desarrollo junto a la implementación de la política.spa
dc.description.abstractThis degree project presents the approach and implementation of a learning policy for neural networks, based on and refining existing techniques trained through the experience of an external agent (expert), taking as reference a convolutional neural network architecture. The selection process of the problematic environment, the architecture and the selected paradigm are detailed. Likewise, multiple tests are performed to confirm the performance. They are trained with different training policies. Finally, the performance of the different trained policies is evaluated with respect to a base policy \textit{exploration/exploitation}. As a final product we present: the reference expert dataset, a repository with the programs made for the development together with the implementation of the policy.spa
dc.description.degreelevelPregradospa
dc.description.degreenameIngeniero Electronicospa
dc.format.mimetypeapplication/pdf
dc.identifier.citationSalazar Villareal, C. E. (2023). Reducción de Tiempos de Entrenamiento de Algoritmos de Aprendizaje de Máquina a través de Tutoría por parte de un Experto Externo. [Trabajo de Grado, Universidad Santo Tomás]. Repositorio Institucional.spa
dc.identifier.instnameinstname:Universidad Santo Tomásspa
dc.identifier.reponamereponame:Repositorio Institucional Universidad Santo Tomásspa
dc.identifier.repourlrepourl:https://repository.usta.edu.cospa
dc.identifier.urihttp://hdl.handle.net/11634/52386
dc.language.isospa
dc.publisherUniversidad Santo Tomásspa
dc.publisher.branchCRAI-USTA Bogotáspa
dc.publisher.facultyFacultad de Ingeniería Electrónicaspa
dc.publisher.programPregrado Ingeniería Electrónicaspa
dc.relation.referencesA. Rimassa, J. Luciano, C. Zurita, J. Paul, V. Bautista y S. Francisco, «Reconocimiento de tumores y patologías cerebrales mediante inteligencia artificial.,» 2022. dirección: http://www.dspace.uce.edu.ec/handle/25000/27244.spa
dc.relation.referencesS. Ben-David, E. Kushilevitz e Y. Mansour, «Online Learning versus Offline Learning,» Machine Learning, vol. 29, mayo de 1997. doi: 10. 1023/A:1007465907571.spa
dc.relation.referencesY. Lecun, L. Bottou, Y. Bengio y P. Haffner, «Gradient-based learning applied to document recognition,» Proceedings of the IEEE, vol. 86, n.o 11, págs. 2278-2324, 1998. doi: 10.1109/5.726791.spa
dc.relation.referencesE. Strubell, A. Ganesh y A. McCallum, Energy and Policy Considerations for Deep Learning in NLP, 2019. doi: 10 . 48550 / ARXIV . 1906 . 02243. dirección: https://arxiv.org/abs/1906.02243.spa
dc.relation.referencesS. Chaudhury, D. Kimura, T. Inoue y R. Tachibana, Model-based imitation learning from state trajectories, 2018. dirección: https://openreview.net/forum?id=S1GDXzb0b.spa
dc.relation.referencesK. Judah, A. Fern, P. Tadepalli y R. Goetschalckx, «Imitation Learning with Demonstrations and Shaping Rewards,» Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28, n.o 1, jun. de 2014. doi: 10 . 1609 / aaai . v28i1 . 9024. dirección: https : / / ojs . aaai . org / index.php/AAAI/article/view/9024.spa
dc.relation.referencesX. Guo, S. Chang, M. Yu, M. Liu y G. Tesauro, Faster Reinforcement Learning with Expert State Sequences, 2018. dirección: https://openreview.net/forum?id=BJ7d0fW0b.spa
dc.relation.referencesI. Radosavovic, X. Wang, L. Pinto y J. Malik, «State-Only Imitation Learning for Dexterous Manipulation,» en 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, págs. 7865-7871. doi: 10.1109/IROS51168.2021.9636557.spa
dc.relation.referencesF. Torabi, G. Warnell y P. Stone, Recent Advances in Imitation Learning from Observation, 2019. arXiv: 1905.13566 [cs.RO].spa
dc.relation.referencesH. M. L. Yisong Yue, IMITATION LEARNING TUTORIAL, 2018. dirección: https://sites.google.com/view/icml2018-imitation- learning/.spa
dc.relation.referencesS. Singh, What is Imitation Learning? 2019. dirección: https : / / deeplearninguniversity.com/what-is-imitation-learning/.spa
dc.relation.referencesM. Schaarschmidt, A. Kuhnle, B. Ellis, K. Fricke, F. Gessert y E. Yoneki, LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations, 2018. doi: 10 . 48550 / ARXIV . 1808 . 07903. dirección: https://arxiv.org/abs/1808.07903.spa
dc.relation.referencesR. Marcus, P. Negi, H. Mao, N. Tatbul, M. Alizadeh y T. Kraska, «Bao: Making Learned Query Optimization Practical,» en Proceedings of the 2021 International Conference on Management of Data, ép. SIGMOD ’21, Virtual Event, China: Association for Computing Machinery, 2021, págs. 1275-1288, isbn: 9781450383431. doi: 10 . 1145 / 3448016 . 3452838. dirección: https://doi.org/10.1145/3448016.3452838.spa
dc.relation.referencesR. Marcus, P. Negi, H. Mao et al., «Neo,» Proceedings of the VLDB Endowment, vol. 12, n.o 11, págs. 1705-1718, jul. de 2019. doi: 10. 14778/3342263.3342644. dirección: https://doi.org/10.14778% 2F3342263.3342644.spa
dc.relation.referencesM. F. Argerich, J. Fürst y B. Cheng, «Tutor4RL: Guiding Reinforcement Learning with External Knowledge.,» en AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering (1), 2020.spa
dc.relation.referencesO. Rivlin, Reinforcement Learning Using a Single Demonstration, 2019. dirección: https://towardsdatascience.com/reinforcement- learning-using-a-single-demonstration-7889fe5e9f41.spa
dc.relation.referencesX. Guo, S. Chang, M. Yu, G. Tesauro y M. Campbell, Hybrid Reinforcement Learning with Expert State Sequences, 2019. doi: 10 . 48550 / ARXIV . 1903 . 04110. dirección: https://arxiv.org/abs/1903.04110.spa
dc.relation.referencesR. Zhang, F. Torabi, L. Guan, D. H. Ballard y P. Stone, Leveraging Human Guidance for Deep Reinforcement Learning Tasks, 2019. arXiv: 1909.09906 [cs.AI]spa
dc.relation.referencesA. Aflakian, A. Rastegharpanah y R. Stolkin, «Boosting Performance of Visual Servoing Using Deep Reinforcement Learning From Multiple Demonstrations,» IEEE Access, vol. 11, págs. 26 512-26 520, 2023. doi: 10.1109/ACCESS.2023.3256724spa
dc.relation.referencesA. M. Metelli, M. Pirotta y M. Restelli, «Compatible Reward Inverse Reinforcement Learning,» en The Thirty-first Annual Conference on Neural Information Processing Systems - NIPS 2017, Long Beach, United States, dic. de 2017. dirección: https://hal.inria.fr/hal- 01653328.spa
dc.relation.referencesR. M. J. V. M., «Efectividad, eficacia y eficiencia en equipos de trabajo,» Espacios, 2017.spa
dc.relation.referencesE. Elibol, J. Calderon, M. Llofriu, C. Quintero, W. Moreno y A. Weitzenfeld, «Power usage reduction of humanoid standing process using q-learning,» en RoboCup 2015: Robot World Cup XIX 19, Springer, 2015, págs. 251-263.spa
dc.relation.referencesG. Cardona, C. Bravo, W. Quesada et al., «Autonomous navigation for exploration of unknown environments and collision avoidance in mobile robots using reinforcement learning,» en 2019 SoutheastCon, IEEE, 2019, págs. 1-7.spa
dc.relation.referencesL. J. P. Reyes, N. B. Oviedo, E. C. Camacho y J. M. Calderon, «Adaptable Recommendation System for Outfit Selection with Deep Learning Approach,» IFAC-PapersOnLine, vol. 54, n.o 13, págs. 605-610, 2021.spa
dc.relation.referencesJ. A. Cárdenas, U. E. Carrero, E. C. Camacho y J. M. Calderón, «Optimal PID ø axis Control for UAV Quadrotor based on Multi- Objective PSO,» IFAC-PapersOnLine, vol. 55, n.o 14, págs. 101-106, 2022.spa
dc.relation.referencesJ. A. Cardenas, U. E. Carrero, E. C. Camacho y J. M. Calderon, «Intelligent Position Controller for Unmanned Aerial Vehicles (UAV) Based on Supervised Deep Learning,» Machines, vol. 11, n.o 6, pág. 606, 2023.spa
dc.relation.referencesA. M. Pedro Larranaga ~ Inaki ~ Inza, Tema 8. Redes Neuronales.spa
dc.relation.referencesJ. Schmidhuber, «Deep Learning in Neural Networks: An Overview,» CoRR, vol. abs/1404.7828, 2014. arXiv: 1404.7828. dirección: http: //arxiv.org/abs/1404.7828.spa
dc.relation.referencesA. Gleave, M. Taufeeque, J. Rocamonde et al., imitation: Clean Imitation Learning Implementations, 2022. arXiv: 2211 . 11972 [cs.LG].spa
dc.relation.referencesB. Zheng, S. Verma, J. Zhou, I. Tsang y F. Chen, Imitation Learning: Progress, Taxonomies and Challenges, 2022. arXiv: 2106 . 12177 [cs.LG].spa
dc.relation.referencesA. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez y V. Koltun, CARLA: An Open Urban Driving Simulator, 2017. arXiv: 1711.03938 [cs.LG].spa
dc.relation.referencesW. O. Quesada, J. I. Rodriguez, J. C. Murillo et al., «Leader-follower formation for UAV robot swarm based on fuzzy logic theory,» en Artificial Intelligence and Soft Computing: 17th International Conference, ICAISC 2018, Zakopane, Poland, June 3-7, 2018, Proceedings, Part II 17, Springer, 2018, págs. 740-751.spa
dc.relation.referencesD. Paez, J. P. Romero, B. Noriega, G. A. Cardona y J. M. Calderon, «Distributed particle swarm optimization for multi-robot system in search and rescue operations,» IFAC-PapersOnLine, vol. 54, n.o 4, págs. 1-6, 2021.spa
dc.relation.referencesJ. León, G. A. Cardona, A. Botello y J. M. Calderón, «Robot swarms theory applicable to seek and rescue operation,» en Intelligent Systems Design and Applications: 16th International Conference on Intelligent Systems Design and Applications (ISDA 2016) held in Porto, Portugal, December 16-18, 2016, Springer, 2017, págs. 1061-1070.spa
dc.relation.referencesG. A. Cardona y J. M. Calderon, «Robot swarm navigation and victim detection using rendezvous consensus in search and rescue operations,» Applied Sciences, vol. 9, n.o 8, pág. 1702, 2019.spa
dc.relation.referencesN. Gómez, N. Peña, S. Rincón, S. Amaya y J. Calderon, «Leader- follower behavior in multi-agent systems for search and rescue based on pso approach,» en SoutheastCon 2022, IEEE, 2022, págs. 413-420spa
dc.relation.referencesB. Pallares, T. Rozo, E. C. Camacho, J. G. Guarnizo, J. M. Calderon et al., «Design and construction of a cost-oriented mobile robot for domestic assistance,» IFAC-PapersOnLine, vol. 54, n.o 13, págs. 293-298, 2021.spa
dc.relation.referencesL. G. Jaimes y J. M. Calderon, «An UAV-based incentive mechanism for Crowdsensing with budget constraints,» en 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC), IEEE, 2020, págs. 1-6.spa
dc.relation.referencesG. A. Cardona, J. Ramirez-Rugeles, E. Mojica-Nava y J. M. Calderon, «Visual victim detection and quadrotor-swarm coordination control in search and rescue environment,» International Journal of Electrical and Computer Engineering, vol. 11, n.o 3, pág. 2079, 2021.spa
dc.relation.referencesG. Cardona, M. Arevalo-Castiblanco, D. Tellez-Castro, J. Calderon y E. Mojica-Nava, «Robust Adaptive Synchronization of Interconnected Heterogeneous Quadrotors Transporting a Cable-Suspended Load,» en 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021, págs. 31-37.spa
dc.relation.referencesG. Cardona, D. Tellez-Castr, J. Calderon y E. Mojica-Nava, «Adaptive Multi-Quadrotor Control for Cooperative Transportation of a Cable-Suspended Load,» en 2021 European Control Conference (ECC), IEEE, 2021, págs. 696-701spa
dc.relation.referencesE. Elibol, J. Calderon, M. Llofriu, W. Moreno y A. Weitzenfeld, «Analyzing and reducing energy usage in a humanoid robot during standing up and sitting down tasks,» International Journal of Humanoid Robotics, vol. 13, n.o 04, pág. 1 650 014, 2016. 68spa
dc.rightsAtribución-NoComercial-SinDerivadas 2.5 Colombia
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.rights.coarhttp://purl.org/coar/access_right/c_abf2spa
dc.rights.localAbierto (Texto Completo)spa
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.subject.keywordArtificial Intelligencespa
dc.subject.keywordLearning Policiesspa
dc.subject.keywordImitation Learningspa
dc.subject.keywordReinforcement Learningspa
dc.subject.keywordExpert Learningspa
dc.subject.lembIngeniería Electrónicaspa
dc.subject.lembTeologíaspa
dc.subject.lembTendencia de la Investigaciónspa
dc.subject.proposalInteligencia Artificialspa
dc.subject.proposalPolíticas de Aprendizajespa
dc.subject.proposalAprendizaje por Imitaciónspa
dc.subject.proposalAprendizaje por Refuerzospa
dc.subject.proposalAprendizaje por Expertospa
dc.titleReducción de Tiempos de Entrenamiento de Algoritmos de Aprendizaje de Máquina a través de Tutoría por parte de un Experto Externospa
dc.type.coarhttp://purl.org/coar/resource_type/c_7a1f
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.driveinfo:eu-repo/semantics/bachelorThesis
dc.type.localTrabajo de gradospa
dc.type.versioninfo:eu-repo/semantics/acceptedVersion

Archivos

Bloque original

Mostrando 1 - 3 de 3
Cargando...
Miniatura
Nombre:
2023carlossalazar.pdf
Tamaño:
5.34 MB
Formato:
Adobe Portable Document Format
Descripción:
Trabajo de Grado
Cargando...
Miniatura
Nombre:
Carta Aprovacion Facultad.pdf
Tamaño:
155.4 KB
Formato:
Adobe Portable Document Format
Descripción:
Carta Aprovacion Facultad
Cargando...
Miniatura
Nombre:
Carta Derechos de Autor.pdf
Tamaño:
272.01 KB
Formato:
Adobe Portable Document Format
Descripción:
Carta Derechos de Autor

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
807 B
Formato:
Item-specific license agreed upon to submission
Descripción: