Reducción de Tiempos de Entrenamiento de Algoritmos de Aprendizaje de Máquina a través de Tutoría por parte de un Experto Externo
| dc.contributor.advisor | Calderon Chavez, Juan Manuel | |
| dc.contributor.author | Salazar Villareal, Carlos Enrique | |
| dc.contributor.corporatename | Universidad Santo Tomás | spa |
| dc.contributor.cvlac | https://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0000380938 | |
| dc.contributor.orcid | https://orcid.org/0000-0002-4471-3980 | |
| dc.contributor.orcid | https://orcid.org/0000-0002-3454-3711 | |
| dc.date.accessioned | 2023-09-26T00:26:45Z | |
| dc.date.available | 2023-09-26T00:26:45Z | |
| dc.date.issued | 2023-09-25 | |
| dc.description | Este proyecto de grado presenta el planteamiento e implementación de una política de aprendizaje para redes neuronales, basándose y refinando técnicas ya existentes entrenado mediante experiencias de un agente externo (experto), tomando como referencia una arquitectura de red neuronal convolucional. Se detalla el proceso de selección del ambiente problemático, la arquitectura y el paradigma seleccionado. Así mismo, se realizan pruebas múltiples para confirmar el desempeño. Se entrenan con distintas políticas de entrenamiento. Finalmente se evalúa el rendimiento de las distintas políticas entrenado con respecto a una política base \textit{exploración/explotación}. Como producto final se presenta: el dataset experto de referencia, un repositorio con los programas realizados para el desarrollo junto a la implementación de la política. | spa |
| dc.description.abstract | This degree project presents the approach and implementation of a learning policy for neural networks, based on and refining existing techniques trained through the experience of an external agent (expert), taking as reference a convolutional neural network architecture. The selection process of the problematic environment, the architecture and the selected paradigm are detailed. Likewise, multiple tests are performed to confirm the performance. They are trained with different training policies. Finally, the performance of the different trained policies is evaluated with respect to a base policy \textit{exploration/exploitation}. As a final product we present: the reference expert dataset, a repository with the programs made for the development together with the implementation of the policy. | spa |
| dc.description.degreelevel | Pregrado | spa |
| dc.description.degreename | Ingeniero Electronico | spa |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | Salazar Villareal, C. E. (2023). Reducción de Tiempos de Entrenamiento de Algoritmos de Aprendizaje de Máquina a través de Tutoría por parte de un Experto Externo. [Trabajo de Grado, Universidad Santo Tomás]. Repositorio Institucional. | spa |
| dc.identifier.instname | instname:Universidad Santo Tomás | spa |
| dc.identifier.reponame | reponame:Repositorio Institucional Universidad Santo Tomás | spa |
| dc.identifier.repourl | repourl:https://repository.usta.edu.co | spa |
| dc.identifier.uri | http://hdl.handle.net/11634/52386 | |
| dc.language.iso | spa | |
| dc.publisher | Universidad Santo Tomás | spa |
| dc.publisher.branch | CRAI-USTA Bogotá | spa |
| dc.publisher.faculty | Facultad de Ingeniería Electrónica | spa |
| dc.publisher.program | Pregrado Ingeniería Electrónica | spa |
| dc.relation.references | A. Rimassa, J. Luciano, C. Zurita, J. Paul, V. Bautista y S. Francisco, «Reconocimiento de tumores y patologías cerebrales mediante inteligencia artificial.,» 2022. dirección: http://www.dspace.uce.edu.ec/handle/25000/27244. | spa |
| dc.relation.references | S. Ben-David, E. Kushilevitz e Y. Mansour, «Online Learning versus Offline Learning,» Machine Learning, vol. 29, mayo de 1997. doi: 10. 1023/A:1007465907571. | spa |
| dc.relation.references | Y. Lecun, L. Bottou, Y. Bengio y P. Haffner, «Gradient-based learning applied to document recognition,» Proceedings of the IEEE, vol. 86, n.o 11, págs. 2278-2324, 1998. doi: 10.1109/5.726791. | spa |
| dc.relation.references | E. Strubell, A. Ganesh y A. McCallum, Energy and Policy Considerations for Deep Learning in NLP, 2019. doi: 10 . 48550 / ARXIV . 1906 . 02243. dirección: https://arxiv.org/abs/1906.02243. | spa |
| dc.relation.references | S. Chaudhury, D. Kimura, T. Inoue y R. Tachibana, Model-based imitation learning from state trajectories, 2018. dirección: https://openreview.net/forum?id=S1GDXzb0b. | spa |
| dc.relation.references | K. Judah, A. Fern, P. Tadepalli y R. Goetschalckx, «Imitation Learning with Demonstrations and Shaping Rewards,» Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28, n.o 1, jun. de 2014. doi: 10 . 1609 / aaai . v28i1 . 9024. dirección: https : / / ojs . aaai . org / index.php/AAAI/article/view/9024. | spa |
| dc.relation.references | X. Guo, S. Chang, M. Yu, M. Liu y G. Tesauro, Faster Reinforcement Learning with Expert State Sequences, 2018. dirección: https://openreview.net/forum?id=BJ7d0fW0b. | spa |
| dc.relation.references | I. Radosavovic, X. Wang, L. Pinto y J. Malik, «State-Only Imitation Learning for Dexterous Manipulation,» en 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, págs. 7865-7871. doi: 10.1109/IROS51168.2021.9636557. | spa |
| dc.relation.references | F. Torabi, G. Warnell y P. Stone, Recent Advances in Imitation Learning from Observation, 2019. arXiv: 1905.13566 [cs.RO]. | spa |
| dc.relation.references | H. M. L. Yisong Yue, IMITATION LEARNING TUTORIAL, 2018. dirección: https://sites.google.com/view/icml2018-imitation- learning/. | spa |
| dc.relation.references | S. Singh, What is Imitation Learning? 2019. dirección: https : / / deeplearninguniversity.com/what-is-imitation-learning/. | spa |
| dc.relation.references | M. Schaarschmidt, A. Kuhnle, B. Ellis, K. Fricke, F. Gessert y E. Yoneki, LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations, 2018. doi: 10 . 48550 / ARXIV . 1808 . 07903. dirección: https://arxiv.org/abs/1808.07903. | spa |
| dc.relation.references | R. Marcus, P. Negi, H. Mao, N. Tatbul, M. Alizadeh y T. Kraska, «Bao: Making Learned Query Optimization Practical,» en Proceedings of the 2021 International Conference on Management of Data, ép. SIGMOD ’21, Virtual Event, China: Association for Computing Machinery, 2021, págs. 1275-1288, isbn: 9781450383431. doi: 10 . 1145 / 3448016 . 3452838. dirección: https://doi.org/10.1145/3448016.3452838. | spa |
| dc.relation.references | R. Marcus, P. Negi, H. Mao et al., «Neo,» Proceedings of the VLDB Endowment, vol. 12, n.o 11, págs. 1705-1718, jul. de 2019. doi: 10. 14778/3342263.3342644. dirección: https://doi.org/10.14778% 2F3342263.3342644. | spa |
| dc.relation.references | M. F. Argerich, J. Fürst y B. Cheng, «Tutor4RL: Guiding Reinforcement Learning with External Knowledge.,» en AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering (1), 2020. | spa |
| dc.relation.references | O. Rivlin, Reinforcement Learning Using a Single Demonstration, 2019. dirección: https://towardsdatascience.com/reinforcement- learning-using-a-single-demonstration-7889fe5e9f41. | spa |
| dc.relation.references | X. Guo, S. Chang, M. Yu, G. Tesauro y M. Campbell, Hybrid Reinforcement Learning with Expert State Sequences, 2019. doi: 10 . 48550 / ARXIV . 1903 . 04110. dirección: https://arxiv.org/abs/1903.04110. | spa |
| dc.relation.references | R. Zhang, F. Torabi, L. Guan, D. H. Ballard y P. Stone, Leveraging Human Guidance for Deep Reinforcement Learning Tasks, 2019. arXiv: 1909.09906 [cs.AI] | spa |
| dc.relation.references | A. Aflakian, A. Rastegharpanah y R. Stolkin, «Boosting Performance of Visual Servoing Using Deep Reinforcement Learning From Multiple Demonstrations,» IEEE Access, vol. 11, págs. 26 512-26 520, 2023. doi: 10.1109/ACCESS.2023.3256724 | spa |
| dc.relation.references | A. M. Metelli, M. Pirotta y M. Restelli, «Compatible Reward Inverse Reinforcement Learning,» en The Thirty-first Annual Conference on Neural Information Processing Systems - NIPS 2017, Long Beach, United States, dic. de 2017. dirección: https://hal.inria.fr/hal- 01653328. | spa |
| dc.relation.references | R. M. J. V. M., «Efectividad, eficacia y eficiencia en equipos de trabajo,» Espacios, 2017. | spa |
| dc.relation.references | E. Elibol, J. Calderon, M. Llofriu, C. Quintero, W. Moreno y A. Weitzenfeld, «Power usage reduction of humanoid standing process using q-learning,» en RoboCup 2015: Robot World Cup XIX 19, Springer, 2015, págs. 251-263. | spa |
| dc.relation.references | G. Cardona, C. Bravo, W. Quesada et al., «Autonomous navigation for exploration of unknown environments and collision avoidance in mobile robots using reinforcement learning,» en 2019 SoutheastCon, IEEE, 2019, págs. 1-7. | spa |
| dc.relation.references | L. J. P. Reyes, N. B. Oviedo, E. C. Camacho y J. M. Calderon, «Adaptable Recommendation System for Outfit Selection with Deep Learning Approach,» IFAC-PapersOnLine, vol. 54, n.o 13, págs. 605-610, 2021. | spa |
| dc.relation.references | J. A. Cárdenas, U. E. Carrero, E. C. Camacho y J. M. Calderón, «Optimal PID ø axis Control for UAV Quadrotor based on Multi- Objective PSO,» IFAC-PapersOnLine, vol. 55, n.o 14, págs. 101-106, 2022. | spa |
| dc.relation.references | J. A. Cardenas, U. E. Carrero, E. C. Camacho y J. M. Calderon, «Intelligent Position Controller for Unmanned Aerial Vehicles (UAV) Based on Supervised Deep Learning,» Machines, vol. 11, n.o 6, pág. 606, 2023. | spa |
| dc.relation.references | A. M. Pedro Larranaga ~ Inaki ~ Inza, Tema 8. Redes Neuronales. | spa |
| dc.relation.references | J. Schmidhuber, «Deep Learning in Neural Networks: An Overview,» CoRR, vol. abs/1404.7828, 2014. arXiv: 1404.7828. dirección: http: //arxiv.org/abs/1404.7828. | spa |
| dc.relation.references | A. Gleave, M. Taufeeque, J. Rocamonde et al., imitation: Clean Imitation Learning Implementations, 2022. arXiv: 2211 . 11972 [cs.LG]. | spa |
| dc.relation.references | B. Zheng, S. Verma, J. Zhou, I. Tsang y F. Chen, Imitation Learning: Progress, Taxonomies and Challenges, 2022. arXiv: 2106 . 12177 [cs.LG]. | spa |
| dc.relation.references | A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez y V. Koltun, CARLA: An Open Urban Driving Simulator, 2017. arXiv: 1711.03938 [cs.LG]. | spa |
| dc.relation.references | W. O. Quesada, J. I. Rodriguez, J. C. Murillo et al., «Leader-follower formation for UAV robot swarm based on fuzzy logic theory,» en Artificial Intelligence and Soft Computing: 17th International Conference, ICAISC 2018, Zakopane, Poland, June 3-7, 2018, Proceedings, Part II 17, Springer, 2018, págs. 740-751. | spa |
| dc.relation.references | D. Paez, J. P. Romero, B. Noriega, G. A. Cardona y J. M. Calderon, «Distributed particle swarm optimization for multi-robot system in search and rescue operations,» IFAC-PapersOnLine, vol. 54, n.o 4, págs. 1-6, 2021. | spa |
| dc.relation.references | J. León, G. A. Cardona, A. Botello y J. M. Calderón, «Robot swarms theory applicable to seek and rescue operation,» en Intelligent Systems Design and Applications: 16th International Conference on Intelligent Systems Design and Applications (ISDA 2016) held in Porto, Portugal, December 16-18, 2016, Springer, 2017, págs. 1061-1070. | spa |
| dc.relation.references | G. A. Cardona y J. M. Calderon, «Robot swarm navigation and victim detection using rendezvous consensus in search and rescue operations,» Applied Sciences, vol. 9, n.o 8, pág. 1702, 2019. | spa |
| dc.relation.references | N. Gómez, N. Peña, S. Rincón, S. Amaya y J. Calderon, «Leader- follower behavior in multi-agent systems for search and rescue based on pso approach,» en SoutheastCon 2022, IEEE, 2022, págs. 413-420 | spa |
| dc.relation.references | B. Pallares, T. Rozo, E. C. Camacho, J. G. Guarnizo, J. M. Calderon et al., «Design and construction of a cost-oriented mobile robot for domestic assistance,» IFAC-PapersOnLine, vol. 54, n.o 13, págs. 293-298, 2021. | spa |
| dc.relation.references | L. G. Jaimes y J. M. Calderon, «An UAV-based incentive mechanism for Crowdsensing with budget constraints,» en 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC), IEEE, 2020, págs. 1-6. | spa |
| dc.relation.references | G. A. Cardona, J. Ramirez-Rugeles, E. Mojica-Nava y J. M. Calderon, «Visual victim detection and quadrotor-swarm coordination control in search and rescue environment,» International Journal of Electrical and Computer Engineering, vol. 11, n.o 3, pág. 2079, 2021. | spa |
| dc.relation.references | G. Cardona, M. Arevalo-Castiblanco, D. Tellez-Castro, J. Calderon y E. Mojica-Nava, «Robust Adaptive Synchronization of Interconnected Heterogeneous Quadrotors Transporting a Cable-Suspended Load,» en 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021, págs. 31-37. | spa |
| dc.relation.references | G. Cardona, D. Tellez-Castr, J. Calderon y E. Mojica-Nava, «Adaptive Multi-Quadrotor Control for Cooperative Transportation of a Cable-Suspended Load,» en 2021 European Control Conference (ECC), IEEE, 2021, págs. 696-701 | spa |
| dc.relation.references | E. Elibol, J. Calderon, M. Llofriu, W. Moreno y A. Weitzenfeld, «Analyzing and reducing energy usage in a humanoid robot during standing up and sitting down tasks,» International Journal of Humanoid Robotics, vol. 13, n.o 04, pág. 1 650 014, 2016. 68 | spa |
| dc.rights | Atribución-NoComercial-SinDerivadas 2.5 Colombia | |
| dc.rights.accessrights | info:eu-repo/semantics/openAccess | |
| dc.rights.coar | http://purl.org/coar/access_right/c_abf2 | spa |
| dc.rights.local | Abierto (Texto Completo) | spa |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/2.5/co/ | |
| dc.subject.keyword | Artificial Intelligence | spa |
| dc.subject.keyword | Learning Policies | spa |
| dc.subject.keyword | Imitation Learning | spa |
| dc.subject.keyword | Reinforcement Learning | spa |
| dc.subject.keyword | Expert Learning | spa |
| dc.subject.lemb | Ingeniería Electrónica | spa |
| dc.subject.lemb | Teología | spa |
| dc.subject.lemb | Tendencia de la Investigación | spa |
| dc.subject.proposal | Inteligencia Artificial | spa |
| dc.subject.proposal | Políticas de Aprendizaje | spa |
| dc.subject.proposal | Aprendizaje por Imitación | spa |
| dc.subject.proposal | Aprendizaje por Refuerzo | spa |
| dc.subject.proposal | Aprendizaje por Experto | spa |
| dc.title | Reducción de Tiempos de Entrenamiento de Algoritmos de Aprendizaje de Máquina a través de Tutoría por parte de un Experto Externo | spa |
| dc.type.coar | http://purl.org/coar/resource_type/c_7a1f | |
| dc.type.coarversion | http://purl.org/coar/version/c_ab4af688f83e57aa | |
| dc.type.drive | info:eu-repo/semantics/bachelorThesis | |
| dc.type.local | Trabajo de grado | spa |
| dc.type.version | info:eu-repo/semantics/acceptedVersion |
Archivos
Bloque original
1 - 3 de 3
Cargando...
- Nombre:
- 2023carlossalazar.pdf
- Tamaño:
- 5.34 MB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Trabajo de Grado
Cargando...
- Nombre:
- Carta Aprovacion Facultad.pdf
- Tamaño:
- 155.4 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Carta Aprovacion Facultad
Cargando...
- Nombre:
- Carta Derechos de Autor.pdf
- Tamaño:
- 272.01 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Carta Derechos de Autor
Bloque de licencias
1 - 1 de 1
Cargando...
- Nombre:
- license.txt
- Tamaño:
- 807 B
- Formato:
- Item-specific license agreed upon to submission
- Descripción:

