Preparing a survey on SOTA self-improvment techniques for LLMs.
Developed a self-improvement pipeline utilizing RLAIF and RLHF techniques.
Verified the efficiency of the fine-tuning pipeline using smaller language models
Contributed to improving the reasoning capabilies utilizing RLHF pipeline.
Configured table-top robots and implemented code for kinesthetic demonstration-based data collection.
Contributed to the creation of a dataset comprising 300K scenarios involving household tasks performed by both fixed and mobile manipulators.
Engineered a multi-modal planner by integrating a Q/A model, a captioning model, and an LLM to enhance planning capabilities.
Achieved superior performance over the text-only planner by integrating object locations and their contextual relationships within scenes.
Research Assistant
University of Alberta Edmonton, AB, Canada
May 2021 - Apr 2023
Designed and developed a benchmark for dynamically changing robotic environments using Mujoco.
Investigated the effectiveness of state-of-the-art model-free reinforcement learning algorithms in responding to dynamic changes.
Enhanced the speed of adaptation by preserving or injecting prior knowledge into the policy.
Expedited training across multiple seeds through distributed computing, optimizing resource utilization on local servers.
Innovated and developed a methodology for globally explaining policy gradient algorithms in robotic domains.
Accurately highlighted individual contributions of each robot component to decision-making using this methodology.
Innovated and developed an approach for real-time diagnostics of malfunctions and dynamic adaptations.
Accelerated training via cloud computing and optimized resource utilization on local servers.
Research Assistant
Amirkabir University of Technology Tehran, Iran
Jan 2020 - Jan 2021
Developed an automated learning agent for daily historical data collection, pattern extraction, and trading strategy recommendations,
Innovated a robust methodogoly based on deep RL, consistently outperforming rule-based and technical analysis-based approaches.
Demonstrated strong robustness against market trends and volatilities.
Explored and optimized hyper-parameter configurations for deep RL algorithms, including neural network architectures (such as RNNs, CNNs, MLPs).
Research Intern
KTH Royal Institute of Technology Stockholm, Sweden
Jun 2019 - Nov 2019
Achieved a 99.7% accuracy by successfully developing and training a GCN on a subset of Visual Genome dataset.
Implemented an efficient data storage and training pipeline to handle the extensive Visual Genome graph dataset for GCN training.
Generated high-fidelity and contrastive explanations for GCNs in a graph classification task.
Improved and assessed the interpretability and transparency of model decisions in a scene-graph description task.
Research Intern
IPM Institute for Research in Fundamental Sciences Tehran, Iran
Jun 2019 - Nov 2019
Applied neural network compression techniques, achieving a reduction in memory access by 15.3× and forward-propagation computation by 11.3× through sparsity and weight/input similarity.
Successfully extended the compression method to well-known ImageNet architectures including AlexNet, VGG, Inception, and ResNet, highlighting its versatility and effectiveness.
Falahati, H., Peyro, M., Amini, H., Taghian, M., Sadrosadati, M., Lotfi-Kamran, P. and Sarbazi-Azad, H., 2021. Data-Aware compression of neural networks. IEEE Computer Architecture Letters, 20(2), pp.94-97.