About Mehran

  • Strong AI and Machine Learning expertise
    • 6+ years of experience in machine learning, data analysis, deep learning, reinforcement learning, NLP, and generative models.
  • Proven track record
    • Contributed to large-scale projects, published research in top journals.
  • Strong Software Engineering Skills
    • 7+ years of experience developing scalable algorithms and software systems, troubleshooting, integration, testing, and deployment.
  • Large Language Models
    • Experienced in customizing large language models (LLMs) for specific tasks and according to human values.
  • Machine Learning Development
    • Proficient in machine learning algorithm development, data management,cloud and distributed computing , and deployment.
  • Soft Skills
    • Strong communication skills, self-motivated with a creative approach to problem-solving, and demonstrated leadership in achieving milestones.

Expertise

Education

M.Sc. Computer Science GPA: 3.90 / 4.00

LogoUniversity of Alberta Edmonton, AB, Canada

B.Sc. Computer Engineering GPA: 3.99 / 4.00

LogoAmirkabir University of Technology Tehran, Iran

Skills

  • Skill level:
  • The basics
  • Advanced
  • Seasoned
  • Expert
Programming
  • Python
  • Java
  • C / C++
  • CUDA
  • OpenMP
  • Assembly Programming
Machine Learning
  • Pytorch
  • Tensorflow
  • Jax
Data Analysis
  • Pandas
  • Matplotlib
  • ScikitLearn
  • SciPy
Web
  • JavaScript
  • HTML
  • CSS
  • Node.js
  • Vue.js
Operating Systems
  • Linux
  • MacOS
  • Windows
Other Tools
  • Docker
  • LaTeX
  • Git
  • Microsoft Office

Experiences

Associate Machine Learning Researcher

LogoHuawei Techonologies Canada CO., Ltd. Edmonton, AB, Canada

  • Preparing a survey on SOTA self-improvment techniques for LLMs.
  • Developed a self-improvement pipeline utilizing RLAIF and RLHF techniques.
  • Verified the efficiency of the fine-tuning pipeline using smaller language models
  • Contributed to improving the reasoning capabilies utilizing RLHF pipeline.
  • Configured table-top robots and implemented code for kinesthetic demonstration-based data collection.
  • Contributed to the creation of a dataset comprising 300K scenarios involving household tasks performed by both fixed and mobile manipulators.
  • Engineered a multi-modal planner by integrating a Q/A model, a captioning model, and an LLM to enhance planning capabilities.
  • Achieved superior performance over the text-only planner by integrating object locations and their contextual relationships within scenes.

Research Assistant

LogoUniversity of Alberta Edmonton, AB, Canada

  • Designed and developed a benchmark for dynamically changing robotic environments using Mujoco.
  • Investigated the effectiveness of state-of-the-art model-free reinforcement learning algorithms in responding to dynamic changes.
  • Enhanced the speed of adaptation by preserving or injecting prior knowledge into the policy.
  • Expedited training across multiple seeds through distributed computing, optimizing resource utilization on local servers.
  • Innovated and developed a methodology for globally explaining policy gradient algorithms in robotic domains.
  • Accurately highlighted individual contributions of each robot component to decision-making using this methodology.
  • Innovated and developed an approach for real-time diagnostics of malfunctions and dynamic adaptations.
  • Accelerated training via cloud computing and optimized resource utilization on local servers.

Research Assistant

LogoAmirkabir University of Technology Tehran, Iran

  • Developed an automated learning agent for daily historical data collection, pattern extraction, and trading strategy recommendations,
  • Innovated a robust methodogoly based on deep RL, consistently outperforming rule-based and technical analysis-based approaches.
  • Demonstrated strong robustness against market trends and volatilities.
  • Explored and optimized hyper-parameter configurations for deep RL algorithms, including neural network architectures (such as RNNs, CNNs, MLPs).

Research Intern

LogoKTH Royal Institute of Technology Stockholm, Sweden

  • Achieved a 99.7% accuracy by successfully developing and training a GCN on a subset of Visual Genome dataset.
  • Implemented an efficient data storage and training pipeline to handle the extensive Visual Genome graph dataset for GCN training.
  • Generated high-fidelity and contrastive explanations for GCNs in a graph classification task.
  • Improved and assessed the interpretability and transparency of model decisions in a scene-graph description task.

Research Intern

LogoIPM Institute for Research in Fundamental Sciences Tehran, Iran

  • Applied neural network compression techniques, achieving a reduction in memory access by 15.3× and forward-propagation computation by 11.3× through sparsity and weight/input similarity.
  • Successfully extended the compression method to well-known ImageNet architectures including AlexNet, VGG, Inception, and ResNet, highlighting its versatility and effectiveness.

Publications

  1. Taghian, M., Asadi, A. and Safabakhsh, R., 2022. Learning financial asset-specific trading rules via deep reinforcement learning. Expert Systems with Applications, 195, p.116523.
  2. Taghian, M., Asadi, A. and Safabakhsh, R., 2021. A reinforcement learning based encoder-decoder framework for learning stock trading rules. arXiv preprint arXiv:2101.03867.
  3. Falahati, H., Peyro, M., Amini, H., Taghian, M., Sadrosadati, M., Lotfi-Kamran, P. and Sarbazi-Azad, H., 2021. Data-Aware compression of neural networks. IEEE Computer Architecture Letters, 20(2), pp.94-97.
  4. Taghian Jazi, M., 2022. Representation Analysis of Deep Reinforcement Learning algorithms in Robotic Environments.
  5. Jazi, M.T., Miwa, S., Mitsuka, Y., Günther, J. and Zaiane, O., 2022. Explainability of deep reinforcement learning algorithms in robotic domains by using Layer-wise Relevance Propagation.