Hello! 👋

I'm Aman Sharma

Machine Learning Engineer

About Me

I’m a Machine Learning Engineer with 2+ years of experience specializing in transformers, LLMs, and generative AI. At Huawei Canada, I work on optimizing large-scale language models for efficiency and real-world applications. I hold a Master’s in Data Science and AI from the University of Waterloo and a BEng. in Computer Engineering from Thapar University. I’m passionate about building practical AI solutions that turn cutting-edge research into impact.

Python PyTorch TensorFlow Hugging Face Transformers LoRA / PEFT SQL Data Structures & Algorithms scikit-learn NumPy Pandas Matplotlib / Seaborn Git

Education

Master of Data Science and Artificial Intelligence

University of Waterloo

Sept 2023 - Apr 2025

Graduated with GPA: 3.8/4.0
Relevant Coursework: Machine Learning, Deep Learning, Computer Vision, Big Data Analytics, Stats for Data Science, Exploratory Data Analysis

Bachelor of Engineering in Computer Engineering

Thapar University

2018 - 2022

Graduated with GPA 9.43/10 (Top 5% of class)

Experience

Research Engineer - NLP/LLM

Huawei Canada

May 2024 - Present

Conducted layer-wise analysis of transformer architectures (embeddings, attention patterns) across scales, generating insights for architectural optimization
Designed and evaluated architectural modifications to large-scale models, including LLaMA, Qwen, and Mixture of Experts (MoE), achieving 40% faster inference and 34% lower memory usage
Integrated fused CUDA/Triton kernels into LLM training pipelines, delivering ∼30% faster training throughput and optimizing GPU utilization for large-scale experiments
Leveraged PyTorch and HuggingFace Transformers for pre-training, fine-tuning (LoRA, PEFT, instruction tuning) and inference of large-scale language models (1B–40B parameters) in multi-node environments

Software Engineer

Samsung R&D India

Jul 2022 - Aug 2023

Collaborated to develop Push to Talk FW removing the need to hold the voice button while using voice feature in TV
Integrated features in Text to Speech for non-zero ducking, making TV smarter to not reduce the background volume to zero while using Text to Speech
Led a team to build the whole FW to integrate Speech to Text feature in Netflix App, enhancing user experience
Contributed in Voice FW team to include the support of concurrent Multi-Voice Assistants and optimized performance of TTS and STT by 20%, by analyzing and resolving 50+ issues

Machine Learning Intern

Samsung R&D India

Jan 2022 - Jun 2022

Spearheaded end-to-end development of Textless NLP integration for Bixby voice assistant on Samsung TVs, achieving 25% reduction in voice command processing latency
Architected and implemented a novel audio-to-pseudo-unit encoder, eliminating traditional speech-to-text conversion bottleneck and streamlining voice command processing pipeline
Designed and trained a custom BERT-based classification model for direct mapping of voice commands to system functions, optimizing voice assistant response accuracy and efficiency

Publications

DTRNet: Dynamic Token Routing Network to Reduce Quadratic Costs in Transformers

arXiv preprint (2025); submitted to AAAI 2025 (under review)

Echoatt: Attend, copy, then adjust for more efficient large language models

Efficient Natural Language and Speech Processing (ENLSP-IV) workshop, 2024

Projects

DTRNet: Dynamic Token Routing Network to Reduce Quadratic Costs in Transformers

DTRNet is an improved Transformer architecture that allows tokens to dynamically skip the quadratic cost of cross-token mixing while still receiving lightweight linear updates.

PyTorchPythonHuggingFaceDistributed TrainingModel Optimization

Subquadratic Hybrid Transformer

This project explores a hybrid transformer architecture that combines linear attention and sparse attention (BigBird) mechanisms to efficiently model long-range dependencies in long-sequence data.

PyTorchPythonHuggingFaceLLMs

Intent Classification using GAN-BERT

This project explores the application of semi-supervised learning for intent classification in the Chinese language using a GAN-BERT architecture.

PyTorchPythonHuggingFaceGANsSemi-Supervised Learning

CNN on novel Human Actions Dataset

In this project, we have tried to build our custom CNN model, which is then trained on a Human actions dataset also created by us

PythonPyTorchCNNComputer VisionDataset Creation & Annotation

Anime Recommendation System

It is a content based Anime Recommendation. In this website when a user enters an Anime, it recommends top 10 animes of same genre based on user ratings

PythonRecommendation systemsHTML/CSSFlaskPandasNumPy

Online Healthcenter

In this project, I have developed a disease predicting app consisting of 5 diseases, trained using both Machine Learning(4 diseases) and Deep Learning(1 disease).

PythonMLHTML/CSSFlaskHeroku

RNN Regularization

This project reimplements and evaluates the core findings of Zaremba et al. (2014) by exploring dropout applied only to non-recurrent connections in RNNs.

RNNGRUPython