Daksh Patel — ML Research & AI Systems

01 — About

The Narrative

My work sits at the intersection of high-stakes industrial engineering and practical AI implementation. At AWS, I lead Scenario Modeling for global cost allocation, where the margin for error is zero.

This experience, combined with my time as a Teaching Assistant for Machine Learning at USC, has driven me to focus on how we can build AI systems that are reliable, performant, and grounded in data.

My goal is to develop AI systems that are observable, verifiable, and scalable — systems that don't just predict, but provide deterministic engineering guarantees.

The Engineering Philosophy

Why can we model millions of AWS billing scenarios with deterministic precision, yet still struggle to build AI systems that genuinely align with strict engineering constraints? I aim to bridge this gap.

FEAT — 01

Data & Systems Architecture

Designing high-fidelity frameworks to process complex data at scale, with verifiable logic and deterministic guarantees.

FEAT — 02

Machine Learning Optimization

Stripping away algorithmic bloat to find the mathematical signal — what the model must retain vs. what it can safely discard for performance.

FEAT — 03

Observable AI Systems

Designing AI that operates transparently, not just as black boxes. Building observable, verifiable ML pipelines.

FEAT — 04

Applied Computer Vision

Applied deep learning for medical image classification at USC — where diagnostic precision and system reliability are life-critical.

02 — Experience

Industry Work

Jun 2025 – Present

Amazon Web Services

Seattle, WA · FinTech

Data Engineer — FinTech, Cost Allocation

Designing high-fidelity frameworks to simulate complex financial outcomes at cloud scale, focusing on system robustness and verifiable logic in mission-critical billing infrastructure.

Built an automated scenario modeling engine using Python & SQL for high-fidelity "what-if" simulations, reducing latency by 85% and enabling proactive impact analysis.
Architected migration of 20TB+ financial workloads to ATLAS with 100% integrity; automated validation frameworks resolved 99% of data discrepancies for security auditing.
Refactored PySpark ETL pipelines processing billions of records daily, optimizing compute efficiency by 30% with zero production downtime.
Modernized legacy logic into scalable SQL/Spark architectures, improving end-to-end data delivery speeds by 25%.

Oct 2024 – Jun 2025

Northern Lights Post Inc.

Los Angeles, CA

Machine Learning / Data Engineer

Developed multimodal content ranking systems bridging production-scale ML with research-grade retrieval precision at 9M+ image, 10M+ video scale.

Developed viral content ranking algorithm using XClip (video) and ResNet-50 (image) embeddings for popularity prediction.
Integrated FAISS-based similarity search for efficient embedding-based retrieval, reducing computational overhead significantly.
Automated metadata augmentation pipelines improving recommendation model precision and reducing manual intervention.

Jan 2024 – May 2025

University of Southern California

Los Angeles, CA

Graduate Teaching Assistant — DSCI 552

Course Producer under Prof. Mohammad Reza Rajati. Mentored 750+ students across three semesters in ML fundamentals and investigated Deep Learning applications for medical imaging.

Earned TA role by ranking Top 5 in the course cohort.
Provided personalized guidance to 750+ students across three semesters of Machine Learning for Data Science.

Nov 2023 – May 2025

Keck School of Medicine, USC

Los Angeles, CA · Radiomics Lab

Research Assistant — Oncology Imaging

Applied deep learning to medical image classification for cancer detection in a life-critical diagnostic setting — where precision is non-negotiable.

Optimized diagnosis pipelines with PyTorch, integrating Foundation Models (SAM, SAM2, MEDSAM) for segmentation and comparing with nnU-Net benchmarks.
Applied quantization to decrease model parameters and enhance computational efficiency by 15%.
Developed novel data preprocessing to mitigate class imbalance, boosting weighted accuracy by 10%.

May – Sep 2024

Kintsugi Global, Inc.

Los Angeles, CA

AI/ML Engineer — NLP & Voice Systems

Custom-trained LLMs using PyTorch for an anime-character chatbot; enhanced user engagement by 25%.
Engineered a TTS system using NVIDIA Tacotron 2; applied quantization and distillation to reduce latency for real-time voice generation.
Integrated AWS services (EC2, Lambda, S3) for scalable, low-latency inference deployment.

Jan – Sep 2023

Siksti Technologies (Slikk)

Bengaluru, India

Machine Learning Engineer

Leveraged ML to identify sales trends, contributing to a 10% revenue increase through data-driven decisions.
Reduced catalog processing time by 80% via automated backend scripts using Django and PostgreSQL.
Boosted API response time by 66% through caching, query optimization, and load balancing.

03 — Portfolio

Research & Engineering

Systems Engineering · AWS

Scenario Modeling Engine — AWS FinTech

Automated high-fidelity "what-if" simulation framework for cloud-scale financial cost allocation. Deterministic billing simulation across millions of scenarios with zero tolerance for error. 85% latency reduction; 20TB+ migration with 100% integrity.

PythonSQLPySparkATLAS

Proprietary System

Medical AI · Published

Deep Learning for Osseous Metastatic Cancer Detection

Developed and evaluated DL models to automate lesion detection and segmentation in CT scans at Keck School of Medicine. Integrated Foundation Models (SAM, SAM2, MEDSAM) with nnU-Net benchmarking for oncology imaging.

PyTorchSAM2nnU-NetMedical Imaging

View Publication →

Multimodal · Retrieval

FAISS-Accelerated Content Ranking System

Viral content ranking engine using XClip (video) + ResNet-50 (image) embeddings with FAISS-based similarity search. Optimized embedding-based indexing for sub-linear retrieval at 9M+ image, 10M+ video scale.

FAISSXClipResNet-50Embeddings

Proprietary

NLP · Architecture

Transformer from Scratch

Decoder-only Transformer built entirely from scratch with PyTorch and PyTorch Lightning. Implements Multi-Head Self-Attention, Feed-Forward Networks, Positional Encoding, and Layer Normalization — no library abstractions.

PyTorchTransformersAttention

View on GitHub →

NLP · Translation

TransLingo: Neural Machine Translation

Seq2Seq translation with attention mechanisms for German→English. Focuses on representation alignment between source and target language spaces through encoder-decoder architecture.

PyTorchSeq2SeqAttention

View on GitHub →

Remote Sensing · Published

Predicting Agricultural Yield via Remote Sensing

ML model for Punjab crop yield prediction using 17 years of multi-source geospatial data (weather, soil, NDVI satellite data). Statistical methods, ML algorithms, and real-world validation. Published ICoISS 2023.

Remote SensingNDVIStatistical ML

View Publication →

04 — Research Output

Selected Publications

European Journal of Radiology AI · 2025

Deep learning-based detection and segmentation of osseous metastatic prostate cancer lesions on computed tomography

Automated lesion detection and segmentation in CT scans using deep learning, demonstrating AI-driven automation improving metastatic lesion detection accuracy and clinical decision-making in oncology.

doi:10.1016/j.ejrai.2025.100005 ↗

ICoISS · June 2023

Predicting Agricultural Yield by Integrating Remote Sensing Data and Machine Learning Technology

Multi-source geospatial yield prediction for Punjab farmlands using 17 years of weather, soil, and NDVI satellite data with statistical and real-world validation.

doi:10.1007/978-981-99-1726-6_6 ↗

Procedia Computer Science · January 2022

Blockchain-based Food Supply Chain — A Double Blockchain Framework

Double-blockchain framework to enhance transparency, traceability, and trust in the agricultural supply chain with immutable audit trails and provenance verification.

doi:10.1016/j.procs.2022.12.034 ↗

Bridging the Gap Between
Scale and Reliability.

The Narrative

Industry Work

Research & Engineering

Selected Publications

The Engineer's Toolkit

Currently exploring roles in
Machine Learning and Data Engineering.

Bridging the Gap BetweenScale and Reliability.

The Narrative

Industry Work

Research & Engineering

Selected Publications

The Engineer's Toolkit

Currently exploring roles inMachine Learning and Data Engineering.

Bridging the Gap Between
Scale and Reliability.

Currently exploring roles in
Machine Learning and Data Engineering.