Open to work

Apaar
Saroj

AI Engineer · Agent Builder · Data Scientist

MS in Information Technology at Arizona State (4.0 GPA). Currently building LangGraph agents and LLM pipelines. Strong foundation in Python, ML, and production data systems from 3+ years at EXL and Team Computers.

View Projects Get in Touch

3+Years Industry

4.0GPA at ASU

AIFirst Focus

Apaar Saroj

AI Engineer · Open to Work

LangGraph / Agent SystemsProficient

LLM Pipelines & Tool UseProficient

ML & Statistical ModelingAdvanced

Python & SQLExpert

● Available AI Engineer PythonLangGraphAnthropic API

About

Analytics roots.
AI trajectory.

Recently finished an MS in Information Technology at Arizona State with a 4.0 GPA, and currently targeting AI Engineer roles focused on building agents, LLM pipelines, and end-to-end AI products. The shift from classical ML toward agentic systems is deliberate, and the projects here reflect it.

The two most recent projects are both shipped AI systems: a Personal Trainer Agent built with LangGraph, ChromaDB, and PubMed-sourced RAG with persistent SQLite memory, and a News Summarizer built as a LangGraph state graph with a self-healing quality loop and LLM-as-Judge evaluation. Before grad school, three years of production data work at EXL and Team Computers built the engineering foundation that sits underneath those systems.

Strongest in Python, LangGraph, the Anthropic API, and classical ML. Happy to talk about AI Engineer roles, interesting agent problems, or anything in the LLM space.

🤖
Agentic AIShipped: Personal Trainer Agent (LangGraph + ChromaDB RAG + SQLite memory) and News Summarizer (LangGraph + LLM-as-Judge eval)

🎓
Arizona State UniversityMS Information Technology · 4.0 GPA · May 2026

⚡
Production Data Background3+ years across data engineering, predictive modeling, and BI at EXL and Team Computers

📍
Open to US rolesRemote, hybrid, or on-site

Projects

Work that ships.

Personal Project · Agentic AIFEATURED

Personal Trainer AI Agent

LangGraph agent with a RAG layer built on PubMed research and curated exercise data. Claude reasons over member profiles, workout history, and a ChromaDB fitness knowledge base before giving any advice. Persistent memory via SQLite checkpointing means the agent remembers across sessions, not just within one.

PythonLangGraphLangChainChromaDBRAGPubMed APISQLite

Graph

chatbot node + ToolNode, conditional edge on tool_calls, SqliteSaver checkpointer

RAG knowledge base

PubMed abstracts on 8 fitness topics + curated exercise markdown, chunked and embedded in ChromaDB

5 tools

log_workout · get_workout_history · generate_workout_plan · get_user_profile · search_fitness_knowledge

Memory

SQLite checkpointer persists conversation state across sessions keyed by member PT-XXXX ID

ASU · Graduate CapstoneFEATURED

Healthcare System Efficiency Evaluation Framework

End-to-end ML pipeline scoring healthcare financing efficiency across 52 countries using WHO, OECD, and World Bank data. Shannon entropy derives objective indicator weights, fixed-effects regression and SHAP-explained Random Forests model the drivers, deployed as a live Streamlit dashboard with a policy simulator.

Pythonscikit-learnSHAPK-MeansRandom ForestStreamlitDockerPanel Regression

● Live Demo

Healthcare Efficiency dashboard screenshot

Scale

52 countries · 22 years · 4 international data sources

Key Finding

US spends 16.5% of GDP but ranks #43/52 in efficiency, and the model isolates why

Deliverable

6-tab Streamlit dashboard + Docker deployment + live policy simulator

Personal Project · Gen-AILangGraph

AI News Summarizer

LangGraph state graph that fetches news via NewsAPI, ranks articles by relevance using Claude as judge, summarizes with Claude Haiku, and auto-evaluates with LLM-as-Judge and ROUGE-L. Low-quality results trigger automatic query refinement and a retry loop, up to 3 iterations.

Key Design

5-node state graph · self-healing feedback loop · LLM-as-Judge · ROUGE-L scoring

PythonLangGraphLangChainClaude HaikuROUGE-LLLM-as-JudgeStreamlit

● Live Demo

ASU · NLP ExplorationResearch

Cross-Domain Tone Classification

Studied whether BERT models trained on incompatible label systems can still map to a shared target using a small calibration set. One model used tone labels, the other emotion labels. The finding: domain alignment matters more than dataset size.

Key Insight

Smaller domain-aligned dataset (2.6k) outperformed the larger misaligned one (8k): 54.6% vs 46.5%

BERTPyTorchHuggingFaceNLPTransfer Learning

Experience

Where I've built.

Sep 2022 – Jul 2024

Business Analyst

EXL · Gurugram, India

Embedded with the data team for a Big Six UK energy provider, building and running 20+ daily data engineering scripts that processed over 100K financial transactions across cash matching, unallocated transactions, and balance reconciliation for a regulated portfolio of around 10 million households.
Built 6+ Power BI dashboards for regulatory compliance, cash flow tracking, and final credits reporting, used directly by finance leadership for strategic planning and audit readiness.
Automated recurring financial reports in SQL and Python, cutting turnaround on time-sensitive queries from 24 hours to under 1, and took ownership of processes handed over from the onshore team.

Jul 2021 – Sep 2022

Developer, BI & Analytics

Team Computers · Gurugram, India

Built Python predictive models on wind turbine sensor data to identify downtime drivers, with real-time Tableau dashboards for performance monitoring and energy forecasting.
Integrated weather forecast data into the analytics pipeline, building energy production models that improved resource planning and operational efficiency by 15%.
Reduced unplanned turbine downtime by 20% and maintenance costs by 15% through predictive maintenance models.

Tech stack.

AI & Agents

Anthropic API / Tool Use Proficient

LangGraph Proficient

LangChain / LCEL Proficient

Prompt Engineering Advanced

LLM-as-Judge / Eval Proficient

BERT / Transformers Proficient

ML & Data

Python (pandas, NumPy) Expert

SQL Expert

scikit-learn / XGBoost Advanced

SHAP / Explainability Advanced

Statistical Modeling Advanced

ETL & Data Pipelines Advanced

Tools & Infra

Streamlit Advanced

Docker Proficient

Git / GitHub Proficient

Tableau / Power BI Advanced

AWS Foundational

ChromaDB / Vector Stores Proficient

Credentials.

MS Information Technology

Arizona State University

2024 – May 2026Tempe, AZ

★ 4.0 GPA

B.Tech Computer Science

Kurukshetra University

2015 – 2019Haryana, India

Certifications

AWS Academy ML Foundations

Amazon Web Services

Prompt Engineering for Developers

DeepLearning.AI

What leaders say.

"A pivotal role in developing complex analytics processes and delivering critical MI reports that reduced client costs."

Manvi Gupta

Sr. AVP · EXL

"Unparalleled professionalism and data-driven insight that guided our organization toward optimal outcomes."

Ajeet Singh Kaintura

Manager, Transformation & Solutioning · EXL

Model	Training Data	Accuracy	Macro F1
Model B, Tone Analysis	2,684 samples	54.58%	0.5337
Model C, GoEmotions	8,044 samples	46.50%	0.4635

ApaarSaroj

Analytics roots.AI trajectory.