Home About Skills Experience Projects Contact Download CV

Data Scientist · ML Engineer · Agentic AI

VIGNESHWARI
JAYAPRAKASH

  • Agentic AI systems — LangGraph + RAG reasoning over real financial & operational data
  • GenAI platform cutting intelligence retrieval by 70% at NM State Government
  • Fraud detection saving $1M+/yr across 1M+ daily transactions — earned CFO recognition
  • Supply chain demand forecasting on 46M+ records — MAPE < 8%
  • 5+ years production ML · MS Data Science ASU · 4.0 GPA
| MS Data Science · ASU · 4.0 GPA
Vigneshwari Jayaprakash
5+

Years Experience

$1M+

Cost Savings Delivered

46M+

Records Forecasted

4.0

GPA at ASU

About

My Journey

I began my career in 2013 at Infosys, progressing from Software Engineer to Data Scientist over six years — building production ML models and data systems for BNSF Railway, one of North America's largest freight networks.

After relocating to the United States, I invested in 1,200+ hours of self-directed upskilling in Generative AI, LLMs, RAG architectures, and cloud-native ML — building hands-on expertise in PyTorch, AWS, and Docker before enrolling at ASU.

I'm completing an M.S. in Data Science (Computing & Decision Analytics) at Arizona State University with a perfect 4.0 GPA. I was also a Gold Medalist during my undergraduate studies at Anna University.

Most recently, I served as a Data Scientist Intern at the NM Department of IT — architecting a first-of-its-kind GenAI conversational analytics platform using LangGraph + RAG, and deploying ensemble anomaly detection models achieving 85% threat detection accuracy at statewide scale.

Beyond the code...

Staying Active

Gym sessions and outdoor activities.

Being Creative

Painting, sewing, and intricate craftwork.

Recharging

Books that expand perspective beyond the tech world.

Capabilities

Technical Skills

Agentic AI & LLM Engineering

End-to-end agentic system design — multi-step reasoning pipelines, RAG over financial and operational data, vector stores, semantic search, structured outputs, and production LLM deployment in regulated environments.

LangGraphLangChainRAGChromaDBFAISSGPT-4Llama 3Prompt EngineeringPydanticFastAPI

ML & Forecasting

Classification, regression, anomaly detection, time-series forecasting, demand planning, and ensemble methods in high-volume production environments.

XGBoostARIMAPyTorchTensorFlowScikit-learnSHAP

MLOps & Engineering

End-to-end ML lifecycle: experiment tracking, model versioning, automated pipelines, and cloud-native deployment at scale.

MLflowAirflowDockerKubernetesAWSCI/CD

Data Engineering

Distributed ETL frameworks, real-time pipelines, SQL optimization, and data quality monitoring — 5TB+ daily at 99.9% uptime.

PythonSQLPySparkKafkaHadoopPandas

Analytics & BI

Statistical modeling, hypothesis testing, executive dashboards, and data-driven storytelling for cross-functional stakeholders.

TableauPower BIStreamlitA/B TestingEDA

Cloud & Infrastructure

Cloud-native ML deployment, serverless architectures, container orchestration, and CI/CD pipelines across AWS ecosystem.

AWS S3SageMakerLambdaKubernetesGit

Career

Professional Journey

New Mexico Dept. of Information Technology

Data Scientist Intern

Jun 2025 – Aug 2025 · State Government · Albuquerque, NM

Latest Role

GenAI Conversational Analytics Platform

Architected a GenAI platform using LangGraph + RAG on AWS Lambda & S3 processing 10K+ cybersecurity incidents, reducing intelligence retrieval time by 70% (hours → seconds).

Ensemble Anomaly Detection

Deployed XGBoost + Isolation Forest achieving 85% threat detection accuracy with 60% fewer false positives — integrated via FastAPI microservices.

Real-Time ETL & Data Quality

Built Airflow-orchestrated ETL pipelines with automated validation, improving data quality by 40% and cutting failure rate by 30%.

Infosys Ltd · BNSF Railway

Technology Analyst — Data & Analytics

Jan 2017 – Jan 2019 · Freight Network · North America

CFO Recognition

ML-Powered Fraud Detection

Built XGBoost + Random Forest fraud detection processing 1M+ daily transactions, delivering $1M+ annual savings and earning a formal CFO commendation.

SHAP-Based Model Explainability

Engineered a SHAP explainability layer for high-stakes financial decisions, reducing stakeholder review cycles by 30% and ensuring regulatory transparency.

A/B Testing & Model Iteration

Applied causal inference & A/B testing across 5+ model iterations, improving F1 score by 18% and reducing false negatives in production.

Infosys Ltd · BNSF Railway

Data Scientist

Jun 2015 – Jan 2017 · Predictive Maintenance & Scale

Predictive Maintenance for Locomotives

Designed time-series predictive maintenance models for 500+ locomotives, reducing unplanned downtime by 25% and saving an estimated $400K annually.

Distributed ETL at Scale

Orchestrated PySpark + Hadoop ETL handling 5TB+ daily ingestion at 99.9% uptime, powering Tableau & Power BI dashboards for 10+ stakeholders.

MLflow & ML Lifecycle

Implemented MLflow experiment tracking and versioning, cutting deployment time by 35% and standardizing ML lifecycle across a 6-person team.

Infosys Ltd · BNSF Railway

Software Engineer

Oct 2013 – May 2015 · Data Engineering & Pipelines

Production Data Pipelines

Engineered ML data pipelines processing 500K+ daily records, maintaining 99%+ data completeness across sprint cycles.

Automated Data Validation

Built automated validation and quality monitoring frameworks, reducing data defect rates by 45% across 3 production pipelines.

Independent Researcher

Self-Directed Research — Generative AI & Data Science

Jan 2019 – May 2024 · United States

1,200+ hrs

Structured Upskilling in GenAI & Cloud ML

Completed 1,200+ hours in GenAI, LLMs, RAG architectures, and cloud-native ML — gaining hands-on expertise in PyTorch, AWS, and Docker ahead of M.S. enrollment at ASU.

Certifications & Personal Projects

Earned ML Specialization, Deep Learning Specialization, and Gen AI with AWS certifications. Built multiple projects across computer vision, NLP, and RAG systems.

Portfolio

Featured Projects

View on GitHub →
SentinelRAG
18,700+ entities · local inference
In Progress
Python LangChain ChromaDB Ollama FastAPI Streamlit

SentinelRAG — Sanctions Compliance Screening

Automated sanctions screening system checking entities against a U.S. government blacklist of 18,700+ entries using keyword, semantic, and knowledge graph search. All data processed locally — deployed via FastAPI with a bulk-screening dashboard and tamper-proof audit log.

99% Recall 100% Precision 18,700+ entities Zero hallucinations Privacy-first (local)
View on GitHub →
Supply Chain AI
46M+ records · M5 dataset
In Progress
Python XGBoost ARIMA LangChain ChromaDB FastAPI MLflow Docker

Agentic Supply Chain Demand Orchestrator

End-to-end ML forecasting + GenAI reasoning system for service parts demand planning. Powered by the M5 Forecasting Dataset (46M+ real Walmart records) remapped to semiconductor equipment supply chain schema — covering NPI launches, reliability signals, field operations, and service campaigns. A GenAI reasoning layer (LangChain + RAG) enables plain-English stakeholder queries over forecast outputs.

46M+ records MAPE < 8% 35% faster lookup NPI · Reliability · Campaigns M5 Gold Standard
View on GitHub →
Railway Track Inspection
PyTorch YOLO11 OpenCV Docker

AI-Powered Railway Track Inspection Pipeline

CV pipeline processing 500+ hours of footage detecting safety-critical defects. Real-time Streamlit dashboard with defect-type filtering and exportable reports.

92% accuracy 40% less manual inspection
View on GitHub →
Wildfire Detection
PyTorch Vision Transformers XAI

Multimodal Wildfire Detection Network

Deep learning pipeline fusing RGB & thermal imagery via Vision Transformers with attention heatmap explainability.

98.6% accuracy +20% precision
View on GitHub →
Fraud Detection
XGBoost SHAP scikit-learn

Transaction Risk Scoring & Fraud Detection

Time-aware fraud pipeline on 300K+ real-time transactions with leakage-safe temporal modeling and SHAP interpretability.

MCC: 0.50 300K+ transactions

Education

Academic Background

M.S. in Data Science

Arizona State University

Computing & Decision Analytics

Aug 2024 – May 2026 (Expected)

GPA: 4.0 / 4.0

B.Tech in Information Technology

Anna University

Information Technology

2009 – 2013

GPA: 3.6 / 4.0 · Gold Medalist

Certifications & Awards

Machine Learning Specialization — DeepLearning.AI
Deep Learning Specialization — DeepLearning.AI
Generative AI with AWS — Amazon Web Services
CIO Recognition — NM DOIT (RAG chatbot cutting triage by 70% & Power BI dashboard)
CFO Recognition — BNSF Railway ($1M+ savings)
Insta Award — Infosys (ML Engineering Impact)
Gold Medalist — Anna University