Hi, I'm Pranav Dhawan

MS Data Science @ George Washington University |
AI Engineer

ABOUT ME

I’m a Machine Learning Engineer and Data Scientist currently pursuing my MS in Data Science at George Washington University. My experience spans computer vision, NLP, and analytics with a focus on turning messy data into reliable, scalable solutions.

I'm drawn to problems where you need to navigate ambiguity, optimize for constraints, and build systems people actually use. Whether it's refactoring models to improve accuracy and speed, engineering features from messy data, or deploying scalable pipelines.

SKILLS & TECHNOLOGIES

LANGUAGES & FRAMEWORKS

PROGRAMMING LANGUAGES

Python
R

LIBRARIES & FRAMEWORKS

Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn
PyTorch
TensorFlow
HuggingFace

DATA & CLOUD

DATA MANAGEMENT

SQL
MySQL
AWS
Google Cloud Platform

VISUALIZATION

Power BI
Tableau
Streamlit

FEATURED PROJECTS

Multimodal Financial Time Series Forecasting

Designing a GNN-based multimodal architecture fusing LSTM (time series), FinBERT (NLP sentiment), and graph networks to model inter-sector dependencies for stock prediction using the FinMultiTime dataset with attention-based late fusion, targeting sub-2% MAPE on held-out test data.

Python GNN LSTM FinBERT Attention

Edge-Based PII Detection & Censoring System

Achieved 98.1% F1-score and 97.9% recall across 54 PII entity types by architecting and benchmarking 4 transformer architectures, selecting DeBERTa as the production model and deploying via ONNX for <100ms on-device inference. Built an end-to-end Streamlit demo with Tesseract OCR and SHAP/LIME explainability.

DeBERTa ONNX Streamlit Tesseract OCR SHAP/LIME

Stock Screen

An AI-powered terminal-style dashboard that unifies stock analysis, real-time news sentiment, and SEC filing insights into one interface. It aggregates and scores financial news, summarizes regulatory filings with AI, and visualizes trends through interactive charts. Designed for investors, it streamlines decision-making with data-driven insights and upcoming predictive forecasting capabilities. Patterns, news sentiment, stock correlations, SEC filings, and time-series data for directional predictions.

GCP Cloud Run Docker Python FinBERT

Serverless ETL — Weather Dashboard

Built a fully serverless ETL pipeline on AWS using Lambda and EventBridge to ingest and transform real-time weather data from the OpenWeather API on a scheduled cadence, with results surfaced through an interactive dashboard.

AWS Lambda EventBridge OpenWeather API ETL

PROFESSIONAL EXPERIENCE

Expected May 2026
Current

Master of Science, Data Science

George Washington University | Washington, DC

GPA: 3.74/4.0

Relevant Coursework:
Machine Learning Deep Learning NLP Data Mining Cloud Computing Time Series Analysis
February 2024 - August 2024

Machine Learning Engineer

Lumina Datamatics

Fine-tuned Computer Vision models to detect and extract complex equations from 10,000+ unstructured documents, achieving high precision in technical expression recognition and eliminating the need for manual post-processing.
Reduced manual correction overhead by 16% and cut inference latency by 0.3ms per page by replacing LayoutParser with a custom YOLO-based document layout pipeline.
Reduced legal document search query time from minutes to under 5 seconds by architecting a hybrid RAG system with optimized vector embeddings and FAISS indexing, improving search relevance for counsel teams.
Technologies Used:
Python YOLO Computer Vision RAG FAISS
July 2023 - September 2023

Machine Learning Intern

HCL Technologies | Noida, India

Identified the top 5 drivers of workforce performance with 87% prediction accuracy by building Logistic Regression and Random Forest models trained on activity data from 500+ employees, enabling targeted HR interventions.
Engineered 12+ data-driven KPIs from raw employee activity logs (screen time, app usage) using SQL and Python, surfacing patterns that differentiated high-performance clusters from the broader population.
Designed interactive Tableau dashboards to visualize productivity trends, enabling management to make data-driven workforce planning decisions.
Technologies Used:
Python SQL scikit-learn Tableau
May 2023 - July 2023

Summer Intern

Ernst & Young | Gurgaon, India

Consolidated Sales & HR KPI reporting into 4 Power BI dashboards—covering revenue trends and attrition rates—cutting cross-functional reporting turnaround and supporting strategic planning for senior leadership.
Achieved 100% reporting accuracy across monthly business reviews by automating ETL workflows for 5+ data sources using Alteryx, eliminating manual data cleaning and transformation.
Technologies Used:
Power BI Alteryx ETL Data Visualization
May 2024
Graduated

B.Tech in Computer Science and Engineering

Manipal University Jaipur | Rajasthan, India

GPA: 8.51/10.0

Relevant Coursework:
Algorithms & Data Structures Database Management Systems
Aug 2021 – Sept 2021

Full Stack Intern

Learnovate Ecommerce | Remote

Developed and maintained full-stack features for an e-commerce platform, contributing to both front-end UI components and back-end API integrations.
Technologies Used:
HTML CSS JavaScript REST APIs
Jun 2021 – Aug 2021

Front-End Intern

Education 4 ol | Remote

Built responsive front-end interfaces for an ed-tech platform, improving user experience and accessibility for online learning content.
Technologies Used:
HTML CSS JavaScript

GET IN TOUCH

Let's work together!

I'm always interested in new opportunities in machine learning, data science, and AI. Feel free to reach out!