Nadia Osei-Bonsu
Data Scientist
nadia.oseibonsu@email.com · +1 (512) 874-3301 · Austin, TX · linkedin.com/in/nadiaoseibonsu · github.com/noseibonsu
Summary
Data scientist with 4 years translating complex datasets into models and insights that drive measurable product and business decisions. Deep experience across the full ML lifecycle — from feature engineering and experimentation to production deployment and monitoring. Fluent in Python, SQL, and cloud-native ML tooling.
Skills
Languages: Python, R, SQL, Bash · ML & Stats: scikit-learn, XGBoost, LightGBM, PyTorch, statsmodels · Data & Pipelines: Pandas, Spark, Airflow, dbt, Snowflake · MLOps: MLflow, SageMaker, Docker, FastAPI · Visualization: Tableau, Plotly, Seaborn, Looker
Experience
Data Scientist II
Helix Analytics · Austin, TX | Mar 2022 – Present
- Built churn prediction model using LightGBM on 14M customer records, reducing monthly churn by 19% and retaining $3.8M ARR.
- Designed and ran 40+ A/B experiments across pricing and onboarding flows, directly informing 6 product decisions that lifted revenue by 11%.
- Deployed real-time recommendation engine via FastAPI on SageMaker, serving 2M daily predictions at under 30ms p95 latency.
- Automated feature pipeline with Airflow and dbt, cutting model retraining time from 11 hours to 43 minutes.
Data Scientist
Prism Data Co. · Dallas, TX | Jun 2020 – Feb 2022
- Developed NLP classification model on 800K support tickets, automating routing with 91% accuracy and saving 18 hours/week of manual triage.
- Built time-series demand forecasting pipeline using Prophet and XGBoost, reducing inventory overstock costs by $620K/yr.
- Partnered with engineering to instrument event tracking for 3 core product flows, establishing the baseline data layer for all downstream experiments.
Projects
OpenCredit — Credit Risk Scoring ModelPython · XGBoost · SHAP · AWS
- Trained gradient-boosted model on 2.1M loan records achieving AUC of 0.94, outperforming baseline logistic regression by 17 points.
- Integrated SHAP explainability layer, making model decisions interpretable for compliance review across 5 regulatory categories.
PulseCheck — Real-Time Sentiment DashboardPython · Transformers · Plotly · FastAPI
- Fine-tuned DistilBERT on domain-specific corpus of 120K labeled reviews, reaching 93.4% F1 on held-out test set.
- Deployed live dashboard ingesting 10K social mentions/hour with sub-second refresh and zero infrastructure cost using serverless stack.
Education
M.S. in Statistics & Machine LearningCarnegie Mellon University | May 2020
Thesis: Bayesian Approaches to Sequential A/B Testing Under Non-Stationarity · GPA: 3.9 / 4.0
Certifications & Awards
AWS Certified Machine Learning — SpecialtyAmazon Web Services · 2022
TensorFlow Developer CertificateGoogle · 2021
1st Place — DataHack Global HackathonKaggle · 2022