Skills
|
A/B Testing, Machine Learning, Causal Inference, ETL, Analytics / Data Engineering, Agentic AI, Generative AI
Python (scikit-learn, numpy, pandas, xgboost, shap, fastapi), R (tidyverse, caret, glmnet), Git (GitHub)
Data Visualisation & Dashboarding: matplotlib, ggplot, Tableau, Shiny, Mode, Streamlit, Looker
Hadoop, Spark (PySpark), SQL (Snowflake / Hive / Presto), Docker, Airflow, Prefect, Jenkins, AWS, GCP, dbt
|
Work
|
Block, Inc. / Cash App
Oakland, CA
Senior Data Scientist (L6), Experimentation
2022 - 2025
-
Architected metrics, data models and statistical analysis for platform running 200+ experiments quarterly.
-
Created a Git-based metrics repository that grew to over 800 metrics used in experiment measurement, working with data scientists companywide to provide guidance on metric design, definitions and data governance. Established standards ensuring data quality, transparency, reusability and reproducibility.
-
Orchestrated all metrics and experiment data pipelines to ensure reliability, making appropriate changes for performance and scalability as data processing volume grew to ~5TB / day, while managing compute costs.
-
Collaborated with Eng to build
Exposium, a comprehensive internal data product that standardized calculation of experiment results and reporting, combined with robust monitoring and alerting.
-
Built APIs using FastAPI to query Snowflake, run inference against statistical / ML models, aggregate data.
-
Obviated the need for standalone dashboards, saving more than 10 man-hours / experiment.
-
Increased experiment sensitivity, statistical power by 30+% with ML, causal inference techniques.
-
Used LLMs to generate text summaries of results based on raw metric movements, making experiment analytics self-serve for all experimenters and shortening decision making duration by up to 2 days.
-
Gave a bird's-eye view of the experiment program, enabling us to correct deficiencies and deviations from experiment best practices, such as peeking, as well as to run meta-analyses.
Robinhood
Menlo Park, CA
Senior Data Scientist, Experimentation
2021 - 2022
-
Worked with engineers and other data scientists on our internal experimentation platform
Kaizen to:
-
Manage the 1000+ metrics in the metrics repo to ensure that high quality, actionable metrics are used. We also completed a migration to metrics that are defined fully in code and entirely self-serve.
-
Apply ML to estimate heterogeneous treatment effects (HTE) so that experimenters can understand how different segments of users respond to product changes, paving the way for personalized experiences.
-
Started and supervised Data Science Experiments oncall to help with experiment design and
Kaizen usage.
NextRoll
San Francisco, CA
Data Scientist, Ad Performance
2020 - 2021
-
Oversaw ad performance (CPM, CPC, budget fulfillment, etc.) for major external clients, e.g. Yelp, Rakuten.
-
Proactively monitored and debugged issues relating to campaign creation, optimization and spend pacing, leading to improved relationships and renewal of $500K+ platform fees in total.
-
Responsible for all reporting for NextRoll Platform Services. Created and maintained the most widely used dashboards for this new team; orchestrating the ETL in Airflow and defining new aggregated data cubes.
Verizon Media Group / Yahoo! Inc.
Sunnyvale, CA
Research Engineer - Data Science
2015 - 2019
-
Worked on improving various aspects of our internal experimentation platform
Evaluate:
-
Implemented a bucket size calculator to suggest appropriate control/test bucket size at experiment setup in order to ensure that the A/B test has sufficient statistical power to detect changes in key metrics.
-
Led the development of Ready-to-use A/A buckets to minimize pre-existing differences between buckets, thereby allowing experiments to start immediately. US Patented, presented in talk at IEEE Big Data 2017.
-
-
Member of Experimentation Council overseeing A/B tests companywide, from planning to decision review.
-
Carried out a thorough Yahoo! Finance user segmentation study, facilitating my subsequent work in
-
Daily dashboards giving product managers greater visibility into trends in key metrics for all segments.
-
Lookalike modeling using xgboost to acquire new users, incremental new installs worth $700K+ in LTV.
|