Sign in

Experimenter, content creator, data person💡📊💬

Also, read docstrings carefully!

My work at Saturn Cloud involves using and learning a lot of different technologies, particularly around scalable and accelerated data science in Python. This can involve benchmarking different tools for specific workloads to show why a customer would want to use Saturn Cloud’s platform over another platform. If I’ve learned anything through this, it’s that benchmarks are hard! This post covers one correction I had to make recently and some reflection on benchmarks in general.

The correction

I recently found a bug in one of these benchmarks I wrote in July comparing the performance of a machine learning grid search across a…

Part 2: Reflecting and giving advice

Photo by Japheth Mast on Unsplash

This is Part 2 of a series of posts. If you want to see the data behind which jobs I applied to and how I progressed through interviews, check out Part 1 here.

This article…

Part 1: Analyzing the Data

Photo by Sylwia Bartyzel on Unsplash

I recently started a new job as a Senior Data Scientist, and thought it would be a fun exercise to analyze the data about my job search. And to make it even more fun, why not share it with strangers on the internet?

I compiled data about each application and include my analysis below. The data includes companies and job titles I applied to, when I applied, when I heard back (if I did), when interviews occurred, and how each application ended. Some light analysis of this data shows trends in my job search habits, as well as some details…

Lightning-fast model training with RAPIDS

Photo by bady abbas on Unsplash

Disclaimer: I’m a Senior Data Scientist at Saturn Cloud — we make enterprise data science fast and easy with Python, Dask, and RAPIDS.

Prefer to watch? Check out a video walkthrough here.

Random forest is a machine learning algorithm trusted by many data scientists for its robustness, accuracy, and scalability. The algorithm trains many decision trees through bootstrap aggregation, then predictions are made from aggregating the outputs of the trees in the forest. Due to its ensemble nature, random forest is an algorithm that can be implemented in distributed computing settings. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store