My work at Saturn Cloud involves using and learning a lot of different technologies, particularly around scalable and accelerated data science in Python. This can involve benchmarking different tools for specific workloads to show why a customer would want to use Saturn Cloud’s platform over another platform. If I’ve learned anything through this, it’s that benchmarks are hard! This post covers one correction I had to make recently and some reflection on benchmarks in general.
I recently found a bug in one of these benchmarks I wrote in July comparing the performance of a machine learning grid search across a…
This is Part 2 of a series of posts. If you want to see the data behind which jobs I applied to and how I progressed through interviews, check out Part 1 here.
There seems to be a lot of content on the internet about how to land your first data science job, but not about how to make vertical moves from a junior or mid-level position. Six months ago I started a new job as a Senior Data Scientist and I thought others might find it useful to read the story of how I got the job.
This article…
I recently started a new job as a Senior Data Scientist, and thought it would be a fun exercise to analyze the data about my job search. And to make it even more fun, why not share it with strangers on the internet?
I compiled data about each application and include my analysis below. The data includes companies and job titles I applied to, when I applied, when I heard back (if I did), when interviews occurred, and how each application ended. Some light analysis of this data shows trends in my job search habits, as well as some details…
Disclaimer: I’m a Senior Data Scientist at Saturn Cloud — we make enterprise data science fast and easy with Python, Dask, and RAPIDS.
Prefer to watch? Check out a video walkthrough here.
Random forest is a machine learning algorithm trusted by many data scientists for its robustness, accuracy, and scalability. The algorithm trains many decision trees through bootstrap aggregation, then predictions are made from aggregating the outputs of the trees in the forest. Due to its ensemble nature, random forest is an algorithm that can be implemented in distributed computing settings. …