In this post, I’ll be training a model to predict pneumonia in chest X-Rays. This will be the first time that I’ll be using SageMaker. Let’s see how it goes!

A) Load the Data

We’ll install the necessary libraries and import those that are already locally available.

!pip3 install tqdm
!pip3 install s3fs
!pip3 install seaborn==0.9.0
!pip3 install imbalanced-learn
%matplotlib inline
import numpy as np
import pandas as pd
import seaborn as sns
import os
import io
import shutil
import json
import random
import boto3
import s3fs
import sagemaker
from PIL import Image
from itertools import islice
import requests
from io import BytesIO
from imblearn.pipeline import Pipeline…

1) Introduction

As I wrap up Module 3 of Flat Iron’s Data Science bootcamp, I will be tacking a Driven Data competition, Pump It Up: Data Mining the Water Table.

Follow along below, or take a look at the Jupyter notebook and repo on GitHub.

The competition provides a dataset of water points in Tanzania and their associated characteristics. It is our job to predict, using the supplied training labels, whether a pump is functional, non-functional, or functional but in need of repair.

In the below, I’ll be building a model to predict water point status given a testing dataset. …


1) Introduction

As I wrap up Module 2 of Flat Iron’s Data Science bootcamp, I will be conducting multiple linear regression on a subset of the King County Housing Sale Price dataset. I’ve referenced the King County Realtor Glossary to interpret the feature names included in the dataset. Follow along below, or take a look at the Jupyter notebook.

To guide my analysis, I began by asking three questions that were of interest to me as I read through the King County Realtor Glossary:

1) What is the effect on sale price of classification as a low grade* property?

2) What is…


As part of my data science bootcamp with the Flat Iron School, I’m playing the following role: Data science consultant to a tech company looking to branch out into the movie industry. They’ve asked me to help them understand the biz before they take the plunge.

I’ve chosen three questions to inform how they choose their scripts and prioritize new projects:

1) Do multi-genre movies earn more profit on average than single-genre movies?

2) Does profit vary by runtime and audience type (domestic/foreign)?

3) How do the 10 highest profit movies compare in terms of their relationship between rating and…

Caitlin Snyder

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store