I've had concerns about how much information I'm able to retain with the stop and go pace of tech learning created by working full-time as a nurse. I got a big confidence boost today while completing my first project in the Codecademy Pro Data Analyst Career Path. The Python Syntax: Medical Insurance Project creates the use case of a medical professional investigating how different factors such as age, BMI, and sex influence the cost of medical insurance for an individual. Extra bonus points for me since this example draws upon the medical field where I have a knowledge base. So let's break down how I worked through this project. (Disclaimer: I snagged the section headings from the Codecademy problem to make it easier to navigate for those who may be working on this project themselves.)
Setting up Factors
Step one was setting up the cost changing factors for use in the analysis. Values were provided for the needed variables, they simply needed assigned. It helps here to remember Python does not declare variables. A variable is created as soon as the equal sign is used to assign a value to it. Step one ended up looking like this:
# create the initial variables below
age = 28
sex = 0
bmi = 26.2
num_of_children = 3
smoker = 0
Note: the problem provides values of 0 or 1 for sex (0=female, 1=male) and smoker (0=no, 1=yes). Also, high five to Codecademy for mentioning the source of this data was not inclusive of data for non-binary individuals.
Working with the Formula
Once the factor related variables are assigned, it's time to setup the formula used to determine the cost of insurance based upon the named factors. Fortunately for me, this formula was provided and only needed assigned to the insurance_cost variable:
# Add insurance estimate formula below
insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
The next step was creating a print statement that would create some informative output from our formula results. This type of print statement requires some string concatenation to join our pieces together. Luckily, we can accomplish this in Python with just the + symbol and some quotation marks around our desired text so the interpreter knows our text is a string. We do need to take one more small step to get all of our pieces into the same type. We currently have some text strings and an integer we are trying to add together. Python first requires us to use the str() function to convert the integer to a string and allow it to be concatenated with the other strings. The process looks like this:
print("This person's insurance cost is " +str(insurance_cost) + " dollars.")
Our output would be: This person's insurance cost is 5469.0 dollars.
Looking at Age Factor
Ok, now the groundwork is done and we are ready to start manipulating factors to see how changes affect the cost of a client's insurance. We will start with changing our client's age by adding four years to the age variable. We will also copy our insurance_cost formula and assign it to a new variable called new_insurance_cost.
# Age Factor
age += 4
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
Remember: in Python the += operator adds a number to the current value of a variable and stores the resulting sum as the new variable value. In this example, we are taking the current value for age of 28, adding 4, and storing 32 as the new value for age. In order to determine how this change to age affects the cost of our client's insurance, we can create a formula to subtract the new cost from the original cost and store it in a variable called change_in_insurance_cost. We can also add a new print statement to concatenate the strings and communicate the change in insurance cost:
change_in_insurance_cost = new_insurance_cost - insurance_cost
print("The change in cost of insurance after increasing the age by 4 years is " +str(change_in_insurance_cost) + ".")
If you're playing along at home, you should now have two print statements on your screen:
This person's insurance cost is 5469.0 dollars
The change in cost of insurance after increasing the age by 4 years is 1000.0.
Looking at BMI Factor
Our next task is to take a look at how a change in BMI relates to change in insurance cost. If you are thinking this is a matter of following the same process as for changes to the client's age, you are on the right path! The process will be identical after we square away one little piece of housekeeping. Remember the change we made to the age variable? We have to update the age variable back to our original value so that we are only manipulating one variable at a time and can be sure any changes are related to a specific factor.
# BMI Factor
age = 28
bmi += 3.1
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
change_in_insurance_cost = new_insurance_cost - insurance_cost
print("The change in estimated insurance cost after increasing BMI by 3.1 is " + str(change_in_insurance_cost) + " dollars.")
We should now have three print statements:
This person's insurance cost is 5469.0 dollars
The change in cost of insurance after increasing the age by 4 years is 1000.0.
The change in estimated insurance cost after increasing BMI by 3.1 is 1147.0 dollars.
Looking at Male vs. Female Factor
The last factor this problem asks us to manipulate is the sex variable. We'll follow the same process we did for changing age and BMI. Our age variable was last assigned the original value of 28 so we don't need to make any changes there. We will need to reassign BMI the original value of 26.2 using the = assignment operator. Then it's just a matter of setting sex equal to 1 for male, re-running our equations, and creating an additional print statement.
# Male vs. Female Factor
bmi = 26.2
sex = 1
new_insurance_cost = 250 * age - 128 * sex + 370 * bmi + 425 * num_of_children + 24000 * smoker - 12500
change_in_insurance_cost = new_insurance_cost - insurance_cost
print("The change in estimated cost for being male instead of female is " +str(change_in_insurance_cost) + " dollars.")
Our list of print statements is now complete with all of our findings!
This person's insurance cost is 5469.0 dollars
The change in cost of insurance after increasing the age by 4 years is 1000.0.
The change in estimated insurance cost after increasing BMI by 3.1 is 1147.0 dollars.
The change in estimated cost for being male instead of female is -128.0 dollars.
A few very elementary conclusions we can draw from these findings are that growing older and gaining weight are factors that cause the estimated cost of insurance to increase. According to these findings, the expected insurance costs of a male are less than a female if all other factors are equal. Real world analysis would likely require more data points than we were provided here but, these conclusions seem reasonable within the constraints of this exercise.
There you have it. A walk-through of my solution to the Codecademy Pro Data Analyst Career Path--Python Syntax: Medical Insurance Project. Writing this article definitely helped me to cement this knowledge in my brain and I hope you will find it helpful in some way as well.