Image for post
Image for post

Correlation in data

Dataset: Our dataset was cobbled together from monthly average Ice Cream Production in 2011 and the average monthly temperature for the US.File: correlation/weatherIceCream.csv 
Image for post
Image for post
File: Correlation/scatters1.py#Import Libraries
from bokeh.models import HoverTool
from bokeh.plotting import figure, show, output_file
import pandas as pd
# Read Data
df = pd.read_csv("correlation/weatherIceCream.csv", usecols=['Date','AVG temp C','Ice Cream production'])
hover = HoverTool(tooltips=[
("(Temp,Ice Cream Production", "($x, $y)")
])
p = figure(x_range=(-10, 30),y_range=(35, 90), tools=[hover])# Main chart definition
p.scatter(df['AVG temp C'], df['Ice Cream production'],size=10)
p.background_fill_color = "mintcream"
p.background_fill_alpha = 0.2
p.xaxis.axis_label = "Avg Temp C"
p.yaxis.axis_label = "Ice Cream Production (1000, Gallons)"
show(p)
Image for post
Image for post
Check the live chart or run it from the code sample/github, you can hover over the data points,which makes it easier to understand the relationships.

Pearsons Correlation Coefficient ( PCC ) ,Pearson’s r or simply correlation coefficient.

File: pearsonr.pyfrom scipy.stats import pearsonr
import pandas as pd
# Read Data
df = pd.read_csv("correlation/weatherIceCream.csv", usecols=['Date','AVG temp C','Ice Cream production'])
# pearson in the hauz !
print(pearsonr(df['AVG temp C'],df['Ice Cream production']))
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post

Causation and other traps

Image for post
Image for post
Image for post
Image for post

Conclusion

About the Author :

Written by

AI, Software Developer, Designer : www.k3no.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store