I made this quick and dirty dashboard out of some SAMHSA datasets on drug use.
(For you data tech stack nerds out there, I mainly used Tableau — as you can probably tell — and Excel for this, but there was even a bit of SQL and Python for this. I probably could have just used Tableau and Excel, but I like making things harder for myself.)
There is a veritable goldmine of data on the the SAMHSA website for those of us data nerds who are also interested in what ails society.
I have a lot of questions here, which I have already started some data analysis to figure out the answers to – results to be posted “shortly.” (Schedule permitting.) There are some datasets that don’t include marijuana, which I think is appropriate for a lot of reasons.
For example, if I am not mistaken, all of the top 10 highest drug use states are states that have legal recreational cannabis. However, because SAMHSA is a federal entity, and marijuana is still federally illegal, it is counted in the “all illegal drugs” datasets.
So I think it’s safe to venture a guess that rates of state-legal cannabis use are “tainting” this particular dataset. It doesn’t take a complex machine learning algorithm to figure that out.
Vermont being #1 is an immediate red (or green?) flag for that.
While they do have a severe opioid problem (what state doesn’t?), the Green Mountain State’s most famous company is Ben and Jerry’s, people. That counts for some kind of data point.
I’m going to put up some other analyses that cross-reference the other datasets, and some others containing comparisons of age groups. I even have some correlations (!) done.
Like a real data scientist!