ax <- list(
zeroline = FALSE,
showline = FALSE,
showticklabels = FALSE,
showgrid = FALSE
)
final_dataset %>%
group_by(disease, funding, disease_type) %>%
summarize(avg_funding = mean(funding, na.rm = TRUE)) %>%
plot_ly(x = ~ disease, y = ~ funding, color = ~disease_type, type = "scatter") %>%
layout(title = "Average NIH Funding (millions USD) For Chronic vs Acute Diseases", xaxis = ax)
#Pricing by disease status (boxplot)
final_dataset %>%
filter(!is.na(disease) & !is.na(avg_price)) %>%
mutate(disease_type = fct_reorder(disease_type, avg_price)) %>%
group_by(disease_type) %>%
plot_ly(y = ~avg_price, type = "box", color = ~disease_type, colors = "Set3")
acute = final_dataset %>%
filter(!is.na(disease) & !is.na(avg_price)) %>%
filter(disease_type == "acute")
chronic = final_dataset %>%
filter(!is.na(disease) & !is.na(avg_price)) %>%
filter(disease_type == "chronic")
wilcox_results_price = wilcox.test(acute$avg_price, chronic$avg_price, alternative = "two.sided", mu = 0, paired = FALSE)
t_test_results = t.test(acute$avg_price, chronic$avg_price, var.equal = TRUE, paired = FALSE)
The Wilcoxon Rank-Sum Test comparing median average price of the disease types rejects the null hypothesis that the median average prices of drugs that treat acute and chronic diseases are equal. Therefore we conclude that at the 0.05 significance level, the median average price of acute funding is statistically different from the median funding of chronic diseases.
For curiosity’s sake, let’s say we assumed that the two samples are normally distributed, independent and have equal variances. Therefore, we would conduct the two sample independent t-test. It’s important to note that the validity of this test may be compromised because the funding variable violates the normality assumption. This test’s conclusion aligns with the Wilcoxon Rank-Sum Test.
In conclusion, we can conclude that the mean/median average price of drugs that treat acute diseases is statistically different from the mean/median average price of drugs that treat chronic diseases.
Drugs that treat acute conditions have a mean “average price” of $173.43 per dose and chronic diseases have a mean “average price” of $125.39 per dose.
Based on the boxplots below, we see that acute conditions have a wider interquartile range compared to chronic conditions: there is greater variability in acute condition funding compared to chronic. 25% of acute conditions have a funding of $18 million or less. 25% of acute conditions receive funding of more than $313 million or more. The median funding for acute conditions is $50 million per year. Acute funding is right skewed.
Comparatively, chronic conditions have a lower variability in funding, but have more outliers. 25% of chronic conditions have a funding of $224 million or less. 25% of chronic conditions receive funding of more than $272.5 million or more. The median funding for chronic conditions is $253 million per year. Chronic funding has an approximately normal distribution in funding.
#acute vs. chronic (boxplot)
#Funding by disease status
final_dataset %>%
filter(!is.na(disease) & !is.na(funding)) %>%
mutate(disease_type = fct_reorder(disease_type, funding)) %>%
group_by(disease_type) %>%
plot_ly(y = ~funding, type = "box", color = ~disease_type, colors = "Set3") %>%
layout(title = "Distribtuion NIH Funding (millions USD) for Acute vs Chronic Diseases")
The Wilcoxon Rank-Sum Test comparing median funding of the disease types rejects the null hypothesis that the median fundings of acute and chronic diseases are equal. Therefore we conclude that at the 0.05 significance level, the median funding of acute funding is statistically different from the median funding of chronic diseases.
The two sample independent t-testconclusion aligns with the Wilcoxon Rank-Sum Test.
In conclusion, we can conclude that the mean/median of chronic funding of $323.68 million/year is statistically different from the mean/median of acute funding of $184.51 million/year.