PACKAGE PRICING AT MISSION HOSPITAL CASE SOLUTION
SECTION-A Isha Walia – 025 Ishu Bhardwaj – 026 Karan Kakkar - 030 Vinit Durshetti - 061 Gaurav Kushwaha – 371
Checking distribution of response variable (total cost to hospital)
Log (TOTAL.COST.TO.HOSPITAL) looks closer to normal distribution so transform the response
variable of “total cost to hospital” to Log (TOTAL.COST.TO.HOSPITAL) Exploratory Analysis of Log (TOTAL.COST.TO.HOSPITAL) with respect to predictors AGE, GENDER and MARITAL STATUS
Log (TOTAL.COST.TO.HOSPITAL) seems to increase with AGE, for male gender it is more and married people it is more
Fitting Linear Regression Model 1. Develop a suitable simple linear regression model to check if there is any relationship
between “Total Cost to Hospital” and “AGE”. For the fitted model, interpret the regression coefficient corresponding to “AGE”. A1. On fitting linear regression model for log (TOTAL.COST.TO.HOSPITAL) with AGE predictor
Min 1Q Median -1.51748 -0.24402 -0.00536
3Q 0.25388
Max 1.39912
Estimate Std. Error t value Pr (>|t|) (Intercept) 11.814724 0.043326 272.693 < 2e-16 *** AGE 0.008565 0.001118 7.662 4.21e-13 *** --Signif. Codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.455 on 246 degrees of freedom Multiple R-squared: 0.1927, Adjusted R-squared: 0.1894 F-statistic: 58.7 on 1 and 246 DF, p-value: 4.212e-13 The regression coefficient suggests that as the age increases by one year, the cost to the hospital in creases by 1.99% of the previous cost. Thus the increase in cost in geometric.
2. At the time of admission, suppose a patient’s age is 50 years. Based on the fitted model in (1), what will be the minimum cost of treatment for this patient at 95% confidence level? A2. On predicting at an AGE=50 with 95% confidence interval fit lwr upr 1 12.24298 11.34373 13.14223 On back transformation
fit lwr upr 1 207519.8 84434.41 510034.3 So, minimum cost for hospital for a patient of age 50 can be 8443 4.41
3. Suppose Mission Hospital is planning to introduce a package price for the treatment and has decided to charge INR 250,000 for patients of age 50 years. What is the probability that the treatment cost will exceed the package price? Do you think that the Mission Hospital should revise the package price? On finding probability of cost to be more than INR 250,000
P (>250000) = 0.6588341 That is 65% probability is that the cost will be more than INR 2 50,000 Since the probability of exceeding the package price is more than 50%, the hospital might end up getting a loss. The package price should be such that the probability of exceeding is 50% and the probability of the cost being less than the price is 50%. Thus, in that case the hospital would not suffer any extra cost.
4. Build a simple linear regression model between “Total Cost to Hospital” and “GENDER”. Interpret the results.
Residuals: Min 1Q Median -1.31142 -0.28273 -0.08258
3Q 0.26109
Max 1.57082
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 11.93436 0.05503 216.865 < 2e-16 *** GENDERM 0.19082 0.06726 2.837 0.00493 ** --Signif. Codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.4983 on 246 degrees of freedom Multiple R-squared: 0.03168, Adjusted R-squared: 0.02774 F-statistic: 8.048 on 1 and 246 DF, p-value: 0.004934 The results of the regression model suggest that the cost to the hospital for a male is 55% higher for a male than a female.
5. Build a simple linear regression model between “Total Cost to Hospital” and “MARITAL STATUS”. Interpret the results.
Residuals: Min 1Q Median -1.3608 -0.2360 -0.0334
3Q 0.2396
Max 1.4042
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.29182 0.04466 275.229 <2e-16 *** MARITAL.STATUSUNMARRIED -0.40697 0.05944 -6.847 6e-11 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.4641 on 246 degrees of freedom Multiple R-squared: 0.1601, Adjusted R-squared: 0.1566
F-statistic: 46.88 on 1 and 246 DF, p-value: 5.998e-11 For the model we can infer that people who are married lead to higher cost to the ho spital by 1.55 times.
6. Build a multiple linear regression model with “Total Cost to Hospital” as dependent variable, and “AGE, “GENDER” and “MARITAL STATUS” as predictors. Compare the results with that of (4) and (5).
Residuals: Min 1Q Median -1.5285 -0.2603 -0.0104
3Q 0.2470
Max 1.3529
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 11.790187 0.151136 78.011 < 2e-16 *** AGE 0.007637 0.002555 2.989 0.00308 ** GENDERM 0.104211 0.062490 1.668 0.09667 . MARITAL.STATUSUNMARRIED -0.032630 0.132570 -0.246 0.80578 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.4543 on 244 degrees of freedom Multiple R-squared: 0.2019, Adjusted R-squared: 0.1921 F-statistic: 20.58 on 3 and 244 DF, p-value: 6.394e-12
On running the model with age, gender and marital status we discover that the coefficient of age hasn’t changed much but there has been a significant change in the coefficients of gender and marital status.
Mean cost to hospital Married Unmarried
Male 2,62,198 1,62,737
Female 2,15,579 1,51,930
The above table suggests that married males cost higher to the hospital than expected. Thus there is an interaction effect between gender and marital status leading to the change in coefficients.
7. Build a multiple linear regression model with appropriate set of predictors. Identify the statistically significant predictors that the Mission Hospital can use in predicting “Total Cost to Hospital”. Comment on the performance of the fitted model. How doe s the fitted model help Mission Hospital to take managerial decisions? A7.
Residuals: Min 1Q Median -0.96533 -0.18093 -0.01659
3Q 0.19462
Max 1.19165
Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 10.3784828 0.4924860 21.074 < 2e-16 AGE 0.0085850 0.0030825 2.785 0.006015 GENDERM 0.0410937 0.0716926 0.573 0.567339 MARITAL.STATUSUNMARRIED 0.0964430 0.1444585 0.668 0.505364 ACHD 0.0606913 0.1454933 0.417 0.677148 CAD.DVD 0.4675391 0.1300201 3.596 0.000433 CAD.SVD 0.3492459 0.3141862 1.112 0.268025 CAD.TVD 0.3441462 0.1408546 2.443 0.015670 CAD.VSD 0.3220618 0.4186867 0.769 0.442926 OS.ASD 0.2303903 0.1517427 1.518 0.130964 other..heart 0.2947377 0.1152326 2.558 0.011488 other..respiratory 0.0736222 0.2061631 0.357 0.721494 other.general -1.6289222 0.4634972 -3.514 0.000577 other.nervous 0.6509382 0.4193210 1.552 0.122602 other.tertalogy 0.3684828 0.1693884 2.175 0.031108 PM.VSD 0.2809374 0.2406915 1.167 0.244907 RHD 0.5645466 0.1333216 4.234 3.9e-05 BODY.WEIGHT 0.0022855 0.0037020 0.617 0.537890 BODY.HEIGHT 0.0005591 0.0016910 0.331 0.741381 HR.PULSE 0.0050994 0.0019315 2.640 0.009129 BP..HIGH -0.0021987 0.0023049 -0.954 0.341603 BP.LOW -0.0005388 0.0032198 -0.167 0.867311 RR 0.0173013 0.0090719 1.907 0.058343 Diabetes1 -0.0931856 0.1643344 -0.567 0.571496 Diabetes2 0.2090071 0.1756235 1.190 0.235820 hypertension1 -0.0623585 0.1217057 -0.512 0.609116 hypertension2 -0.2203463 0.1496889 -1.472 0.143028 hypertension3 0.1137384 0.1999772 0.569 0.570339 other -0.0703775 0.1239298 -0.568 0.570932 HB 0.0027892 0.0118002 0.236 0.813456 UREA 0.0008210 0.0026521 0.310 0.757307 CREATININE 0.2667857 0.1271125 2.099 0.037444 AMBULANCE 0.1048268 0.3199244 0.328 0.743607 TRANSFERRED -0.2662347 0.2261663 -1.177 0.240923 ALERT NA NA NA NA ELECTIVE 0.0878894 0.3115261 0.282 0.778221 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3965 on 156 degrees of freedom (57 observations deleted due to missingness) Multiple R-squared: 0.5307, Adjusted R-squared: 0.4285
*** **
*** * * *** * *** ** .
*
F-statistic: 5.19 on 34 and 156 DF, P-value: 5.174e-13 The significant variables are age, complaint codes, pulse of the patient at time of admission and creatinine level of patient. The R square of the fitted model is only 53% while the adjusted R square is 42.8%. This suggests that the model is not a very good predicted of the cost to hospital. Using the above model, the managers could identify factors that lead to higher costs and hence build packages that would not lead to the actual cost exceeding the package cost.
APPENDIX: R CODE