|
"P(X | \\hat{f}_{\\beta}) = \\prod_{\\alpha = 1}^{n} P(X_{\\alpha}|\\hat{f}_{\\beta}(X)), \\alpha = 1,\\ldots,n\n", |
The notebook useses the P(X | ... ) notation, which I would interpret as the conditional probability of the data. However, linear models would typically be used for continuous response data where P(X_i = | ... ) is zero. Instead, one would use the densities, i.e. small p or f.
Furthermore, since a product is used, this implies that the observations are independent from each other. Hence, as written a little further down:
OLS: - assumes that the errors have a mean of zero, constant variance and are independent of eachother (no correlation in error).
Is incomplete, because the same was assumed for the ML approach.
Altogether, I find that the post a little confusion. As far as I know: For a Gaussian response distribution with KNOWN $\sigma$ the OLS and MLE should be identical. I fail to completely understand what the exact data generating mechanism is in the example due to a lot of code, but for a simple normal X_1,...,X_n \iid N(\mu, \sigma^2) there are explicit solutions available? As a suggestion: Maybe write the data generating mechanism clearer in math notation.
DataScienceInteractivePython/Interactive_Model_Fitting.ipynb
Line 56 in adf0515
The notebook useses the P(X | ... ) notation, which I would interpret as the conditional probability of the data. However, linear models would typically be used for continuous response data where P(X_i = | ... ) is zero. Instead, one would use the densities, i.e. small p or f.
Furthermore, since a product is used, this implies that the observations are independent from each other. Hence, as written a little further down:
OLS: - assumes that the errors have a mean of zero, constant variance and are independent of eachother (no correlation in error).
Is incomplete, because the same was assumed for the ML approach.
Altogether, I find that the post a little confusion. As far as I know: For a Gaussian response distribution with KNOWN$\sigma$ the OLS and MLE should be identical. I fail to completely understand what the exact data generating mechanism is in the example due to a lot of code, but for a simple normal X_1,...,X_n \iid N(\mu, \sigma^2) there are explicit solutions available? As a suggestion: Maybe write the data generating mechanism clearer in math notation.