Optimization to the Truncated Regression Model

### Improving the truncated regression model (abbreviated as TRM) by fixing boundary violations

This research only has one goal: try to fix boundary violations when the truncated regression model is applied. For instance, a party vote share is naturally confined within a range from 0% to 100%. If our model gives a predicted value below 0% or above 100%, this result is a boundary violation and under no circumstance can this happen. Applying the technique of constrained optimization in nonlinear programming, I successfully develop a revised TRM model that solves the boundary violation problem.

### Don't confuse the truncated regression with the censored regression

Truncated Regression differs from censored regression in two aspects. First, according to Takeshi Amemiya (1984: 3):

"Tobit models refer to regression models in which the range of the dependent
variable is constrained in some way. In economics, such a model was first suggested in a pioneering work by
Tobin (1958). He analyzed household expenditure on durable goods using a
regression model which specifically took account of the fact that the expenditure (the dependent variable
of his regression model) cannot be negative. Tobin called his model the model of *limited *dependent
variables. It and its various generalizations are known popularly among economists as *Tobit models,
*a phrase coined by Goldberger (1964) because of similarities to
*probit models. *These models are also known as *censored *or *truncated *regression
models. The model is called *truncated *if the observations outside a specified range are totally
lost and *censored *if one can at least observe the exogenous variables."

In other words, we throw out the information of DV and IV for the observations that have
inadmissible DV values if the truncated regression model is applied. But if we apply the censored regression
model, we keep these observations by changing the inadmissible values to the nearest censoring point. Therefore,
* two models assume different data-generating mechanisms.*

Second, since two models have different data-generating mechanisms, they have different
likelihood functions. Let the dependent variable has an upper and lower limit $b$ and $a$, the likelihood
function for truncated regression is

\begin{equation*}

L_{TRM}=\prod\limits_{i=1}^{n}{\left\{\frac{\exp\left(\frac{-\left( {{y}_{i}}-\boldsymbol{{{x}_{i}}\beta}
\right)}{2{{\sigma }^{2}}} \right)}{\int_{a}^{b}{\exp \left( \frac{-\left( y-\boldsymbol{{{x}_{i}}\beta}
\right)}{2{{\sigma }^{2}}} \right)dy}} \right\}}.

\end{equation*}

On the other hand, the likelihood function for censored regression is

\begin{equation*}

L_{CRM}=\prod\limits_{i=1}^{n}\Phi {{\left( \frac{a-\boldsymbol{{{x}_{i}}\beta} }{\sigma}\right)}^{{{d}_{l}}}}
{{\left(1-\Phi \left( \frac{b-\boldsymbol{{{x}_{i}}\beta} }{\sigma } \right) \right)}^{{{d}_{u}}}}
{{\left( \frac{1}{\sigma }\phi \left( \frac{y_{i}-\boldsymbol{{{x}_{i}}\beta} }{\sigma }
\right) \right)}^{1-{{d}_{l}}-{{d}_{u}}}},

\end{equation*}

where the indexed variable is defined as

\begin{equation*}

{d}_{l}=

\begin{cases}

1 &\text{if $y_{i}<a$}\\

0 &\text{if $y_{i}>a$,}

\end{cases}

,\qquad {d}_{u}=

\begin{cases}

1 &\text{if $y_{i}>b$}\\

0 &\text{if $y_{i}<b$.}

\end{cases}

\end{equation*}

Apparently, truncated regression and censored regression are two different models.

### Don't confuse the truncated regression with the regression with a latent variable

Some might argue that we can simply assume a latent untruncated normal distribution
and add a linking function to transform all the predicted DV values into admissible values. In this way,
we don't need to apply the truncated regression model. This argument has nothing wrong, but it simply
assumes another data-generating mechanism and therefore adopts a different likelihood function.
Therefore, what this argument proposes is beyond the scope of my research because my work intends to
improve the current truncated regression model without assuming an extra data-generating mechanism such
as censoring or selection bias. We should not compare apples and oranges. ^{i}

### Don't confuse the truncated regression with the regression with a selection mechanism

The Heckman model (Heckman, 1979) is a popular model recent year in political science literature. However, it is designed to handle the selection bias problem when $y$ (DV) and $x$ (IV) is selected based the value of $z$ (selection variable), where $z$ is subject to a censoring mechanism. Specifically, the Heckman model assumes a bivariate normal distribution of the error terms for the selection and outcome regressions with a correlation coefficient. (Sigelman and Zeng, 1999, 177) The empirical truncation of the outcome dependent variable depends on the selection mechanism that is specified in the Heckman model. Apparently, the censored and sample-selected models do not directly assume the dependent variable as univariate truncated normal, and they both add an additional assumption to the data-generating process that might not be true. Therefore, the truncated regression and the Heckman model are completely different.

^{i} In contemporary statistical science, the likelihood
theory is a crucial paradigm of inference for data analysis. (Royall, 1997, xiii)
It provides an unifying approach of statistical modeling to both frequentists and Bayesians with the criterion
of maximum likelihood. (Azzalini, 1996) The rapid development of political
methodology in the last two decades also witnesses the establishment of the likelihood paradigm in the scientific
study of politics. (King, 1998) As a model of inference, the fundamental assumption
of the likelihood theory is the likelihood principle, stating that "all evidence, which is obtained from an
experiment, about an unknown quantity $\theta$, is contained in the likelihood function of $\theta$ for the given
function." (Berger and Wolpert,1984, vii) In other words, given the fact
that the likelihood function is defined by the probability density (or mass) function, we must make a distributional
assumption of the dependent variable to derive a likelihood function. The plausibility of such a distributional
assumption is therefore vital to the validity of statistical inference. And the assumption of the data-generating
mechanism largely determines which distribution assumption should be applied.
(This footnote originally appears in my another artcile "Solving Problems in the Panel Regression Model for
Truncated Dependent Variables: A Constrained Optimization Method")