Combined Parameter-State Estimation Algorithms

In many cases, we do not know the model of the signal exactly; the model may have unknown parameters which cannot be determined a priori  They can be incorporated into the filtering problem by increasing the dimension of the signal and this can be an effective strategy when used with REST (and its automated refining grid property).  However, in conjunction with other methods, it is cumbersome and slow.  In the sequel, we refer to this technique as parameter inclusion filtering.

A known strategy for estimating parameters within filtering problems is to apply the Expected Maximum (EM) algorithm between observations to estimate the parameters as in Dembo and Zeitouni (1986).  However, this relies on an iterative convergence between the observations with no fixed upper bound on the computations required, it is only known to apply to a specific class of models, and a general consistency result for the parameter estimates has not been proven.  Another interesting parameter estimation strategy involving the EM algorithm is the Hidden Markov Model (HMM) approach (see e.g. Elliott (2002)).  Here, in a very limited setting, it is possible to come up with equations and an expanded filter that provide the solution to the combined maximum-likelihood parameter estimation and filtering problem. The parameter estimates are proven to converge to at least a local minimum.  Still, neither of these methods on their own yields a satisfactory solution to our problems, but we are investigating the possibility of approximating the signal with a finite-dimensional Markov chain and applying the HMM approach.  Of course, this involves more approximations and is done on a case-by-case basis to ensure the desired parameters appear properly in the approximation. The final known strategy that we are investigating is that of the Generalized Method of Moments (GMM) studied by Hansen (1982). This approach is intriguing since the parameter estimates can be constructed without the filtering algorithm and there is no cross dependence between the parameter estimation and filtering, simplifying convergence proofs.  Still, this method is also not completely general (for example, we require as a condition invertibility of the sensor function h) and requires many problem-dependent calculations or the convergence of an iteration scheme between observations.  Hence, alternative strategies are still actively sought.

Hubert Chan, Michael Kouritzin, and Hongwei Long have developed a combined least square parameter estimation filtering algorithm that works very well in practice for a variety of signals evolving over a compact space. The signal has to be enlarged to include derivatives of the original signal but this is almost expected by analogy to the HMM and the GMM methods. The filter for this enlarged signal contains enough information to produce proper least squares estimates. The algorithm is designed to minimize (in the unknown parameters) the weighted mean square error between the new observation and the predicted observation based on the model and the past observations.  Hence, our method can be thought of as an extension of L. Ljung's scheme for partially observable systems.  Currently, we allow parameters in both the drift and diffusion terms in the signal but assume that the observation noise is independent of the signal and is additive. The implementation therefore involves predicting the signal one step in advance, which already is calculated in filtering algorithms just prior to doing the Bayes' rule update. However, to make the whole algorithm recursive we must substitute the distribution that we have calculated in terms of all prior parameter estimates in place of the desired filtering distribution, calculated assuming that the current estimate is the true parameter value. This problem makes the mathematical proof of combined long-time convergence of the parameter, filtering estimator pair extremely challenging.  Still, if the parameter estimates are close to the true value and not changing dramatically and the filter forgets earlier estimates, then you can argue that the near future estimates should remain close to the true value. Hence, there is reason to be optimistic but new mathematical methods definitely need to be employed to construct a proof.  We discuss our ongoing work below.