An improved unbiased particle filter

Ajay Jasra; Mohamed Maama; Hernando Ombao

doi:10.1515/mcma-2023-2024

Artikel

An improved unbiased particle filter

Ajay Jasra , Mohamed Maama und Hernando Ombao

Veröffentlicht/Copyright: 15. Dezember 2023

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen Erkunden Sie dieses Fachgebiet

Aus der Zeitschrift Monte Carlo Methods and Applications Band 30 Heft 2

Abstract

In this paper, we consider the filtering of partially observed multi-dimensional diffusion processes that are observed regularly at discrete times. We assume that, for numerical reasons, one has to time-discretize the diffusion process, which typically leads to filtering that is subject to discretization bias. The approach in [A. Jasra, K. J. H. Law and F. Yu, Unbiased filtering of a class of partially observed diffusions, Adv. Appl. Probab. 54 (2022), 3, 661–687] establishes that, when only having access to the time discretized diffusion, it is possible to remove the discretization bias with an estimator of finite variance. We improve on this method by introducing a modified estimator based on the recent work [A. Jasra, M. Maama and H. Ombao, Antithetic multilevel particle filters, preprint (2023), https://arxiv.org/abs/2301.12371]. We show that this new estimator is unbiased and has finite variance. Moreover, we conjecture and verify in numerical simulations that substantial gains are obtained. That is, for a given mean square error (MSE) and a particular class of multi-dimensional diffusion, the cost to achieve the said MSE falls.

Keywords: Unbiased estimation; particle filters; diffusion processes; filtering

MSC 2010: 65C05; 65C60; 60G05; 60G35; 91G60

Funding statement: All authors were supported by KAUST baseline funding.

A Proofs

To prove our results, we use two major assumptions 1 and 2. These are given below.

A.1 Some notation

Let ( V , V ) be a measurable space. For φ : V → R , we write B b ⁢ ( V ) as the collection of bounded measurable functions. For φ ∈ B b ⁢ ( V ) , we write the supremum norm ∥ φ ∥ ∞ = sup x ∈ V | φ ⁢ ( x ) | . For a measure 𝜇 on ( V , V ) and a function φ ∈ B b ⁢ ( V ) , the notation μ ⁢ ( φ ) = ∫ V φ ⁢ ( x ) ⁢ μ ⁢ ( d ⁢ x ) is used. For A ∈ V , the Dirac measure is written as δ A ⁢ ( d ⁢ x ) . If K : V × V → [ 0 , ∞ ) is a non-negative operator and 𝜇 is a measure, we use the notation μ ⁢ K ⁢ ( d ⁢ y ) = ∫ V μ ⁢ ( d ⁢ x ) ⁢ K ⁢ ( x , d ⁢ y ) , and for φ ∈ B b ⁢ ( V ) , K ⁢ ( φ ) ⁢ ( x ) = ∫ V φ ⁢ ( y ) ⁢ K ⁢ ( x , d ⁢ y ) . We denote (throughout) 𝐶 as a generic finite constant whose value may change upon each appearance and whose dependencies (on model and simulation parameters) are clear from the statements associated to them.

We write X 2 = R d × d , and 𝔼 denotes the expectation with respect to the law of our simulated algorithm. The assumptions are as follows.

- For each ( i , j ) ∈ { 1 , … , d } , α i ∈ B b ⁢ ( X ) , β i ⁢ j ∈ B b ⁢ ( X ) .
- α ∈ C 2 ⁢ ( X , X ) , β ∈ C 2 ⁢ ( X , X 2 ) .
- β ⁢ ( x ) ⁢ β ⁢ ( x ) ⊤ is uniformly positive definite.
- There exists a C < + ∞ such that, for any ( x , i , j , k , m ) ∈ X × { 1 , … , d } 4 ,
  max ⁡ { | ∂ α i ∂ x m ⁢ ( x ) | , | ∂ β i ⁢ j ∂ x m ⁢ ( x ) | , | ∂ h i ⁢ j ⁢ k ∂ x m ⁢ ( x ) | , | ∂ 2 α i ∂ x k ⁢ ∂ x m ⁢ ( x ) | , | ∂ 2 β i ⁢ j ∂ x k ⁢ ∂ x m ⁢ ( x ) | } ≤ C .
- For each k ∈ N , g k ∈ B b ⁢ ( X ) ∩ C 2 ⁢ ( X , R ) .
- For each k ∈ N , there exists a 0 < C < + ∞ such that, for any x ∈ X , g k ⁢ ( x ) ≥ C .
- For each k ∈ N , there exists a 0 < C < + ∞ such that, for any ( x , j , m ) ∈ X × { 1 , … , d } 2 ,
  max ⁡ { | ∂ g k ∂ x j ⁢ ( x ) | , | ∂ 2 g k ∂ x j ⁢ ∂ x m ⁢ ( x ) | } ≤ C .

A.2 Technical results

The following result is essentially [15, Proposition A.1] and can be proved in the same manner.

Lemma A.1

Assume 1–2. For any k ∈ N , there exists a C < + ∞ such that, for any

( p , N p , L ¯ , φ ) ∈ N 0 × N × N 0 × B b ⁢ ( X ) ,

we have

E ⁢ [ ( π ̂ k N p , L ¯ ⁢ ( φ ) − π k L ¯ ⁢ ( φ ) ) 2 ] ≤ C ⁢ ∥ φ ∥ ∞ 2 N p ⁢ ( 1 + p 2 N p ) .

Below, we denote by η k s the discretized predictor at time k ∈ N , at level s ∈ N 0 , that is, for any φ ∈ B b ⁢ ( X ) ,

η k s ⁢ ( φ ) = π k − 1 l ⁢ ( P l ⁢ ( φ ) )

with the convention that π − 1 s ⁢ ( d ⁢ x ) = δ { x 0 } ⁢ ( d ⁢ x ) .

Lemma A.2

Assume 1–2. For any ( k , φ ) ∈ N × B b ⁢ ( X ) ∩ C b 2 ⁢ ( X , R ) , there exists a C < + ∞ such that, for any ( p , N p , l , ε ) ∈ N 0 × N 2 × ( 0 , 1 2 ) ,

E ⁢ [ ( [ 1 2 ⁢ η k N 0 : p , l + 1 2 ⁢ η k N 0 : p , l , a − η k N 0 : p , l − 1 ] ⁢ ( φ ) − [ η k l − η k l − 1 ] ⁢ ( φ ) ) 2 ] ≤ C ⁢ ( Δ l N p + p 2 ⁢ Δ l 1 2 − ε N p 2 ) .

Proof

This follows by combining the proof of [15, Proposition A.1] with [16, Lemmata C.9 and C.11] along with some simple calculations which are omitted for brevity. ∎

Remark A.1

One can prove, in a similar manner to Lemma A.2, the following results. For the first result, one needs [16, Lemmata C.8 and C.10] and for the second [16, Remarks C.4 and C.5].

Assume 1–2. For any ( k , φ ) ∈ N × B b ⁢ ( X ) ∩ C b 2 ⁢ ( X , R ) , there exists a C < + ∞ such that, for any
( p , N p , l , ε ) ∈ N 0 × N 2 × ( 0 , 1 2 ) ,
we have
E ⁢ [ ( [ η k N 0 : p , l − η k N 0 : p , l − 1 ] ⁢ ( φ ) − [ η k l − η k l − 1 ] ⁢ ( φ ) ) 2 ] ≤ C ⁢ ( Δ l 1 2 N p + p 2 ⁢ Δ l 1 2 − ε N p 2 ) .
Assume 1–2. For any ( k , φ ) ∈ N × B b ⁢ ( X ) ∩ C b 2 ⁢ ( X , R ) , there exists a C < + ∞ such that, for any
( p , N p , l , ε ) ∈ N 0 × N 2 × ( 0 , 1 2 ) ,
we have
E ⁢ [ ( [ η k N 0 : p , l − η k N 0 : p , l , a ] ⁢ ( φ ) ) 2 ] ≤ C ⁢ ( Δ l 1 2 N p + p 2 ⁢ Δ l 1 2 − ε N p 2 ) .

Lemma A.3

Assume 1–2. For any ( k , φ ) ∈ N × B b ⁢ ( X ) ∩ C b 2 ⁢ ( X , R ) , there exists a C < + ∞ such that, for any ( p , N p , l , ε ) ∈ N 0 × N 2 × ( 0 , 1 2 ) ,

E ⁢ [ | [ η k N 0 : p , l − η k N 0 : p , l − 1 ] ⁢ ( φ ) − [ η k l − η k l − 1 ] ⁢ ( φ ) | 4 ] 1 4 ≤ C ⁢ 2 p 2 ⁢ Δ l 1 4 ⁢ ( 1 2 − ε ) N p .

Proof

This follows by using the definition of η k N 0 : p , s with Minkowski’s inequality ( p + 1 ) times and then using [16, Lemma C.8] along with some simple calculations; the proof is omitted. ∎

Remark A.2

Using a similar approach to the proof of Lemma A.3 except using [16, Remark C.4] in place of [16, Lemma C.8], one can easily deduce the following result. Assume 1–2. For any

( k , φ ) ∈ N × B b ⁢ ( X ) ∩ C b 2 ⁢ ( X , R ) ,

there exists a C < + ∞ such that, for any ( p , N p , l , ε ) ∈ N 0 × N 2 × ( 0 , 1 2 ) ,

E ⁢ [ | [ η k N 0 : p , l − η k N 0 : p , l , a ] ⁢ ( φ ) | 4 ] 1 4 ≤ C ⁢ 2 p 2 ⁢ Δ l 1 4 ⁢ ( 1 2 − ε ) N p .

Lemma A.4

Assume 1–2. For any ( k , φ ) ∈ N × B b ⁢ ( X ) ∩ C b 2 ⁢ ( X , R ) , there exists a C < + ∞ such that, for any ( p , N p , l , ε ) ∈ N 0 × N 2 × ( 0 , 1 2 ) ,

E ⁢ [ ( [ π k l − π k l − 1 ] ̂ N p ⁢ ( φ ) − [ π k l − π k l − 1 ] ⁢ ( φ ) ) 2 ] ≤ C ⁢ ( Δ l N p + p 2 ⁢ Δ l 1 2 − ε N p 2 + 2 2 ⁢ p ⁢ Δ l 1 2 − ε N p 4 ) .

Proof

The result follows by combining Lemmata A.2, A.3 and Remarks A.1, A.2 with [16, Lemma C.4] as we will now detail. By using [16, Lemma C.4] along with the C 2 -inequality four times, we have the decomposition

E ⁢ [ ( [ π k l − π k l − 1 ] ̂ N p ⁢ ( φ ) − [ π k l − π k l − 1 ] ⁢ ( φ ) ) 2 ] ≤ C ⁢ ∑ j = 1 4 T j ,

where

T 1 = E ⁢ [ ( 1 η k N 0 : p , l − 1 ⁢ ( g k ) ⁢ [ 1 2 ⁢ η k N 0 : p , l + 1 2 ⁢ η k N 0 : p , l , a − η k N 0 : p , l − 1 ] ⁢ ( g k ⁢ φ ) − 1 η k l − 1 ⁢ ( g k ) ⁢ [ 1 2 ⁢ η k l + 1 2 ⁢ η k l − η k l − 1 ] ⁢ ( g k ⁢ φ ) ) 2 ] ,

T 2 = E ⁢ [ ( 1 η k N 0 : p , l ⁢ ( g k ) ⁢ η k N 0 : p , l − 1 ⁢ ( g k ) ⁢ 1 2 ⁢ { [ η k N 0 : p , l − η k − 1 N 0 : p , l , a ] ⁢ ( g k ⁢ φ ) } ⁢ { [ η k N 0 : p , l − 1 − η k N 0 : p , l ] ⁢ ( g k ) } ) 2 ] ,

T 3 = E ⁢ [ ( 1 2 ⁢ η k N 0 : p , l , a ⁢ ( g k ⁢ φ ) η k N 0 : p , l , a ⁢ ( g k ) ⁢ η k N 0 : p , l ⁢ ( g k ) ⁢ η k N 0 : p , l − 1 ⁢ ( g k ) ⁢ { [ η k N 0 : p , l , a − η k N 0 : p , l ] ⁢ ( g k ) } ⁢ { [ η k N 0 : p , l − 1 − η k N 0 : p , l ] ⁢ ( g k ) } ) 2 ] ,

T 4 = E [ ( η k N 0 : p , l , a ⁢ ( g k ⁢ φ ) η k N 0 : p , l , a ⁢ ( g k ) ⁢ η k N 0 : p , l − 1 ⁢ ( g k ) [ 1 2 η k N 0 : p , l + 1 2 η k N 0 : p , l , a − η k N 0 : p , l − 1 ] ( g k ) − η k l ⁢ ( g k ⁢ φ ) η k l ⁢ ( g k ) ⁢ η k l − 1 ⁢ ( g k ) × [ 1 2 η k l + 1 2 η k l − η k l − 1 ] ( g k ) ) 2 ] .

In a similar manner, T 1 and T 4 can be treated, so we only consider T 1 ; this is the same for T 2 and T 3 ; hence we only deal with T 2 . Therefore, we bound only T 1 , T 2 and conclude the proof from there.

For T 1 , we have that T 1 ≤ C ⁢ ( T 5 + T 6 ) , where

T 5 = E ⁢ [ ( 1 η k N 0 : p , l − 1 ⁢ ( g k ) ⁢ ( [ 1 2 ⁢ η k N 0 : p , l + 1 2 ⁢ η k N 0 : p , l , a − η k N 0 : p , l − 1 ] ⁢ ( g k ⁢ φ ) − [ 1 2 ⁢ η k l + 1 2 ⁢ η k l − η k l − 1 ] ⁢ ( g k ⁢ φ ) ) ) 2 ] , T 6 = E ⁢ [ ( ( 1 η k N 0 : p , l − 1 ⁢ ( g k ) − 1 η k l − 1 ⁢ ( g k ) ) ⁢ [ 1 2 ⁢ η k l + 1 2 ⁢ η k l − η k l − 1 ] ⁢ ( g k ⁢ φ ) ) 2 ] .

Then T 5 can be controlled using the lower bound on g k and Lemma A.2, and T 6 can be bounded by using the lower bound on g k , [15, Proposition A.1] and [11, Lemma D.2]. Putting these two results together gives

T 1 ≤ C ⁢ ( Δ l 1 2 N p + p 2 ⁢ Δ l 1 2 − ε N p 2 ) .

For T 2 , we have that T 1 ≤ C ⁢ ( T 7 + T 8 ) , where

T 7 = E ⁢ [ ( 1 η k N 0 : p , l ⁢ ( g k ) ⁢ η k N 0 : p , l − 1 ⁢ ( g k ) ⁢ 1 2 ⁢ { [ η k N 0 : p , l − η k − 1 N 0 : p , l , a ] ⁢ ( g k ⁢ φ ) } ⁢ { [ η k N 0 : p , l − 1 − η k N 0 : p , l ] ⁢ ( g k ) − [ η k l − 1 − η k l ] ⁢ ( g k ) } ) 2 ] , T 8 = E ⁢ [ ( 1 η k N 0 : p , l ⁢ ( g k ) ⁢ η k N 0 : p , l − 1 ⁢ ( g k ) ⁢ 1 2 ⁢ { [ η k N 0 : p , l − η k − 1 N 0 : p , l , a ] ⁢ ( g k ⁢ φ ) } ⁢ { [ η k l − 1 − η k l ] ⁢ ( g k ) } ) 2 ] .

For T 7 , we can use the lower bound on g k , Cauchy–Schwarz, Lemma A.3 and Remark A.2. For T 8 , we can use the lower bound on g k , Remark A.1 and [11, Lemma D.2]. Therefore, one can deduce that

T 2 ≤ C ⁢ ( Δ l N p + p 2 ⁢ Δ l 1 2 − ε N p 2 + 2 2 ⁢ p ⁢ Δ l 1 2 − ε N p 4 ) ,

and from here, one can conclude. ∎

References

[1] A. Beskos and G. O. Roberts, Exact simulation of diffusions, Ann. Appl. Probab. 15 (2005), no. 4, 2422–2444. 10.1214/105051605000000485Suche in Google Scholar

[2] J. Blanchet and F. Zhang, Exact simulation for multivariate Itô diffusions, Adv. in Appl. Probab. 52 (2020), no. 4, 1003–1034. 10.1017/apr.2020.39Suche in Google Scholar

[3] O. Cappé, E. Moulines and T. Rydén, Inference in Hidden Markov Models, Springer Ser. Statist., Springer, New York, 2005. 10.1007/0-387-28982-8Suche in Google Scholar

[4] P. Del Moral, Mean Field Simulation for Monte Carlo Integration, Monogr. Statist. Appl. Probab. 126, CRC Press, Boca Raton, 2013. 10.1201/b14924Suche in Google Scholar

[5] P. Del Moral, J. Jacod and P. Protter, The Monte-Carlo method for filtering with discrete-time observations, Probab. Theory Related Fields 120 (2001), no. 3, 346–368. 10.1007/PL00008786Suche in Google Scholar

[6] P. Fearnhead, O. Papaspiliopoulos and G. O. Roberts, Particle filters for partially observed diffusions, J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008), no. 4, 755–777. 10.1111/j.1467-9868.2008.00661.xSuche in Google Scholar

[7] M. B. Giles, Multilevel Monte Carlo path simulation, Oper. Res. 56 (2008), no. 3, 607–617. 10.1287/opre.1070.0496Suche in Google Scholar

[8] M. B. Giles, Multilevel Monte Carlo methods, Acta Numer. 24 (2015), 259–328. 10.1017/S096249291500001XSuche in Google Scholar

[9] M. B. Giles and L. Szpruch, Antithetic multilevel Monte Carlo estimation for multi-dimensional SDEs without Lévy area simulation, Ann. Appl. Probab. 24 (2014), no. 4, 1585–1620. 10.1214/13-AAP957Suche in Google Scholar

[10] S. Heinrich, Multilevel Monte Carlo methods, Large-Scale Scientific Computing, Springer, Berlin (2001), 58–67. 10.1007/3-540-45346-6_5Suche in Google Scholar

[11] A. Jasra, K. Kamatani, K. J. H. Law and Y. Zhou, Multilevel particle filters, SIAM J. Numer. Anal. 55 (2017), no. 6, 3068–3096. 10.1137/17M1111553Suche in Google Scholar

[12] A. Jasra, K. Kamatani, P. P. Osei and Y. Zhou, Multilevel particle filters: Normalizing constant estimation, Stat. Comput. 28 (2018), no. 1, 47–60. 10.1007/s11222-016-9715-5Suche in Google Scholar

[13] A. Jasra, K. Law and C. Suciu, Advanced multilevel Monte Carlo methods, Int. Stat. Rev. 88 (2020), no. 3, 548–579. 10.1111/insr.12365Suche in Google Scholar

[14] A. Jasra, K. J. H. Law and P. P. Osei, Multilevel particle filters for Lévy-driven stochastic differential equations, Stat. Comput. 29 (2019), no. 4, 775–789. 10.1007/s11222-018-9837-zSuche in Google Scholar

[15] A. Jasra, K. J. H. Law and F. Yu, Unbiased filtering of a class of partially observed diffusions, Adv. Appl. Probab. 54 (2022), no. 3, 661–687. 10.1017/apr.2021.50Suche in Google Scholar

[16] A. Jasra, M. Maama and H. Ombao, Antithetic multilevel particle filters, preprint (2023), https://arxiv.org/abs/2301.12371. 10.1017/apr.2024.12Suche in Google Scholar

[17] A. Jasra, F. Yu and J. Heng, Multilevel particle filters for the non-linear filtering problem in continuous time, Stat. Comput. 30 (2020), no. 5, 1381–1402. 10.1007/s11222-020-09951-9Suche in Google Scholar

[18] D. McLeish, A general method for debiasing a Monte Carlo estimator, Monte Carlo Methods Appl. 17 (2011), no. 4, 301–315. 10.1515/mcma.2011.013Suche in Google Scholar

[19] C.-H. Rhee and P. W. Glynn, Unbiased estimation with square root convergence for SDE models, Oper. Res. 63 (2015), no. 5, 1026–1043. 10.1287/opre.2015.1404Suche in Google Scholar

[20] M. Vihola, Unbiased estimators and multilevel Monte Carlo, Oper. Res. 66 (2018), no. 2, 448–462. 10.1287/opre.2017.1670Suche in Google Scholar

Received: 2023-03-04

Revised: 2023-11-17

Accepted: 2023-11-24

Published Online: 2023-12-15

Published in Print: 2024-06-01

Sie haben derzeit keinen Zugang zu diesem Inhalt.

Artikel in diesem Heft

https://doi.org/10.1515/mcma-2023-2024

Schlagwörter für diesen Artikel

Unbiased estimation; particle filters; diffusion processes; filtering