Home A new proof of geometric convergence for the adaptive generalized weighted analog sampling (GWAS) method
Article
Licensed
Unlicensed Requires Authentication

A new proof of geometric convergence for the adaptive generalized weighted analog sampling (GWAS) method

  • Rong Kong and Jerome Spanier EMAIL logo
Published/Copyright: June 30, 2016

Abstract

Generalized Weighted Analog Sampling is a variance-reducing method for solving radiative transport problems that makes use of a biased (though asymptotically unbiased) estimator. The introduction of bias provides a mechanism for combining the best features of unbiased estimators while avoiding their limitations. In this paper we present a new proof that adaptive GWAS estimation based on combining the variance-reducing power of importance sampling with the sampling simplicity of correlated sampling yields geometrically convergent estimates of radiative transport solutions. The new proof establishes a stronger and more general theory of geometric convergence for GWAS.

MSC 2010: 65C05; 82D75

Funding statement: The second author gratefully acknowledges partial support from NIH NIBIB Laser Microbeam and Medical Program (LAMMP, P41EB015890) and from NIH R25GM103818.

A Preliminary estimates

The following formula from probability theory can be found in [12] and is used in many of our proofs.

Lemma 1

For any random variables X and Y,

EX[X]=EY[EX[.X|Y]],VX[X]=EY[VX[.X|Y]]+VY[EX[.X|Y]].

As well, the following set-theoretic result arises in estimating probabilities:

Proposition 2

Assume that A and B are two subsets of a probability space and, for two small positive numbers ε1 and ε2, satisfy

𝒫(A)1-ε1,𝒫(B)1-ε2.

Then

𝒫(AB)1-(ε1+ε2).

Proof of Lemma 1.

We will use the following formulas, for any random variables X and Y:

(A.1)EX[X]=EY[EX[X|Y]],VX[X]=EY[VX[X|Y]]+VY[EX[X|Y]].

For E[h(P)], by first conditioning on ηP and then on ξ1, we have

E[h(P)]=EηP[E[h(P)ηP]]
=E[h(P)ηP=1]p(P)+E[h(P)ηP=0](1-p(P))
=p^(P)+(1-p(P))Eξ1[E[h(P)ηP=0,ξ1]]
=p^(P)+(1-p(P))ΓE[h(P)ηP=0,ξ1=P1]p(P,P1)(1-p(P))dP1
=p^(P)+ΓK^(P,P1)E[h(P1)]𝑑P1,

where we have used

E[h(P)ηP=0,ξ1=P1]=K^(P,P1)p(P,P1)E[h(P1)].

As for E[g], taking averages of the both sides of equation (3.7) and conditioning on ξ0, we obtain

E[g]=E[h(ξ0)S^(ξ0)p1(ξ0)]=Eξ0[E[h(ξ0)S^(ξ0)p1(ξ0)|ξ0]]=ΓE[h(P)S^(P)]dP=ΓE[h(P)]S^(P)dP,

which is (5.2).

To calculate the variance of h(P), by conditioning on ηP, we obtain

Vh[h(P)]Vh[h(P)ηP=1]p(P)+Vh[h(P)|ηP=0](1-p(P))+EηP[Eh[h(P)ηP]]2-{EηP[Eh[h(P)ηP]]}2.

The first term of the right-hand side is equal to zero because, under the condition ηP=1, h(P) is deterministic, while the last term is equal to (Eh[h(P)])2 owing to (A.1). Again, applying (A.1) on the second term by conditioning (h(P)|ηP=0) on ξP, we obtain

Vh[h(P)]=Eξ1[Vh[h(P)ηP=0,ξ1]](1-p(P))+Vξ1[Eh[h(P)ηP=0,ξ1]](1-p(P))+[Eh[h(P)ηP=1]]2p(P)+[Eh[h(P)ηP=0]]2(1-p(P))-(Eh[h(P)])2=ΓVh[h(P)ηP=0,ξ1=Q]p(P,Q)dQ+Eξ1[Eh[h(P)ηP=0,ξ1]]2(1-p(P))-{Eξ1[Eh[h(P)ηP=0,ξ1]]}2(1-p(P))+(p^(P)p(P))2p(P)+[Eh[h(P)ηP=0]]2(1-p(P))-(Eh[h(P)])2,

where it can be easily verified that Eh[h(P)|ηP=1]=p^(P)p(P). According to (A.1), the third term and the fifth term cancel out. We then have

Vh[h(P)]=ΓVh[h(Q)](K^(P,Q)p(P,Q))2p(P,Q)𝑑Q+Γ(Eh[h(Q)])2(K^(P,Q)p(P,Q))2p(P,Q)𝑑Q+(p^(P)p(P))2p(P)-(Eh[h(P)])2,

or

Vh[h(P)]+(Eh[h(P)])2=Γ(Vh[h(Q)]+(Eh[h(Q)])2)K^2(P,Q)p(P,Q)𝑑Q+p^2(P)p(P),

which is (5.3). To calculate V[g], we have

V[g]=V[h(ξ0)S^(ξ0)p1(ξ0)]
=Eξ0[V[h(ξ0)S^(ξ0)p1(ξ0)|ξ0]]+Vξ0[E[h(ξ0)S^(ξ0)p1(ξ0)|ξ0]]
=ΓV[h(P)S^(P)]dPp1(P)+Eξ0[E[h(ξ0)S^(ξ0)p1(ξ0)|ξ0]]2-(E[g])2
=ΓV[h(P)](S^(P))2p1(P)𝑑P+Γ(E[h(P)])2(S^(P))2p1(P)𝑑P-(E[g])2,

which is (5.4). The proof of Lemma 1 is completed. ∎

Proof of Lemma 2.

From (2.11) and (5.1), we obtain

E[h(P)]=S(P)+ΓK(P,P1)Φ~(P1)E[h(P1)]𝑑P1ΓK(P,Q)Φ~(Q)𝑑Q+S(P).

We then have

E[h(P)]S(P)+minPΓΦ~(P)ΓK(P,P1)E[h(P1)]𝑑P1maxPΓΦ~(P)ΓK(P,Q)Φ~(Q)𝑑Q+S(P).

Using the first inequality of (5.5) and also applying Proposition 2, we obtain

𝒫{E[h(P)]δSMΦ~maxPΓΓK(P,Q)𝑑Q+δS}>1-ε1,

which means

E[h(P)]δSMΦ~maxPΓΓK(P,Q)𝑑Q+δS

as both sides of the inequality are deterministic. Estimate (5.6) is proved.

Now, we derive an upper bound of Vh[h(P)]. Formula (5.3) can be written as

(Vh[h(P)]+(Eh[h(P)])2)(ΓK(P,Q)Φ~(Q)𝑑Q+S(P))2
=Γ(Vh[h(Q)]+(Eh[h(Q)])2(Φ~(Q))2)K2(P,Q)p(P,Q)𝑑Q+S2(P)p(P),

or

(Vh[h(P)]+(Eh[h(P)])2)(Φ~(P))2
=Γ(Vh[h(Q)]+(Eh[h(Q)])2(Φ~(Q))2)K2(P,Q)p(P,Q)𝑑Q
(A.2)+S2(P)p(P)+(Vh[h(P)]+(Eh[h(P)])2)((Φ~(P))2-(ΓK(P,Q)Φ~(Q)𝑑Q+S(P))2).

Since p(P,Q) satisfies (3.11), as an equation of (Vh[h(P)]+(Eh[h(P)])2)(Φ~(P))2, (A.2) can be estimated as follows,

(Vh[h(P)]+(Eh[h(P)])2)(Φ~(P))2
11-κpmaxPΓ((Vh[h(P)]+(Eh[h(P)])2)|(Φ~(P))2-(ΓK(P,Q)Φ~(Q)𝑑Q+S(P))2|)
+11-κpmaxPΓS2(P)p(P),

or

Vh[h(P)]+(Eh[h(P)])2
11-κp1(Φ~(P))2maxPΓ((Vh[h(P)]+(Eh[h(P)])2)|(Φ~(P))2-(ΓK(P,Q)Φ~(Q)𝑑Q+S(P))2|)
+11-κp1(Φ~(P))2maxPΓS2(P)p(P).

Taking the maximum value of both sides, we obtain

Vh[h(P)]+(Eh[h(P)])2
11-κp1minPΓ(Φ~(P))2maxPΓ(Vh[h(P)]+(Eh[h(P)])2)maxPΓ|(Φ~(P))2-(ΓK(P,Q)Φ~(Q)𝑑Q+S(P))2|
(A.3)+11-κp1minPΓ(Φ~(P))2maxPΓS2(P)p(P).

In (A.3), notice that

|(Φ~(P))2-(ΓK(P,Q)Φ~(Q)𝑑Q+S(P))2|
=|(Φ~(P)+ΓK(P,Q)Φ~(Q)𝑑Q+S(P))(Φ~(P)-ΓK(P,Q)Φ~(Q)𝑑Q-S(P))|
(maxPΓ|Φ~(P)|(1+κ)+MS)maxPΓ|Φ~(P)|maxPΓ|Φ~(P)-ΓK(P,Q)Φ~(Q)𝑑Q-S(P)||Φ~(P)|
(maxPΓ|Φ~(P)|(1+κ)+MS)maxPΓ|Φ~(P)|maxPΓ|Φ~(P)-ΓK(P,Q)Φ~(Q)𝑑Q-S(P)||Φ~(P)|,

where κ is defined by (2.4). Therefore, from (A.3), we have

𝒫{Vh[h(P)]+(Eh[h(P)])2(MΦ~(1+κ)+MS)MΦ~(1-κp)δΦ~2maxPΓ|Φ~(P)-ΓK(P,Q)Φ~(Q)𝑑Q-S(P)||Φ~(P)|
maxPΓ(Vh[h(P)]+(Eh[h(P)])2)+MS2(1-κp)δΦ~2δP}>1-ε1.

Now using the second inequality of (5.5) and Proposition 2, we have

𝒫{Vh[h(P)]+(Eh[h(P)])2αmaxPΓ(Vh[h(P)]+(Eh[h(P)])2)+MS2(1-κp)δΦ~2δP}>1-ε1-ε2,

or

𝒫{Vh[h(P)]+(Eh[h(P)])2MS2(1-α)(1-κp)δΦ~2δP}>1-ε1-ε2,

which means

Vh[h(P)]+(Eh[h(P)])2MS2(1-α)(1-κp)δΦ~2δP,

because both sides of the inequality are deterministic and ε1+ε2<1.

The proof of Lemma 2 is completed. ∎

Proof of Corollary 3.

According to Chebyshev’s inequality, for any W and ε3>0, we have

𝒫{|Eh[h(P)]-1Ww=1Whw(P)|<Vh[h(P)]Wε3}1-ε3,

which means

𝒫{1Ww=1Whw(P)>Eh[h(P)]-Vh[h(P)]Wε3}1-ε3.

Using (5.6), we have

𝒫{1Ww=1Whw(P)>δh-MVWε3}1-ε3.

Now we just choose

Wh=4MVε3δh2

to complete the proof of Corollary 3. ∎

Proof of Lemma 4.

From the definition (3.7) of ζ and equation (3.9) about d𝒩~d𝒩, we obtain

E[|X|]=E[|ζ-(ΓΦ(P)S(P)𝑑P)d𝒩~d𝒩|]
=k=1Λk|ζ-(ΓΦ(P)S(P)𝑑P)d𝒩~d𝒩|𝑑ν
=k=1Λk|S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
-(ΓΦ(P)S(P)dP)S^(P1)K^(P1,P2)K^(Pk-1,Pk)p^(Pk)|dP1dPk
=k=1Λk|S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
-Φ~(P1)S(P1)ΓΦ(Q)S(Q)𝑑QΓΦ~(Q)S(Q)𝑑QK(P1,P2)Φ~(P2)ΓK(P1,Q)Φ~(Q)𝑑Q+S(P1)
K(Pk-1,Pk)Φ~(Pk)ΓK(Pk-1,Q)Φ~(Q)𝑑Q+S(Pk-1)S(Pk)ΓK(Pk,Q)Φ~(Q)𝑑Q+S(Pk)|dP1dPk.

Noticing (4.5) and (4.7) about the definition of D(P) (keep in mind that we ignore the superscript in this section), we can simply the factors after the minus sign in the integral by

(A.4)Φ~(Pi)ΓK(Pi,Q)Φ~(Q)𝑑Q+S(Pi)=1ΓK(Pi,Q)Φ~(Q)𝑑Q+S(Pi)Φ~(Pi)=11+D(Pi).

Therefore,

E[|X|]=k=1ΛkS(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)|1-Φ~(P1)ΓΦ(Q)S(Q)𝑑QΓΦ~(Q)S(Q)𝑑QΦ~(P2)ΓK(P1,Q)Φ~(Q)𝑑Q+S(P1)
Φ~(Pk)ΓK(Pk-1,Q)Φ~(Q)𝑑Q+S(Pk-1)1ΓK(Pk,Q)Φ~(Q)𝑑Q+S(Pk)|dP1dPk
=1ΓΦ~(Q)S(Q)𝑑Qk=1ΛkS(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
|ΓΦ~(Q)S(Q)𝑑Q-ΓΦ(Q)S(Q)𝑑Q(1+D(P1))(1+D(Pk))|dP1dPk
=1ΓΦ~(Q)S(Q)𝑑Qk=1ΛkS(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
|ΓΦ~(Q)S(Q)𝑑Q-ΓΦ~(Q)S(Q)𝑑Q(1+D(P1))(1+D(Pk))-Γ(Φ(Q)-Φ~(Q))S(Q)𝑑Q(1+D(P1))(1+D(Pk))|dP1dPk,

which can be estimated by

E[|X|]1ΓΦ~(Q)S(Q)𝑑Qk=1ΛkS(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
(ΓΦ~(Q)S(Q)𝑑Q|1-1(1+D(P1))(1+D(Pk))|+||Γ(Φ(Q)-Φ~(Q))S(Q)dQ|(1+D(P1))(1+D(Pk))|)dP1dPk.

Obviously, if |D(P)|<1, then

(A.5)1(1+D(P1))(1+D(Pk))1(1-maxPΓ|D(P)|)k,

and, therefore,

E[|X|]1ΓΦ~(Q)S(Q)𝑑Qk=1ΛkS(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
(A.6)(ΓΦ~(Q)S(Q)𝑑Q|1-1(1-maxPΓ|D(P)|)k|+|Γ(Φ(Q)-Φ~(Q))S(Q)dQ|(1-maxPΓ|D(P)|)k)dP1dPk.

On the other hand, using (2.4),

|ΛkS(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)𝑑P1𝑑Pk|
maxPkΓS(Pk)Γ|S(P1)|𝑑P1(maxP1ΓΓK(P1,P2)𝑑P2)(maxPkΓΓK(Pk-1,Pk)𝑑Pk)
(A.7)=κk-1maxPΓS(P)Γ|S(P)|𝑑P.

Applying (A.7) to (A.6), we obtain

E[|X|]maxPΓS(P)Γ|S(P)|dP(k=1(κk-1(1-maxPΓ|D(P)|)k-κk-1)
+|Γ(Φ(Q)-Φ~(Q))S(Q)dQ|ΓΦ~(Q)S(Q)𝑑Qk=1κk-1(1-maxPΓ|D(P)|)k)
=maxPΓS(P)Γ|S(P)|𝑑P1-maxPΓ|D(P)|-κ(maxPΓ|D(P)|1-κ+|Γ(Φ(Q)-Φ~(Q))S(Q)dQ|(ΓΦ~(Q)S(Q)𝑑Q)).

Using condition (5.8) and Proposition 2, we obtain

𝒫{E[|X|]βmaxPΓS(P)Γ|S(P)|𝑑P(1-β)κ(maxPΓ|D(P)|1-κ+|Γ(Φ(Q)-Φ~(Q))S(Q)dQ|(ΓΦ~(Q)S(Q)𝑑Q))}>1-ε1.

The proof of Lemma 4 is completed. ∎

Proof of Theorem 6.

Noticing the definition (3.7) of ζ and the equation (3.9) satisfied by dν~dν, we obtain

E[X2]=E[(ζ-(ΓΦ(P)S(P)𝑑P)dν~dν)2]
=k=1Λk(ζ-(ΓΦ(P)S(P)𝑑P)dν~dν)2𝑑ν
=k=1Λk(S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
-(ΓΦ(P)S(P)dP)S^(P1)K^(P1,P2)K^(Pk-1,Pk)p^(Pk))2
dP1dPkp1(P1)p(P1,P2)p(Pk-1,Pk)p(Pk).

Using (2.11), we obtain

E[X2]=k=1Λk(S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk)
-Φ~(P1)S(P1)ΓΦ(Q)S(Q)𝑑QΓΦ~(Q)S(Q)𝑑QK(P1,P2)Φ~(P2)ΓK(P1,Q)Φ~(Q)𝑑Q+S(P1)
K(Pk-1,Pk)Φ~(Pk)ΓK(Pk-1,Q)Φ~(Q)𝑑Q+S(Pk-1)S(Pk)ΓK(Pk,Q)Φ~(Q)𝑑Q+S(Pk))2
dP1dPkp1(P1)p(P1,P2)p(Pk-1,Pk)p(Pk)
=1(ΓΦ~(Q)S(Q)𝑑Q)2k=1Λk(S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk))2
(ΓΦ~(Q)S(Q)dQ-ΓΦ(Q)S(Q)dQΦ~(P1)ΓK(P1,Q)Φ~(Q)𝑑Q+S(P1)
Φ~(Pk-1)ΓK(Pk-1,Q)Φ~(Q)𝑑Q+S(Pk-1)Φ~(Pk)ΓK(Pk,Q)Φ~(Q)𝑑Q+S(Pk))2
dP1dPkp1(P1)p(P1,P2)p(Pk-1,Pk)p(Pk).

Noticing the notations D(P) (ignore the superscript) defined in (4.7) or doing exactly what we have done in Lemma 4, equation (A.4), we obtain

E[X2]=1(ΓΦ~(Q)S(Q)𝑑Q)2k=1Λk(S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk))2p1(P1)p(P1,P2)p(Pk-1,Pk)p(Pk)
(ΓΦ~(Q)S(Q)𝑑Q-ΓΦ(Q)S(Q)𝑑Q11+D(P1)11+D(Pk-1)11+D(Pk))2dP1dPk
=1(ΓΦ~(Q)S(Q)𝑑Q)2k=1Λk(S(P1)K(P1,P2)K(Pk-1,Pk)S(Pk))2p1(P1)p(P1,P2)p(Pk-1,Pk)p(Pk)
(ΓΦ~(Q)S(Q)dQ(1-11+D(P1)11+D(Pk-1)11+D(Pk))
+Γ(Φ~(Q)-Φ(Q))S(Q)𝑑Q(1+D(P1))(1+D(Pk)))2dP1dPk.

Using (A.5) in the proof of Lemma 4, we obtain

E[X2]k=1(Γ(S(P1))2p1(P1)𝑑P1)(maxPΓΓ(K(P,Q))2p(P,Q)𝑑Q)k-1maxPΓ(S(P))2p(P)
(A.8)(1-1(1-maxPΓ|D(P)|)k+Γ(Φ~(Q)-Φ(Q))S(Q)𝑑QΓΦ~(Q)S(Q)𝑑Q1(1-maxPΓ|D(P)|)k)2.

Since, for any a and b, (a+b)22(a2+b2), we obtain

(1-1(1-maxPΓ|D(P)|)k+Γ(Φ~(Q)-Φ(Q))S(Q)𝑑QΓΦ~(Q)S(Q)𝑑Q1(1-maxPΓ|D(P)|)k)2
(A.9)2(1(1-maxPΓ|D(P)|)k-1)2+2(Γ(Φ~(Q)-Φ(Q))S(Q)𝑑QΓΦ~(Q)S(Q)𝑑Q1(1-maxPΓ|D(P)|)k)2.

Noticing the definition of κp, (3.11), and using (A.9), from (A.8) we obtain

E[X2]2(Γ(S(P1))2p1(P1)𝑑P1)(maxPΓ(S(P))2p(P))k=0κpk-1(1(1-maxPΓ|D(P)|)k-1)2
+2(Γ(S(P1))2p1(P1)𝑑P1)(maxPΓ(S(P))2p(P))
k=0κpk-1(Γ(Φ~(Q)-Φ(Q))S(Q)𝑑QΓΦ~(Q)S(Q)𝑑Q1(1-maxPΓ|D(P)|)k)2
2(Γ(S(P1))2p1(P1)𝑑P1)(maxPΓ(S(P))2p(P))(1-maxPΓ|D(P)|+κp)maxPΓ|D(P)|2((1-maxPΓ|D(P)|)2-κp)(1-κp)(1-maxPΓ|D(P)|-κp)
+2(Γ(S(P1))2p1(P1)𝑑P1)(maxPΓ(S(P))2p(P))(1-maxPΓ|D(P)|)2-κp(Γ(Φ~(Q)-Φ(Q))S(Q)𝑑QΓΦ~(Q)S(Q)𝑑Q)2.

Using (5.10) together with Proposition 2, we obtain (5.11).

The proof is completed. ∎

B Estimation of the bias

Proof of Theorem 1.

Recalling the definition of τW(P), (4.4), we obtain

(B.1)Z(P)|E[τW(P)]-Φ(P)|=|E[w=1Wωw(P)w=1Whw(P)]-Φ(P)|=|E[1Ww=1W(ωw(P)-Φ(P)hw(P))1Ww=1Whw(P)]|.

Note that hw(P) and ωw(P) are the w-th samples of random variables h(P) (=dν~PdνP) and ω(P), respectively.

Now, from Corollary 3, for ε>0, there must be a positive number δh (defined by (5.7)) such that

(B.2)𝒫{1Wi=1Whw(P)12δh}1-ε.

Therefore, from (B.1),

𝒫{Z(P)E[w=1W|ωw(P)-Φ(P)hw(P)|]Wδh}1-ε,

or

(B.3)𝒫{Z(P)w=1WE[|ωw(P)-Φ(P)hw(P)|]Wδh}1-ε.

According to (4.9), the definition of X(P) (ignore the superscript as indicated before), R (B.3) is actually

𝒫{Z(P)w=1WE[|Xw(P)|]Wδh}1-ε,

where the subscript w indicates that Xw(P) is the w-th sample of X(P). Now, appealing to Lemma 4, we obtain the second inequality of (6.1).

As for the random variable Z, we can prove it similarly. Notice

Z|E[τW]-I|=|E[i=1Wζwi=1Wgw]-I|=|E[1Wi=1W(ζw-(ΓΦ(Q)S(Q)𝑑Q)gw)1Wi=1Wgw]|.

Then using (B.2), we obtain the first inequality of (6.1).

This completes the proof of Theorem 1. ∎

C Estimation of second moments

Proof of Theorem 1.

We first estimate V(P). Note that this is not the variance of the random variable τW because, in general,

E[τW(P)]Φ(P),

i.e., τW(P) is not unbiased. However, the estimation of V(P) will help us prove the geometric convergence of the solution through the random variable τW(P).

We have

(C.1)V(P)=E[τW(P)-Φ(P)]2=E[w=1Wωw(P)w=1Whw(P)-Φ(P)]2=E[1Ww=1W(ωw(P)-Φ(P)hw(P))1Ww=1Whw(P)]2.

Now, from Corollary 3, for ε1>0, there must be a positive number δh and an integer Wh>0 such that when WWh,

𝒫{1Ww=1Whw(P)12δh}1-ε1.

Therefore, from (C.1),

𝒫{V(P)4E[w=1W(ωw(P)-Φ(P)hw(P))]2W2δh2}1-ε1,

or

𝒫{V(P)4W2δh2w=1WE[(ωw(P)-Φ(P)hw(P))2]
(C.2)+8W2δh2i<jWE[(ωi(P)-Φ(P)hi(P))(ωj(P)-Φ(P)hj(P))]}1-ε1.

Because of the independence of the random walks for different i and j, the second sum in the braces of (C.2) vanishes. We then have

𝒫{V(P)4W2δh2w=1WE[(ωw(P)-Φ(P)hw(P))2]}1-ε1,

or

(C.3)𝒫{V(P)4Wδh2E[(ω(P)-Φ(P)h(P))2]}1-ε1.

According to condition (7.3), we can apply Lemma 6, specifically inequality (5.12):

(C.4)𝒫{maxPΓE[X2(P)]c~1maxPΓ|D(P)|2+c~2maxPΓ|Φ~(P)-Φ(P)Φ~(P)|2}>1-ε2,

Now combining (C.3) and (C.4), and using Proposition 2, we obtain

(C.5)𝒫{V(P)4c~1Wδh2maxPΓ|D(P)|2+4c~2Wδh2maxPΓ|Φ~(P)-Φ(P)Φ~(P)|2}1-ε1-ε2,

which is the second inequality of (7.4) (after redefining the constants c~1 and c~2).

As for V, we note

V=E[τW-I]2=E[w=1Wζww=1Wgw-I]2=E[1Ww=1W(ζw-(ΓΦ(Q)S(Q)𝑑Q)gw)w=1Wgw]2.

Similarly, we can obtain the first inequality of (7.4).

In order to derive (7.5), we notice that

maxPΓ|D(P)|R=maxPΓ|ΓK(P,Q)(Φ~(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~(P))Φ~(P)|
(1+κ)(maxPΓΦ~(P)+maxPΓΦ(P))minPΓΦ~(P)
(C.6)(1+κ)(maxPΓΦ~(P)+MΦ)minPΓΦ~(P)

and

(C.7)maxPΓ|Φ~(P)-Φ(P)Φ~(P)|(maxPΓΦ~(P)+MΦ)minPΓΦ~(P).

Combining (C.5), (C.6) and (C.7), we obtain

𝒫{V(P)4c~1Wδh2((1+κ)(maxPΓΦ~(P)+MΦ)minPΓΦ~(P))2+4c~2Wδh2((maxPΓΦ~(P)+MΦ)minPΓΦ~(P))2}1-ε1-ε2.

Now using (7.2) and Proposition 2, we obtain

𝒫{V(P)C1W}1-2ε1-ε2,

where

C1=4(c~1(1+κ)2+c~2)(MΦ~+MΦ)2δh2δΦ~2.

Noticing the second inequality of (7.1), since V(P) is deterministic, we obtain

V(P)C1W,

which is the first inequality of (7.5). The second inequality can be obtained similarly.

This completes the proof. ∎

D General geometric convergence

Proof of Theorem 1.

We will prove by induction that , for any m1,

(D.1){𝒫{δΦ~Φ~m-1(P)MΦ~}1-ε5,𝒫{maxPΓ|Dm-1(P)|1-κpγ}1-2ε5,𝒫{maxPΓ|Φ(P)-Φ~m(P)|<λmaxPΓ|Φ~m-1(P)-Φ(P)|}1-ε.

We have added two additional inequalities to the list (first two inequalities of (D.1)). The reason for doing so is that, to prove the third inequality, we need the first two inequalities.

For m=1, the first two inequalities are trivial, as conditions (8.1) and (8.2) imply them. To prove the third one, we appeal to Theorem 1, the second inequality of (7.4),

(D.2)𝒫{V1(P)c~1WmaxPΓ|D0(P)|2+c~2WmaxPΓ|Φ~0(P)-Φ(P)Φ~0(P)|2}1-3ε5,

where we have taken ε1=ε5 and ε2=2ε5.

We need to estimate maxPΓ|D0(P)|2 and maxPΓ|Φ~0(P)-Φ(P)Φ~0(P)|2 by maxPΓ|Φ~0(P)-Φ(P)|2. We have

maxPΓ|D0(P)|2=maxPΓ|ΓK(P,Q)(Φ~0(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~0(P))Φ~0(P)|2
(D.3)(1+κ)2(minPΓΦ~0(P))2maxPΓ|Φ~0(P)-Φ(P)|2

and

(D.4)maxPΓ|Φ~0(P)-Φ(P)Φ~0(P)|21(minPΓΦ~0(P))2maxPΓ|Φ~0(P)-Φ(P)|2.

From (D.2), (D.3) and (D.4), we obtain

𝒫{V1(P)c~1(1+κ)2+c~2W(minPΓΦ~(P)0)2maxPΓ|Φ~0(P)-Φ(P)|2}1-3ε5.

Applying the first inequality (D.1) and Proposition 2, we obtain

(D.5)𝒫{V1(P)C1WmaxPΓ|Φ~0(P)-Φ(P)|2}1-4ε5,

where C1=c~1(1+κ)2+c~2δΦ~2. Next, according to Chebyshev’s inequality,

𝒫{|Φ(P)-τW1(P)|<V1(P)ε5}1-ε5,

or after we replace τW1(P) by Φ~1(P),

(D.6)𝒫{|Φ(P)-Φ~1(P)|<V1(P)ε5}1-ε5.

Combining (D.5) with (D.6) and noticing Proposition 2, we obtain

𝒫{|Φ(P)-Φ~1(P)|<C1Wε5maxPΓ|Φ~0(P)-Φ(P)|}1-ε.

Thus (D.1) is proved for m=1 once we pick a W4 such that when WW4,

(D.7)C1Wε5λ.

We now prove (D.1) for the general case. Again, according to Chebyshev’s inequality,

𝒫{|Φ(P)-τWm-1(P)|<Vm-1(P)ε5}1-ε5,

or after replacing τWm-1(P) by Φ~m-1(P),

(D.8)𝒫{|Φ(P)-Φ~m-1(P)|<Vm-1(P)ε5}1-ε5,

or

(D.9)𝒫{Φ(P)-Vm-1(P)ε5<Φ~m-1(P)<Φ(P)+Vm-1(P)ε5}1-ε5.

According to Theorem 1,

(D.10)Vm-1(P)C2W,

where C2 does not depend on any specific stages. Substituting (D.10) into (D.9) produces

𝒫{minPΓΦ(P)-C2Wε5<Φ~m-1(P)<maxPΓΦ(P)+C2Wε5}1-ε5,

or by (2.6),

𝒫{δΦ-C2Wε5<Φ~m-1(P)<MΦ+C2Wε5}1-ε5.

Noticing (4.2), we can find a W2 such that

δΦ-C2W2ε5δΦ~,
MΦ+C2W2ε5MΦ~.

We then have

(D.11)𝒫{δΦ~<Φ~m-1(P)<MΦ~}1-ε5.

In order to prove the second inequality of (D.1), we notice

maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|
(D.12)1+κminPΓΦ~m-1(P)maxPΓ|Φ(P)-Φ~m-1(P)|.

Combining (D.11) with (D.12) and applying Proposition 2 (about the lower bound),

𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|
(D.13)1+κδΦ~maxPΓ|Φ(P)-Φ~m-1(P)|}>1-ε5.

Combining (D.8) with (D.13) and applying Proposition 2, we obtain

𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|1+κδΦ~Vm-1(P)ε5}>1-2ε5.

Applying (D.10), we obtain

(D.14)𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|1+κδΦ~C2Wε5}>1-2ε5.

We can then pick a W3 such that

(D.15)1+κδΦ~C2W3ε51-κpγ.

Combining (D.14) with (D.15), when WW3,

(D.16)𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|1-κpγ}1-2ε5,

which is the second inequality of (D.1).

The third inequality of (D.1) can be easily proved. All we have to do is to go over the steps from (D.2) through (D.7) with the superscript 1 replaced by m and 0 replaced by m-1. That is, we can pick the same W4 as determined by (D.7) such that when WW4,

(D.17)𝒫{maxPΓ|Φ(P)-Φ~m(P)|<λmaxPΓ|Φ~m-1(P)-Φ(P)|}1-ε.

Thus, when WW0=max{W1,W2,W3,W4}, (D.11), (D.16) and (D.17) all hold.

The proof of Theorem 1 is completed. ∎

E Geometric convergence for expansion in basis functions

Proof of Theorem 1.

Using the expressions of the true solution Φ(P) by (9.1) and the m-th stage approximation Φ~m(P) by (9.3), we have

|Φ(P)-Φ~m(P)|=|i=0aifi(P)-i=0Na~imfi(P)|
=|i=0N(ai-a~im)fi(P)+i=N+1aifi(P)|
i=0N|ai-a~im||fi(P)|+rN
(E.1)Mfi=0N|ai-a~im|+rN.

Therefore, estimating |Φ(P)-Φ~m(P)| becomes estimating |ai-a~im|.

The proof is similar to Theorem 1. We will prove by induction that , for any m1,

(E.2){𝒫{δΦ~Φ~m-1(P)MΦ~}1-ε5,𝒫{maxPΓ|Dm-1(P)|1-κpγ}1-2ε5,𝒫{maxPΓ|Φ(P)-Φ~m(P)|<λmaxPΓ|Φ~m-1(P)-Φ(P)|+rN}1-ε.

We have added two additional inequalities to the list (first two inequalities of (E.2)). The reason for doing so is that, to prove the third inequality, we need the first two inequalities.

For m=1, the first two inequalities are trivial, as conditions (9.4) and (9.5) imply them. To prove the third one, we appeal to Theorem 1, the first inequality of (7.4),

(E.3)𝒫{V1c1WmaxPΓ|D0(P)|2+c2W(Γ(Φ~0(Q)-Φ(Q))S(Q)𝑑QΓΦ~0(Q)S(Q)𝑑Q)2}1-3ε5,

where we have taken ε1=ε5 and ε2=2ε5.

We need to estimate maxPΓ|D0(P)|2 and maxPΓ|Φ~0(P)-Φ(P)Φ~0(P)|2 by maxPΓ|Φ~0(P)-Φ(P)|2. We have

maxPΓ|D0(P)|2=maxPΓ|ΓK(P,Q)(Φ~0(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~0(P))Φ~0(P)|2
(E.4)(1+κ)2(minPΓΦ~0(P))2maxPΓ|Φ~0(P)-Φ(P)|2

and

(E.5)maxPΓ|Φ~0(P)-Φ(P)Φ~0(P)|21(minPΓΦ~0(P))2maxPΓ|Φ~0(P)-Φ(P)|2.

From (E.3), (E.4) and (E.5), we obtain

(E.6)𝒫{V1c~1(1+κ)2+c~2W(minPΓΦ~0(P))2maxPΓ|Φ~0(P)-Φ(P)|2}1-3ε5.

Applying the first inequality (E.2) and Proposition 2, we obtain

(E.7)𝒫{V1C1WmaxPΓ|Φ~0(P)-Φ(P)|2}1-4ε5,

where C1=c~1(1+κ)2+c~2δΦ~2. Next, according to Chebyshev’s inequality,

𝒫{|ai1-τW1(P)|<V1(P)ε5}1-ε5(N+1).

or after we replace τW1 by a~i1,

(E.8)𝒫{|ai1-a~i1|<V1ε5}1-ε5(N+1).

Combining (E.1) with (E.7) and (E.8) and noticing Proposition 2, we obtain

𝒫{|Φ(P)-Φ~1(P)|<MfC1(N+1)Wε5maxPΓ|Φ~0(P)-Φ(P)|+rN}1-ε.

Thus (E.2) is proved for m=1 once we pick a W4 such that when WW4,

(E.9)MfC1(N+1)Wε5λ.

We now prove (E.2) for the general case. Again, according to Chebyshev’s inequality,

𝒫{|aim-1-τWm-1|<Vm-1ε5}1-ε5(N+1),

or after replacing τWm-1 by a~im-1,

𝒫{|aim-1-a~im-1|<Vm-1ε5}1-ε5(N+1),

which leads to

(E.10)𝒫{|Φ(P)-Φ~m-1(P)|<MfVm-1(N+1)ε5+rN}1-ε5,

or using Proposition 2,

(E.11)𝒫{Φ(P)-MfVm-1(N+1)ε5-rN<Φ~m-1(P)<Φ(P)+MfVm-1(N+1)ε5+rN}1-ε5.

According to Theorem 1,

(E.12)Vm-1C2W,

where C2 does not depend on any specific stages. Substituting (E.12) into (E.11) produces

𝒫{minPΓΦ(P)-MfC2(N+1)Wε5-rN<Φ~m-1(P)<maxPΓΦ(P)+MfC2(N+1)Wε5+rN}1-ε5,

or by (2.6),

𝒫{δΦ-MfC2(N+1)Wε5-rN<Φ~m-1(P)<MΦ+MfC2(N+1)Wε5+rN}1-ε5.

Noticing (4.2), we can find a W2 such that

δΦ-MfC2(N+1)W2ε5-rNδΦ~,
MΦ+MfC2(N+1)W2ε5+rNMΦ~.

We then have

(E.13)𝒫{δΦ~<Φ~m-1(P)<MΦ~}1-ε5.

In order to prove the second inequality of (E.2), we notice

maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|
(E.14)1+κminPΓΦ~m-1(P)maxPΓ|Φ(P)-Φ~m-1(P)|.

Combining (E.13) with (E.14) and applying Proposition 2 (about the lower bound),

𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|
(E.15)1+κδΦ~maxPΓ|Φ(P)-Φ~m-1(P)|}>1-ε5.

Combining (E.10) with (E.15) and applying Proposition 2, we obtain

𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|
1+κδΦ~MfVm-1(N+1)ε5+1+κδΦ~rN}>1-2ε5.

Applying (E.12), we obtain

𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|
(E.16)1+κδΦ~MfC2(N+1)Wε5+1+κδΦ~rN}>1-2ε5.

We can pick a sufficiently large N to make rN sufficiently small and then pick a W3 such that

(E.17)1+κδΦ~MfC2(N+1)W3ε5+1+κδΦ~rN1-κpγ.

Combining (E.16) with (E.17), when WW3,

(E.18)𝒫{maxPΓ|ΓK(P,Q)(Φ~m-1(Q)-Φ(Q))𝑑Q+(Φ(P)-Φ~m-1(P))Φ~m-1(P)|1-κpγ}1-2ε5,

which is the second inequality of (E.2).

The third inequality of (E.2) can be easily proved. All we have to do is to go over the steps from (E.6) through (E.9) with the superscript 1 replaced by m and 0 replaced by m-1. That is, we can pick the same W4 as determined by (E.9) such that when WW4,

(E.19)𝒫{maxPΓ|Φ(P)-Φ~m(P)|<λmaxPΓ|Φ~m-1(P)-Φ(P)|+rN}1-ε.

Thus, when WW0=max{W1,W2,W3,W4}, (E.13), (E.18) and (E.19) all hold.

The proof of Theorem 1 is completed. ∎

References

[1] Booth T., Exponential convergence for Monte Carlo particle transport, Trans. Amer. Nuclear Soc. 50 (1985), 267–268. Search in Google Scholar

[2] Booth T., Zero-variance solutions for linear Monte Carlo, Nuclear Sci. Eng. 102 (1989), 332–340. 10.13182/NSE89-A23646Search in Google Scholar

[3] Booth T., Exponential convergence on a continuous Monte Carlo transport problem, Nuclear Sci. Eng. 127 (1997), 338–345. 10.13182/NSE97-A1939Search in Google Scholar

[4] Case K. M. and Zweifel P. W., Linear Transport Theory, Addison-Wesley, Reading, 1967. Search in Google Scholar

[5] Kong R. and Spanier J., Error analysis of sequential monte carlo methods for transport problems, Monte Carlo and Quasi-Monte Carlo Methods 1998 (Claremont 1998), Springer, Berlin (2000), 252–272. 10.1007/978-3-642-59657-5_17Search in Google Scholar

[6] Kong R. and Spanier J., Sequential correlated sampling methods for some transport problems, Monte Carlo and Quasi-Monte Carlo Methods 1998 (Claremont 1998), Springer, Berlin (2000), 238–251. 10.1007/978-3-642-59657-5_16Search in Google Scholar

[7] Kong R. and Spanier J., Residual versus error in transport problems, Monte Carlo and Quasi-Monte Carlo Methods 2000 (Hong Kong 2000), Springer, Berlin (2002), 306–317. 10.1007/978-3-642-56046-0_20Search in Google Scholar

[8] Kong R. and Spanier J., A new proof of geometric convergence for general transport problems based on sequential correlated sampling methods, J. Comput. Physics 227 (2008), 9762–9777. 10.1016/j.jcp.2008.07.016Search in Google Scholar PubMed PubMed Central

[9] Kong R. and Spanier J., Geometric convergence of adaptive monte carlo algorithms for radiative transport problems based on importance sampling methods, Nuclear Sci. Eng. 168 (2011), 197–225. 10.13182/NSE10-29Search in Google Scholar

[10] Lai Y. and Spanier J., Adaptive importance sampling algorithms for transport problems, Monte Carlo and Quasi-Monte Carlo Methods 1998 (Claremont 1998), Springer, Berlin (2000), 273–283. 10.1007/978-3-642-59657-5_18Search in Google Scholar

[11] Powell M. J. D. and Swann J., Weighted uniform sampling–a Monte Carlo technique for reducing variance, J. Inst. Math. Appl. 2 (1966), 228–236. 10.1093/imamat/2.3.228Search in Google Scholar

[12] Ross S., A First Course in Probability, 5th ed., Prentice-Hall, Upper Saddle River, 1998. Search in Google Scholar

[13] Spanier J. and Gelbard E. M., Monte Carlo Principles and Neutron Transport Problems, Addison-Wesley, Reading, 1969. Search in Google Scholar

[14] Spanier J. and Kong R., A new adaptive method for geometric convergence, Monte Carlo and Quasi-Monte Carlo Methods 2002 (Singapore 2002), Springer, Berlin (2004), 439–449. 10.1007/978-3-642-18743-8_27Search in Google Scholar

[15] Silverman B. W., Density Estimation for Statistics and Data Analysis, Chapman and Hall, London, 1986. Search in Google Scholar

[16] Advanced Monte Carlo Methods, CRIAMS report LANL-03-001 to Los Alamos National Laboratory, February, 2003. Search in Google Scholar

Received: 2016-2-5
Accepted: 2016-6-15
Published Online: 2016-6-30
Published in Print: 2016-9-1

© 2016 by De Gruyter

Downloaded on 16.11.2025 from https://www.degruyterbrill.com/document/doi/10.1515/mcma-2016-0110/pdf
Scroll to top button