Processing math: 100%

Probabilistic system analysis – Part III

Limit theorem

Chebyshev’s inequality

σ2=(xμ)2fX(x)dxuc(xμ)2fX(x)dx+u+c(xμ)2fX(x)dxc2ucfX(x)dx+c2u+cfX(x)dx=c2P(|Xμ|c)

P(|Xμ|c)σ2c2P(|Xμ|kσ)1k2

Convergence

Mn=X1++Xnn

E[Mn]=E[X1]++E[Xn]n=nμn=μvar(Mn)=(X1μ)2++(Xnμ)2n2=nσ2n2=σ2n0P(|Mnμ|ε)var(Mn)ε2=σ2nε20

Example

Bernoulli process σx=p(1p)=14

95% confidence of error smaller than 1% P(|Mnμ|0.01)0.05

P(|Mnμ|0.01)σ2Mn0.012=σ2x0.012n14n(0.01)20.05

Scaling of Mn

Sn=X1++Xnvar(Sn)=nσ2Mn=Snnvar(Mn)=σ2nSnnvar(Snn)=var(Sn)n=σ2

Central limit theorem

Standardized Sn=X1++Xn

Zn=SnE[Sn]σSn=SnnE[X]nσN(0,1)

P(Znc)P(Zc)=Φ(c)

CDF of Zn converges to normal CDF, not about convergence of PDF or PMF

Example: 

n=36,p=0.5, find P(Sn21)

Exact answer:

21k=0(36k)(12)36=0.8785

Method 1: Central limit theorem

Zn=SnnE[X]nσ=Snnpnp(1p)

E[Sn]=np=360.5=18σ2Sn=np(1p)=9P(Sn21)P(Sn18321183)=P(Zn1)=0.843

Method 2: 1/2 correction for binomial approximation

P(Sn21)=P(Sn<22)   Sn is an integerP(Sn21.5)P(Zn21.5183)=P(Zn1.17)=0.879

The solution is closer to the exact answer.

De Moivre-Laplace CLT

When 1/2 correction is used, CLT can also approximate the binomial pmf, not just the binomial CDF.

P(Sn=19)=P(18.5Sn19.5)P(0.17Zn0.5)=P(Zn0.5)P(Zn0.17)=0.124

Exact answer: P(Sn=19)=(3619)(12)36=0.125

 

Poisson arrivals during unit interval equals: sum of n (independent) Poisson arrivals during n intervals of length 1/n. X=X1++Xn, E[X]=1. Fix p of Xi, when n, the mean and variance are changing, so CLT cannot be applied here. For Binomial(n,p):

p fixed,n: normalnp fixed,n: Poisson

Bayesian statistical inference

Types of Inference models/approaches

X=aS+W

#1. Model building: know “signal” S, observe X, infer a

#2. Inferring unknown variables: know a, observe X, infer S

#3. Hypothesis testing: unknown takes one of few possible values, aim at small probability of incorrect decision

#4. Estimation: aim at a small estimation error

θ is an unknown parameter and the difference between classical statistics and Bayesian is that θ is constant in classical statistics, but a random variable in Bayesian.

pΘ|X(θ|x)=pΘ(θ)fΘ|X(θ|x)fX(x)Hypothesis testingfΘ|X(θ|x)=fΘ(θ)fΘ|X(θ|x)fX(x)Estimation

Maximum a posteriori probability (MAP): choose a single value that gives a maximum probability, often used in hypothesis testing, but may be misleading when look into the conditional expectation.

Least mean squares estimation (LMS): find c to minimize E[(Θc)2], in which c=E[Θ], then the optimal mean squared error is var(Θ)=E[(ΘE[Θ])2], summarized as

minimize E[(Θc)2]c=E[Θ]var(Θ)=E[(ΘE[Θ])2]

E[(Θc)2]=E[Θ2]2cE[Θ]+c2ddcE[(Θc)2]=2E[Θ]+2c=0

LMS estimation of two random variables

minimize E[(Θc)2|X=x]c=E[Θ|X=x]var(Θ|X=x)=E[(ΘE[Θ|X=x])2|X=x]E[(Θg(x))2|X=x]E[(ΘE[Θ|X])2|X]E[(Θg(x))2|X]E[(ΘE[Θ])2]E[(Θg(x))2]

LMS estimation with several measurements: E[Θ|X1,,Xn], is hard to compute and involves multi-dimensional integrals, etc.

Estimator: ˆΘ=E[Θ|X], with estimation error ˜Θ=ˆΘΘ

E[˜Θ|X]=E[ˆΘΘ|X]=E[ˆΘ|X]E[Θ|X]=ˆΘˆΘ=0E[˜Θh(x)|X]=h(x)E[˜Θ|X]=0E[˜Θh(x)]=0cov(˜Θ,ˆΘ)=E[(˜ΘE[˜Θ])(ˆΘE[ˆΘ])]=E[˜Θ(ˆΘΘ)]=0˜Θ=ˆΘΘvar(Θ)=var(ˆΘ)+var(˜Θ)

Linear LMS

Consider estimator of Θ of the form ˆΘ=aX+b and minimize E[(ΘaXb)2]

ˆΘL=E[Θ]+cov(X,Θ)var(X)(XE[X])

E[(ˆΘLΘ)2]=(1ρ2)σ2Θ

Consider estimators of the form ˆΘ=a1X1++anXn+b and minimize E[(a1X1++anXn+bΘ)2]. Set derivatives to zero and linear system in b and the ai, only means, variances, covariances matter.

Cleanest linear LMS example

Xi=Θ+WiΘ,W1,,Wn are independent, Θμ,σ20 and Wi0,σ2i, then

ˆΘL=μ/σ20+ni=1Xi/σ2ini=11/σ2i

Classical statistical inference

Problem type

#1: hypothesis testing H0:θ=1/2 versus H1:θ=3/4

#2: composite hypotheses H0:θ=1/2 versus H1:θ1/2

#3: estimation, design an estimator ˆΘ to keep estimation error ˆΘθ small

Maximum likelihood estimation: pick θ that makes data most likely

ˆθMAP=argmaxθpΘ(x;θ)

ExamplemaxθnΠi=1θeθxi

maxθnΠi=1θeθximaxθ(nlogθθni=1xi)nθni=1xi=0ˆθML=nx1++xn

desirable properties of estimators

(1) unbiased: E[ˆΘn]=θ

(2) consistent: ˆΘnθ in probability

(3) small mean squared error (MSE): E[(ˆΘθ)2]=var(ˆΘθ)+(E[ˆΘθ])=var(ˆΘ)+(biased)2

Confidence intervals (CIs)

random interval [ˆΘn,ˆΘ+n] with an 1α confidence interval

P(ˆΘnθˆΘ+n)1α

CI in estimation of the mean ˆΘn=(X1++Xn)/n

Φ(1.96)=10.05/2

P(|ˆΘnθ|σ/n1.96)0.95

P(ˆΘn1.96σnθˆΘn+1.96σn)0.95

more generally

Φ(z)=1α/2P(ˆΘnzσnθˆΘn+zσn)1α

Since the σ is unknown, then we should estimate the value of σ

option 1: upper bound on σ, for example Bernoulli σ=p(1p)1/2

option 2: ad hoc estimate of σ, for example Bernoulliθˆσ=ˆΘ(1ˆΘ)

option 3: generic estimation of the variance

σ2=E[(Xiθ)2]ˆσ2n=1nni=1(Xiθ)2σ2ˆS2n=1n1ni=1(XiˆΘn)2σ2

Leave a reply:

Your email address will not be published.

Site Footer

Recording Life, Sharing Knowledge, Be Happy~