Essay - Vrije Universiteit Amsterdam

Organisatie
Essay
College 1:
Introductie
Monte Carlo Simulatie
Discrete Event Simulatie
Java
A.A.N. Ridder
Department EOR
Vrije Universiteit Amsterdam
Homepage:
http://personal.vu.nl/a.a.n.ridder/essay/default.htm
I
Twee keer college (6 en 8 januari);
I
Groepjes van twee;
I
Simulatieprogramma van een wachtrijmodel in Java;
I
Cases;
I
Verslag inleveren eind vd maand;
I
Later mondelinge toelichting;
6 Januari 2015
c Ad Ridder (VU)
Essay – January 2015
1 / 83
Informatie
c Ad Ridder (VU)
Essay – January 2015
2 / 83
Enige DES boeken
Wachttijdtheorie
Hoofdstuk 5 in
Henk C. Tijms
A First Course in Stochastic Models
Wiley, 2003
Jerry Banks, John Carson, Barry Nelson and David Nicol.
Discrete-Event System Simulation, Fourth edition, Pearson 2005.
Website (Internet)
http://personal.vu.nl/a.a.n.ridder/
essay/
´
Christos Cassandras and Stephane
Lafortune. Introduction to Discrete
Event Systems, Second edition, Springer 2008.
Louis Birta and Gilbert Arbez. Modelling and Simulation. Exploring
Dynamic System behaviour, Springer 2007.
Averill M. Law and W. David Kelton. Simulation Modeling and Analysis,
Third edition, McGraw-Hill 2000.
Website (Blackboard)
https://bb.vu.nl/
c Ad Ridder (VU)
Essay – January 2015
Lawrence Leemis and Steve Park. Discrete-Event Simulation: A First
Course, 2004.
3 / 83
c Ad Ridder (VU)
Essay – January 2015
4 / 83
Onderwerpen
Voorkennis en -kunde
1. Monte Carlo simulatie
2. Discrete-event simulatie
I
Kansrekening (SLLN; CLT)
3. Voorbeeld van een discrete-event simulatie
I
Markov-ketens (DTMC; CTMC)
4. Eindige en oneindige horizon simulatie
I
Statistiek (BTI)
5. Ingredienten van een simulatiestudie
I
Simulatie uit verdelingen (inverse transformatie; acceptatie-rejectie)
6. Kenmerken van een simulatieprogramma
I
Java programmeren (Linked List)
7. Wachtrijmodellen
8. Wachttijdtheorie
c Ad Ridder (VU)
Essay – January 2015
5 / 83
Studie van een Stochastisch Systeem
c Ad Ridder (VU)
Essay – January 2015
Methoden van studie
I
Veel bedrijfseconomische en financiele systemen bevatten onzekere
(stochastische) componenten;
I
Denk aan
I
Exact (analytisch): Markov-ketens, kansgenererende functies, Laplace
transformaties, complexe analyse.
Risico; levensduur; bederf;
I
Benaderingen: numerieke algoritmes.
Volatiliteit; koersverloop;
I
Simulatie: last resort. Meestal als discrete event simulation (DES).
I
Transporttijden; operatietijden; wachttijden;
I
Congestie; ongelukken;
I
I
I
Onderzoek en studie van het systeem nodig om die onzekerheden te
quantificeren;
I
Bv, een ziekenhuis moet weten wat de gemiddelde ‘ligtijd’ is om te
beoordelen of het aantal bedden voldoende is.
c Ad Ridder (VU)
6 / 83
Essay – January 2015
7 / 83
c Ad Ridder (VU)
Essay – January 2015
8 / 83
Simulation Model
Monte Carlo Simulatie
I
The basis of simulation is the static simulation model.
I
Also called Monte Carlo simulation.
I
A probability model is determined by random variables (or process)
X = (X1 , X2 , . . .) (input variables).
I
The output of the model is the variable Y = H(X) for some function H
(output variable).
I
The goal is to compute the performance measure
Z
` = [Y] = [H(X)] =
H(x)f (x) dx,
E
E
X
where f (x) is the pdf of the random vector X;
c Ad Ridder (VU)
Essay – January 2015
9 / 83
Probability Theory
c Ad Ridder (VU)
Essay – January 2015
10 / 83
The Monte Carlo Method
SLLN
D
When X1 , X2 , . . . are iid replications of X ∼ f then
N
1 X
a.s.
H(Xi ) →
N i=1
E[H(X)]
(N → ∞).
I
Consider finitely many iid replications X1 , X2 , . . . , XN ;
I
This is called a random sample with sample size N;
I
The mean of the associated H(Xi ) variables is a random variable; i.e., a
function on the sample space; called sample mean:
I
I
Make use of the last property;
I
By executing the simulation program, you generate observations (or
data; realizations; values; ...) x1 , x2 , . . . , xN of the random sample;
I
Compute the average of the data:
`=
N
1 X
H(xi );
N i=1
I
Then (SLLN): ` ≈ `: we say that ` is an estimate of `;
N
1 X
`b =
H(Xi ).
N i=1
I
Question: how good is the estimate?
I
Actually; it is custom in the simulation literature to just consider and
analyse the estimators;
E`b = `; i.e., `b is an unbiased estimator of the performance measure `;
I
Thus; how good is the sample mean estimator?
Properties:
I
I
For large N, according to SLLN, `b is not so random; i.e., it is almost a
constant function; i.e., `b ≈ ` for almost all samples.
c Ad Ridder (VU)
Essay – January 2015
11 / 83
c Ad Ridder (VU)
Essay – January 2015
12 / 83
Statistics
Using the CLT
Let σ 2 =
Var[H(X)] be the variance of the output variable;
CLT
D
When X1 , X2 , . . . are iid replications of X ∼ f , and `b is the sample mean based
on N samples; then
√
D
N `b − ` → N(0, σ 2 )
D
D
Interpretation: `b ≈ ` + N(0, σ 2 /N) ∼ N(`, σ 2 /N);
f`b(y)
`q = is q-th quantile of
b
distribution of `:
I
√
b ≈ σ/ N;
First: the standard error of the estimator is SE[`]
I
Suppose you estimated ` by ` based on a simulated sample of size N;
I
You might consider ` as a random realisation of of the estimator `b ∼ f`b(y).
I
Thus (see graph previous slide),
√
|` − `| < 1.96 σ/ N ⇔ 0.025 <
I
P(`b ≤ `q ) = q
D
P `b ≤ `
< 0.975
In other words, when you would generate many independent estimates
(each based on sample N), about 95% of these would be within 1.96
standard errors of the true value.
Note:
`0.025
` = `0.5
c Ad Ridder (VU)
`0.975
√
`0.975 ≈ `+1.96 σ/ N
y
Essay – January 2015
13 / 83
Random Estimation Interval
I
c Ad Ridder (VU)
Essay – January 2015
14 / 83
Confidence Interval
Generally, let α ∈ (0, 1) be your uncertainty parameter such as α = 0.05
or α = 0.1.
2
Var[Y] of the output
I
Let ` be your estimate, and suppose you know σ =
variable.
I
Let zq be the q-th quantile of the standard normal distribution; i.e.,
Φ(zq ) = (N(0, 1) ≤ zq ) = q. We saw
√ |`b − `| ≤ z1−α/2 σ/ N ≈ 1 − α
| {z }
P
I
The analysis above is the basis for reporting simulation output;
I
Your estimate is ` computed as the average of N iid output data obtained
by simulating N times the output variable Y = H(X);
I
You get a realisation of the random estimation interval:
b ` + z1−α/2 SE[`]
b
I(`) = ` − z1−α/2 SE[`],
P
I
You report this interval as a 100(1 − α)% confidence interval.
I
Interpretation: when you would generate many independent estimates
(each based on sample N), and when you would calculate the
associated confidence intervals, then about 100(1 − α)% of these
confidence intervals would cover the true value.
b
SE[`]
I
Rewrite: the unknown parameter ` satisfies
b `b + z1−α/2 SE[`])
b
P ` ∈ (|`b− z1−α/2 SE[`],
{z
}
random estimation interval
c Ad Ridder (VU)
Essay – January 2015
≈1−α
b
I(`)
15 / 83
c Ad Ridder (VU)
Essay – January 2015
16 / 83
Unknown Standard Error
Properties
I
Almost always the standard error of the estimator is unknown;
I
Equivalently, the variance σ 2 =
variable is unknown;
I
Thus variance can be estimated by the sample variance
Properties of the sample variance.
Var[Y] = Var[H(X)] of the output
(i). Unbiased estimator:
E[S2 ] = σ2 ;
P
lim
N
2
1 X
S =
H(Xi ) − `b .
N − 1 i=1
2
I
N→∞
(4.6)
N
N 1 X 2
H (Xi ) − `b2 .
N − 1 N i=1
the (Student) t distribution with N − 1 degrees of freedom.
(iv). Asymptotic normality:
√
D
N `b − ` /S → N(0, 1);
Essay – January 2015
17 / 83
Normal vs Student t
I
Essay – January 2015
18 / 83
E
Summary. Suppose we wish to compute a performance measure ` = [Y]
where the output variable Y = H(X) can be calculated as a function H of a
sequence of random input variables X. Suppose σ 2 = ar[Y] also unknown.
According to property (iii), we should use the Student t distribution in
stead of the normal distribution when constructing confidence intervals:
√
√ `b − tN−1,1−α/2 S/ N, `b + tN−1,1−α/2 S/ N ,
V
Monte Carlo algorithm
1. Repeat for i = 1, . . . , N:
(i). Generate Xi independently of previous runs;
(ii). Compute output Yi = H(Xi ).
According to property (iv), there is not much difference between the
critical points for large sample size N:
α
t9,1−α/2
t49,1−α/2
t99,1−α/2
t249,1−α/2
t999,1−α/2
z1−α/2
c Ad Ridder (VU)
c Ad Ridder (VU)
The Monte-Carlo Algorithm
where tN−1,q is the q-th quantile for the t-distribution with N − 1 degrees
of freedom.
I
∀ > 0
`b − ` D
`b − ` D
√ ∼ N(0, 1) ⇒
√ ∼ tN−1 ,
σ/ N
S/ N
b by its estimate
Hence, replace in your computations SE[`]
√
c = S/ N.
SE
c Ad Ridder (VU)
P(|S2 − σ2 | > ) = 0
(iii).
Mathematically equivalent:
S2 =
I
(ii). Strongly consistent estimator: S2 → σ 2 ; i.e.,
0.15
1.574
1.462
1.451
1.443
1.441
1.440
0.1
1.833
1.677
1.660
1.651
1.646
1.645
0.05
2.262
2.010
1.984
1.970
1.962
1.960
Essay – January 2015
2. Compute sample mean estimator for `:
0.025
2.685
2.312
2.276
2.255
2.245
2.241
N
1 X
`b =
Yi ;
N i=1
3. Compute sample variance estimator for σ 2 :
S2 =
N
1 X
b 2;
(Yi − `)
N − 1 i=1
4. Report estimate, confidence interval, and/or (estimated) standard error.
19 / 83
c Ad Ridder (VU)
Essay – January 2015
20 / 83
What is DES?
Discrete-event simulation is a computer simulation model/program of a
stochastic system that evolves dynamically in time via state information which
changes at discrete time epochs.
The components of a DES are
Discrete Event Simulation
c Ad Ridder (VU)
Essay – January 2015
21 / 83
A queueing network example
I
Entities
I
Time variable (or simulation clock)
I
System state variables
I
Events and Event list (or calendar)
I
Global variables
I
Statistics collectors (or counter variables)
c Ad Ridder (VU)
Essay – January 2015
22 / 83
Modeling aspects
As a reference model we consider a queueing network model.
c Ad Ridder (VU)
Essay – January 2015
23 / 83
I
The queueing model consists of two stations (‘queues’) Q1 and Q2 .
I
At Qi jobs arrive according to a Poisson(λi ) process.
I
There are three servers at Q1 and two servers at Q2 .
I
The service times of the servers at Qi have distribution function Gi (·).
I
After service completion at Q1 the jobs enter Q2 .
I
After service completion at Q2 a job leaves the system with probability p
or re-enters (‘feeds back’ to) Q1 with probability 1 − p.
I
Both stations have infinite waiting spaces.
I
Waiting jobs are served in order of arrival (FCFS).
c Ad Ridder (VU)
Essay – January 2015
24 / 83
Performance measures
Transient vs Steady-State
Typically we are interested in
I
waiting times or system times at the two stations;
I
waiting lines or system lines at two stations;
I
time spend in the system (sojourn time);
I
throughput (per station or from the whole system);
I
utilization;
I
...
We have to specify
I
whether we wish to estimate these performance measures for a finite
period; for instance the queueing system operates daily, opening empty
at 8.00 hr in the morning, closing at 18.00 hr in the evening.
I
or whether we wish to estimate steady-state averages; then we assume
that the system operates for an infinite time.
Let’s assume that
c Ad Ridder (VU)
Essay – January 2015
25 / 83
Entities
Essay – January 2015
26 / 83
System state
I
Actually, a system is defined to be a collection of entities, e.g., people or
cars or machines, that act and interact together.
I
Without entities, nothing would happen.
I
Entities have attributes, usually given as data values.
I
In our example the entities are (attributes between brackets)
I
c Ad Ridder (VU)
I
I
I
The system state is a collection of variables necessary to describe a
system at a particular time.
The set of all states is denoted by X ; a specific state by x ∈ X ; and the
(random) state at time t by X(t) ∈ X .
In our example the state comprises
I
jobs or customers (arrival time);
I
the number of jobs (x1 , x2 ) present at the two stations;
I
servers (idle/busy).
I
two vectors a1 , a2 of the arrival times of these waiting jobs;
I
two vectors b1 , b2 specifying the status of the servers (idle/busy).
Most often in a simulation study, we do not bother too much with entities.
c Ad Ridder (VU)
Essay – January 2015
27 / 83
c Ad Ridder (VU)
Essay – January 2015
28 / 83
Events
I
I
I
Time or clock variable
An event is an instantaneous occurrence that may change the state or
trigger a state transition.
The set of all possible events is denoted by E ; a specific event by e; the
events active in state x ∈ X by E(x) ⊂ E .
I
Events occur at some point in time.
I
For this we need a variable representing the current time of the
simulation.
I
This is also called the simulation clock or system time that measures the
elapsed simulation time.
I
Clearly we need to specify its unit size (second or minute or ...); then we
set it to zero at the start of a simulation and update it every time an event
occurs.
In our example events are
I
the arrival of a new job (at Q1 or at Q2 );
I
a service completion at one of the five servers.
c Ad Ridder (VU)
Essay – January 2015
29 / 83
Event list or calendar
Essay – January 2015
30 / 83
A Set of Alarm Clocks
I
The calendar for the simulation is a list of the events that are currently
scheduled to occur.
I
There is only one event list and it consists of the scheduled event times,
sorted by the earliest scheduled time first.
I
The event list at time t is denoted by Lev (t).
I
In our example the event list comprises
I
the next arrival time of a customer at Q1 ;
I
the next arrival time of a customer at Q2 ;
I
for any busy server the next time he is ready completing the job.
c Ad Ridder (VU)
c Ad Ridder (VU)
Essay – January 2015
I
Another view of the event list is that it is a collection of alarm clocks, one
for each scheduled event. The clocks are preset at different (random)
times in the future. For instance at some t there are 3 events scheduled:
(
)
Lev (t) =
arrival Q1
31 / 83
c Ad Ridder (VU)
arrival Q2
service
completion Q2
Essay – January 2015
32 / 83
The Event-Scheduling Approach
A walk through DES
In our queueing example we let time be measured in seconds, starting at
08:00. This is system time 0.
I
The discrete-event simulation runs as follows.
I
Suppose current time is tsim and state x ∈ X .
I
Each active event ei ∈ E(x) has an associated scheduled alarm time ti .
I
We denote the type-time pair of events by [‘A1’, <time>] for the arrival
event at time <time> at Q1 (and similarly for arrival event at Q2 );
I
The simulation clock is advanced forward to the earliest time on the event list
(when the first alarm goes off).
I
The alarm belongs to a particular event that triggers several activities in the
system.
I
I
These activities make that there will be changes in the system state and in
the event list; these need to be updated.
and by [‘D1’, 1, <time>] for departure events due to service completion
at server 1 of Q1 ; similarly for the other servers, and the other queue.
I
We start tsim = 0 with an empty system and idle servers:
I
Then the simulation clock is advanced again, etc.
x = {x1 = 0, x2 = 0, a1 = [], a2 = [], b1 = (0, 0, 0), b2 = (0, 0)}.
I
There are two active events: arrivals at the queues; the associated event
times are drawn by our random number generator, resulting. Thus
Lev = {[‘A2’, 0.817], [‘A1’, 1.284]}.
c Ad Ridder (VU)
Essay – January 2015
33 / 83
The 1-st event
c Ad Ridder (VU)
Essay – January 2015
The 2-nd event
I
The simulation time is advanced to the earliest event time 0.817
belonging to the event of an arriving job at Q2 ;
I
The simulation time is advanced to the earliest event time 1.284
belonging to the event of an arriving job at Q1 .
I
Immediately, this job enters service with server 1 of Q2 ;
I
This job enters service with server 1 of Q1 .
I
We draw a sample of the service time S2 , say 1.693, to schedule a new
event ‘D2’ with departure time 0.817 + 1.693 = 2.510;
I
We draw a sample of the service time S1 , say 2.613, to schedule a new
event ‘D1’ with departure time 1.284 + 2.613 = 3.987;
I
Also we realise a new sample of the interarrival time A2 , say 1.226, and
update the scheduled time of event ‘A2’ to 0.817 + 1.226 = 2.043;
I
We realise a new sample of the interarrival time A1 , say 0.577, and
update the scheduled time of event ‘A1’ to 1.284 + 0.577 = 1.861;
Thus the new situation is:
Thus the new situation is:
tsim = 0.817
tsim = 1.284
x = {x1 = 0, x2 = 1, a1 = [], a2 = [], b1 = (0, 0, 0), b2 = (1, 0)}
x = {x1 = 1, x2 = 1, a1 = [], a2 = [], b1 = (1, 0, 0), b2 = (1, 0)}
Lev = {[‘A1’, 1.284], [‘A2’, 2.043], [‘D2’, 1, 2.510]}
c Ad Ridder (VU)
34 / 83
Essay – January 2015
Lev = {[‘A1’, 1.861], [‘A2’, 2.043], [‘D2’, 1, 2.510], [‘D1’, 1, 3.987]}
35 / 83
c Ad Ridder (VU)
Essay – January 2015
36 / 83
The 3-rd event
The k-th event
Suppose that currently the situation is
I
The simulation time is advanced to the earliest event time 1.861
belonging to the event of an arriving job at Q1 .
I
This job enters service with server 2 of Q1 .
I
We draw a sample of the service time S1 , say 0.811, to schedule a new
event ‘D1’ with departure time 1.861 + 0.811 = 2.752;
I
We realise a new sample of the interarrival time A1 , say 1.428, and
update the scheduled time of event ‘A1’ to 1.861 + 1.428 = 3.289;
tsim = 249.31
x = {x1 = 4, x2 = 8, a1 = (240.82), a2 = (238.71, 240.61, 241.01, 244.55, 246.91, 248.88),
b1 = (1, 1, 1), b2 = (1, 1)}
Lev = {[‘D2’, 2, 250.28], [‘D2’, 1, 251.43], [‘A1’, 254.38], [‘D1’, 2, 255.36],
[‘A2’, 256.42], [‘D1’, 1, 260.91], [‘D1’, 3, 263.93]}
Thus the new situation is:
I
The next event is service completion at Q2 at time 250.28.
I
There are jobs waiting at Q2 to occupy the empty seat. We draw a
sample of the service time S2 , say 0.83, and update the scheduled time
of event ‘D2’, 2 to 250.28 + 0.83 = 251.11.
I
For the job leaving Q2 we flip a coin (Bernoulli random variable) to decide
a feed back; suppose the outcome is to loop back for entering Q1 again.
I
The looped job finds all servers busy at Q1 , and thus joins the queue.
tsim = 1.861
x = {x1 = 2, x2 = 1, a1 = [], a2 = [], b1 = (1, 1, 0), b2 = (1, 0)}
Lev = {[‘A2’, 2.043], [‘D2’, 1, 2.510], [‘D1’, 2, 2.752], [‘A1’, 3.289], [‘D1’, 1, 3.987]}
c Ad Ridder (VU)
Essay – January 2015
37 / 83
The k-th event (cont’d)
c Ad Ridder (VU)
Essay – January 2015
Trace
Next situation:
I
This detailed walk through a system simulation is called a trace.
tsim = 250.28
I
In a trace a print is made of all the (important) simulation components
after each event.
I
It serves two objectives.
x = {x1 = 5, x2 = 7, a1 = (240.82, 250.28), a2 = (240.61, 241.01, 244.55, 246.91, 248.88),
b1 = (1, 1, 1), b2 = (1, 1)}
(i). In developing a simulation program it helps you to think about the events that
may happen in the system, the activities they trigger and their logic.
Lev = {[‘D2’, 2, 251.11], [‘D2’, 1, 251.43], [‘A1’, 254.38], [‘D1’, 2, 255.36],
[‘A2’, 256.42], [‘D1’, 1, 260.91], [‘D1’, 3, 263.93]}
c Ad Ridder (VU)
38 / 83
Essay – January 2015
(ii). After having written a computer program of the simulation, it provides a
check whether the program is correct.
39 / 83
c Ad Ridder (VU)
Essay – January 2015
40 / 83
Simulation run
Programming a Run
I
A simulation run of the system goes from event to event at discrete time
epochs determined by realisations of the appropriate random variables.
I
A trace covers usually a short time interval or a small number of events.
I
I
The actual execution of the computer program of the simulation is longer
but should end at some time or after some number of events: stopping
criterion.
When you write a computer program of the simulation it is convenient to
modularise your program into several subprograms or routines to clarify
the logic and the interactions.
I
Roughly a simulation run looks in pseudo-code as follows.
I
One such a simulation is called a simulation run which corresponds to
the concept of sample path of a stochastic process.
I
Later we will see that in most simulations studies you will simulate
(many) more simulation runs in order to obtain more reliable estimates.
c Ad Ridder (VU)
Essay – January 2015
initialise;
REPEAT
[e,t] = next_event_type_and_time;
switch (e)
case ’arrival’: ...
case ’departure’: ...
case ’. . .’: ...
UNTIL stopping_criterion;
41 / 83
Programming a Run (cont’d)
Essay – January 2015
42 / 83
The Goal of DES
In the subroutines for the different events you program code for updating
state, event list, and statistical counters. Similar as what you would do in the
trace. For instance:
The goal of a DES-study is to estimate performance measures; for instance :
subroutine execute_arrival_at_Q1_event(t)
begin
update_all_relevant_counter_variables();
j = find_free_server();
if (server(j)==‘idle’)
server(j) = ‘busy’;
s = generate_servicetime();
add_to_eventlist(‘D1’,j,t+s);
else
add_customer_to_queue(j,t);
end
a = generate_interarrivaltime();
add_to_eventlist(‘A1’,t+a);
end
c Ad Ridder (VU)
c Ad Ridder (VU)
Essay – January 2015
I
average waiting times (per queue or overall);
I
mean waiting line (per queue or overall);
I
average time spend in the system (sojourn time);
I
throughput (per queue or overall);
I
utilization;
I
...
We need to say whether we want these quantities measured (averaged)
during a short period (finite-horizon simulation), or in the long-run
(steady-state simulation).
43 / 83
c Ad Ridder (VU)
Essay – January 2015
44 / 83
Procedure
I
A run starts at time 0, at a specific specified state;
I
A run ends at some stopping criterion, e.g.
(i). A time horizon T;
(ii). A number of events Ne ;
Finite-Horizon
(iii). A number of arrivals Na ;
(iv). Etc.
Simulation
c Ad Ridder (VU)
Essay – January 2015
45 / 83
Illustration
I
Execute N runs, independent and (probabilistically) identical.
I
In each run we keep track of certain variables (counter variables) that at
the end of the run determine its outcomes or outputs.
I
The performance measures are estimated by averaging these N
outcomes.
I
The counter variables are initialised at the start of a simulation run, and
because nothing changes in between two consecutive events, it suffices
to update them after an event occurs.
c Ad Ridder (VU)
Essay – January 2015
46 / 83
Example
state
0
T
t
run 1: compute
output
Y1 .
I
Consider Q1 in the tandem network.
I
Suppose the goal is to estimate the performance measure “mean
average waiting time per customer from 08:00-18:00”.
I
This means:
state
0
T
t
run 2: compute
output
Y2 .
c Ad Ridder (VU)
let K be the (random) number of customers that has been served during
these ten hours at Q1 ;
I
let W1 , . . . , WK be their corresponding (random) waiting times;
P
define the output Y = (1/K) K
i=1 Wi ;
I
state
0
I
T
Essay – January 2015
t
run 3: compute
output
Y3 .
I
47 / 83
then we wish to compute
c Ad Ridder (VU)
E[Y].
Essay – January 2015
48 / 83
Example (cont’d)
I
Estimating the Average Queue Length
During a run in the simulation program you keep track of two variables:
1. nserv1 = the current total number customers who went into service at Q1 ;
I
How to compute (or estimate) the mean length at Q1 during the interval
08:00-18:00?
Whenever a new customer enters service at Q1 , these two variables are
updated.
I
At the end of the run you have a realisation y = wtotal1 /nserv1 of the output
Y.
I
Let Q(t) be the queue length at time t, 0 ≤ t ≤ T, where t = 0 represents
08:00 and T represents 18:00; measured in some time-units;
RT
Goal: [Y] for output Y = T1 0 Q(t) dt;
2. wtotal1 = the current total waiting time of these nserv1 customers.
I
I
I
I
Repeat N times and do your statistics.
I
For such finite-horizon simulation the statistics is similar as for static
stochastic simulation (see the part on the Monte Carlo simulation).
c Ad Ridder (VU)
Essay – January 2015
49 / 83
E
Interpretation of the integral: area below graph of the function Q(t).
c Ad Ridder (VU)
Essay – January 2015
50 / 83
Illustration
Q(t)
Steady-State
0 T1 T2
TR T
t
Simulation
Let 0 = T0 < T1 < T2 < · · · < TR < T ≤ TR+1 be the consecutive event times.
Y=
R
1X
1
Tr − Tr−1 Q(Tr−1 ) +
T − TR Q(TR )
T r=1
T
Note: number of events R is random.
c Ad Ridder (VU)
Essay – January 2015
51 / 83
c Ad Ridder (VU)
Essay – January 2015
52 / 83
Steady-State Definition
Looking to the Future
I
Suppose that you describe the system state at time t by X(t);
I
You observe the system at time 0 (‘now’);
I
The system is acting already a long time (from time −∞);
I
You might be interested in
(i). What is the CDF F0 (x) of X(0)?
(ii). What is the expectation
E[X(0)]?
I
You might reason as follows;
I
The system starts at time 0;
I
The system will act forever in the future;
I
The system will be in steady-state at time +∞;
I
Now we are interested in
(i). The CDF F∞ (x) of X(∞);
(ii). The expectation
H X(∞) .
(iii). More generally,
suppose
that H(x) is a function acting on the state variable,
what is
H X(0) ?
E
I
F0 is the steady-state distribution of the system;
I
The expectations are steady-state performance measures.
c Ad Ridder (VU)
Essay – January 2015
E
I
53 / 83
Limit Distribution
Question: what do we mean by random variable X(∞)?
c Ad Ridder (VU)
Essay – January 2015
54 / 83
Convergence in Mean
P(X(t) ≤ x) is a CDF for all t ≥ 0;
I
Assume that Ft (x) =
I
Equivalently, X(t) is a proper random variable;
Definition
D
Assume that X(t) → X(∞). If
E[X(∞)] = t→∞
lim E[X(t)]
Definition
Assume
we say that the sequence of random variables {X(t), t ≥ 0} converges in
mean to X(∞); and we denote
(a). The limit
F∞ (x) = lim Ft (x)
t→∞
exists for all x ∈
R (we say pointwise);
L
X(t) →1 X(∞)
(b). F∞ (·) is a CDF (recall the definition of a CDF!)
Then we say that the sequence of random variables {X(t), t ≥ 0} converges
weakly or in distribution to a (proper!) random variable X(∞); and we denote
Generally you need to check certain sufficient conditions to ensure
connvergence in distribution and/or in mean (see the counterexamples next
slides).
D
X(t) → X(∞)
c Ad Ridder (VU)
Essay – January 2015
55 / 83
c Ad Ridder (VU)
Essay – January 2015
56 / 83
Counterexample I
Counterexample II
I
I
P(X(t) = 0) = (t − 1)/t; P(X(t) = t) = 1/t.
Define random variable X(t) for t ≥ 0 by
P(X(t) = 0) = 0.5; P(X(t) = t) = 0.5.
I
I
I
Hence,


0,
Ft (x) = (t − 1)/t,


1,
Hence,


0,
Ft (x) = 0.5,


1,
I
Define random variable X(t) for t ≥ 1 by
x < 0;
0 ≤ x < t;
x ≥ t.
I
Easy to see that F∞ (x) = limt→∞ Ft (x) exists (pointwise), with
(
0,
x < 0;
F∞ (x) =
1,
x ≥ 0.
I
Conclude X(t) → X(∞), where
I
However
Easy to see that F∞ (x) = limt→∞ Ft (x) exists (pointwise), with
(
0,
x < 0;
F∞ (x) =
0.5,
x ≥ 0.
No convergence in distribution.
D
lim
t→∞
c Ad Ridder (VU)
Essay – January 2015
57 / 83
Steady-State Simulation
I
I
I
P(X(∞) = 0) = 1;
E[X(t)] = t→∞
lim 1 = 1 6= 0 = E[X(∞)].
c Ad Ridder (VU)
Essay – January 2015
58 / 83
A Steady-State Simulation Run
I
D
Suppose that in your system both (i) X(t) → X(∞) and (ii)
L
H X(t) →1 H X(∞) ;
D
NB: you may assume that also H X(t) → H X(∞)
Hence,
(i). you start simulating a run from an arbitrary initial state, for instance the
empty state;
(ii). continue long enough, say until τ0 , at which the system is about in
steady-state;
The performance measure of interest is the steady-state expectation
`=
H X(∞) ;
I
Note that for all t ≥ τ0 it holds that X(t) ≈ X(∞) and
How would you estimate ` by simulation?
I
Thus define as output of the run
Z
1 τ0 +τ
Y=
H( X(t) dt
τ τ0
I
NB: similarly for discrete processes in which you take sums.
(iii). continue simulating until τ0 + τ ;
E
I
x < 0;
0 ≤ x < t;
x ≥ t.
Notice:
I
I
D
From (i): X(∞) ≈ X(τ0 ) for large τ0 ;
From (ii): ` =
H X(∞) ≈
H X(τ0 ) for large τ0 ;
c Ad Ridder (VU)
E
E
Essay – January 2015
59 / 83
D
c Ad Ridder (VU)
Essay – January 2015
EH
X(t)
≈ `;
60 / 83
Replication-Deletion
Illustration
state
I
The output satisfies
Z
h 1 Z τ0 +τ
i
1 τ0 +τ
H( X(t) dt =
[Y] =
τ τ0
τ τ0
E
I
I
E
EH
X(t)
τ
run 2: compute
output
Y2 from state
in
t process
[τ0 , τ0 + τ ].
dt ≈ `;
τ0
0
Repeating N times independently, you get an (approximately) unbiased
estimator of ` by
N
1 X
`b =
Yi
N i=1
warm-up
observation
state
The rest (SLLN; CLT; confidence interval; accuracy; ...) is the same as in
static Monte Carlo simulations.
τ0
0
warm-up
c Ad Ridder (VU)
τ
run 1: compute
output
Y1 from state
in
t process
[τ0 , τ0 + τ ].
Essay – January 2015
61 / 83
Discussion
observation
c Ad Ridder (VU)
Essay – January 2015
More Good News
I
Suppose the performance measure is a long-run fraction or average
quantity; e.g.
I
A simple method for estimating a steady-state expectation;
I
the long-run fraction of time the server is available;
I
The initial long period of a run until steady-state is called warm-up
period, and is not used for computing output;
I
the long-run average cost per unit of time;
I
the long-run fraction of time the machine is down;
The output Y of a run is computed from the observation period;
I
the long-run average number of particles per unit of time;
I
Pro: iid observations which makes the statistics easy;
I
etc.
I
Con: inefficient because each run needs the warm-up period.
I
I
These can be modelled by
lim
t→∞
c Ad Ridder (VU)
62 / 83
Essay – January 2015
63 / 83
c Ad Ridder (VU)
1
t
t
Z
H X(s) ds.
0
Essay – January 2015
64 / 83
Almost Sure Limits
I
The Difficulty of Steady-State Simulation
If the system state process {X(t), t ≥ 0} is regenerative these long-run
average limits converge almost surely to the steady-state expectation
(this is known as an ergodic theorem):
Z
1 t
lim
H X(s) ds =
H X(∞)
a.s.
t→∞ t 0
|
{z
}
random variable
E
I
See your textbook on Markov chains;
I
Corollary: you
estimate
long-run averages/fractions similarly as
estimating
H X(∞) .
I
How large should τ0 (length of warm-up period) be before starting the
observations?
I
This is the initial transient problem;
I
There are statistical techniques, but in practice you just simulate and
experiment;
E
c Ad Ridder (VU)
Essay – January 2015
65 / 83
Steady-State Simulation Strategies
c Ad Ridder (VU)
66 / 83
C. The Batch Means Method
I
I
The replication-deletion method, described in the slides above, is just
one of many methods for steady-state simulation.
I
The most important ones are
Remedy for the inefficiency of the replication-deletion method;
(i). simulate one long run of length T;
(ii). first a warm-up period of length τ0 ;
(iii). then split the remaining time T − τ0 in N sections (called batches) of length
τ = (T − τ0 )/N.
A. Stationary process (single run)
B. Replication-deletion method (multiple runs)
I
Compute the sample mean of the i-th batch (i = 1, . . . , N)
Z
1 τ
Yi =
H X(τ0 + (i − 1)τ + t) dt
τ 0
I
Final estimator is the average of these batch means
C. Batch means method (single run)
D. Regenerative method (multiple cycles)
I
A. and D. are more involved obtaining variance estimates for the
construction of the confidence intervals; not in this course.
I
C. is an efficient alternative for B.
c Ad Ridder (VU)
Essay – January 2015
Essay – January 2015
N
1 X
`b =
Yi
N i=1
67 / 83
c Ad Ridder (VU)
Essay – January 2015
68 / 83
Illustration
Discussion
state
I
Typically, number of batches N is small.
I
The batch length (or size) τ should be large, so that
1. the correlations between the batch means Yi , i = 1, 2, . . . are neglegible;
2. each batch mean Yi is approximately Gaussian.
t
0
I
K
warm-up
c Ad Ridder (VU)
batch 1
length τ
batch 2
length τ
Essay – January 2015
batch 3
length τ
69 / 83
Confidence Interval
I
I
c Ad Ridder (VU)
Essay – January 2015
70 / 83
Ingredienten van een simulatiestudie
The sample variance to estimate v2 :
1. Random number generator
N
2
1 X
S2 =
Yi − `b .
N − 1 i=1
I
Thus the Yi ’s may be considered as independent Gaussian, with mean `
and variance v2 ;
2. Trekken uit kansverdelingen
3. Fitten van parameters
Since N is small:
√ `b − ` D
N
→ tN−1 (t → ∞),
S
where tN−1 is the Student-t variable with N − 1 df.
√ √
CI: `b − tN−1,1−α/2 S/ N, `b + tN−1,1−α/2 S/ N .
c Ad Ridder (VU)
Essay – January 2015
4. Programma schrijven en uitvoeren
5. Output statistiek
6. Verificatie
71 / 83
c Ad Ridder (VU)
Essay – January 2015
72 / 83
1. Random Number Generator (RNG)
RNG in Java
Een RNG produceert een rij getallen u1 , u2 , . . . tussen 0 en 1 die beschouwd
mogen worden als realiseringen van een rij IID uniform verdeelde
stochastische variabelen U1 , U2 , . . ..
I
Gebruik de klasse Random:
long seed = 123456; // kies je eigen seed
Random generator = new Random(seed);
Een RNG wordt als een algoritme ge¨ımplementeerd dat
I
Dan steeds u = generator.nextDouble() geeft een
(pseudo-)random getal tussen 0 (inclusief!) en 1 (exclusief).
I
Alternatief: Mersenne Twister. Een snelle en uitstekende RNG. Code
beschikbaar op Internet.
(i). snel moet zijn;
(ii). reproduceerbaar;
(iii). uitstekende statistische eigenschappen;
(iv). machine-onafhankelijk.
c Ad Ridder (VU)
Essay – January 2015
73 / 83
2. Simuleren uit verdelingen
I
Inverse-transformatie: als X verdelingsfunctie F(·) heeft, en U is de
D
F(x) = 1 − e−λx = 1 − u ⇔ x = − ln(u)/λ.
Acceptatie-rejectie: als X kansdichtheid f (·) heeft, Y kansdichtheid g(·),
en er is een eindige constante c zodat f (x) ≤ cg(x) voor alle x ∈ ; dan
R
D
α α−1 −λx
λ x
e
Γ(α)
Calculus: optimaal is µ = α/λ en c =
c Ad Ridder (VU)
I
Gegeven een geparametrizeerde kansverdelingstype, bv Gamma(α, λ),
Coxian2 (p, µ1 , µ2 ), etc van een stochastische variabele X.
I
Gegeven voldoende momenten m1 =
I
Vind de parameters zodat de momenten kloppen.
Meestal eerste twee momenten.
I
Ipv tweede moment, beter genormaliseerde variantie; die heet
(kwadratische) variatiecoefficient
c2 [X] =
en g(x) = µe−µx .
I
λ
Γ(α)
α α−1
e
Essay – January 2015
E[X], m2 = E[X2 ], . . .
I
is X = Y|(U ≤ f (Y)/(cg(Y))). Algoritme: trek y uit g(·), u uit U(0, 1); als
u ≤ f (y)/(cg(y)) neem x = y, anders herhaal. Bv (voor α > 1)
f (x) =
74 / 83
Probleem:
D
uniforme s.v. op [0, 1]; dan is X = F −1 (U) = F −1 (1 − U). Algoritme: los
F(x) = u (of F(x) = 1 − u) op naar x. Bv
I
Essay – January 2015
3. Fitten van parameters
Algemeen: inverse-transformatie methode of de acceptatie-rejectie methode.
Specifiek: convolutie, compositie, ...
I
c Ad Ridder (VU)
Var[X] = m2 − 1.
m21
m21
Zie de notitie des notes.
.
75 / 83
c Ad Ridder (VU)
Essay – January 2015
76 / 83
6. Verificatie
Verificatie is het proces dat nagaat of je simulatieprogramma correct is.
Enige tips:
I
Programmeer in modules en check deze; bijvoorbeeld een module om
trekkingen uit een Lognormale verdeling te simuleren met verwachting 5
en variantie 20. Ga experimenteel na of 100000 trekkingen inderdaad
deze gemiddeldes heeft.
I
Druk een trace af (eventlist, toestand, statistische tellers na iedere event)
en controleer ‘met de hand’.
I
Controleer bekende eigenschappen zoals Little.
I
Simuleer vereenvoudigde versie van het systeem waarvoor analytische
uitdrukkingen bekend zijn.
I
Neem extreme waarden van parameters.
c Ad Ridder (VU)
Essay – January 2015
DES datastructuren
77 / 83
Dynamische vectoren
Essay – January 2015
78 / 83
Linked List
I
In wachtrijsimulaties heb je te maken met dynamische
vectoren/matrices: geplande events; klanten in de rij (of rijen); klanten in
het systeem; etc.
I
Afhankelijk van model/probleem/simulatieprogramma heb je behalve de
lengte (ofwel ‘aantallen’) ook nodig informatie per element. Bv, de
aankomsttijden van klanten die staan te wachten.
I
Dynamisch wil zeggen dat de lengte varieert gedurende de simulatie.
Implementatie als ‘arrays’ is dan bijzonder inefficient (vrijwel onmogelijk).
c Ad Ridder (VU)
c Ad Ridder (VU)
Essay – January 2015
79 / 83
I
De datastructuur Linked List (gelinkte lijst) is de oplossing.
I
Uit Wikipedia: Een gelinkte lijst bestaat uit een reeks van
´ of twee
data-elementen, records, elk met een aantal datavelden en e´ en
verwijzingen naar een volgende en/of vorig data-element. Het is niet
mogelijk om meteen een willekeurig data-element direct te benaderen in
tegenstelling tot bijvoorbeeld een array: om een element op de N-de
positie te benaderen, moet eerst de hele lijst tot positie N doorlopen
worden.
c Ad Ridder (VU)
Essay – January 2015
80 / 83
Operaties
Zelf programmeren
Het is toegestaan om op elke plek in een gelinkte lijst data-elementen in te
voegen of te verwijderen door de verwijzing van het voorgaande element
naar respectievelijk het nieuw toe te voegen, dan wel het tweede volgend
element te laten wijzen.
Mijn advies: voor wie het kan zelf programmeren:
I class Knoop (of Event of Klant of ...):
I
I
bevat velden voor data, een wijzer naar vorige en/of een wijzer naar
volgende;
I
bevat methode(n) voor constructie
class Lijst (of EventList of Wachtrij of ...):
I
bevat velden voor lengte (vd lijst), en voor knopen current/first/last;
I
bevat methoden voor constructie, invoegen, verwijderen, zoeken, enz.
Een wachtrijsimulatie bevat meestal meer lijsten!
c Ad Ridder (VU)
Essay – January 2015
81 / 83
Gebruik van Java’s klassen
I
Java kent een klasse LInkedList van objecten. Je moet zelf klasse object
invoeren (bv Event, Knoop, etc, zie vorige slide).
I
LinkedList bevat al methoden voor invoegen en verwijderen, en een
aantal anderen. Opzoeken bij documentatie van Java
http://docs.oracle.com/javase/1.4.2/docs/api/java/
util/LinkedList.html
I
Zie ook mijn voorbeeldprogramma LLproef.java op
http://personal.vu.nl/a.a.n.ridder/essay/src.html
c Ad Ridder (VU)
Essay – January 2015
83 / 83
c Ad Ridder (VU)
Essay – January 2015
82 / 83