Uncertainty Quantification - P10 / P50 / P90*
Also see:
When there are significant
uncertainties in our heterogeneous reservoir descriptions, as there almost
always are, there is no such thing as a meaningful P10, P50, P90, or Px
case description. There are only meaningful P10, P50, or Px results that must be
determined from probabilistic analysis. Any valid question in
reservoir modeling, regardless of the model used, must be asked to some
number of cases, representing many combinations of the
uncertainties, in order to obtain a probabilistic distribution of the
answer. Absolute predictions require a statistically significant
set of cases, but optimizations may require only a small number (see
SensorPx Example 3). Individual scenarios have a near-zero probability of
occurrence, and any desired number of Px cases of oil recovery, for
example, can be found or constructed. When the uncertainties have
large effects on the results, no single case can answer any question or
represent any probability of description or behavior, and is virtually
meaningless in itself, beyond the Px probabilistic result that it reproduces
and that was used to choose it amongst the considered scenarios.
In probabilistic analysis, one
might run 1000 realizations of equally-probable combinations of the uncertain variables.
The estimated P90 oil recovery is the recovery that is exceeded in 90% of the cases (there is a
90% chance the recovery will be greater than the exceedance P90 value)**.
"Cumulative" probabilities are the opposite of exceedance probabilities:
low median
high
Exceedance probabilities:
P90
P50
P10
Cumulative probabilities:
P10
P50
P90
A given case might give
a P90 oil recovery and a P10 gas recovery. In general there will be no
realization that gives P90 results for more than one variable. Depending on how
much effect the uncertain variables have on the results, there may be no
such thing as "what happens in the neighborhood of the P10". There is such
a thing, if the effect is little. Two P90 cases (of a given variable) can easily be absolutely
and completely different. Multiple P10 cases tend to exhibit less
difference in description and behavior. The differences in multiple cases
giving a result of the same exceedance probability Px decrease with
decreasing value of x.
Consider a case
that gives (optimistic) P10 oil recovery that has very many wells. From
that case one might easily construct two P90 oil recovery (pessimistic)
cases that each operate a completely different subset of the wells in the
P10 oil recovery case.
That same extreme
difference of representative cases naturally results from stochastic (or
probabilistic)
representation of the most common uncertain variables, porosity and permeability.
Coming up with a Px case requires
a reverse model calculation, no matter what model is used, to determine the
inputs corresponding to a set of Px scalar outputs, as a function of time. That is a far, far
more complex task than the forward model. Those companies and
processes insisting on single or a few representative probabilistic cases
(rather than results) are not correctly using reservoir models.
Providing a
number is never a problem, as an answer to a valid question. The answer to
any question is always “the most probable answer is …”, or “the mean and
standard deviation are …” or “this is the estimated probabilistic
distribution of the prediction or benefit”, or "the answer can not be determined by
reservoir modeling". Only probabilistic outcomes, or results, can be
reliably determined for partially unknown systems, in reserves or prospect
evaluation, and even in history matching and optimization. The huge problem
arises in reservoir modeling when representative probabilistic cases are
required or perceived to be needed or meaningful, because in general no such
meaningful cases exist, and in general it is extremely unlikely to identify
any meaningful individual “realisations that are near the P90/50/10" as some
suggest. In the typical case we might run thousands of realizations in an
uncertainty space characterized by infinite numbers of real possibilities,
but only billions or trillions or very many more restricted possibilities
(the number of restricted possibilities is M**n where M is the number of
discrete values allowed and n is the number of variables) . For example, if
there are 1000 variables and 10 allowed values for each, then the number of
restricted possibilities is 10**1000. If 1000 realizations are run, and
then another 1000, then the probability is near zero that any found Px case
in the first 1000 runs will be similar to any found in the second set,
perhaps for all x but at least for x >> 0.
The same
principles apply using any model with uncertain inputs that is used to
obtain a probabilistic solution, which includes the use of analogs and
experience. For example, unless you have production, initial well rate
and decline curve parameters are always uncertain. Some number of analogs and amounts of experience and data allow engineers to
compare wells with ranked analog results and estimate probabilistic results
from them. The uncertainty decreases with increasing applicable
analogs, experience, and data. The applicable analogs are simply the realizations
of the uncertainties. Local similarities in field geology and
production behavior are learned from experience and significantly reduce
uncertainty.
Simulation can
further reduce uncertainty by relating production and injection to input
descriptive properties of the reservoir(s), well(s), facilities, and fluids,
and initial conditions and boundary conditions versus time. But much
of that input data is potentially uncertain, especially in optimizations,
when we are attempting to determine optimal values of the input controls
like well locations and completions and constraints (process).
Simulation can be used to numerically quantify the uncertainty in predicted
production and injection, and to probabilistically optimize it, as a
function of uncertainty in input descriptive variables and options in
control variables. Simulation is needed when it becomes impossible to
otherwise predict the effects of heterogeneous reservoir/well descriptive
variables and/or controls on production and recovery.
Spe11 makes
a good example. It is a well-known 10x10x3 blackoil model of gas
injection and oil recovery, with the single gas injector completed in
(1,1,1) and the single producer in (10,10,3). Layer thicknesses are 20, 30,
and 50 ft, respectively. Change the bhp of the gas injection well from
10000 to 9000 psi (to avoid effects of the negative compressibility error in
the specified pvt data). Also complete both the injector and producer
in all 3 layers, and change Kz to be equal to .1*Kx. The Sensor
datafile is spe1pbase.dat.
Assume that the
base case is our "best guess" case, which is defined by the most
likely values of all uncertain inputs, and that only the areal layer permeabilities
K1 K2 K3 are uncertain. Base case values are 500, 50, and
200 md respectively. Assume that the base case layer perms are estimated
from analogs are not in error by more than a factor of 2 in either
direction, i.e. 250 < K1 < 1000, 25 < K2 < 100, and 100 < K3 < 400,
and that the probability distribution is uniform (usually permeability will
follow a log-normal distribution).
Results,
verifiable with any simulator (they should be the same or very close, for
the given individual cases):
The base case
gives cumulative oil recovery of 20.02%.
The Sensor data
file spe1p.dat specifies
the above permeability ranges as uniform random distributions.
Each execution of spe1p.dat gives a different and equally probable
realization of the 3 unknowns.
The Makespx datafile
spe1p.mspx specifies that
SensorPx will compute P10, P50, and P90 results from the output of 10,000
runs of spe1p.dat that will be made in 8 simultaneous processes of 1250
serial runs each It was found that sets of 1000, 3000, and 5000 runs
were not sufficient to provide accurate and reproducible probabilistic
results, but 10,000 runs yields P10, P50, and P90 phase rates and
cumulatives that are reproducible within 1%, for the given uncertainties, in
about 6 minutes on our machine. The large number of runs required to
obtain a statistically significant set, for only 3 total variables, is due
to the fact that the variables have very large effects on results over their
given ranges. In another example using different assumptions of
uncertainty (link to SensorPx Example 1 is at bottom of this page), 10,000
runs was found to be sufficient to quantify uncertainty in results for 900
uncertain variables that are randomly populated according to their input
probability distributions.
The calculated
Field Px,y,z values are given in the SensorPx output file spe1p.log.
Results at end of run are:
FINAL FIELD RESULTS
VARIABLE TIME MIN PZ PY
PX MAX
(CASE) (CASE)
(CASE) (CASE) (CASE)
--------------------------------------------------------------------------------------------------------
QOIL 0.3650000E+04 0.1648364E+04 0.3586624E+04 0.8808262E+04
0.2000000E+05 0.2106339E+05
2669 5593
5373 4780 2529
QWAT 0.3650000E+04 0.0000000E+00 0.0000000E+00 0.0000000E+00
0.0000000E+00 0.0000000E+00
0 0
0 0 0
QGAS 0.3650000E+04 0.3814416E+05 0.1226484E+06 0.1516267E+06
0.1977881E+06 0.2954274E+06
3091 8075
2262 7568 1276
QWI 0.3650000E+04 0.0000000E+00 0.0000000E+00 0.0000000E+00
0.0000000E+00 0.0000000E+00
0 0
0 0 0
QGI 0.3650000E+04 0.8418205E+05 0.1000000E+06 0.1000000E+06
0.1000000E+06 0.1000460E+06
3091 3125
5333 4631 2886
WCUT 0.3650000E+04 0.0000000E+00 0.0000000E+00 0.0000000E+00
0.0000000E+00 0.0000000E+00
0 0
0 0 0
GOR 0.3650000E+04 0.1907208E+04 0.6884886E+04 0.1898299E+05
0.3733397E+05 0.7145449E+05
3091
4980 4635 3583 2669
CUMOIL 0.3650000E+04 0.3931248E+05 0.4819892E+05 0.6493193E+05
0.7300000E+05 0.7305285E+05
5833 5860
3101 1888 3051
CUMWAT 0.3650000E+04 0.0000000E+00 0.0000000E+00 0.0000000E+00
0.0000000E+00 0.0000000E+00
0 0
0 0 0
CUMGAS 0.3650000E+04 0.9493449E+05 0.2111026E+06 0.4339683E+06
0.5052956E+06 0.5260374E+06
3091 4082
9117 3805 949
CUMWI 0.3650000E+04 0.0000000E+00 0.0000000E+00 0.0000000E+00
0.0000000E+00 0.0000000E+00
0 0
0 0 0
CUMGI 0.3650000E+04 0.3511303E+06 0.3650000E+06 0.3650000E+06
0.3650000E+06 0.3650000E+06
3734 2399
5181 8195 2486
OILREC 0.3650000E+04 0.1382300E+02 0.1694764E+02 0.2283128E+02
0.2566817E+02 0.2568675E+02
5833 5860
3101 1888 3051
GASREC 0.3650000E+04 -0.7122196E+02 -0.4260883E+02 0.1909491E+02
0.3884295E+02 0.4458563E+02
5832 4082
9117 3805 949
PAVGHC 0.3650000E+04 0.1693110E+04 0.1883883E+04 0.2642994E+04
0.6125840E+04 0.8657533E+04
949 2484
6907 6222 3091
PAVG 0.3650000E+04 0.1693103E+04 0.1883872E+04 0.2642970E+04
0.6125770E+04 0.8657509E+04
949 2484
6907 6222 3091
The P50 oil
recovery (PY OILREC) is given as 22.83%. Note that each Px, Py, and Pz
result in the entire table is from a different case (the case number giving
each result is indicated below the result). Exceptions are pairs of
variables that can be essentially the same or are strongly related, like PAVG and PAVGHC, and OILREC
and CUMOIL. For example, of all runs, case number 3091 in the above table
has the lowest cumulative gas production and final producing GOR, and the
highest final average pressure (all related to very low gas production).
But in general there is no such thing as a Px, Py, or Pz case.
The case that was found to give that 22.83% P50 oil recovery is shown in the
results table to be case number 3101 (case3101.out),
which has K1,K2,K3 values of 908.13, 45.01, and 343.31 md, respectively.
We can easily
modify the base case (K1,K2,K3 = 500,50,200, oil recovery = 20.02%) by
manually changing layer permeabilities in very different ways that will give
that same P50 22.83% oil recovery (within 0.005%). We can choose to either decrease the top layer perm from 500
to 360 md , or we can increase the bottom layer perm from 200 to 283.25 md,
which both have the same effect of reducing override of the injected gas due
to gravity, and of increasing oil production to a recovery of 22.83%. There
are a very large number of combinations of the 3 variables that will give
22.83% oil recovery.
In terms of what
is allowed to vary in the base case, any conclusions of probable description
or behavior that may be inferred from any chosen Px case, or from
differences between any two chosen Px and Py cases, are virtually guaranteed
to be wrong!
To efficiently
compute probabilistic results from data uncertainties, the number of
uncertain variables, generally equal in actual number to many times the numbers of gridblocks
and wells, must be minimized. We must strive
to build and evaluate as many of the fastest and coarsest and least detailed
models that are sufficient as fast as we can, rather than the most detailed.
Detailed modeling is valid only at the fine scale, for subsequent upscaling
to field-scale problems that can be practically solved. In our
discrete upscaled numerical models, any surface or feature is justifiably
represented by nothing more than large coarse-block average permeability and
porosity and rock type distributions. Conforming the grid to detail or
surfaces is counter-productive. All issues of behavior must be
investigated with respect to the probabilistic results given by a
large set of possible realizations. That includes history matching and
optimization. We can determine the uncertainty in our predictions only
by basing them on many equally probable history matches or scenarios.
In optimization, questions that can not be represented by changes in the
data sets (history matches) can not be answered by reservoir modeling.
The question of whether or not Option A is better than Option B is
represented in all the realizations by some change in the data.
The probability that A is better than B is given by the fraction of
case A runs giving better results. The mean and standard deviation of
the difference in probable benefit, given by some defined function of the
probabilistic results, like NPV, or some simple value/cost function of
production/injection, to minimize the number of variables, is easily
computed.
A simple
component of probabilistic forecasting and optimization workflows for Sensor
is provided by SensorPx. SensorPx computes Px, Py, and Pz values for all
production/injection results variables (Well,
Platform, Region, Superregion, and Field, or Field-only) from any given set
of binary fort.61 plot file results produced by a set of Sensor runs having
equally probable descriptions and/or initial, boundary, and operating
conditions. The Px,y,z results are output in files
casename.px, casename.py, and casename.pz, in that same standard Sensor
fort.61 plot file format for simple viewing, plotting, and computing in
compatible post-processors and workflows.
See
SensorPx Example 1
*
Originally taken from comments by Brian Coats in discussions of "Should we vary expected well performance for P90-P50-P10 cases in prospect
evaluation?" on SPE's former Reserves and
Economics Technical Interest Group discussion forum, and of
"Deterministic (incremental) method" on SPE's
Reservoir Technical Interest Group discussion forum (SPE login is required).
** These Px values that are exceeded in x% of
the cases are defined as "exceedance" probabilities. Exceedance
P10 is a high, optimistic estimate, and exceedance P90 is pessimistic. "Cumulative"
probabilities are the opposite - the cumulative Px oil recovery is that
which is not achieved (recovery is less than the Px value) in x% of the cases,
i.e. there is an x% chance that recovery will be less than the cumulative Px
value.
Cumulative P10 is a low, pessimistic estimate, and cumulative P90 is a high
estimate. Both exceedance and cumulative probabilities are commonly used.
Exceedance Pxi: The probability is at
least x% that the output variable i will be greater than its Pxi value
(P90 is a low estimate, P50 is median, and P10 is a high estimate of
variable i)
Cumulative Pxi: The probability is at
most x% that the output variable i will be less than its Pxi value (P10 is a
low estimate, P50 is median, and P90 is a high estimate of variable i).
The terms "at least" and "at most" appear in
the above definitions because Pxi and Pyi values can be the same. For
example, if the runs having the highest 30% cumulative oil production are
all oil-rate-limited at the same specified maximum oil rates at all times,
then their cumulative oil productions are the same, their exceedance Pxi
values for x = 0 to 30 are the same, and their cumulative Pxi for x=70 to
100 are the same. In that case, if we compute P10, P50, and P90
values, we can say that the probability is at least 10% that cumulative oil
production will be greater than the exceedance P10 value (the actual
probability is 30%, since P10i=P30i). We can also say that the
probability is at most 90% that cumulative oil production will be less than
the cumulative P90 value (the actual probability is 70% since P70i=P90i).
If exceedance P(x-1)i > Pxi > P(x+1)i, or
cumulative P(x-1)i < Pxi < P(x+1)i, then the "at least" and "at most" terms
can be removed from the Pxi definitions, and accuracy in the estimated value
of x is + or - 1% (assuming statistically significant results).
Usually, if enough cases are run and if significant uncertainty exists, Pxi
results will be continuously variable in x, and the terms "at least" and "at
most" do not apply. Runs for real cases in which the wells remain
rate-limited at all times are very rare.
1. Odeh, A.S., "Comparison of Solutions to a Three-Dimensional Black-Oil
Reservoir Simulation Problem", JPT (Jan. 1981) Vol. 33 p 13025
|