pgeom(q=20,prob=0.05)[1] 0.6594384
pexp(q=21,rate=0.05)[1] 0.6500623
The Geometric distribution describes an experiment ending in success or failure which we repeat for some number of failures until we achieve our first success (or equivalently, repeat for some number of successes until our first failure.)
Different textbooks will use one of two conventions: either (i) the Geometric distribution counts the number of failures before the first success, or (ii) the Geometric distribution counts the number of total trials including both the failures and the last trial ending in success. You should always be sure to stay consistent and not to mix these two cases in your work or when searching for reference material!
We will use the first definition on this page, counting only the failures, which is the convention used by R. So if the very first trial ends in success, then we would write \(X=0\) failures.
Let \(B_1, \ldots, B_n\) be a series of \(n\) Bernoulli variables each identically and independently distributed with parameter \(p\). Let \(X\) denote the index of the last failure before the first success, i.e.
\[B_{X+1} = 1; \;B_i = 0 \; \forall i \le X\]
Then \(X \sim \mathrm{Geometric}(p)\).
Two key premises of the Geometric distribution are that (i) \(n\), the number of trials, is allowed to vary until it reaches a natural stopping point,1 and (ii) the probability of success never changes over time or in response to the previous trials.
\[\begin{array}{ll} \text{Support:} & \mathbb{Z}^+=\{0,1,2,\ldots,\infty\} \\ \text{Parameter(s):} & p,\text{ the probability of success }(p \in [0,1]) \\ \text{PMF:} & P(X=k) = p(1 - p)^k \\ \text{CDF:} & F_X(x) = \left\{\begin{array}{cl} 0, & \quad x \lt 0 \\ 1 - (1-p)^{\lfloor x \rfloor + 1}, & \quad x \ge 0 \end{array}\right\} \\ \text{Mean:} & \mathbb{E}[X] = \frac{1-p}{p} \\ \text{Variance:} & \mathbb{V}[X] = \frac{1-p}{p^2} \\ \end{array}\]
#| '!! shinylive warning !!': |
#| shinylive does not work in self-contained HTML documents.
#| Please set `embed-resources: false` in your metadata.
#| standalone: true
#| viewerHeight: 650
library(shiny)
library(bslib)
ui <- page_fluid(
tags$head(tags$style(HTML("body {overflow-x: hidden;}"))),
title = "Geometric distribution PMF",
fluidRow(plotOutput("distPlot")),
fluidRow(sliderInput("p", "Probability (p)", min=0.01, max=0.99, step=0.01, value=0.5)))
server <- function(input, output) {
output$distPlot <- renderPlot({
plot(x=0:20,y=dgeom(x=0:20,input$p),main=NULL,
xlab='x (Prior failures)',ylab='Probability',type='h',lwd=3)})
}
shinyApp(ui = ui, server = server)
pgeom(q=20,prob=0.05)[1] 0.6594384
pexp(q=21,rate=0.05)[1] 0.6500623
Unlike, say, the Binomial distribution in which the number of trials are fixed ahead of time.↩︎