35 Words About Uncertainty, Every AI-Savvy Leader Must Know
Source: https://medium.com/towards-artificial-intelligence/ai-uncertainty-4ac6810899ac?source=rss------artificial_intelligence-5

Uncertainty
uncertainty: a situation involving imperfect or unknown information
probability: a numerical description of how likely an event is going to happen or that a proposition is true
possible world: possible events given a situation, e.g., getting a ‘1’ when rolling a dice; notated with the letter:
ω
set of all possible worlds: all possible worlds combined, which when added up equal one; e,g., getting a ‘1, 2, 3, 4, 5 or 6’ when rolling a dice; notated with the letter:
Ω
P(ω)
range of possibilities: ‘0’ means an event is certain not to happen, whereas ‘1’ means an event is absolutely certain to happen, notated as:
0 ≤ P(ω) ≤ 1
unconditional probability: the degree of belief in a proposition in the absence of any other evidence
conditional probability: the degree of belief in a proposition given some evidence that has already been revealed; the probability of ‘rain today’ given ‘rain yesterday’:
P(a|b) (probability of a given b),
P(rain today|rain yesterday)P(a|b) = [P(a ∧ b)] / P(b)
P(a ∧ b) = P(b) P(a|b)
P(a ∧ b) = P(a) P(b|a)
random variable: a variable in probability theory with a domain of possible values it can take on, for example:
Weather
{sun, cloud, rain, wind, snow}
probability distribution: a mathematical function that provides the probabilities of occurrence of different possible outcomes, for example:
P(Flight = on time) = 0.6
P(Flight = delayed) = 0.3
P(Flight = cancelled) = 0.1or:P(Flight) = ⟨0.6, 0.3, 0.1⟩
independence: the knowledge that one event occurs does not affect the probability of the other event
P(a ∧ b) = P(a)P(b|a) or
P(a ∧ b) = P(a)P(b)
Bayes’ rule: (or Bayes’ theorem) of one probability theory’s most important rules, describing the probability of an event, based on prior knowledge of conditions that might be related:
P(b|a) = [P(b) P(a|b)] / P(a)
Thus, knowing…
P(cloudy morning | rainy afternoon)
… we can calculate:
P(rainy afternoon | cloudy morning)
P(rain|clouds) = [ P(clouds|rain)P(rain) ] / P(clouds)
joint probability: the likelihood that two events will happen at the same time
P(a,b) = P(a) * P(9)
probability rules: a number of algebraic manipulations useful to calculate different probabilities, including negation, inclusion-exclusion, marginalization, or conditioning
negation: a handy probability rule to figure out the probability of an event not happening, for example:
P(¬cloud) = 1 − P(cloud)
inclusion-exclusion: another probability rule, which excludes double-counts to calculate the probability of event a or b:
P(a ∨ b) = P(a) + P(b) − P(a ∧ b)
marginalization: a very useful probability rule (much more details here by Jonny Brooks-Bartlett)
P(a) = P(a, b) + P(a, ¬b)
conditioning: our final probability rule, implying that if we have two events (a and b), instead of having access to their joint probabilities, we have access to their conditional probabilities:
P(a) = P(a|b)P(b) + P(a|¬b)P(¬b)
bayesian networks: a data structure that represents the dependencies among random variables
inference: the process of using data analysis to deduce properties of an underlying distribution of probability
query: variable for which to compute the distribution
evidence variable: observed variables for event e
hidden variable: non-evidence, non-query variable
inference by enumeration: a process for solving inference queries given a joint distribution and conditional probabilities
approximate inference: a systematic iterative method to estimate solutions, such as a Monte-Carlo simulation
sampling: a technique in which samples from a larger population are chosen using various probability methods
rejection sampling: (or acceptance-rejection method) a basic technique used to generate observations from a given distribution
likelihood weighting: a form of importance sampling where various variables are sampled in a predefined order and where evidence is used to update the weights
Markov assumption: the assumption that the current state depends on only a finite fixed number of previous states
Markov chain: a sequence of random variables where the distribution of each variable follows the Markov assumption
hidden Markov models: a Markov model for a system with hidden states that generate some observed event
sensor Markov assumption: the assumption that the evidence variable depends only the corresponding state
filtering: a practical application of probability information: given observations from start until now, calculate a distribution for the current state
prediction: a practical application of probability information: given observations from start until now, calculate a distribution for a future state
smoothing: a practical application of probability information: given observations from start until now, calculate a distribution for past state
most likely explanation: a practical application of probability information: given observations from start until now, calculate the most likely sequence of states
Related posts
Discover Past Posts