天天看點

Probability Distributions

a probability distribution describes how the values of a random variable is distributed.

the binomial distribution is a discrete probability distribution. it describes the outcome of n independent trials in an experiment. each trial is assumed to have only two outcome, labeled as success or failure. if the probability of a successful trial is p, then the probability of having x successful trials in an experiment is as follows.

描述随機現象的一種常用機率分布形式,因與二項式展開式相同而得名。即重複n次的伯努利試驗。在每次試驗中隻有兩種可能的結果,而且是互相對立的,是獨立的,與其它各次試驗結果無關,結果事件發生的機率在整個系列試驗中保持不變,則這一系列試驗稱為伯努力試驗。

一個簡單的例子如下:擲一枚骰子十次,那麼擲得4的次數就服從n = 10、p = 1/6的二項分布。

we apply the function pbinom with x = 4, n = 12, p = 0.2. > pbinom(4, size=12, prob=0.2)   [1] 0.92744

the poisson distribution is the probability distribution of independent events occurrence in an interval. if λ is the mean occurrence per interval, then the probability of having x occurrence within a given interval is:

泊松分布适合于描述機關時間内随機事件發生的次數。如某一服務設施在一定時間内到達的人數,電話交換機接到呼叫的次數,汽車站台的候客人數,機器出現的故障數,自然災害發生的次數等等。

if there are twelve cars crossing a bridge per minute on average, find the probability of having sixteen or more cars crossing the bridge in a particular minute. we compute the upper tail probability of the poisson distribution with the function ppois. > ppois(16, lambda=12, lower=false)   # find upper tail   [1] 0.10129 if there are twelve cars crossing a bridge per minute on average, the probability of having sixteen or more cars crossing the bridge in a particular minute is 10.1%.

泊松分布與二項分布的差別

當二項分布的n很大而p很小時,泊松分布可作為二項分布的近似,其中λ為np。通常當n≧10,p≦0.1時,就可以用泊松公式近似計算。

the continuous uniform distribution is the probability distribution of random number selection from the continuous interval between a and b. its density function is defined by the following.

here is a graph of the continuous uniform distribution with a = 1, b = 3.

the exponential distribution describes the arrival time of a randomly recurring independent event sequence. if μ is the mean waiting time for the next event recurrence, its probability density function is:

here is a graph of the exponential distribution with μ = 1.

指數分布(exponential distribution)是一種連續機率分布。指數分布可以用來表示獨立随機事件發生的時間間隔,比如旅客進機場的時間間隔、中文維基百科新條目出現的時間間隔等等。

suppose the mean checkout time of a supermarket cashier is three minutes. find the probability of a customer checkout being completed by the cashier in less than two minutes. the checkout processing rate is equals to one divided by the mean checkout completion time. hence the processing rate is 1/3 checkouts per minute. we then apply the function pexp of the exponential distribution with rate=1/3. > pexp(2, rate=1/3)   [1] 0.48658

in particular, the normal distribution with μ = 0 and σ = 1 is called the standard normal distribution, and is denoted as n(0,1). it can be graphed as follows.

正态分布(normal distribution)又名高斯分布(gaussian distribution), 很重要的一種分布...因為中心極限定理

中心極限定理(central limit theorem)

assume that the test scores of a college entrance exam fits a normal distribution. furthermore, the mean test score is 72, and the standard deviation is 15.2. what is the percentage of students scoring 84 or more in the exam? we apply the function pnorm of the normal distribution with mean 72 and standard deviation 15.2. since we are looking for the percentage of students scoring higher than 84, we are interested in the upper tail of the normal distribution. > pnorm(84, mean=72, sd=15.2, lower.tail=false)   [1] 0.21492

here is a graph of the chi-squared distribution 7 degrees of freedom.

we apply the quantile function qchisq of the chi-squared distribution against the decimal values 0.95. > qchisq(.95, df=7)        # 7 degrees of freedom   [1] 14.067

here is a graph of the student t distribution with 5 degrees of freedom.

find the 2.5th and 97.5th percentiles of the student t distribution with 5 degrees of freedom. > qt(c(.025, .975), df=5)   # 5 degrees of freedom   [1] -2.5706  2.5706

if v 1 and v 2 are two independent random variables having the chi-squared distribution with m1 and m2 degrees of freedom respectively, then the following quantity follows an f distribution with m1 numerator degrees of freedom and m2denominator degrees of freedom, i.e., (m1,m2) degrees of freedom.

here is a graph of the f distribution with (5, 2) degrees of freedom.

find the 95th percentile of the f distribution with (5, 2) degrees of freedom. > qf(.95, df1=5, df2=2)   [1] 19.296

卡方分布(χ2分布)、t分布和f分布合稱三大抽樣分布, 因為他們都是基于正态分布的

本文章摘自部落格園,原文釋出日期:2012-02-16