数据分析中假设检验
Using Inferential Statistics, we learned how to analyze the sample data and make inferences about the population mean and other population data. However, we could not confirm the conclusions we made about the population data. That is why the concept of Hypothesis Testing comes into the picture.
使用推断统计,我们学习了如何分析样本数据以及如何推断总体平均值和其他总体数据。 但是,我们无法确认关于人口数据得出的结论。 这就是假设检验概念出现的原因。
You can find out more about Inferential Statistics and Central Limit Theorem in my previous articles.
假设 (Hypothesis)
Using Inferential, Descriptive, and Exploratory analysis, we performed some research on the population sample. We derived some insights from the sample and made claims about the entire population. These are just the claims; they are not exactly true. This type of claim or assumption is called Hypothesis.
使用推论,描述性和探索性分析,我们对总体样本进行了一些研究。 我们从样本中得出了一些见解,并对整个人群提出了主张。 这些仅仅是要求; 他们并不完全正确。 这种类型的主张或假设称为假设。
假设检验 (Hypothesis Testing)
There are some ways or tricks to check the Hypothesis, and if the hypothesis is correct, then we apply it to the whole population. This process is known as Hypothesis Testing. The final goal is whether there is enough evidence that the hypothesis is correct. As we have already seen in Inferential Statistics and Central Limit Theorem(CLT), we will work with sample data and confirm our assumption about the population in Hypothesis Testing.
有一些方法或技巧可以检查假设,如果假设是正确的,那么我们将其应用于整个人群。 此过程称为假设检验。 最终目标是是否有足够的证据证明该假设是正确的。 正如我们在推论统计和中心极限定理(CLT)中已经看到的那样,我们将使用样本数据并在假设检验中确认我们对总体的假设。
In Hypothesis Testing, we formulate two hypotheses:
在假设检验中,我们提出两个假设:
Null Hypothesis (H₀): Status quo
零假设(H₀) :现状
Alternate Hypothesis (H₁): It challenges the status quo
替代假设(H₁) :它挑战现状
零假设(H₀) (Null Hypothesis (H₀))
The null hypothesis is the prevailing belief about a population. It states that there is no change or no difference in the situation or the claim. H₀ denotes the null hypothesis.
零假设是关于人口的普遍信念。 它指出,情况或索赔没有变化或没有差异。 H₀表示原假设。
替代假设(H₁) (Alternate Hypothesis (H₁))
The alternate hypothesis is the claim that opposes the null hypothesis. H₁ denotes an alternate hypothesis.
替代假设是反对原假设的主张。 H₁表示另一种假设。
For Example, in a criminal trial, the jury has to decide whether the defendant is innocent or guilty for a case. Here the null hypothesis is, the defendant is innocent just like before the charges. The alternate hypothesis is the defendant is guilty, and the prosecutor would try to prove this.
例如,在刑事审判中,陪审团必须确定被告是否无罪或有罪。 在这里,零假设是,被告就像被指控前一样是无辜的。 另一种假设是被告有罪,检察官将试图证明这一点。
假设检验的结果 (The outcome of Hypothesis Testing)
In hypothesis testing, we reject the null hypothesis if there is sufficient evidence to support the alternate hypothesis. If there is no sufficient evidence for the alternate hypothesis, we fail to reject the null hypothesis. That is how we make claims. In any case, we should never say that we “accept” the null hypothesis. Either we reject, or we fail to reject the null hypothesis, that’s it.
在假设检验中,如果有足够的证据支持替代假设,我们将拒绝原假设。 如果没有足够的证据支持替代假设,那么我们将无法拒绝原假设。 这就是我们提出索赔的方式。 无论如何,我们永远不应该说我们“接受”原假设。 我们要么拒绝,要么我们不能拒绝零假设,就是这样。
Example:
例:
If a company has 30000 employees and claims that it takes an average of 35 minutes for the employees to reach the office daily.
如果一家公司有30000名员工,并声称员工平均每天需要35分钟才能到达办公室。
Here,The Null Hypothesis(H₀): Average time for employees = 35 minutesThe Alternate Hypothesis(H₁): Average time for employees ≠ 35 minutes
此处, 空假设(H₀) :员工平均时间= 35分钟交替假设(H₁) :员工平均时间≠35分钟
制定原假设和替代假设 (Formulating the null and alternate hypothesis)
There is a common rule to formulate the null and alternate hypotheses from the claim statement.
有一条通用规则可从索赔声明中表述零假设和替代假设。
The null hypothesis always has the following signs: = OR ≤ OR ≥
零假设总是具有以下标志:= OR≤或≥
The alternate hypothesis always has th