Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data. Statistics is a highly interdisciplinary field; research in statistics finds applicability in virtually all scientific fields and research questions in the various scientific fields motivate the development of new statistical methods and theory. In developing methods and studying the theory that underlies the methods statisticians draw on a variety of mathematical and computational tools.

Two fundamental ideas in the field of statistics are uncertainty and variation. There are many situations that we encounter in science (or more generally in life) in which the outcome is uncertain. In some cases the uncertainty is because the outcome in question is not determined yet (e.g., we may not know whether it will rain tomorrow) while in other cases the uncertainty is because although the outcome has been determined already we are not aware of it (e.g., we may not know whether we passed a particular exam).

Probability is a mathematical language used to discuss uncertain events and probability plays a key role in statistics. Any measurement or data collection effort is subject to a number of sources of variation. By this we mean that if the same measurement were repeated, then the answer would likely change. Statisticians attempt to understand and control (where possible) the sources of variation in any situation.

“What Is Statistics?”. 2021. *stat.uci.edu*. https://www.stat.uci.edu/what-is-statistics/.

## Probability vs Statistics

Probability is a measure of the likelihood of an event to occur. Since probability is a quantified measure, it has to be developed with the mathematical background. Specifically, this mathematical build of the probability is known as the probability theory. | Statistics is the discipline of collection, organization, analysis, interpretation, and presentation of data. Most statistical models are based on experiments and hypotheses, and probability is integrated into the theory, to explain the scenarios better. |

*Probability vs Statistics*

**What is the difference between Probability and Statistics?**

- Probability and statistics can be considered two opposite processes, or rather two inverse processes.
- Using probability theory, the randomness or uncertainty of a system is measured by means of its random variables. As a result of the comprehensive model developed, the behaviour of the individual elements can be predicted. But in statistics, a small number of observations is used to predict the behaviour of a larger set whereas, in probability, limited observations are selected at random from the population (the larger set).
- More clearly, it can be stated that using probability theory the general results can be used to interpret individual events, and the properties of the population are used to determine the properties of a smaller set. The probability model provides the data regarding the population.
- In statistics, the general model is based on specific events, and the sample properties are used to infer the characteristics of the population. Also, the statistical model is based on the observations/ data.

“Difference Between Probability And Statistics | Compare The Difference Between Similar Terms”. 2012. *Compare The Difference Between Similar Terms*. https://www.differencebetween.com/difference-between-probability-and-vs-statistics/.

## Recommended

Kunin, Daniel. 2021. “Seeing Theory”. *seeing-theory.brown.edu*. https://seeing-theory.brown.edu/.

An interactive visual introduction to probability and statistics. It currently covers six chapters: basic probability, compound probability, probability distributions, frequentist interference, Bayesian interference, and regression analysis. Each chapter contains interactive exercises to help visualize and understand the information.

“Boundless Statistics | Simple Book Publishing”. 2022. *courses.lumenlearning.com*. https://courses.lumenlearning.com/boundless-statistics/.

## Additional Reading

“Crash Course – Statistics”. 2021. *thecrashcourse.com*. https://thecrashcourse.com/courses/statistics.

“How To Do Statistics – The Easy Way”. 2019. *Chi-Squared Innovations*. https://www.chi2innovations.com/blog/discover-stats-blog-series/how-to-use-statistics-to-plan-your-study/.

If you want to know how to do statistics, you’ll need to know how to navigate through statistics to create a plan-of-action for your study. Where do you start? Should you bury your head in the sand and go out and collect your data without having a strategy on how to analyse it? That’s probably not a good way to start – and yet that’s *exactly* where most people do start! And three years later, when their thesis or report is overdue, they take their data to a statistician and say ‘I just need a p-value for my paper’. Little do they know that they need a miracle to rescue their study! Fortunately, there are ways of creating a coherent plan-of-action for your next study, and all you need are a process, a strategy and a nice, big diagram of The Big Picture of statistics!

“Statistics 101: A No BS Introduction For Dummies”. 2021. *Medium*. https://medium.com/co-learning-lounge/introduction-to-probability-and-statistics-in-data-science-170e63773708.

“Statistics – The Last Dark Art”. 2016. *Chi-Squared Innovations*. https://www.chi2innovations.com/blog/discover-stats-blog-series/statistics-last-dark-art/#t-1591631358650.

Statistics is the study of the analysis, interpretation and presentation of data, and in particular of **measuring and controlling uncertainty**. It is the ‘uncertainty’ that makes statistics a distinct flavour of mathematics, and is the reason that statisticians argue a lot.

Taylor, Kylie. “The Difference Between Probability And Statistics”. 2023. *Medium*. https://kylie-taylor.medium.com/the-difference-between-probability-and-statistics-dd64ffe73964.

**Probability:** We *know* the source of our data and want to calculate an *outcome.***Statistics:** We know an *outcome* and want to *learn* the source of our data.

## Abuse of Statistics?

**What is your favorite example of abuse of statistics?**

*“More than 80% of Dentists recommend Colgate”*

Sounds like a winning claim, right? Makes us think Colgate is recommended by 80% of dentists, while the remaining 20% might have recommended some other brands.

However, each doctor surveyed could actually recommend several brands, not just one. That is, 80% of those doctors who recommended Colgate could also have recommended any other brand simultaneously. For all we know, there could be a brand X that might have been recommended by even more than 80% of doctors! What a convenient way to conclude such a survey, isn’t it?

This particular ad was banned by the UK govt, yet they did continue to use it in other parts of the world.

Prasad, Ganesh. “What Is Your Favorite Example Of Abuse Of Statistics?” 2022. *Quora*. https://www.quora.com/What-is-your-favorite-example-of-abuse-of-statistics.

### Statistics Don’t Lie But Liars Use Statistics

One statistic that has always bothered me, like the Colgate statistic, is for Crest products:

*“4 out of 5 dentists recommend Crest.”*

- According to that statement, 20% of dentists do NOT recommend you use Crest.
- The statement doesn’t claim 80% of dentists think Crest is either the best or only one to use.
- How many dentists did they contact? What incentive did the dentists have for participating in the study? Were only dentists surveyed who were biased toward Crest products?
- “P&G (Proctor & Gamble) sent the Crest product to 344 dentists who were asked to use the product for one week.
**The dentists were paid $75 to participate in a survey.**269 dentists participated in a phone survey where they were asked “Based on your experience using this oral rinse, which of the following statements best describes your most likely recommendation of this oral rinse to your patients?” According to the complaint, P&G arrived at the 4 out of 5 number by combining those who responded that they ‘would recommend’ the product with those who responded that they ‘would recommend only if their patients asked about it.’ Pfizer alleges that this hypothetical recommendation does not constitute proper substantiation that health professionals recommend the product in their actual practice.” - “Four Out Of Five Comparative Statements May Be Misleading”. 2006.
*The Trademark Blog*. https://www.schwimmerlegal.com/2006/03/four-out-of-five-comparative-statements-may-be-misleading.html.

- “P&G (Proctor & Gamble) sent the Crest product to 344 dentists who were asked to use the product for one week.

“Question everything, learn something, and answer nothing.”

Euripedes

## Videos

“Video Index – StatQuest!!!”. 2022. *StatQuest!!!*. https://statquest.org/video-index/.

There are two ways to see all of my videos and navigate them. Probably the best way is to use this Learney Flow Chart, which was created by my friends at Learney.me. What makes it so awesome is that you can easily pick the general topic you are interested in and then see all of the relevant videos and their dependencies. Alternatively, you can find everything right here, just not as well organized.

This page contains links to playlists and individual videos on Statistics, Statistical Tests, Machine Learning, Webinars and Live Streams, organized, roughly, by category. Generally speaking, the videos are organized from basic concepts to complicated concepts, so, in theory, you should be able to start at the top and work you way down and everything will make sense.

The featured image on this page is from the Stanford Online website.