Numbers. What do numbers mean to us? When we see numbers on expiry, reliability, recyclability on our products it is a sign of proof and validation. When we see numbers of rating on movies and shows it represents the worth of spending time on it. We have a sense of discomfort when we see numbers on the price tags and a satisfaction with the numbers of quality. We fear when we see numbers on pollution statistics and are relieved by numbers associated to economic growth. Some numbers don’t intimidate us while some do; which is mostly the case which large numbers that need to processed and converted in our minds for some seconds and those numbers that need to calculated. There is clear difference in emotion when we read 1L and 100000.
A survey taken in the UK in 2003 said that only 47% of people qualified the minimum levels of numeracy-which is the ability to deal with fractions, decimals and percentages. Even After efforts and stress put on this by the state, the next survey in 2011 showed an increase by only 2%. This almost 50-50 situation clearly shows there are two kinds of people- one who comfortable with nos and others are not.
But numbers could be undertaken for different purposes. Algebra, Geometry, Statistics, etc. Statistics is something which receives mixed reactions by non-mathematicians and even mathematicians. But statistics rather provides a feeling of binding and is personal at its output. Because statistics are about us as a group and not as individuals. By definition it is ‘the science of dealing with data about the state and the community we live in.’ As said by Alan Smith, we as social animals are fascinated about how we relate ourselves to our groups and peers. We need a hook. We need to attach ourselves with some sort of community directly and indirectly. Some we know already and some may also come as a surprise to us. Most powerful stats are the ones that surprise us.
Why do we get surprised by data and statistics? It is seen that people’s perception about what the reality is and what is actually differs by a large amount. Daneil Kahneman and his peers spent years studying the disjoint between what people perceive and the reality. They realized that people are poor intuitive statisticians. But what causes this misperception? Lack of Individual experiences, influences by the media, etc. But also it was seen that geography is the issue. It is not very easy to ask someone the knowledge or the data of an entire country or state. Instead if we focus to one’s specific region or city one might know. But even then people seem to have misconceptions. People are even blind to the blindness- they just believe their notion and most of the time it is not even doubted.
He talked about how them gamifying the data survey, got a quarter of a million people playing within 48 hours. And it even became a topic of discussion, a movement and someone’s interest. I realized this made the numbers less intimidating. Changed quantity from a number to characters. It made it easier for people to understand and it managed to hook both the kinds of people-who like numbers and also who don’t. Numbers can surprise us all.
Hence, I would agree with him when he said ‘statics is the science of us and that is why we should be fascinated’. Statistics is something we shouldn’t be intimidated about and infact we should know how to apply them in our day to day life. All of us in school are made to focus more on algebra and calculus but I believe with statistics we can reduce the gap between the conceptions and reality.
I have personally loved playing with numbers. Numbers can mean so many things, lets you discover so many things, clarify so many things and guarantee others. There is a trill while dealing with numbers. Manipulation in numbers could be very easily spotted. We can make a million different predictions with data in hand. But with all this we need a right way to collect data, right interpretation of data and right data to be collected to get to where we anticipated to be or find what we were looking for.
Because of all the above mentioned reasons it is understandable that people are skeptical about numbers. One must understand which of those numbers are reliable and which are not. There are various sources we could find data from. There are government statistics, private statistics. One hack I have found for that is to get data from an apt tier of data source. The bottom tier would be getting data from beuros, research papers, government official data, etc. 2nd tier would be derivative pages like websites, private company data, etc and the topmost low ranked tier would be getting data from wiki, videos, etc. This is one of the tools I found for getting apt data for apt situations.
Government statistics are for the public not for a certain target audience. Private companies could manipulate data to support their values. But even then the government numbers are distrusted and are not felt genuine. Why? There is long chain of causes and effect for such a result but few reasons could be relativity, taking averages, omitting extreme data to get unified results, etc.
But we must also understand that some government data are important and need them to make sense of society and understand crucial problems moving beyond emotions. I understand the emotions behind being skeptical about collecting data where cultures, races, immigrants, discriminated population and other controversial and sensitive topics are involved. We also experienced such a situation while understanding the ‘race controversy behind bell curves’. But to take any actions if need be, we must understand the situation and for that we need data. For any campaign, policies, bills to be past in a favor of any community, society, workforce, etc we must understand the number of people who will get benefited or effected by it.
Also I feel because so many people are afraid of numbers and do not understand them, they doubt them. Hence, we need to learn to understand data and more importantly, we need skills to spot bad statistics rather than blindly accepting or blindly ignoring them.One way to be able to make numbers very accurate was to have as many people as possible to question them. Here are some questions we must ask while looking into data.
1. Can you see uncertainty?
Journalists are needed report the facts and not actually predict them. A lot of the times to fill in gaps of the data gathered journalist put their interpretation and predictions. When this data is referred later, again interpretations and predictions make their way.
In recent times we have seen that polling has become very inaccurate. Reason being, our societies have become very diverse making it difficult for pollsters to get a very nice representative sample of the population. Also, people might lie. But as soon as these numbers turn into visual forms of data they are seen to be trusted. A lot of data visualizations will overstate certainty. “Charts numb are brain to criticism”. Numbers make you feel skeptical but a chart gains trust.
Data journalist Mona Chalabi took real data sets and turning them into hand drawn visualizations so that people can see how imprecise this data is and see that a human did it. In her visualizations, shaky lines posed imprecisions and also reflected that there is no exact no.
As we discussed earlier and saw in examples by Mona Chalabi, averages could be misleading. Averages omit the extreme results to get an uniform data spread to be taken average from.
2. Can you see yourself in the data? People are frustrated and they distrust the government data because they can’t picture themselves in it. They do not relate to the situation and their problems on the matter might have not even been mentioned forget described.
The point of asking where one fits in is to get as much context as possible. ‘The axis is everything. Once you change the scale you can change the story!’
3. How was the data collected? This is tough because not everybody be acquainted with the methodologies and there could be a gap in understanding the data. We can take simple steps and understand some situation to know which data is better than the others. One is the tool I mentioned which is the pyramid of sources. Also we can understand that government statistics are better than private statistics. Private companies don’t care if the statistics are right they just need the right numbers. Government statistics on the other hand in theory at least are impartial.
But when it comes to validation of data it is the other way around. With private data you can look into those products or services, give feedback and mark the statistics wrong or right but how do you question government statistics? Government statistics have a larger audience and also larger sample population for collection of data. To validate it, we would need to address the large sample population which is not possible.
Nevertheless, by applying the simple tools mentioned above asking the right questions, making sure that the source of the data is a reliable one and finding out if you are seeing everything in the chart which is there to see we be able to understand, reason, interpret and validate the data to the best of our abilities.