I think the article hit the nail on the head: data phobia.
It is a very tough sell to tell people that their intuitions are wrong, especially when certain practices and beliefs are long entrenched.
It is hard for me to describe to anyone the resistance to anything you present if you haven't been there. The expectations for accuracy are far beyond what can be realistically accomplished. It is humiliating and frustrating to see people who have no business analyzing your work comb over every last number and if ONE number is wrong, the whole plan tumbles.
In my experience, the communication issue is teaching sales, management, etc, that the point is not to play whack-a-mole and "fix" everything, and that the data is not meant to be used as a hammer (IMO), but that it is simply there to point the company in a direction, or at least, show where things could be improved and encourage good directions that already exist.
I believe that the expectations do not align with reality at this point. Everyone is looking for some mythical Fountain of XYZ, and it simply does not exist.
"Lies, Damn Lies, and Statistics" is so ingrained in our conscience that the expected reaction to Big Data is a knee-jerk mistrust to whatever is presented. It is a serious issue, and we that have to analyze data have to be cognizant of the fact that we are pushing back against a century (centuries?) of idiots who have used statistics to lie, justify false information, and push agendas.
This reaction can vary heavily depending on the traits of the executives, and unfortunately, the kind of personality that gets to the top in embedded corporate hierarchies tends to be people who are excellent at the image of productivity rather than true productivity. To these people, there is nothing more threatening than objective facts. They can manipulate and hide from reality with the subjective. The objective is a new world where they are exposed for the frauds that they are.
The U.S. Federal Government is packed with these types.
In the corporate world, this is because the purpose of enterprise data is less to drive decisionmaking, and more to justify decisions that the executives were going to make anyway, which will usually just be whatever has the greatest benefit to their own careers.
In the public world, though, it seems to me like big data has had pretty huge results. The most commonly cited example probably being Nate Silver predicting the 2012 US presidential election results correctly for all 50 states using big data sources and techniques--this degree of predictive statistical analysis was previously unheard of in politics.
Is there some kind of gene which predisposes people to throwing around buzzwords? How is it that human behaviour gave rise to "big data" and "the cloud"?
The "gene" you speak of is a desperation to prove relevance when a person has a sneaking suspicion in their heart that they don't add value.
Ever notice how the people who throw around the most buzz words are those who think of, and refer to themselves as "big picture" thinkers? They have conveniently tricked themselves into thinking that focusing on and mastering any true skill is a distraction from understanding the broader landscape in which they operate.
The example given of Big Data use is that Netflix allegedly checking its user data to decide whether to buy House of Cards.
It's not evident that such Big Data was useful in the way described. Anyway, the data described could easily be collected - likely at the same cost and accuracy - in an old-fashioned focus group type of way.
Deciding to buy House of Cards required insight - outside of the data and statistics - to ask the right questions and come to a conclution.
Has using 'Big Data' here led to a more accurate or better value results? Is this even really what 'Big Data' is? Absolutely not clear.
'Data phobia' might be an issue but is not the major issue. The issue is using phrases like 'Big Data' which don't actually mean anything: this phrase alone doesn't create value for business.
Big Corp, that are known to have actual big data are also hardly known for being early adopters... It will take much more time than expected to see data-driven companies and I actually think the finance industry will be the first to do real, sensible big data analysis.
It will also take something more than Hadoop and the like to do something real... Don't like to be that guy, but please, stop rewriting yet another query language and try to write efficient engines instead. ;)
I worked at a financial software startup from 2005-2007, and the financial industry was already using big data pervasively. Hell, LTCM failed in 1998 because they were doing big data analyses of small spreads and forgot about some of the holes in their analyses, namely what would happen if a "black swan" event moved currency prices out of their historical norms. Quant hedge funds like DE Shaw, Citadel, and RennTech have been around since the 80s.
Heck, back in high school one of the math competitions I won was sponsored by INFORMS (the Institute For Operations Research and the Management Sciences), and I asked my dad "What's operations research?" and he said that it's where people with math Ph.Ds go to make big bucks. Companies like FedEx, Safeway, and WalMart have relied on their massive amounts of operations data to do things like minimize transportation costs or ensure that they're stocking items with maximum demand for decades.
The thing is, everybody who gains a competitive advantage from big data has reason to keep that fact secret, so that their competitors don't start doing the same thing. What's changed now is that the media itself is facing competitors that use big data themselves to make themselves more relevant than traditional media, and so suddenly it's a Story Of Consequence. It's not that big data is the next big thing - it's that big data was the last big thing that you are only hearing about now, and suddenly a bunch of folks that had never heard of it are now trying to play catch-up.
For me this is the key:
"when you ask your data the right questions you can find hugely valuable insights"
Many people treat data as the Oracle of Delphi hoping to just make an offering (investment) and wait for knowledge to be dropped. This idea that you can just shift through endless data and pull out insights is just plain wrong. Frist start with the decisions you can or have to make, then look at the data to see if it gives you insight into the best choice.
Fears and phobias never helped coping with reality.
The stream of data that we and our environments create(d) is an reality since the moment we made them digital (vs. analog) and started processing them with computers. As soon as you are able to process the data you are often just one switch / flag away to also store it.
When telephone systems and backbones started to become digital on a broader scale in the 1990, the at that time already existing surveillance or data collection expanded massively because data became accessible and usable on a large scale. First huge processing farms were built for the FBI, NSA ea. and the congress appropriated hundreds of millions for them. For the next version of those server farms U.S. congress already provided dozens of billions 4-5 years ago. Follow the money and you can find out since when this is going on.
With every step our world is becoming more digital these "virtual images" of us will become more complete and we will become more dependent on them. Ask yourself how often you are still using a paper map today vs 10 years ago, soon there will be a generation of people living that will only navigate with digital maps and GPS leaving a trace of their every movements - for them it will be the norm and they will not know another way. If you prefer a more positive image think about people living with a heart disease and how many of them will survive because of 24/7 surveillance.
One key element with all that data often overlooked is that once we depend on it within many areas of our lives, falsification of "data elements" or blocking access to "data services" has substantial impact on the person herself. Thinking about finding a new job so that you can pay your rent - almost every step within that is already digital. And faking email conversations, phone interviews etc today only requires limited resources for "someone" having mandatory access to the mail servers and internet infrastructure. Putting incriminating material on computers of your business / political / life "opponents" might already be a drag and click activity for some of those. Falsificating / sabotaging financial transactions have been reported from various political activists since years and are for years already part of Hollywood folklore / films. In short - soon "some" will be able to completely change our real lives by "working the data" - if we are economically successful, whom we meet, what we think about others / products / politics, if we die from diseases or not...
It would be too easy to say that what happens with all the data about us, our activities and interactions is a matter of what society we are living in, if it's a true democracy with civil rights or an oppressive state. This per se is an illusion. It is denying how the majority of people are living their lives, they want to be part of a community, be safe, have no problems, enjoy incentives (or things you can buy). And it is denying how governments work, because the sole existence of such a feedback / control / surveillance / oppression mechanism is changing society itself and the way we are governed.
Think along the lines - what can be done will be done. And if powers of "some" in our societies continue to be expanded day-after-day - it certainly will.
It is a very tough sell to tell people that their intuitions are wrong, especially when certain practices and beliefs are long entrenched.
It is hard for me to describe to anyone the resistance to anything you present if you haven't been there. The expectations for accuracy are far beyond what can be realistically accomplished. It is humiliating and frustrating to see people who have no business analyzing your work comb over every last number and if ONE number is wrong, the whole plan tumbles.
In my experience, the communication issue is teaching sales, management, etc, that the point is not to play whack-a-mole and "fix" everything, and that the data is not meant to be used as a hammer (IMO), but that it is simply there to point the company in a direction, or at least, show where things could be improved and encourage good directions that already exist.
I believe that the expectations do not align with reality at this point. Everyone is looking for some mythical Fountain of XYZ, and it simply does not exist.
"Lies, Damn Lies, and Statistics" is so ingrained in our conscience that the expected reaction to Big Data is a knee-jerk mistrust to whatever is presented. It is a serious issue, and we that have to analyze data have to be cognizant of the fact that we are pushing back against a century (centuries?) of idiots who have used statistics to lie, justify false information, and push agendas.