Wednesday, April 2, 2014

Big Data, Human Matters


My article from an old Media & Tech blog (February 14th 2012)

(image credit)

Explosive advancements in media and technology, over the past decade alone, have led us to The Age of Big Data. So much more information than ever before is at our disposal, that it all seems to be morphing into new organisms. They scurry like mercury from a broken thermometer, and spread like a viral video on YouTube.

It was my friend, Patrick, who sent me this article. Blake, who was among the recipients, responded by saying he thought about creating a fictitious profile on Facebook in order to game the system. The following is my e-mail response:

Many thanks, as always, to Patrick for sending us super interesting stuff!

For me, Big Data (in caps now; cf. Big Brother) is terribly exciting and daunting at the same time. I’m trying to get my head around it, frankly. But here’s a notion I hold to: Science, mathematics, technology etc. are, at their essence, a human endeavor. I believe in their rigor and logic, and in their ability to extend our grasp of things, well beyond our imagination. But these are all subject to the best and worst of our humanity. So gathering and analyzing Big Data knowledgeably, responsibly and cautiously serve our purpose well. 


However, the researcher and-or statistician in us know very well that the power of an analytic tool to find differences increases the bigger the data set. The problem is, some of these differences are ‘false positives’ (i.e., meaningless, even spurious). In other words, with a big enough data set, we can find almost anything we want to find. This was acknowledged in the NY Times article, and it’s obviously not a good thing.

Good point, Blake. But I think that system has been gamed already. For example, I wonder how many of the 800+ million on Facebook are actual people, brands, or organizations. If its algorithms are truly smart, then they’ll make a well-calculated estimate of fake profiles and adjust down the total number of users accordingly. I don’t know if they do this or not. But in the meantime, the 800+ million figure we often hear is most accurately described as profiles, not unique or actual users.

Now, in light of these points, we can take analytic reports of Facebook posts, Twitter feeds etc. with lots of grains of salt.


Thank you for reading, and let me know what you think!

Ron Villejo, PhD

No comments:

Post a Comment