Some Topics I am Excited About

I am enrolled in a course being offered by Worldview in Stanford called “Behind and Beyond Big Data“. As a part of this I am learning a lot about how the big data is being used in various interesting ways. In addition, a couple of other things that I saw on TV or heard in the social media also has captured my interest.

Tweets predicting rates of heart attacks. https://goo.gl/2iInWM%5B/caption%5D

 

Predictive Modeling based on Facebook Likes

Michal Kosinskia, David Stillwella, and Thore Graepelb from the University of Cambridge and Microsoft designed an experiment to see if the Facebook Likes of a person is a predictor of “private traits and attributes” of a person such as age, intelligence and sexual orientation. They describe their research here. A very large sample of facebook users contributed voluntarily to the research by participating in myPersonality initiative. They also manually inspected the volunteers’ facebook profiles in some cases to infer additional information such as the ethnic origin. It is a fascinating experiment. If you are interested in checking how well the system predicts your traits and attributes, try it out at Apply Magic Sauce.

I can tell you that my predictions were pathetic overall. I should say that I was very happy with just one prediction (it predicted my age to be half of the real age!), but was really upset with some others (such as my intelligence relative to the rest of the population). A model is as good as the hypothesis being tested and the underlying data. Since the participants volunteered to participate, there is a natural bias there. In addition, their statement in the paper “Visual inspection of profile pictures was used to assign ethnic origin to a randomly selected subsample of users (n = 7,000; 73% Caucasian; 14% African American; 13% others)”  may indicate why my score is as poor as it turned out to be! Still, it is an interesting experiment.

In the course, one of the authors cautions how such social science research can also be misused. An authoritarian regime can use such models to “predict” the political or sexual orientation of its citizens to do whatever they want to do with its citizens! I hope the authors refine the model and explain its limitations carefully to everyone before it is used in in appropriate fashion. On the other hand, authoritative regimes may not care.

Predicting rates of Heart Attacks using Tweets

Another fascinating research I read about has to do with the analysis of tweets to understand the levels of stress which are predictors of heart attacks. Correlating the tone of the tweets with the locations associated with them, the authors have been able to predict the rates of heart attacks. As one of the authors said in an interview “Obviously, the ones dying from heart attacks are not the ones tweeting, but where one lives is an indicator of stress which in turn is a predictor of heart attacks” (a couple of years ago, I would have challenged him on that one – I was about to tweet when I was about to be wheeled into a procedure and only a stern look from my wife stopped me from doing that!).

You can read more about this here and the description of the project taken from that page is – “They found that expressions of negative emotions such as anger, stress and fatigue in a county’s tweets were associated with higher heart disease risk. On the other hand, positive emotions like excitement and optimism were associated with lower risk.”

The authors were given access to some 10% of tweets from 2009 to 2010. There are questions about sampling and whether twitter users represent the population of the community in any meaningful way. One of the authors explains the rationale for research here “So even if we both live in the most beautiful neighborhood in New York City, and I’m really, really angry and I’m on the road with you, you will get some of that anger.”

Everyone doing big data research cautions that most of these studies tend to be based on correlations rather than causation.

I am reminded about the excitement that Google Flu Trends generated initially and how it all turned out in the end. So, it is worth waiting and seeing what happens.

Treating Cancer using Polio Virus

I heard a fascinating story on 60 minutes yesterday which involves how researchers from Duke University are experimenting with the use of Polio virus to fight brain tumors (glioblastoma). A slightly re-engineered version of the polio virus where the portion of the virus that causes paralysis is removed, but other structural aspects necessary for it to attach itself to a cancer cell is preserved is the key to this. Once injected, the polio virus invades the cancer cell in ways that allows the body’s immune system to recognize the tumor as a threat to the body and attack the cell and kill it. Though the phase 1 trial had many success stories, the complications arising from the experiment also resulted in the deaths of some of the pioneer patients willing to sacrifice themselves for this type of research.

Based on the success of this, FDA has moved this research to their “breakthrough status”, which means that the procedure will be available sooner than the usual lengthy process. The researchers feel that this can also help in many other cancers.

 

Leave a Reply