Social Substrates: People and the data they make (David Ayman Shamma)




UC Berkeley School of Information show

Summary: Everything we do online leaves traces: our tweets, Facebook likes, and YouTube views. Currently, Big Data is all about sifting through cloud stores of these traces with little question as to why those traces exist. Big Data analyses are based on data that are already collected; they are not about asking what should be collected to answer important social and motivational questions. I ask: What motivates people to do what they do? And how can we build predictive models of what people do based on their contextualized and emerging interests, and not just their numerical data. Finding the reasons why people do what they do, and why they create the data trails in the first place, invites a new set of questions and demands a new set of methods. I present investigations into uncovering and understanding these motivations through three areas of inquiry: genre classification, topic prediction, and event detection. I propose changes for how we measure engagement, how we design system instrumentation, and how we design for data collection, aggregation and summarization. These changes have immediate implications on how we understand human behavior online and build new experiences, and they bring ramifications for the next generation of large data solutions.