Workshop program

The Workshop Proceedings of the 13th International AAAI Conference on Web and Social Media will be published with the journal Frontiers in Big Data. Details.

June 11, 2019

Coffee Break Morning: 10:15AM - 10:45AM
Lunch Break: 12:30PM - 2:00PM
Coffee Break Afternoon: 3:45PM - 4:15PM


Lunch Talk (1:15PM - 1:45PM, Room H.004)

History and Future of Data Science at Reddit: a Product & Experimentation Perspective
Lin Huang (Director of Data Science & Engineering)



W1: Complex Systems Perspectives on Algorithmic Bias (Room H.003)
Kenny Joseph; Ancsa Hannák

W3: Data for the Wellbeing of Most Vulnerable (Room H.001)
Yelena Mejova; Daniela Paolotti; Kyriaki Kalimeri; Rumi Chunara

W4: Managing and Designing for Norms in Online Communities (Room H.202)
Sanjay Kairam; Eshwar Chandrasekharan; Joseph Steering; Stevie N Chanellor

W5: Demographic Research with Web and Social Media Data (Room H.204)
Diego Alburez-Gutierrez, Sofia Gil, Emilio Zagheni

W6: Modeling and Mining Social-Media-Driven Complex Networks (Room H.206)
Roberto Interdonato; Sabrina Gaito; Alessandra Sala; Andrea Tagarelli



W7: Workshop on Critical Data Science (Room H.002)
Momin Malik; Katja Mayer; Hemank Lamba; Claudia Müller-Birn


EVENING WORKSHOP (8:00PM - approx. 10:00PM)

WX: The ICWSM Science Slam
Substanz Club, Ruppertstr. 28, Munich.

LUNCH TALK (1:15PM - 1:45PM, Room H.004)

History and Future of Data Science at Reddit: a Product & Experimentation Perspective
Lin Huang, Director of Data Science & Engineering

Success of data science requires mastery of statistical/ML techniques, and deep understanding of company priorities. The most rewarding yet challenging aspect is to build a rhythm to deliver actionable outputs (i.e., data products) that directly transform business decisions. Here in Reddit, we have uniquely complex data assets, for which it took us rounds of iterations to customize a solution that maximizes our data potential. This talk focuses on our recent journey in the last 18 months to build an experimentation platform, and to revamp the engineering workflow prioritizing the interplay of offline ML modeling and online testing. Not every piece of ML insights checks out in an experiment, but every "setback" is equally valuable as it allows us to iterate faster and more purposefully. Culturally speaking, we encourage calculated risk taking since data products are most powerful when they point to new business directions (vs. validating the existing ones). With a suite of concrete examples, we demonstrate how rhythmically outputting data products answers the common questions that almost all companies are curious about, namely, where are "good ideas" from? How to generate them at scale? Do they hold up IRL? Still valuable if they don't?