Chris BailProfessor of Sociology & Public Policy
Teaching Assistant: Devin J. Cornell
PhD Student, Sociology Department
Office Hours: Wednesday 2-5pm (links will be posted on Slack beforehand)
The past decade has witnessed an explosion of data produced by websites such as Twitter, Facebook, Google, and Wikipedia, the mass digitization of administrative and historical records, and the rapid expansion of mobile technology into nearly every corner of our lives. A new wave of techniques for collecting, classifying, and analyzing these data hold enormous potential to address many of the most urgent questions in social science: How do diseases spread? What causes financial meltdowns? How did America become so politically polarized? This course surveys the nascent inter-disciplinary field of computational social science, which combines insights from computer and information science, sociology and social network analysis, economics, political science, and public health in order to answer such questions.
This course requires no prior knowledge of computer programming or social science. Students will learn to ask social science questions, and learn how to answer them by collecting data from digital sources such as social media sites. Students will also acquire skills in social network analysis, automated text analysis, application programming interfaces, and the R programming language.
OVERVIEW OF CLASS FORMAT
Due to the COVID-19 pandemic, this class will be held entirely online. The format of the class and course requirements are designed to provide maximum flexibility to you as we all navigate these unprecedented times– but also to build in new opportunities to learn through deeper engagement with me.
Each week, there will be two re-recorded lectures that average 10-30 minutes each. One of these lectures will be about a social science topic (e.g. political polarization or public health), and the other is designed to introduce you to the programming you need to learn to complete the lab assignment for that week. There are also required readings for each week that build upon the content in the lectures, and help introduce you to the material you need to learn for the labs. The lab assignments do not always line up with the content of the social science lectures; instead, the goal is to introduce both social science and coding gradually/incrementally.
In addition to viewing the lectures each week, you will be required to sign up for a bi-weekly 15 minute one-on-one meeting with me (Professor Bail) via Zoom to discuss a topic of your choice from either the readings or the lab. Every other week you will be assigned to a small-group discussion with your classmates (also via Zoom) that will always be held from 12:30-1:30PM. I will circulate across the Zoom breakout rooms to check in with each group at some point during this time. See the course schedule to determine whether each week involves a one-on-one meeting or a small group meeting and please mark your calendars accordingly.
My decision to meet with you one-on-one and in small groups is guided by my extensive previous experience teaching on Zoom, which has taught me that it is a very ineffective format for large group conversations in Duke courses. Meeting one on one will also help me tailor the class to your needs, and help you succeed with all of the work for our class– weekly labs, and a final project that you will present during the last week of the semester.
We will draw heavily on two excellent books, both of which are freely available online thanks to the generosity of the authors. If you can afford to purchase a hard copy to show your appreciation for their hard work, please consider it.
- Bit by Bit: Social Research in the Digital Age. Matthew J Salganik (2017), Princeton University Press
- R for Data Science. Garrett Grolemund & Hadley Wickham, O’Reilly: )
R & RStudio
In this class, we will use R, a free programming language that I will teach tou how to install during the first week of class. There are a variety of different ways to use R, but the most common way to do so is with the software RStudio, a free Graphical User Interface which you can either run on your laptop, or via a web server. Though it is possible to run it on a Tablet or Microsoft Surface, I recommend using a laptop or desktop if possible.
We will be using Slack to communicate with each other. Slack is a messaging platform that let’s us share code, web links, and other things easily- it also allows me to make announcements to the class and share other things I think you might find interesting (e.g. internship announcements or articles that build upon our discussions with each other). Slack is also used by many different companies and organizations in and outside the tech world, so this might be a great chance to learn how to use it if you have not already.
The field of computational social science is growing so rapidly that none of the resources I give you will remain at the cutting edge for long. You will almost certainly encounter issues unique to the data we collect as part of our final research projects and/or incompatibilities between software packages and/or your computer. Stack Overflow is a website where computer programmers help each other solve such problems. Individuals ask questions, and others earn “reputation points” for solving their problems—these reputation points are awarded by the person who asks the question as well as other site users who vote upon the elegance/efficiency of each solution. For you, this reputation system means you can quickly identify the most high-quality solutions to your problems. Take a tour of the site here.
Many of the most important advances in computational social science appear first on Twitter or blogs. I therefore encourage you to open a Twitter account-if you don’t already have one- and follow the authors we read, or consider checking out the people I follow. Having a Twitter account will also come in handy for some of the exercises we do in class to collect data from Twitter. Of the many blogs that you might read, I recommend R Bloggers, which provides a concise overview of new functions in R as well as solutions to common problems faced by computational social scientists, as well as those in other fields.
Watch Pre-Recorded Lecture Videos
Each week you are required to watch one lecture on a substantive topic and one “lab” lecture. The substantive topics cover a range of different issues from public health to discrimination and artificial intelligence; the labs teach you the concrete techniques to perform the types of analysis we read about during the class, and provide you with the tools necessary to complete your final project.
You are responsible for understanding the assigned readings each week. Make use of your fellow students, your TA, the Internet, a dictionary, and me to ensure that you understand the readings. Remember that this syllabus is a ‘living document.’ By this I mean I reserve the right to change the reading assignments in response to your feedback as well as my own sense of our group achievement. No changes will be made without at least two weeks of advance notice. Each week, we also provide you with a list of “optional readings” in case you want to go deeper into the material that is covered in the assigned readings.
Weekly Lab Assignments
By midnight on Saturday of each week, you will be required to complete and submit the assigned lab exercises described on the course schedule on this website. You must submit your lab assignments in a format called “R markdown” (abbreviated .Rmd). A video that describes how to create files in this format is available on the Schedule for the second lab (the first lab assignment is ungraded).
Labs will comprise 50% of your final grade. Students are permitted to miss one lab assignment without penalty. Assignments will be graded as follows: 100% (Student writes code that successfully completes all tasks assigned); 90% (Student writes code that successfull completes all but one of the assigned tasks); 80% (Student writes code that successfully completes all but two of the assigned tasks); 70% student writes code that completes all but three of the assigned tasks); 0% (Student does not write code that completes assignments).
Please Submit all homework assignments to our TA (Devin Cornell) via direct message on Slack.
Weekly Meeting with Dr. Bail
Every other week, you will be required to meet with Dr. Bail for 15 minutes to discuss the readings or the lab assignments (your choice). As I mentioned above, the purpose of these meetings is so that I can tailor the class to you and give you the best online course experience possible. These meetings do not count towards your grade, however, I think they will be very useful in helping you to design and conduct the final project for our class, which counts for 50% of your grade. You can use our time together to ask questions about the assigned readings, or discuss possible topics for your final project, which should build upon one or more of the required readings on our schedule. You may cancel any of the one-on-one meetings with me at your discretion– especially if unforeseen circumstances arise because of the pandemic- however, I ask that you give me 24 hours notice if you plan to cancel our meeting
To sign up for a weekly meeting time, please put your name in this Google sheet
Numerous studies arrive at the same conclusion: students learn more when they are actively engaged in activities in class (even if they sometimes think they learn less through such activities). The challenge, then, is for us all to think of how you can get more engaged, and in my experience the best way for you to do this is to try doing some research yourself. This may sound like a lofty goal, but my bet is that we will learn much more while failing to achieve an ambitious goal than if we do not try.
Your goal for your final project is to a) ask a research question relevant to one of the topics we cover in social science (e.g. misinformation, algorithms and discrimination, or public health); b) explain why this topic is important (to social science and/or the world); c) develop at least one hypothesis to answer this question; d) collect data that allows you to test this hypothesis; and e) describes whether or not your hypothesis was confirmed, and what implications this should have for people who want to do future research on your topic.
Whether you find support for your hypothesis will not effect your grade. Instead, you will be evaluated based upon a) the quality of the research question you ask and the hypotheses you develop; and, b) the quality of the data collection and analysis. If your analysis does not support your hypothesis– and your hypothesis is a well-founded one– then I consider this to be an important finding. In fact, I have published research that does not confirm my initial hypotheses as well (For example, see here).
If you have questions about your grade for the final project at any time we can discuss them during our one-on-one meetings- the earlier you begin thinking about and developing your project, the more quickly and efficiently I can give you feedback to help you achieve the grade you desire.
The final project will consist of two parts: a final presentation and a final paper.
Your final project presentation will be an opportunity to get feedback from Prof. Bail and your fellow students that can help you write a better paper. The format of the presentation is entirely up to you. Feel free to use Google Slides, Powerpoint, or anything else that suits you. The presentations will be total 10 minutes including 8 minutes of presentation time and 2 minutes for questions and feedback. The presentation itself is ungraded, so only the final paper will be included in your final grade. Final presentations will happen on April 26 and 27, 1-3pm.
Your final project paper should be submitted as both an R Markdown (.Rmd) file and a knitted web page (.html) file by the deadline on Saturday May 1st at 5pm EDT. It should be at least 2,500 words and the knitted web page SHOULD NOT include code blocks. The written component should include the following components:
A) an introduction in which you ask the research question and explain why it is important; B) a section where you define key concepts in your study and present hypotheses to answer your research question; C) a detailed explanation of how you collected the data to test your hypotheses; D) a description of the analysis techniques you used to analyze your data; E) a detailed description of the results of your analysis with at least two figures, and an interpretation of what they mean for past and future research on your subject; F) a bibliography.
My goal is for you to produce something that you can show to future employers, graduate schools, or even just your friends and families to showcase what you have learned. You can see some previous final project papers at the bottom of this course description website (link). You can see examples of student projects from previous semesters of this course at the links below.
How Your Grade Will be Calculated
- Lab exercises 50%
- Final Project Paper 50%
The Duke Compact recognizes our shared responsibility for our collective health and well-being. Please be reminded that by signing your name to this pledge, you have acknowledged that you understand the conditions for being on campus (if you are on campus this semester). These include complying with university, state, and local requirements and acting to protect yourself and those around you. For complete language and updated policies, please visit this link
Academic Integrity/the DCS
All students, whether residing on campus or learning remotely, must adhere to the Duke Community Standard (DCS): Duke University is a community dedicated to scholarship, leadership, and service and to the principles of honesty, fairness, and accountability. Citizens of this community commit to reflect upon these principles in all academic and non-academic endeavors, and to protect and promote a culture of integrity. Plagiarism, cheating or other violations will be dealt with according to University policy. All student assignments will be processed by plagiarism detection software.
Mental Health and Wellness
We are living through unprecedented times that are creating tremendous challenges for everyone. If your mental health concerns and/or stressful events negatively affect your daily emotional state, academic performance, or ability to participate in your daily activities, many resources are available to you, including the ones listed below. I encourage all students to access these resources, particularly as we navigate the transition and emotions associated with this time.
DukeReach. Provides comprehensive outreach services to identify and support students in managing all aspects of wellbeing.
Counseling and Psychological Services (CAPS). CAPS services include individual,
group, and couples counseling services, health coaching, psychiatric services, and
workshops and discussions. They can be reached at (919) 660-1000
Blue Devils Care. A convenient and cost-effective way for Duke students to receive 24/7 mental health support through TalkNow. Managing daily stress and self-care are also important to well-being.
Duke offers several resources for students to both seek assistance on coursework and improve overall
wellness, some of which are listed below and described in more detail at this link
• The Academic Resource Center: (919) 684-5917, theARC@duke.edu,
• DuWell: (919) 681-8421, email@example.com, or https://studentaffairs.duke.edu/duwell)
• WellTrack: https://app.welltrack.com/
In addition to accessibility issues experienced during the typical academic year, I recognize that remote learning may present additional challenges. Students may be experiencing unreliable wi-fi, lack of access to quiet study spaces, varied time-zones, or additional responsibilities while studying at home. If you are experiencing these or other difficulties, please contact me to discuss possible accommodations.
Technology Accommodations Students with demonstrated high financial need who may have limited access to computers and stable internet may request assistance in the form of loaner laptops and WIFI hotspots. For new Spring 2021 technology assistance requests, please go here. Please note that supplies are limited. For updates, please visit this link.
Academic Accommodations The Student Disability Access Office (SDAO) will continue to be available to ensure that students are able to engage with their courses and related assignments. Students should be in touch with the Student Disability Access Office to request or update accommodations under these circumstances. Zoom has the ability to provide live closed captioning. If you are not seeing this, and but would like to see this feature, please reach out to Duke OIT for assistance.
Accommodations for Remote Students If you are unable to attend one of our group meetings, please contact me and we can discuss how to accommodate your needs during this very challenging time.
Because of our virtual format– and weekly one-on-one meetings, I am not holding regular office hours- however, I am very happy to find time to meet with you on an ad hoc basis if you wish to go above and beyond what we discuss in our weekly meetings.
FINAL PROJECT EXAMPLES
The Gender Gap in Professional Networking
by Charlotte McCulloh