Pre-Midterm review Digital trace data Digital trace data An example case: Predicting election from social media SAMPLING DISTRIBUTION • On average, � will center around, or cluster among, its population value � : • It forms normal distribution • the mean of � � = � • central limit theorem (중심극한정리) • Variability of � � (= how close or far � � is apart from � ) decreases as the sample size increases • The standard deviation (SD) of the sampling distribution is proportional to the SD of its population value, so as the SD of the sample SAMPLING DISTRIBUTION IN PRACTICE • You gather your sample, and obtain “sample-specific” estimate of � , which is � � • This is your best-possible guess of � (you assume � � = � ) • Using the SD of the sample, you approximate SD of your sampling distribution, calculate the range of possible values of your � given probability mass (e.g., 95% or 99%) ( = Confidence interval) The case of predicting election from social media • Few conceptual and m ethodological issues: • A m achine-learning approach: social media data predicting previous election results (training & building prediction models) à predict new election using a developed model using out-of-sample data Data extraction Preprocessing Feature extraction ML algorithm training Prediction on previous data Data extraction Preprocessing Feature extraction ML algorithm application Prediction on new data Problem s? Artificial Intelligence Three different perspectives of democracy Data “creation” in datafication • sampling bias: when does it occur? Data “creation” in datafication • Psychology of survey response (assumptions) • Examples of “bad” questionnaires Data “creation” in datafication • Advantages vs. disadvantages of digital trace / online data • Capacity to collect and analyze massive amount of data • Nonreactive measures (avoiding response bias) Data “creation” in datafication • Advantages vs. disadvantages of digital trace / online data • Issues with reflexivity (ppl change their behaviors when recognized) • Ethics and privacy • A lack of robust model of collaboration w/ industry The big data “mythology” When to rely on social media data? many are analogues to what would be observed offline behaviors studied using social media Pitfalls when relying on social media data social media Black-box proprietary sampling algorithms used by Similar to the critic of lab-based experiment: • Affordances of platform may change the logic of ppl’s behavior used by social media: Black-box proprietary sampling algorithms Campaigns and Big data • Political campaigns: Deliberate, self-conscious efforts on the part of elites to influence citizens influence prospective voters • political advertising & mass media (i.e., news or debate) appearance Advantages and disadvantages of ads • To (almost) anyone they like, given sufficient audience attention, and to the extend that they can afford such ads (i.e., money) • But not necessarily mean every prospective voter is uniformly influenced by such ads “Free” media appearance • Therefore, political actors have a strong incentive to “supplement” these paid media with “free” media: press conference, talk-show campaigning • Disadvantage (especially in news coverage): lack of message control & targeting Microtargeting: Combining two sources of data In what ways has consumer and proprietary data shaped our understanding of political attitudes and behaviors? In what ways has consumer and proprietary data shaped our understanding of political attitudes and behaviors? Increasing social and cultural differences • Seemingly apolitical domains – such as food, artistic or cultural preferences, consumer decisions, moral senses, etc. – can provide some “cues” about one’s political identity (“lifestyle politics”) • Some evidence suggests that ppl do evaluate co-partisans more favorably than out-partisans in seemingly apolitical arena • These increasing social and cultural differences are (quite naturally) reflected in what people write and share in social media: Data-driven campaigning in practice - Data protection requirement • - Campaign finance laws • - Election contexts Democratic consequences of micro-targeting • In principle, MC could “strengthen” democracy by increasing political participation: • (1) Micro-targeting may amplify the effects of campaigns by reaching citizens who are difficult to reach: • (2) Microtargeting increase the diversity of political campaigns, and voters’ knowledge about certain issues: • (3) Help voters to manage information overload: Threats for democracy from micro-targeting • In practice, this means gathering as much as data about individual voters: data privacy issues • More serious issue concerns with data breaches: • mobilization by microtargeting also means suppressing voter turnout for their opponents Differential issue focus creates biased perceptions of the parties’ priorities • Certain groups may be ignored: Ethics of data and knowledge production “Emotional contagion” study “Taste, ties, and time” data Encore & Mark of criminal record examples Social media and Market forces in news production Economic theories of news production Economic theories of news production Hard = high level of newsworthiness (current affairs), demanding immediate publication Soft = do not need timely publication, low substantive informational value Consumers’ informational demand Producers’ informational “costs” The “contestable” News market on social media • No Entry/Exit barriers: No formal barriers to entry or exit from the market. • On social media, platform itself provides audience and distribution channels • Symmetric Information/technology: There cannot be any specialized technology or knowledge available to the incumbent firms but not the new entrants. • On a given platform, technological affordances are the same • No Sunken Costs: There cannot be any capital investments (in either physical or intellectual capital) that cannot be recouped. • Results: more “hit-and-run” type competition & declining “quality” of news in general – “clickbait media” Attention-maximization by algorithmic curation users’ behaviors quantitative audience metric quantitative information about The shift in journalistic routines • Questions naturally arise about the ways in which people actually read and engage with these pieces, beyond the feel-good tally of how many visitors they attract. The use of audience metric The use of audience metric The use of audience metric Attitude assessment: measuring attitudes • Direct measurement: • Some people don’t have attitudes about topics, or don’t know their attitudes! Psychology of attitude response (assumptions)