MDI Data Workshop Series

2023 Fall

Text as Data: Measurement and Inference Issues with Text Data

Text as data has become a transformative approach of producing insights about human behavior and society. How do we use text as data? This workshop provides an overview of different applications of text as data. From constructing variables using text to employing Large Language Models (LLM) to scale variables, we will discuss the strengths and weaknesses of using text as data in the contexts of measurement, statistical models, and causal inference. This workshop serves as an introductory session for the other fall MDI Data Workshops that will focus on specific machine learning and natural language processing (NLP) techniques. Basic familiarity with programming and statistical methods is expected. No NLP background is required.

[Slides]

Google Colab(GU login required): [Day 1] [Day 2]