Linda Chen '23
I have always been interested in potential trends and patterns within our volumious corpus, so at the initial stage of my individual project, I ran a few keyword searches through the whole corpus using regular expression and python. I chose to look into the frequency of “cigarette” and “smoking” mostly because during the project team’s initial exploration of The College News, we discovered a large number of tobacco advertisements. I then ran a keyword search of “cigarette” and “smoking”, which returned over 2800 results–a shockingly high number to my generation considering tobacco’s well-known negative impact on health nowadays. After an informal discussion with the project team, I realized the frequency of smoking related terms in our corpus might correlate to national policies concerning the tobacco industry during the runtime of The College News (1914-1968). To further analyze the data I gathered through keyword searching, I imported the csv file containing the search results to Pandas, a python library for data analysis, as a data frame, and calculated the relative frequency of “smoking” and “cigarette” by dividing their yearly counts by the total wordcount of all issues published that year. Finally, I graphed my data using Altair, a python library for interactive data visualization. I chose to make a bar graph with year as the x-variable and relative frequency on the y-axis because I hope this graph can reveal patterns and trends in the fluctuation of cigarette-related content over time. As the graph shows, mentions of cigarettes and smoking have been in steep decline throughout the 1960s with the notable nadir in 1964 potentially as a result of “the Surgeon General’s 1964 Report on Smoking and Health”, a research highlighting negative effects of tobacco.
Health, National Center for Chronic Disease Prevention and Health Promotion (US) Office on Smoking and. Fifty Years of Change 1964–2014. The Health Consequences of Smoking—50 Years of Progress: A Report of the Surgeon General. Centers for Disease Control and Prevention (US), 2014. https://www.ncbi.nlm.nih.gov/books/NBK294310/.