This post is part of a series of posts and testimonies featuring data science students from the Data Science Immersive 12 Weeks, Full-Time Career Accelerator in Washington, DC offered by the educational company General Assembly. Students share what they've learned, what they wished to accomplish and what they are doing next. This series also profiles the winners of a Shout! and General Assembly Data Science Immersive Accelerator competition for best data journalism.
Last month, Shout! joined forces with General Assembly's Data Science program to "slice and dice" the data behind President Trump's first 100 days in office.
Q&A with winner the Data Science Immersive Class of May 2017, Brian Austin and Flavius Mihaies, founder of Shout!
Shout! (SH!): Congratulations on your win! I was wondering what prompted your decision to pick this topic? Is that something you had in mind prior to the competition?
Brian Austin (BA): I didn’t have any idea that this particular dataset existed until I started poking around based on the prompt from Shout! to write about the first 100 days of Donald Trump’s presidency. I was looking for ways to measure change during that time, and stumbled across the excellent US government analytics webpage. I saw that it kept track of metrics for a whole range of government web domains, and started wondering about the changes that you might expect over that time period.
SH!: What was the main challenge, editorially, data wise, or both that you encountered and how did you address it?
BA: The main challenge from a data analysis standpoint was that while the data was extremely current (up to the minute!), there was no historical data readily available to draw comparisons. To overcome this challenge, I used the Wayback Machine, an internet archive that scrapes and stores millions of web pages a day. They collected the 30-day prior data on government websites for January 29th and April 29th. April 29th, being the 100th day, was extremely useful, and while January 29th doesn’t give an exact picture of the pre-Trump era, I thought it was close enough to give an approximate idea.
If I were to continue the project, I would like to continually pull the information from the analytics.usa.gov and track 30-day changes over time. I would be especially interested in how my findings from this year compare with prior or following years’ seasonality.
SH!: On a different note, data-journalism is really at the intersections of two fields that did not much talk to each other until recently. Do you see this as a challenge? How do you picture the future of data-journalism? Is this a career attractive to young data science graduates these days?
As journalists have gotten more adept at extracting information from data, there has been a real proliferation of good analyses and even better storytelling. Journalists like Stephen Engelberg of ProPublica, Nate Cohn at the New York Times Upshot, and the team of data journalists at the Tampa Bay Times have done some incredible work collecting, visualizing, and communicating stories through data. One great example is the data the Tampa Bay Times collected as part of an investigation into police violence against black people in Florida.
I think the best thing about this is that it has pushed all journalists to expand the limits of what prompts them for stories. Being good at scraping and parsing data means that you don’t have to wait for someone to just hand you the information they have. I think this shows up even in stories that aren’t, strictly speaking, “data” journalism. For instance, when Donald Trump’s administration stopped publishing White House visitors’ logs, POLITICO built their own database of publicly-announced guests to the White House. Journalism that sees data as just another avenue of storytelling is one of the things that makes me very excited about its future in an increasingly data-literate world.
SH!: Where can we follow your current and next work?
BA: My next project is examining the way citizens interact with the Consumer Financial Protection Bureau, an administrative organization that polices lenders and other consumer finance companies. I’m taking a close look at the types of things people are talking about when they make a complaint about a company. I’m also looking at what changes when the bureau steps in and takes an enforcement action.
I’ll keep updating that project on my blog as I work at it, but one of my early findings fits in well with this project: on Donald Trump’s first day in office, more people made complaints about consumer finance than they had done on any day before then.