Become a social impact investor for as little as $50.
Recently, a headline read: "Robber busted in S.F. Muni bus robberies while trying to transfer." While you might think that it came from the satirical The Onion, it was actually a real headline from our local newspaper here in San Francisco. I'm not a criminologist, but being a curious data person, I wondered which bus stops historically recorded the most crimes.
Here's what I found, and I'll show you the steps I took in Tableau Desktop, available at TechSoup, to make the following map.
San Francisco, like many cities embracing the open data movement, has a dedicated data site with all sorts of data available for free for anyone to download and use. What may have taken months or even years to request via the Freedom of Information Act, often in a machine unreadable format such as a PDF file, now only takes seconds to download as a "comma-separated value" (CSV) file. All data analysis and visualization packages are able to read a CSV file.
In the category of "Public Safety," Crime Incidents from 1 Jan 2003 is the most requested item. As with many datasets, SF OpenData maps the data by default, which allows you to quickly view the data if you don't need to do further analysis or layering.
However, to view the above information with transit stop data, I needed to blend or overlay additional data. So I accessed the data using the "Export" function on the site.
From the same SF OpenData website, there's a link to the data from the San Francisco Municipal Transportation Agency (SFMTA), with various datasets like bus routes, schedules, and locations. The file I exported is a zip package that contains a CSV file with stop names and GPS coordinates.
Taking the two raw CSV files — crimes and transit stops — I could then build the map of crimes at SFMTA bus stops.
I used Tableau Desktop Public Edition, which is the free version that saves to the cloud exclusively to encourage data sharing. Functionally it is otherwise identical to the one Tableau donates to the nonprofit sector here at TechSoup. However, for data you can't or don't want to share publicly, you must use Tableau Desktop Professional (available through TechSoup) or a paid online service.
It was my first time using Tableau, so in the interest of time I also used R and RStudio, free open-source analysis packages, to clean and arrange the data. This involved removing and replacing empty values, rounding the GPS coordinates, and removing unneeded columns. This is often the most time-consuming step of data analysis, but often the most crucial. Time invested cleaning up the data in the beginning will reduce the time spent later on.
I included all the criminal data. For criminal data rows that had GPS coordinates that matched the bus stop list, I added the "Stop Name" field from the bus stop CSV file. This combined data became the data source for the visualization exercise.
I started by connecting to the "Text File." Once loaded into Tableau as a "Data Source," I saw a preview of the data in a table.
On the right pane where a preview of the data is shown, by selecting "Geographic Role" for the latitude and longitude data, Tableau will know to use these as coordinates in the map. See the screen shot below.
Once the data is loaded, I can access the worksheet view by clicking "Sheet 1" on the bottom, like you would in Excel. I then dragged and dropped "Dimension" elements from the left pane into the different boxes (the dimensions are the column names from the original data file).
Finally, I used this sheet in a "Dashboard" object. On the top menu, under "Dashboard," select "New Dashboard." With this "Dashboard" object, I can manipulate the display, such as selecting certain crimes, or drill down to a specific year. The Dashboard object is also useful for sharing the map with an audience.
Based on the data, I found some criminal hot spots in the period for which there was data. Unsurprisingly, theft is the highest category, since there's a natural congregation of potential victims in the major thoroughfares of Market Street and Van Ness Avenue.
Of course, there are some caveats.
So can we conclude much from this data, aside from the locations of criminal activity? Not really. But on its own website, SFMTA, also using Tableau, acknowledged that in the past fiscal year Muni-related crimes failed to meet SFMTA's goal of less than 3.1 crimes per 100,000 miles.
Sensationalist headlines notwithstanding, and even with the stated caveats, one should definitely be vigilant of criminals — especially thieves — when riding public transit.
Kevin Lo | Senior Program Manager, NetSquared.org | a part of TechSoup Global
This work is published under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License.
Close this window