San Francisco is among the cities which made their data publicly available to everyone. This is very progressive (but also missing in many other areas in the world) and it shows that a high level of transparency breeds more trust in the work of the public institutions, which are serving all citizens. This allows anyone to feel that they are participating in the system rather than being excluded from it. If an area of the city governance needs improvement, the information is there to allow finding the people with the right expertise who will be willing to be held accountable for their actions, being publicly visible to everyone. This is important, because it has been repeatedly shown that when the institutions in one country are strong and work well together, the country as a whole progresses faster and its people can more easily unfold their full creative potential. If that's not the case, they waste energy on fighting the bureaucracy around them.
It would be nice to see more cities following this positive example. Merely sharing the information is not enough; everyone must feel that their efforts are actively changing it and that there are constantly ups and downs, depending on how we do collectively.
Here we will try to explore one of the datasets on crime and see whether we can find something interesting from it. You can find the dataset here. One would expect that there wouldn't be crime in San Francisco if everything is shared so openly, but this is not the case. It requires a certain degree of faith to admit any form of vulnerability, knowing that it could scare tourists away or keep multinationals unwilling to invest in the city. This could be one reason why other cities aren't accepting this as a good example. But doing so and silently keeping the major events in secret is in disservice to everyone. One day it may become clear that our cities are gradually losing their infrastructure they once had, gradually losing their production facilities, neglecting education and healthcare, the people in need, their identity. If we continue to pretend behind a facade, this would be inevitable.
How many records are in this dataset?
- 132083 rows x 13 columns
In which period was the data collected?
- Between 01.01.2016 and 24.11.2016 (likely still updated).
Can this dataset be held in memory at once?
- It takes ≈71.8MB memory to do so.
Which was the most common incident category in San Francisco this year?
Larceny/theft 35251 Other offenses 17196 Non-criminal 15760 Assault 12061 Vandalism 7485 Vehicle theft 5680 Warrants 5203 Burglary 5080 Suspicious occ 5027 Drug/narcotic 3846 Missing person 3746 Robbery 2924 Fraud 2275 Secondary codes 1624 Trespass 1589
Which was the most common incident description?
Grand theft from locked auto 15398 Aided case, mental disturbed 4077 Lost property 3972 Petty theft of property 3815 Battery 3778 Malicious mischief, vandalism 3656 Petty theft from locked auto 3532 Stolen automobile 3202 Drivers license, suspended or revoked 3127 Found property 2765
In which districts crime was most common?
Southern 24981 Northern 17588 Mission 16984 Central 15414 Bayview 12615 Ingleside 10153 Taraval 9858 Tenderloin 8807 Richmond 7867 Park 7815
At what time of the day happened the most incidents in the Southern district?
17:00-18:00 1765 18:00-19:00 1689 16:00-17:00 1554 19:00-20:00 1504 15:00-16:00 1419 00:00-01:00 1406 11:00-12:00 1378 14:00-15:00 1293 20:00-21:00 1266 21:00-22:00 1261 13:00-14:00 1256 12:00-13:00 1182 22:00-23:00 1129 10:00-11:00 1020 09:00-10:00 902
Can we see the distribution by the hour?
At which addresses were incidents happening most frequently?
800 block of Bryant St 3049 800 block of Market St 1219 1000 block of Potrero Av 572 900 block of Market St 476 0 block of Unitednations Pz 420 500 block of Johnfkennedy Dr 418 600 block of Valencia St 355 3200 block of 20th Av 353 1100 block of Fillmore St 346 300 block of Eddy St 323 100 block of Ofarrell St 312 16th St / Mission St 311 0 block of 6th St 308 800 block of Mission St 297 700 block of Mission St 296
Where is Bryant Street?
- Latitude: (-122.40455785073601, -122.402771389219) Longitude: (37.774431418760294, 37.775859961640798)
On which days were the most incidents?
01/01/2016 529 10/08/2016 520 04/01/2016 516 01/29/2016 503 06/25/2016 503 02/04/2016 489 07/01/2016 488 02/01/2016 486 05/23/2016 482 01/23/2016 481
How were most cases resolved?
None 94083 Arrest, booked 35071 Unfounded 1416 Juvenile booked 939 Exceptional clearance 327 Arrest, cited 141 Cleared-contact juvenile for more info 54 Not prosecuted 20 Psychopathic case 14 Located 11 Juvenile diverted 2 Juvenile cited 2 Complainant refuses to prosecute 2 Prosecuted by outside agency 1
Among the unresolved cases, on which day of the week were registered most of them?
Friday 14860 Saturday 14104 Monday 13127 Tuesday 13037 Thursday 13029 Wednesday 12981 Sunday 12945
Can we see a map of all cases?
As you can see, open datasets can help to address many interesting questions.