Friday, 29 November 2013

Updates on CrowdPee

As I mentioned in the previous post, the Zürich Open Data Hacknights were a good opportunity to get started on gathering data on the available toilets in Zürich. Together, David Stark and I built a website that hosts a questionnaire for each of our locations, and a Twitter bot to ask people to fill it out.

NearbySources, the website we made, is a system for crowdsourcing information about places from people who are there. Get in touch if you're interested in using it for a project of your own.

We presented our progress so far at the final Hacknight, using the slides below. Amazingly, when the projects were voted for at the end of the night, we came in second place! It felt great to see that so many people were interested in getting this useful information out there.

From conversations at the event and later, we've come up with the following to-do items:

  • Make questionnaire searchable to add data - done!
  • Add a 'more info' text to the questionnaire, and link to it from the location-questionnaire pages - done!
  • Enable tweeting at the bot to receive a questionnaire
  • Improve the results display and export function
  • Include the coordinates of each location in the results

@CrowdPee has seen several retweets and favourites on Twitter (and, most importantly, some data on toilets!). Unfortunately, it's also been suspended once on suspicion of spamming. This can happen to even helpful bots, as this article on the @FeelBetterBot shows. (Thanks to Suzy Hamilton for that link.)

Here are the changes we've made, which will hopefully make the bot more helpful for everyone:
  • When it detects a geotagged tweet from within 10km of a location of interest, it will ask to follow the user who posted it. They are near enough to Zürich that they might visit one day.
  • When it detects a geotagged tweet from within 100m of a location of interest, it checks whether it already follows that user. If so, it will send it the link to the questionnaire about that location.
  • If a user follows the bot, it will send them a questionnaire for any other location they tweet from. If not, it only ever sends one tweet.

I hope that by only tweeting at people it follows, after they've had a chance to block it, the bot will look less spammy.

Some improvements to identifying nearby locations:
  • We originally filtered the public Twitter stream by location, to get only tweets from the area around Zürich:
    • stream.filter(locations=[8.41, 47.31, 8.62, 47.48])
  • Unfortunately, this filtering didn't always work, for reasons only Twitter knows. The bot followed a lot of people in the Czech Republic and France before we realised this! This is why it now double-checks that a tweet is within 10km of a location of interest.
  • The bot also seemed to be fixated on only a few locations. This was because we were asking for the origin of the bounding box of the 'place' of the tweet:
    • lng, lat =
  • It now asks for the actual coordinates of the tweet, and the problem is solved.
    • lng, lat = status.coordinates['coordinates']

The next improvements will be to the results. They should be exportable in at least CSV and JSON format, and displayable on a map. I will also upload them to the database.
Any comments or suggestions for CrowdPee are very welcome!