Fork me on GitHub

Monday, September 4, 2017

On packaging and shipping tribes.

Image result for shipping packages

 all started with Facebook, the one social media platform I have a love-hate relationship with. I like that I find interesting news and opportunities there but it always comes at a cost. Cats pictures are fun to watch but not very productive. The carefully curated best versions of my friends online is sometimes sad to watch. I recently realized that most the interesting things are shared in groups rather than personal profiles. Figured I needed to see headlines from the groups I mostly care about, to decide whether its worth logging in or not, so I  made a tool for it, enter Tribe. With tribe, all you do is enter the group id and optional start date, end date and data format and out you get the posts in a properly formatted file.
pip install fb-tribe


json magic
csv data
Isn't the facebook web interface much nicer than a csv file? You ask, yes, I've considered a friendlier interface. In the meantime it will remain as minimalist as it stands. Of what use is data in a structured file? Thanks for asking; ask a data science friend the same question and watch their eyes glow. So basically you built a mini-facebook for data scientists and minimalist-disguised hippies. Exactly! I built it in python and as useless as it may sound, it had its challenges.

First off was the scope. Ideally I wanted to able to scrape all groups, but the facebook graph API  allows cli apps to to only scrape open groups. Closed and secret groups require a user token which is only obtained with more secure Web and Mobile apps. Bummer! I settled on a MVP for public groups only.

Having not used Python for a while, I tackled interesting language problems. From mundane problems of importing modules in subpackages to dealing with emojis and Chinese characters in the data. The topic of encoding probably deserves another blogpost but rule of thumb is always explicitly write to files with 'utf-8' encoding. The Python default is charmap encoding which doesn't speak emoji very well.

Equally fun was packaging and uploading my first package to pypi, cause you know , who doesn't like to just 'pip install [package]'. Trial and error and Jamie's blogpost helped me get through most of the issues of package directory structure, defining entry points and console scripts in setup.py. All was well until I couldn't upload the zip file due to some permission issue. After lots of tea and internet consumption, it dawned on me that another package exists called tribe hence why I changed the name to fb-tribe. The package was uploaded, all was well.

As you can tell, I had **so much fun experimenting the facebook Graph API, Python and PyPi. If logging on facebook gives you anxiety,lets be friends , go ahead and 'pip install fb-tribe' from your command line. If you really love it, kindly give it a star (I like stars).  If you find a bug or want to see a new feature feel free to create an issue and if you would to contribute feel to open a pull request.  Yes, the source code is public on github, here.

So what's next?
Glad you asked. Perhaps some semantic analysis with NLP before presenting a post. Perhaps presenting it in a proper web interface, or email service or both, time will tell. In the meantime , I will be getting my facebook updates, the good old fashioned way, like an offline magazine. Thanks for reading and have a great week ahead!


No comments:

Post a Comment