Fork me on GitHub

Thursday, September 21, 2017

A gentle introduction to CNN.. [Part 1]

No, not the news channel.
Lets talk about Convolutional Neural Networks CNNs also known as ConvNets. You probably should care because they power a considerable portion of your apps, whether you realize it nor not. Facebook uses it auto tag your photos, Google uses it for more than just image tagging and street sign translation but also to create weird dreams.  Pinterest, Instagram, Amazon you name it, they all employ these networks in one way or another. We know CNNs are state of the art for computer vision because they have produced winners in the infamous Imagenet challenge among other challenges.

facebook auto-tag
Google 'weird' deep dream
Google Image translate




So, what exactly are CNNs?
To start with, they are neural networks(NNs). Modelled after our very own neuronal structure, neural networks are inter-connected processing elements (neurons) which process information by responding to external inputs.

Biological neuron
Artificial Neuron



You can imagine a single neuron as a black box that takes in numerical input and produces output(s) with a linear followed by a non linear activation function. When many neurons are then stacked in a column they form a layer which are interconnected to get neural networks.
Shallow Neural Network of Fully Connected layers



Deep Neural Network



For detailed explanation of NNs, see this post.
As shown above all neurons in each layer are connected to neurons in adjacent layers, making them Fully Connected (FC) layers.

CNNs go a step further, instead of generic neurons, they are modelled after our very own visual cortex. They are not only composed of multiple layers but also different kinds of layers: Input,(IN) Convolutional(CONV), Non-Linear (RELU), Pooling(POOL), Normalization (optional),Fully Connected(FC) and Output (OUT)layers.

Of interest is the Convolutional layer which performs the linear function, convolution.  A convolution neuron is basically a filter that sweeps across the width and height of the input, computing the dot product between itself and input elements in its receptive field producing a 2D activation map. The dot product is computed along all three dimensions of the input: width, height and depth. For raw images the depth is 3 for RGB pixel intensities. When multiple filters stacked in a layer, then each filter produces a different 2D activation map, rendering a 3D output. The CONV layer therefore maps a 3D input to a 3D output which may differ in dimensions.

3D  Convolution


The Non-linear performs a non linear function such as tanh, sigmoid but most commonly ReLu(Rectified Linear Unit). It affects the values but leaves dimensions unchanged.

The Pooling layer performs reduces the spatial size of its input by selecting a value to represent a relatively small region in the input. Common pooling operations include the Maximum, Average and l2-norm. This layer reduces the height and width of the input but does not affect the depth.
A series of  CONV-RELU-POOL layers are normally stacked together to form a bipyramid like structure where the height and weight decreases while the depth increases as the number of filters increases in higher layers.

bi-pyramid structure

 Finally, the network is concluded with FC layers similar to traditional neural networks.

full-stack


Despite their complication, CNN have competitive advantage over traditional neural networks. This comes as result of their peculiar features, some of which I have outlined below.


  1. Convolution filters can be thought of as FC neurons that share parameters across all filter-sized portions of the input. This is in contrast to FC neurons where each input feature would require a separate parameter. This leads to less memory requirement for parameter storage. Less parameters also reduce the chances of overfitting to training data.
  2. CNNs pick up patterns that result from order of input features. In FC, this order doesn't matter since all features are independent and their locations do not matter. On the other hand, CONV filters, have local receptive field therefore patterns that arise out of proximity or lack of therefore are easily detected. For instance CNNs can easily detect edge patterns or that eyes are close to the nose in faces.
  3. Spatial subsampling (Pooling) ensures insensitivity to size/position/slant variations. This is important images are unstructured data,meaning each pixel doesn't exactly represent a defined feature, a nose pixel in one selfie is most likely in another location in another selfie. A good model stays invariant to these distortions. In CNNs this is achieved by the pooling layer which places importance on the relative rather than absolute position of an image.
  4.  Sparse and locally connected neurons eliminates the need for feature engineering. On training, the network will determine which features are important by allocating appropriate parameters to different locations. Zero-valued parameters imply that the feature is not important. 


CNNs are not just useful to images but also time-series and speech data. They are a hot topic not only in research but in academia as well, this post has barely scratched the surface. Therefore, as the title suggests,there will be a part 2 follow up, where we walk through details of training CNN's from scratch. Thanks for reading !!!



Monday, September 4, 2017

On packaging and shipping tribes.

Image result for shipping packages

 all started with Facebook, the one social media platform I have a love-hate relationship with. I like that I find interesting news and opportunities there but it always comes at a cost. Cats pictures are fun to watch but not very productive. The carefully curated best versions of my friends online is sometimes sad to watch. I recently realized that most the interesting things are shared in groups rather than personal profiles. Figured I needed to see headlines from the groups I mostly care about, to decide whether its worth logging in or not, so I  made a tool for it, enter Tribe. With tribe, all you do is enter the group id and optional start date, end date and data format and out you get the posts in a properly formatted file.
pip install fb-tribe


json magic
csv data
Isn't the facebook web interface much nicer than a csv file? You ask, yes, I've considered a friendlier interface. In the meantime it will remain as minimalist as it stands. Of what use is data in a structured file? Thanks for asking; ask a data science friend the same question and watch their eyes glow. So basically you built a mini-facebook for data scientists and minimalist-disguised hippies. Exactly! I built it in python and as useless as it may sound, it had its challenges.

First off was the scope. Ideally I wanted to able to scrape all groups, but the facebook graph API  allows cli apps to to only scrape open groups. Closed and secret groups require a user token which is only obtained with more secure Web and Mobile apps. Bummer! I settled on a MVP for public groups only.

Having not used Python for a while, I tackled interesting language problems. From mundane problems of importing modules in subpackages to dealing with emojis and Chinese characters in the data. The topic of encoding probably deserves another blogpost but rule of thumb is always explicitly write to files with 'utf-8' encoding. The Python default is charmap encoding which doesn't speak emoji very well.

Equally fun was packaging and uploading my first package to pypi, cause you know , who doesn't like to just 'pip install [package]'. Trial and error and Jamie's blogpost helped me get through most of the issues of package directory structure, defining entry points and console scripts in setup.py. All was well until I couldn't upload the zip file due to some permission issue. After lots of tea and internet consumption, it dawned on me that another package exists called tribe hence why I changed the name to fb-tribe. The package was uploaded, all was well.

As you can tell, I had **so much fun experimenting the facebook Graph API, Python and PyPi. If logging on facebook gives you anxiety,lets be friends , go ahead and 'pip install fb-tribe' from your command line. If you really love it, kindly give it a star (I like stars).  If you find a bug or want to see a new feature feel free to create an issue and if you would to contribute feel to open a pull request.  Yes, the source code is public on github, here.

So what's next?
Glad you asked. Perhaps some semantic analysis with NLP before presenting a post. Perhaps presenting it in a proper web interface, or email service or both, time will tell. In the meantime , I will be getting my facebook updates, the good old fashioned way, like an offline magazine. Thanks for reading and have a great week ahead!