How CNN can be used for NLP problems?

mat_lo7 report abuse

Convolutional Neural Networks (CNNs) are widely used in computer vision. But recently I realized that they could be also used for different tasks in natural language processing. For example, for text classification. When I imagine image classification and text classification I cannot understand how the networks of the same type can be used for such different tasks! What is similar between the text and the image? Could anybody explain to me please, how CNNs are used in NLP?

Answers

Mike1191 report abuse

The text and the image are actually different, but only for humans. Computers operate with all types of information in the numeric form. Probably you know, that in computer vision, the images are presented as matrices of numbers. If you have an image with 50x50 dimensions, this means that it has 2500 pixels, and if this is the colored image, then it should have 3 additional values for each pixel. Each of these values represents the RGB setting (mix of red, green, and blue color). Now, when you are using CNN for the image, you specify the convolution, which takes several neighboring pixels and processes them. Then it goes to the second block of pixels and so on. For example, you can choose the dimension of convolution as 2x2, which means that your CNN will analyze 4 blocks of pixels on each step. The aim of this is to generate features which then will be processed by usual fully connected layers of the network. Roughly say, CNN by acting in such a way, can detect such features as different edges on the image, colors, forms, objects, etc. Briefly, the purpose of convolution in CNN is to generate some high-level features from pixels which would be then used by the traditional neural network on the last layers of the CNN. For the text, each word is also can be presented as a number. It is not important here how to perform this transformation (it is called text vectorization, you can read about this more somewhere in NLP articles and tutorials). The main point here is that by using the convolution of CNN for the text, you look at a couple of neighbor words in the text and try to detect some high-level feature from these simple words. It is harder to imagine these high-level features for the texts compared to the examples of the high-level features I have mentioned earlier. But you can think about these features as about some mini-topics in the text, maybe simple actions, descriptions, etc. Hope you catch the idea.

mat_lo7 report abuse

This is interesting. Are the convolutions for the images and for the texts the same or there are differences between them?

Mike1191 report abuse

The core idea is the same. But the particular form of the convolution is slightly different due to the difference in the nature of images data and textual data. One of the main and most noticeable differences is that the convolutions for the NLP problems are 1-dimensional instead of multidimensional convolutions for the images, because texts are the sequence of words, and the images are the multidimensional matrices of the pixel values.

Add Answer

Need support?

Just drop us an email to ... Show more