Classification based on the taxonomy
I need to perform classification of companies according to the defined taxonomy. It has 3 levels. How I can do this?
I need to perform classification of companies according to the defined taxonomy. It has 3 levels. How I can do this?
Please, give us more details about the problem. What are the input features, how many classes each level has, how many training data do you have, etc?
I have just 1000 labelled companies. So, this is all my training data. The companies descriptions (text) are the input to the model. On the first level, I have 8 categories. On each next level, for each category on the previous level, I have up to 8 categories too.
I think you have not enough data to achieve good quality. My advice is to start from classifying just level 1. But 1000 training example is probably not enough even for this level. Try to collect more data.
Ok, I will think about this. But what are the approaches to this task? How I can classify the company up to level 3?
There is an approach called hierarchical classification. Search about it in Google. This approach has several variants and modifications too.
Also, you can try to classify companies directly (independent model for each level). But you definitely have not enough data for this approach. Also, there could be nasty errors like mixing classes for different errors.
What about neural networks? Can they help here?
Neural networks require even more data than traditional approaches. With just 1000 training examples it is senseless to train NN. But in theory, you can try to build a single neural network for predicting labels for all 3 levels simultaneously. But, again, the central issue here is the lack of training data.
Thanks for all the information!
Just drop us an email to ... Show more