It depends what you need. You may have simple or multiple classification, which means you may have one/several classes per predicted sample. I may say, for the beginning, try to have one class per sample, that would get a better result. If you can have several labels per entity, just start by building one binary model per label.