You could use Pytorch or Tenserflow, thay have functions to do all what you need.
In my opinion, if your network architecture is big use Pytorh , otherwise use Tenseflow.