This part of the code should work with a data set, as in the original example of comparing color data from the iris.txt file. Hence the NumberFormatException, because you are writing a string that will give an error when converted to a number. To work with words, use the Word2Vec method of the Deeplearning4j library. An example of comparing words with the source code is described DL4J NLP Word2Vec Java.