machine learning - generating vector from text data for KMeans using spark -
    i new spark , machine learning. trying cluster using kmeans data like   1::hi how 2::i fine, how   in data, separator :: , actual text cluster second column has text data. after reading on spark official page , numerous articles have written following code not able generate vector provide input kmeans.train step.   import org.apache.spark.sparkconf import org.apache.spark.sparkcontext import org.apache.spark.mllib.clustering.{kmeans, kmeansmodel} import org.apache.spark.mllib.linalg.vectors  val sc = new sparkcontext("local", "test")   val sqlcontext= new org.apache.spark.sql.sqlcontext(sc) import sqlcontext.implicits._  import org.apache.spark.ml.feature.{hashingtf, idf, tokenizer}  val rawdata = sc.textfile("data/mllib/km.txt").map(line => line.split("::")(1))  val sentencedata = rawdata.todf("sentence")  val tokenizer = new tokenizer().setinputcol("sentence").setoutputcol("words")  val wordsdata = tokenizer...