machine learning - Can a model be created on Spark batch and use it in Spark streaming? -
can create model in spark batch , use on spark streaming real-time processing?
i have seen various examples on apache spark site both training , prediction built on same type of processing (linear regression).
can create model in spark batch , use on spark streaming real-time processing?
ofcourse, yes. in spark community call offline training online predictions. many training algorithms in spark allow save model on file system hdfs/s3. same model can loaded streaming application. call predict method of model predictions.
see section streaming + mllib in this link.
for example, if want train decisiontree offline , predictions online...
in batch application -
val model = decisiontree.trainclassifier(trainingdata, numclasses, categoricalfeaturesinfo,impurity, maxdepth, maxbins) model.save(sc, "target/tmp/mydecisiontreeclassificationmodel")
in streaming application -
val samemodel = decisiontreemodel.load(sc, "target/tmp/mydecisiontreeclassificationmodel") samemodel.predict(newdata)
Comments
Post a Comment