Spark SQL: How to consume json data from a REST service as DataFrame -
i need read json data web service thats providing rest interfaces query data spark sql code analysis. able read json stored in blob store , use it.
i wondering best way read data rest service , use other dataframe
.
btw using spark 1.6 of linux cluster on hd insight
if helps. appreciate if can share code snippets same still new spark environment.
on spark 1.6:
if on python, use requests library information , create rdd it. there must similar library scala (relevant thread). do:
json_str = '{"executorcores": 2, "kind": "pyspark", "drivermemory": 1000}' rdd = sc.parallelize([json_str]) json_df = sqlcontext.jsonrdd(rdd) json_df
code scala:
val anotherpeoplerdd = sc.parallelize( """{"name":"yin","address":{"city":"columbus","state":"ohio"}}""" :: nil) val anotherpeople = sqlcontext.read.json(anotherpeoplerdd)
this from: http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets
Comments
Post a Comment