i have column of json strings , able convert them structs, similar how sqlcontext.read.json() make transformation on initial read file.
alternatively, there way nest dataframes? well.
spark not support dataframe (or dataset or rdd) nesting.
you can break down problem 2 separate steps.
first, need parse json , build case class consisting entirely of types spark supports. problem has nothing spark let's assume you've coded as:
def buildmycaseclass(json: string): mycaseclass = { ... }
then, need transform dataframe such string column becomes struct column. easiest way via udf.
val builderudf = udf(buildmycaseclass _) df.withcolumn("mycol", builderudf('mycol))
Comments
Post a Comment