Automatically And Elegantly Flatten DataFrame In Spark SQL
Answer : The short answer is, there's no "accepted" way to do this, but you can do it very elegantly with a recursive function that generates your select(...) statement by walking through the DataFrame.schema . The recursive function should return an Array[Column] . Every time the function hits a StructType , it would call itself and append the returned Array[Column] to its own Array[Column] . Something like: import org.apache.spark.sql.Column import org.apache.spark.sql.types.StructType import org.apache.spark.sql.functions.col def flattenSchema(schema: StructType, prefix: String = null) : Array[Column] = { schema.fields.flatMap(f => { val colName = if (prefix == null) f.name else (prefix + "." + f.name) f.dataType match { case st: StructType => flattenSchema(st, colName) case _ => Array(col(colName)) } }) } You would then use it like this: df.select(flattenSchema(df.schema):_*) I am improving my previous a...