I'm loading many versions of files to spark dataframe. some of the files holds columns A,B and some A,B,C or A,C..
if run this command
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
df = sqlContext.sql("SELECT A,B,C FROM table")
after loading several i can get error "column not exist" i loaded only files that are not holding column C. how can set this value to null instead of getting error?
thank!
Aucun commentaire:
Enregistrer un commentaire