JAVA
[java] Spark java tutorial
닉의네임
2022. 6. 5. 19:01
반응형
https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html
Generic Load/Save Functions - Spark 3.2.1 Documentation
spark.apache.org
JSON File 읽기
Dataset<Row> peopleDF =
spark.read().format("json").load("examples/src/main/resources/people.json");
peopleDF.select("name", "age").write().format("parquet").save("namesAndAges.parquet");
CSV File 읽기
Dataset<Row> peopleDFCsv = spark.read().format("csv")
.option("sep", ";")
.option("inferSchema", "true")
.option("header", "true")
.load("examples/src/main/resources/people.csv");
ORC 파일 쓰기
usersDF.write().format("orc")
.option("orc.bloom.filter.columns", "favorite_color")
.option("orc.dictionary.key.threshold", "1.0")
.option("orc.column.encoding.direct", "name")
.save("users_with_options.orc");
반응형