天天看点

jupyter notebook中运行pyspark代码

  • 前提是windows下安装pyspark

设置连接

  • 用jupyter notebook编写pyspark代码

from pyspark.sql import SparkSession
# 环境配置
spark = SparkSession.builder.master("local").appName("test").enableHiveSupport().getOrCreate()
sc = spark.sparkContext
# 测试是否成功
rdd = sc.parallelize([("hello", 1)])
rdd.collect()
           

继续阅读