Create a spark session
import findspark
findspark.init()
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
Create a sparkContext
sc = spark.sparkContext
Create RDD0 that reads from a numpy array
rdd0=sc.parallelize(data)
Create RDD1,2,3,4
rdd1= rdd0.map(lambda x:x**2)
Execute RDD
<aside> rdd1.collect()
</aside>
Cache RDD
rdd1.cache()
Apply transformation on a portion of the data
rdd2.take(3)