Spark函数之filter 2015-08-21 21:06

filter

对RDD中的元素进行过滤,返回形成新的RDD。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
scala> val a = sc.parallelize(1 to 10, 3)
a: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[29] at parallelize at <console>:21

scala> val b = a.filter(_ % 2 == 0)
b: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[30] at filter at <console>:23

scala> b.collect
res13: Array[Int] = Array(2, 4, 6, 8, 10)

scala>
Tags: #Spark    Post on Spark-API