Spark Streaming实例(1.2.1) 2015-03-19 15:30

说明

以下代码都是在spark-shell中执行的。

启动Socket

在10.255.1.6上启动Socket监听9999端口。待spark连接上新端口后,输入文本,则Spark将准时间统计一行中的单词数。

nc -kl 9999

spark-shell中运行

spark-shell:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.StreamingContext._

//每10秒提交一次
val ssc = new StreamingContext(sc, Seconds(10))

val lines = ssc.socketTextStream("10.255.1.6", 9999)
val words = lines.flatMap(_.split(" "))
val pairs = words.map(word => (word, 1))
val wordCounts = pairs.reduceByKey(_ + _)
wordCounts.print()
ssc.start()
ssc.awaitTermination()

结果:

-------------------------------------------
 Time: 1426751040000 ms
 -------------------------------------------
 (hou,12)
 (cheyo,16)
 (hello,4)
 (world,4)

spark-submit中运行

  • 代码
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.StreamingContext._

object NetworkWordCount {
  def main(args: Array[String]) {
    if (args.length < 2) {
      System.err.println("Usage: NetworkWordCount <hostname> <port>")
      System.exit(1)
    }

    val sparkConf = new SparkConf().setAppName("NetworkWordCount")
    val ssc = new StreamingContext(sparkConf, Seconds(10))

    val lines = ssc.socketTextStream(args(0), args(1).toInt)
    val words = lines.flatMap(_.split(" "))
    val pairs = words.map(word => (word, 1))
    val wordCounts = pairs.reduceByKey(_ + _)
    wordCounts.print()
    ssc.start()
    ssc.awaitTermination()
  }
}
  • sbt打包

ncwordcount.sbt:

name := "Network Word Count Project"

version := "1.0"

scalaVersion := "2.11.5"

libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.2.1"

打包:

sbt package
  • 运行
spark-submit \
  --class "NetworkWordCount" \
  --master local[2] \
  target/scala-2.11/network-word-count-project_2.11-1.0.jar 10.255.1.6 9999

注意:本地模式运行时,应至少指定2个线程。1个接收数据,1个处理数据。

Tags: #Spark #SparkStreaming    Post on Streaming