我一直在玩实验性的Akka Streams API,我有一个用例,我想看看如何实现。对于我的用例,我有一个 StreamTcp
基于 Flow
正在通过将输入连接流绑定到我的服务器套接字来提供。我拥有的流程基于 ByteString
进入它的数据。进入的数据将在其中具有分隔符,这意味着我应该将分隔符之前的所有内容视为一条消息,并将所有内容作为下一条消息处理到下一个分隔符之后。所以玩一个更简单的例子,不使用套接字和静态文本,这就是我提出的:
import akka.actor.ActorSystem
import akka.stream.{ FlowMaterializer, MaterializerSettings }
import akka.stream.scaladsl.Flow
import scala.util.{ Failure, Success }
import akka.util.ByteString
object BasicTransformation {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("Sys")
val data = ByteString("Lorem Ipsum is simply.Dummy text of the printing.And typesetting industry.")
Flow(data).
splitWhen(c => c == '.').
foreach{producer =>
Flow(producer).
filter(c => c != '.').
fold(new StringBuilder)((sb, c) => sb.append(c.toChar)).
map(_.toString).
filter(!_.isEmpty).
foreach(println(_)).
consume(FlowMaterializer(MaterializerSettings()))
}.
onComplete(FlowMaterializer(MaterializerSettings())) {
case any =>
system.shutdown
}
}
}
关于的主要功能 Flow
我发现完成我的目标是 splitWhen
然后产生额外的子流,每个消息一个 .
分隔符。然后,我用另一个步骤流程处理每个子流程,最后在最后打印单个消息。
这一切似乎有点冗长,以实现我认为是一个非常简单和常见的用例。所以我的问题是,是否有更清晰,更简洁的方法来做这个或者这是以分隔符分割流的正确和首选方式?
看起来最近改进了API以包含 akka.stream.scaladsl.Framing。该文档还包含一个 例 如何使用它。关于你的具体问题:
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Framing, Source}
import akka.util.ByteString
import com.typesafe.config.ConfigFactory
object TcpDelimiterBasedMessaging extends App {
object chunks {
val first = ByteString("Lorem Ipsum is simply.Dummy text of the printing.And typesetting industry.")
val second = ByteString("More text.delimited by.a period.")
}
implicit val system = ActorSystem("delimiter-based-messaging", ConfigFactory.defaultReference())
implicit val dispatcher = system.dispatcher
implicit val materializer = ActorMaterializer()
Source(chunks.first :: chunks.second :: Nil)
.via(Framing.delimiter(ByteString("."), Int.MaxValue))
.map(_.utf8String)
.runForeach(println)
.onComplete(_ => system.terminate())
}
产生以下输出:
Lorem Ipsum is simply
Dummy text of the printing
And typesetting industry
More text
delimited by
a period
看起来最近改进了API以包含 akka.stream.scaladsl.Framing。该文档还包含一个 例 如何使用它。关于你的具体问题:
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Framing, Source}
import akka.util.ByteString
import com.typesafe.config.ConfigFactory
object TcpDelimiterBasedMessaging extends App {
object chunks {
val first = ByteString("Lorem Ipsum is simply.Dummy text of the printing.And typesetting industry.")
val second = ByteString("More text.delimited by.a period.")
}
implicit val system = ActorSystem("delimiter-based-messaging", ConfigFactory.defaultReference())
implicit val dispatcher = system.dispatcher
implicit val materializer = ActorMaterializer()
Source(chunks.first :: chunks.second :: Nil)
.via(Framing.delimiter(ByteString("."), Int.MaxValue))
.map(_.utf8String)
.runForeach(println)
.onComplete(_ => system.terminate())
}
产生以下输出:
Lorem Ipsum is simply
Dummy text of the printing
And typesetting industry
More text
delimited by
a period
现在,在akka-streams文档中的Streams Cookbook中发布了类似的示例代码 从ByteStrings流中解析行。
在Akka用户组发布同样的问题之后,我得到了Endre Varga和Viktor Klang的一些建议(https://groups.google.com/forum/#!topic/akka-user/YsnwIAjQ3EE)。我最终得出了恩德雷的建议 Transformer
然后使用 transform
关于的方法 Flow
。我上一个示例的略微修改版本包含在下面:
import akka.actor.ActorSystem
import akka.stream.{ FlowMaterializer, MaterializerSettings }
import akka.stream.scaladsl.Flow
import scala.util.{ Failure, Success }
import akka.util.ByteString
import akka.stream.Transformer
import akka.util.ByteStringBuilder
object BasicTransformation {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("Sys")
implicit val mater = FlowMaterializer(MaterializerSettings())
val data = List(
ByteString("Lorem Ipsum is"),
ByteString(" simply.Dummy text of.The prin"),
ByteString("ting.And typesetting industry.")
)
Flow(data).transform(new PeriodDelimitedTransformer).foreach(println(_))
}
}
随着定义 PeriodDelimitedTransformer
如下:
class PeriodDelimitedTransformer extends Transformer[ByteString,String]{
val buffer = new ByteStringBuilder
def onNext(msg:ByteString) = {
val msgString = msg.utf8String
val delimIndex = msgString.indexOf('.')
if (delimIndex == -1){
buffer.append(msg)
List.empty
}
else{
val parts = msgString.split("\\.")
val endsWithDelim = msgString.endsWith(".")
buffer.putBytes(parts.head.getBytes())
val currentPiece = buffer.result.utf8String
val otherPieces = parts.tail.dropRight(1).toList
buffer.clear
val lastPart =
if (endsWithDelim){
List(parts.last)
}
else{
buffer.putBytes(parts.last.getBytes())
List.empty
}
val result = currentPiece :: otherPieces ::: lastPart
result
}
}
}
因此,我之前的解决方案的一些复杂性已经卷入其中 Transformer
,但这似乎是最好的方法。在我的初始解决方案中,流最终被分成多个子流,这不是我想要的。
我认为安德烈的使用 Framing
是你问题的最佳解决方案,但我遇到了类似的问题 Framing
太有限了我用了 statefulMapConcat
相反,它允许您使用您喜欢的任何规则对输入ByteString进行分组。以下是您的问题的代码,以防它帮助任何人:
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Flow, Source}
import akka.util.ByteString
object BasicTransformation extends App {
implicit val system = ActorSystem("Sys")
implicit val materializer = ActorMaterializer()
implicit val dispatcher = system.dispatcher
val data = ByteString("Lorem Ipsum is simply.Dummy text of the printing.And typesetting industry.")
val grouping = Flow[Byte].statefulMapConcat { () =>
var bytes = ByteString()
byt =>
if (byt == '.') {
val string = bytes.utf8String
bytes = ByteString()
List(string)
} else {
bytes :+= byt
Nil
}
}
Source(data).via(grouping).runForeach(println).onComplete(_ => system.terminate())
}
哪个产生:
Lorem Ipsum is simply
Dummy text of the printing
And typesetting industry