# Beam
## Questions
- how to use the flink runner on KDA?
- can you use beam-sql on KDA?
- Flink vs apex?
- https://stackoverflow.com/questions/45861918/apache-apex-vs-apache-flink
- fundamentally focused on yarn (i.e. a hadoop-based streaming engine)
- Consuming Beam app outputs in akka gracefully?
- Akka streams?
- Kafkfa streams
- KStream: immutalbe (possibly unbounded?) log
- KTable: mutable materialized view
- Where does rx java fit in?
-
## Beam Overview
- A unified api (i.e. unifies batch and streaming)
- Specifies a programing model (via an api) exposed in multiple langauges
- Doesn't specify an execution engine (operates with many execution engines via specialized runners e.g. the flink runner or the spark runner).
- Like Flink, inspired by MillWheel and Dataflow papers
- How beam runs on top of flint
http://flink.apache.org/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html
- Beam (core) + a beam runner impl translates the beam program into one compatible with the execution engine.
- Beam: Apache Flink runner
https://beam.apache.org/documentation/runners/flink/
- Reasons for Beam on Flink
- Beam unifies back and streaming
- Beam smoothly supports multiple programing languages + native ecosystem libs (e.g. numpy pandas, tensorflow)
- leverage the best of the flink engine (e.g. leverage flink's exactly once semantics and management tools)
## Beam Runner compatibility
- https://beam.apache.org/documentation/runners/capability-matrix/
# Apps with queriable state
- Managing Streaming And Queryable State In Spark, Akka Streams, Kafka Streams, And Flink
https://www.lightbend.com/blog/managing-streaming-and-queryable-state-in-spark-akka-streams-kafka-streams-flink
## Thoughts
- Make the pipeline smart
- similar to pushing intelligence to the edge
-