当前位置:   article > 正文

java Flink滚动时间窗口聚合TumblingProcessingTimeWindows运算例子

tumblingprocessingtimewindows

整个的思路是:

  • 构造数据源
  • 窗口聚合代码

1. 构造数据源

首先构造数据,新建一个MyData2.java的文件,写入这个MyData2的类

package create_data;

import java.util.Arrays;

public class MyData2 {
    public int keyId;
    public long timestamp;
    public int num;
    public double[] valueList;

    public MyData2() {
    }

    public MyData2(int accountId, long timestamp, int num, double[] valueList) {
        this.keyId = accountId;
        this.timestamp = timestamp;
        this.num = num;
        this.valueList = valueList;
    }

    public long getKeyId() {
        return keyId;
    }

    public void setKeyId(int keyId) {
        this.keyId = keyId;
    }

    public long getTimestamp() {
        return timestamp;
    }

    public void setTimestamp(long timestamp) {
        this.timestamp = timestamp;
    }

    public double[] getValueList() {
        return valueList;
    }

    public void setValueList(double[] valueList) {
        this.valueList = valueList;
    }

    public int getNum() {
        return num;
    }

    public void setNum(int num) {
        this.num = num;
    }

    @Override
    public String toString() {
        return "MyData{" +
                "keyId=" + keyId +
                ", timestamp=" + timestamp +
                ", num=" + num +
                ", valueList= " + Arrays.toString(valueList) +
                '}';
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62

然后需要一个控制数据生成的类,新建一个类:MyDataSource2.java,写入:

package create_data;

import org.apache.flink.streaming.api.functions.source.SourceFunction;

import java.util.Random;

public class MyDataSource2 implements SourceFunction<MyData2> {
    // 定义标志位,用来控制数据的产生
    private boolean isRunning = true;
    private final Random random = new Random(0);

    @Override
    public void run(SourceContext ctx) throws Exception {
        while (isRunning) {
//            ctx.collect(new MyData(random.nextInt(3), System.currentTimeMillis(), random.nextFloat()));
            ctx.collect(new MyData2(random.nextInt(3), System.currentTimeMillis(), 1, new double[]{random.nextDouble()}));

            Thread.sleep(1000L); // 1s生成1个数据
        }
    }

    @Override
    public void cancel() {
        isRunning = false;
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

2. 全窗口聚合类

最后新建一个FullWindowLearn2.java类,构造全窗口聚合类

package windows_learn;

import create_data.MyData2;
import create_data.MyDataSource2;
import org.apache.commons.lang3.ArrayUtils;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows;
import org.apache.flink.streaming.api.windowing.time.Time;

public class FullWindowLearn2 {
    public static void main(String[] args) throws Exception {
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(3);

        DataStreamSource<MyData2> sourceStream = env.addSource(new MyDataSource2());

        SingleOutputStreamOperator<MyData2> outStream = sourceStream
                .keyBy("keyId")
                .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
                .reduce(new ReduceFunction<MyData2>() {
                    @Override
                    public MyData2 reduce(MyData2 value1, MyData2 value2) throws Exception {
                        return new MyData2(value1.keyId, value2.timestamp, value1.getNum() + value2.getNum(),
                                ArrayUtils.addAll(value1.valueList, value2.valueList));
                    }
                });
        outStream.print();
        env.execute();
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33

运行后的结果如下:

3> MyData{keyId=0, timestamp=1634698715287, num=1, valueList= [0.8314409887870612]}
3> MyData{keyId=2, timestamp=1634698719302, num=4, valueList= [0.6374174253501083, 0.11700660880722513, 0.3332183994766498, 0.6130357680446138]}
3> MyData{keyId=2, timestamp=1634698723310, num=3, valueList= [0.8791825178724801, 0.17597680203548016, 0.7051747444754559]}
3> MyData{keyId=1, timestamp=1634698724310, num=1, valueList= [0.5467397571984656]}
3> MyData{keyId=0, timestamp=1634698722308, num=1, valueList= [0.12889715087377673]}
3> MyData{keyId=2, timestamp=1634698729327, num=3, valueList= [0.5629496738983792, 0.6251463634655593, 0.8676786682939737]}
3> MyData{keyId=0, timestamp=1634698728324, num=2, valueList= [0.01492708588111824, 0.990722785714783]}
3> MyData{keyId=0, timestamp=1634698733340, num=3, valueList= [0.7331520701949938, 0.5266994346048661, 0.9846741428068255]}
3> MyData{keyId=2, timestamp=1634698734342, num=1, valueList= [0.0830623982249149]}
3> MyData{keyId=1, timestamp=1634698731334, num=1, valueList= [0.012806651575719585]}
3> MyData{keyId=2, timestamp=1634698739353, num=2, valueList= [0.30687115672762866, 0.6895039878550204]}
3> MyData{keyId=1, timestamp=1634698737351, num=1, valueList= [0.3591653475606117]}
3> MyData{keyId=0, timestamp=1634698738351, num=2, valueList= [0.7150310138504744, 0.004485602182885184]}
3> MyData{keyId=0, timestamp=1634698743367, num=3, valueList= [0.3387696535357536, 0.8657458802140383, 0.04494430391472559]}
3> MyData{keyId=1, timestamp=1634698744371, num=2, valueList= [0.9323680992655007, 0.21757041220968598]}
3> MyData{keyId=0, timestamp=1634698748381, num=4, valueList= [0.08278636648764448, 0.6922930069529333, 0.9481847392423067, 0.2112353749298962]}
3> MyData{keyId=2, timestamp=1634698749384, num=1, valueList= [0.3952070466478651]}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

可以看到由于真实的时间戳并不是严格的安装5s来,因此有时候聚合4个,有时候6个,但整体是这样滴

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小舞很执着/article/detail/1019193
推荐阅读
相关标签
  

闽ICP备14008679号