第三章:Aggregate

在上一篇文章,讲了spring-data-mongodb常用的增删改查(CRUD)操作,但是平时我们除了这些简单的操作外还需要进行一些复杂统计。本章就介绍mongodb 强大的聚合操作“Aggregate”。

一、Aggregate

MongoDB中聚合(aggregate)主要用于处理数据(诸如统计平均值,求和等),并返回计算后的数据结果。有点类似sql语句中的 count(*)。

mongodb语法

db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
复制代码

mongoTemplate中定义的相关方法:

<O> AggregationResults<O> aggregate(Aggregation aggregation, Class<?> inputType, Class<O> outputType)
<O> AggregationResults<O> aggregate(Aggregation aggregation, String collectionName, Class<O> outputType)
<O> AggregationResults<O> aggregate(TypedAggregation<?> aggregation, Class<O> outputType)
<O> AggregationResults<O> aggregate(TypedAggregation<?> aggregation, String inputCollectionName, Class<O> outputType)
<T> ExecutableAggregationOperation.ExecutableAggregation<T> aggregateAndReturn(Class<T> domainType)
<O> CloseableIterator<O> aggregateStream(Aggregation aggregation, Class<?> inputType, Class<O> outputType)
<O> CloseableIterator<O> aggregateStream(Aggregation aggregation, String collectionName, Class<O> outputType)
<O> CloseableIterator<O> aggregateStream(TypedAggregation<?> aggregation, Class<O> outputType)
<O> CloseableIterator<O> aggregateStream(TypedAggregation<?> aggregation, String inputCollectionName, Class<O> outputType)

二、Example

这部分结合需求来阐述:

  • 比如数据库中存储的数据如下:
  • Q1:"统计各个年级人数" 这个比较简单,我们只需要按照年级分组然后进行sum就能得到结果

    mongo:

    db.pt.aggregate([ { "$group" : { "_id" : "$grade_name" , "总人数" : { "$sum" : 1}}}] )
    复制代码

    java:

    Aggregation aggregation1 = Aggregation.newAggregation(Aggregation.group("grade_name").count().as("总人数"));
    AggregationResults<BasicDBObject> outputTypeCount1 =
            mongoTemplate.aggregate(aggregation1, "pt", BasicDBObject.class);
    for (Iterator<BasicDBObject> iterator = outputTypeCount1.iterator(); iterator.hasNext(); ) {
        DBObject obj = iterator.next();
        System.out.println(JSON.toJSONString(obj));
    复制代码

    结果:

    mongo-vue中:

    console中:

    Q2:"统计某个年级某一项测试在某个范围的人数" 这个也不难,只需要匹配 年级+测试项目+项目分数 between 分数1 and 分数2 然后根据年级分组统计

    mongo:

    db.pt.aggregate(
            [ { "$match" : { "grade_name" : "一年级"}} , { "$unwind" : "$items"} , { "$match" : { "items.item_name" : "BMI" , "items.score" : { "$gt" : 60 , "$lt" : 70}}} , { "$group" : { "_id" : "$grade_name" , "一年级BMI正常人数" : { "$sum" : 1}}}]
    复制代码

    java:

    Aggregation aggregation4 =
            Aggregation.newAggregation(
                    Aggregation.unwind("items"),
                    Aggregation.match(Criteria.where("items.item_name").is("BMI").and("items.score").gt(60).lt(70)),
                    Aggregation.group("grade_name").count().as("BMI正常人数"));
    AggregationResults<BasicDBObject> outputTypeCount4 =
            mongoTemplate.aggregate(aggregation4, "pt", BasicDBObject.class);
    for (Iterator<BasicDBObject> iterator = outputTypeCount4.iterator(); iterator.hasNext(); ) {
        DBObject obj = iterator.next();
        System.out.println(JSON.toJSONString(obj));
    复制代码

    结果:

    mongo-vue中:

    console中:

    Q3:"统计某个年级某一项测试不在某个范围的人数" 这个和Q2的区别在于他是not between and,这我们需要使用or来出来,java中对应的是orOperator;

    mongo:

    db.pt.aggregate(
            [ { "$match" : { "grade_name" : "一年级"}} , { "$unwind" : "$items"} , { "$match" : { "items.item_name" : "BMI" , "$or" : [ { "items.score" : { "$lte" : 60}} , { "items.score" : { "$gte" : 70}}]}} , { "$group" : { "_id" : "$grade_name" , "BMI不正常人数" : { "$sum" : 1}}}]
    复制代码

    java:

    Aggregation aggregation3 =
            Aggregation.newAggregation(
                    Aggregation.match(Criteria.where("grade_name").is("一年级")),
                    Aggregation.unwind("items"),
                    Aggregation.match(Criteria.where("items.item_name").is("BMI").orOperator(
                            Criteria.where("items.score").lte(60),
                            Criteria.where("items.score").gte(70))),
                    Aggregation.group("grade_name").count().as("BMI不正常人数"));
    AggregationResults<BasicDBObject> outputTypeCount3 =
            mongoTemplate.aggregate(aggregation3, "pt", BasicDBObject.class);
    for (Iterator<BasicDBObject> iterator = outputTypeCount3.iterator(); iterator.hasNext(); ) {
        DBObject obj = iterator.next();
        System.out.println(JSON.toJSONString(obj));
    复制代码

    这里有个问题需要注意:当定义多个Criteria (criteria1,criteria2,criteria3,criteria4)然后使用orOperator拼接的时候结果是不正确的,我实验然后错误的写法有两种:

    1. criteria1.orOperator(criteria2) //criteria1 criteria2 是并列条件满足任一即可的【这个写法是铁定错误的!!!】
    2. new Criteria().orOperator(criteria1,criteria2)//criteria1 criteria2 是并列条件满足任一即可的【这个写法百度说是正确写法,不知道为什么结果和上面的是一样的】
    复制代码

    结果:

    mongo-vue中:

    console中:

    mongo:

    db.pt.aggregate(
            [ { "$match" : { "grade_name" : "一年级"}} , { "$unwind" : "$items"} , { "$group" : { "_id" : "$items.item_name" , "平均分" : { "$avg" : "$items.score"} , "最小值" : { "$min" : "$items.score"} , "最大值" : { "$max" : "$items.score"}}}]
    复制代码

    java:

    Aggregation aggregation5 =
            Aggregation.newAggregation(
                    Aggregation.match(Criteria.where("grade_name").is("一年级")),
                    Aggregation.unwind("items"),
                    Aggregation.group("$items.item_name").avg("$items.score").as("平均分").min("$items.score").as
                            ("最小值").max("$items.score").as("最大值"));
    AggregationResults<BasicDBObject> outputTypeCount5 =
            mongoTemplate.aggregate(aggregation5, "pt", BasicDBObject.class);
    for (Iterator<BasicDBObject> iterator = outputTypeCount5.iterator(); iterator.hasNext(); ) {
        DBObject obj = iterator.next();
        System.out.println(JSON.toJSONString(obj));
    复制代码

    使用原生语句的写法:

    //展开数组
    DBObject queryUnwind=new BasicDBObject("$unwind","$items");
    //分组统计
    DBObject groupObject=new BasicDBObject("_id",new BasicDBObject("item_name", "$items.item_name"));
    groupObject.put("min", new BasicDBObject("$min","$items.score"));
    groupObject.put("max", new BasicDBObject("$max","$items.score"));
    groupObject.put("avg", new BasicDBObject("$avg","$items.score"));
    DBObject  queryGroup=new BasicDBObject("$group",groupObject);
    AggregationOutput output=mongoTemplate.getCollection("pt").aggregate(queryUnwind,queryGroup);
    for (Iterator<DBObject> iterator = output.results().iterator(); iterator.hasNext();) {
        DBObject obj =iterator.next();
        System.out.println(obj.toString());
    复制代码

    结果:

    mongo-vue中:

    console中:

    Q5 .“统计各个年级每一个项目得分的最大值,最小值,平均值” 这个问题需要我们按照年级和项目名分组 计算出每一个的最大值,最小值,平均值然后把每个年级的push到一起

    这里我直接使用类似原生语句的写法,不在单独写出mongo语句怎么写。

    java:

    /* 创建 $unwind 操作, 用于切分数组*/
    DBObject unwind = new BasicDBObject("$unwind", "$items");
    /* Group操作*/
    DBObject groupFields = new BasicDBObject("_id",new BasicDBObject("grade_name", "$grade_name").append("item_name", "$items.item_name"));
    groupFields.put("min_score", new BasicDBObject("$min","$items.score"));
    groupFields.put("max_score", new BasicDBObject("$max","$items.score"));
    groupFields.put("avg_score", new BasicDBObject("$avg","$items.score"));
    DBObject group = new BasicDBObject("$group", groupFields);
    /* Reshape Group Result*/
    DBObject projectFields = new BasicDBObject();
    projectFields.put("grade_name", "$_id.grade_name");
    DBObject subProjects=new BasicDBObject("item_name","$_id.item_name");
    subProjects.put("min","$min_score");
    subProjects.put("max","$max_score");
    subProjects.put("avg","$avg_score");
    projectFields.put("item_info",subProjects );
    DBObject project = new BasicDBObject("$project", projectFields);
    /* 将结果push到一起*/
    DBObject groupAgainFields = new BasicDBObject("_id", "$grade_name");
    groupAgainFields.put("item_info", new BasicDBObject("$push", "$item_info"));
    DBObject reshapeGroup = new BasicDBObject("$group", groupAgainFields);
    /* 查看Group结果 */
    AggregationOutput output1 = mongoTemplate.getCollection("pt").aggregate(unwind, group, project, reshapeGroup);
    for (Iterator<DBObject> iterator = output1.results().iterator(); iterator.hasNext();) {
        DBObject obj =iterator.next();
        System.out.println(obj.toString());
    复制代码

    结果: console中: