LCOV - differential code coverage report
Current view: top level - src/backend/executor - nodeAgg.c (source / functions) Coverage Total Hit LBC UIC UBC GBC GIC GNC CBC EUB ECB DUB DCB
Current: Differential Code Coverage HEAD vs 15 Lines: 94.8 % 1507 1428 18 47 14 26 864 50 488 38 900 1 22
Current Date: 2023-04-08 15:15:32 Functions: 98.2 % 57 56 1 56 1 56
Baseline: 15
Baseline Date: 2023-04-08 15:09:40
Legend: Lines: hit not hit

           TLA  Line data    Source code
       1                 : /*-------------------------------------------------------------------------
       2                 :  *
       3                 :  * nodeAgg.c
       4                 :  *    Routines to handle aggregate nodes.
       5                 :  *
       6                 :  *    ExecAgg normally evaluates each aggregate in the following steps:
       7                 :  *
       8                 :  *       transvalue = initcond
       9                 :  *       foreach input_tuple do
      10                 :  *          transvalue = transfunc(transvalue, input_value(s))
      11                 :  *       result = finalfunc(transvalue, direct_argument(s))
      12                 :  *
      13                 :  *    If a finalfunc is not supplied then the result is just the ending
      14                 :  *    value of transvalue.
      15                 :  *
      16                 :  *    Other behaviors can be selected by the "aggsplit" mode, which exists
      17                 :  *    to support partial aggregation.  It is possible to:
      18                 :  *    * Skip running the finalfunc, so that the output is always the
      19                 :  *    final transvalue state.
      20                 :  *    * Substitute the combinefunc for the transfunc, so that transvalue
      21                 :  *    states (propagated up from a child partial-aggregation step) are merged
      22                 :  *    rather than processing raw input rows.  (The statements below about
      23                 :  *    the transfunc apply equally to the combinefunc, when it's selected.)
      24                 :  *    * Apply the serializefunc to the output values (this only makes sense
      25                 :  *    when skipping the finalfunc, since the serializefunc works on the
      26                 :  *    transvalue data type).
      27                 :  *    * Apply the deserializefunc to the input values (this only makes sense
      28                 :  *    when using the combinefunc, for similar reasons).
      29                 :  *    It is the planner's responsibility to connect up Agg nodes using these
      30                 :  *    alternate behaviors in a way that makes sense, with partial aggregation
      31                 :  *    results being fed to nodes that expect them.
      32                 :  *
      33                 :  *    If a normal aggregate call specifies DISTINCT or ORDER BY, we sort the
      34                 :  *    input tuples and eliminate duplicates (if required) before performing
      35                 :  *    the above-depicted process.  (However, we don't do that for ordered-set
      36                 :  *    aggregates; their "ORDER BY" inputs are ordinary aggregate arguments
      37                 :  *    so far as this module is concerned.)  Note that partial aggregation
      38                 :  *    is not supported in these cases, since we couldn't ensure global
      39                 :  *    ordering or distinctness of the inputs.
      40                 :  *
      41                 :  *    If transfunc is marked "strict" in pg_proc and initcond is NULL,
      42                 :  *    then the first non-NULL input_value is assigned directly to transvalue,
      43                 :  *    and transfunc isn't applied until the second non-NULL input_value.
      44                 :  *    The agg's first input type and transtype must be the same in this case!
      45                 :  *
      46                 :  *    If transfunc is marked "strict" then NULL input_values are skipped,
      47                 :  *    keeping the previous transvalue.  If transfunc is not strict then it
      48                 :  *    is called for every input tuple and must deal with NULL initcond
      49                 :  *    or NULL input_values for itself.
      50                 :  *
      51                 :  *    If finalfunc is marked "strict" then it is not called when the
      52                 :  *    ending transvalue is NULL, instead a NULL result is created
      53                 :  *    automatically (this is just the usual handling of strict functions,
      54                 :  *    of course).  A non-strict finalfunc can make its own choice of
      55                 :  *    what to return for a NULL ending transvalue.
      56                 :  *
      57                 :  *    Ordered-set aggregates are treated specially in one other way: we
      58                 :  *    evaluate any "direct" arguments and pass them to the finalfunc along
      59                 :  *    with the transition value.
      60                 :  *
      61                 :  *    A finalfunc can have additional arguments beyond the transvalue and
      62                 :  *    any "direct" arguments, corresponding to the input arguments of the
      63                 :  *    aggregate.  These are always just passed as NULL.  Such arguments may be
      64                 :  *    needed to allow resolution of a polymorphic aggregate's result type.
      65                 :  *
      66                 :  *    We compute aggregate input expressions and run the transition functions
      67                 :  *    in a temporary econtext (aggstate->tmpcontext).  This is reset at least
      68                 :  *    once per input tuple, so when the transvalue datatype is
      69                 :  *    pass-by-reference, we have to be careful to copy it into a longer-lived
      70                 :  *    memory context, and free the prior value to avoid memory leakage.  We
      71                 :  *    store transvalues in another set of econtexts, aggstate->aggcontexts
      72                 :  *    (one per grouping set, see below), which are also used for the hashtable
      73                 :  *    structures in AGG_HASHED mode.  These econtexts are rescanned, not just
      74                 :  *    reset, at group boundaries so that aggregate transition functions can
      75                 :  *    register shutdown callbacks via AggRegisterCallback.
      76                 :  *
      77                 :  *    The node's regular econtext (aggstate->ss.ps.ps_ExprContext) is used to
      78                 :  *    run finalize functions and compute the output tuple; this context can be
      79                 :  *    reset once per output tuple.
      80                 :  *
      81                 :  *    The executor's AggState node is passed as the fmgr "context" value in
      82                 :  *    all transfunc and finalfunc calls.  It is not recommended that the
      83                 :  *    transition functions look at the AggState node directly, but they can
      84                 :  *    use AggCheckCallContext() to verify that they are being called by
      85                 :  *    nodeAgg.c (and not as ordinary SQL functions).  The main reason a
      86                 :  *    transition function might want to know this is so that it can avoid
      87                 :  *    palloc'ing a fixed-size pass-by-ref transition value on every call:
      88                 :  *    it can instead just scribble on and return its left input.  Ordinarily
      89                 :  *    it is completely forbidden for functions to modify pass-by-ref inputs,
      90                 :  *    but in the aggregate case we know the left input is either the initial
      91                 :  *    transition value or a previous function result, and in either case its
      92                 :  *    value need not be preserved.  See int8inc() for an example.  Notice that
      93                 :  *    the EEOP_AGG_PLAIN_TRANS step is coded to avoid a data copy step when
      94                 :  *    the previous transition value pointer is returned.  It is also possible
      95                 :  *    to avoid repeated data copying when the transition value is an expanded
      96                 :  *    object: to do that, the transition function must take care to return
      97                 :  *    an expanded object that is in a child context of the memory context
      98                 :  *    returned by AggCheckCallContext().  Also, some transition functions want
      99                 :  *    to store working state in addition to the nominal transition value; they
     100                 :  *    can use the memory context returned by AggCheckCallContext() to do that.
     101                 :  *
     102                 :  *    Note: AggCheckCallContext() is available as of PostgreSQL 9.0.  The
     103                 :  *    AggState is available as context in earlier releases (back to 8.1),
     104                 :  *    but direct examination of the node is needed to use it before 9.0.
     105                 :  *
     106                 :  *    As of 9.4, aggregate transition functions can also use AggGetAggref()
     107                 :  *    to get hold of the Aggref expression node for their aggregate call.
     108                 :  *    This is mainly intended for ordered-set aggregates, which are not
     109                 :  *    supported as window functions.  (A regular aggregate function would
     110                 :  *    need some fallback logic to use this, since there's no Aggref node
     111                 :  *    for a window function.)
     112                 :  *
     113                 :  *    Grouping sets:
     114                 :  *
     115                 :  *    A list of grouping sets which is structurally equivalent to a ROLLUP
     116                 :  *    clause (e.g. (a,b,c), (a,b), (a)) can be processed in a single pass over
     117                 :  *    ordered data.  We do this by keeping a separate set of transition values
     118                 :  *    for each grouping set being concurrently processed; for each input tuple
     119                 :  *    we update them all, and on group boundaries we reset those states
     120                 :  *    (starting at the front of the list) whose grouping values have changed
     121                 :  *    (the list of grouping sets is ordered from most specific to least
     122                 :  *    specific).
     123                 :  *
     124                 :  *    Where more complex grouping sets are used, we break them down into
     125                 :  *    "phases", where each phase has a different sort order (except phase 0
     126                 :  *    which is reserved for hashing).  During each phase but the last, the
     127                 :  *    input tuples are additionally stored in a tuplesort which is keyed to the
     128                 :  *    next phase's sort order; during each phase but the first, the input
     129                 :  *    tuples are drawn from the previously sorted data.  (The sorting of the
     130                 :  *    data for the first phase is handled by the planner, as it might be
     131                 :  *    satisfied by underlying nodes.)
     132                 :  *
     133                 :  *    Hashing can be mixed with sorted grouping.  To do this, we have an
     134                 :  *    AGG_MIXED strategy that populates the hashtables during the first sorted
     135                 :  *    phase, and switches to reading them out after completing all sort phases.
     136                 :  *    We can also support AGG_HASHED with multiple hash tables and no sorting
     137                 :  *    at all.
     138                 :  *
     139                 :  *    From the perspective of aggregate transition and final functions, the
     140                 :  *    only issue regarding grouping sets is this: a single call site (flinfo)
     141                 :  *    of an aggregate function may be used for updating several different
     142                 :  *    transition values in turn. So the function must not cache in the flinfo
     143                 :  *    anything which logically belongs as part of the transition value (most
     144                 :  *    importantly, the memory context in which the transition value exists).
     145                 :  *    The support API functions (AggCheckCallContext, AggRegisterCallback) are
     146                 :  *    sensitive to the grouping set for which the aggregate function is
     147                 :  *    currently being called.
     148                 :  *
     149                 :  *    Plan structure:
     150                 :  *
     151                 :  *    What we get from the planner is actually one "real" Agg node which is
     152                 :  *    part of the plan tree proper, but which optionally has an additional list
     153                 :  *    of Agg nodes hung off the side via the "chain" field.  This is because an
     154                 :  *    Agg node happens to be a convenient representation of all the data we
     155                 :  *    need for grouping sets.
     156                 :  *
     157                 :  *    For many purposes, we treat the "real" node as if it were just the first
     158                 :  *    node in the chain.  The chain must be ordered such that hashed entries
     159                 :  *    come before sorted/plain entries; the real node is marked AGG_MIXED if
     160                 :  *    there are both types present (in which case the real node describes one
     161                 :  *    of the hashed groupings, other AGG_HASHED nodes may optionally follow in
     162                 :  *    the chain, followed in turn by AGG_SORTED or (one) AGG_PLAIN node).  If
     163                 :  *    the real node is marked AGG_HASHED or AGG_SORTED, then all the chained
     164                 :  *    nodes must be of the same type; if it is AGG_PLAIN, there can be no
     165                 :  *    chained nodes.
     166                 :  *
     167                 :  *    We collect all hashed nodes into a single "phase", numbered 0, and create
     168                 :  *    a sorted phase (numbered 1..n) for each AGG_SORTED or AGG_PLAIN node.
     169                 :  *    Phase 0 is allocated even if there are no hashes, but remains unused in
     170                 :  *    that case.
     171                 :  *
     172                 :  *    AGG_HASHED nodes actually refer to only a single grouping set each,
     173                 :  *    because for each hashed grouping we need a separate grpColIdx and
     174                 :  *    numGroups estimate.  AGG_SORTED nodes represent a "rollup", a list of
     175                 :  *    grouping sets that share a sort order.  Each AGG_SORTED node other than
     176                 :  *    the first one has an associated Sort node which describes the sort order
     177                 :  *    to be used; the first sorted node takes its input from the outer subtree,
     178                 :  *    which the planner has already arranged to provide ordered data.
     179                 :  *
     180                 :  *    Memory and ExprContext usage:
     181                 :  *
     182                 :  *    Because we're accumulating aggregate values across input rows, we need to
     183                 :  *    use more memory contexts than just simple input/output tuple contexts.
     184                 :  *    In fact, for a rollup, we need a separate context for each grouping set
     185                 :  *    so that we can reset the inner (finer-grained) aggregates on their group
     186                 :  *    boundaries while continuing to accumulate values for outer
     187                 :  *    (coarser-grained) groupings.  On top of this, we might be simultaneously
     188                 :  *    populating hashtables; however, we only need one context for all the
     189                 :  *    hashtables.
     190                 :  *
     191                 :  *    So we create an array, aggcontexts, with an ExprContext for each grouping
     192                 :  *    set in the largest rollup that we're going to process, and use the
     193                 :  *    per-tuple memory context of those ExprContexts to store the aggregate
     194                 :  *    transition values.  hashcontext is the single context created to support
     195                 :  *    all hash tables.
     196                 :  *
     197                 :  *    Spilling To Disk
     198                 :  *
     199                 :  *    When performing hash aggregation, if the hash table memory exceeds the
     200                 :  *    limit (see hash_agg_check_limits()), we enter "spill mode". In spill
     201                 :  *    mode, we advance the transition states only for groups already in the
     202                 :  *    hash table. For tuples that would need to create a new hash table
     203                 :  *    entries (and initialize new transition states), we instead spill them to
     204                 :  *    disk to be processed later. The tuples are spilled in a partitioned
     205                 :  *    manner, so that subsequent batches are smaller and less likely to exceed
     206                 :  *    hash_mem (if a batch does exceed hash_mem, it must be spilled
     207                 :  *    recursively).
     208                 :  *
     209                 :  *    Spilled data is written to logical tapes. These provide better control
     210                 :  *    over memory usage, disk space, and the number of files than if we were
     211                 :  *    to use a BufFile for each spill.  We don't know the number of tapes needed
     212                 :  *    at the start of the algorithm (because it can recurse), so a tape set is
     213                 :  *    allocated at the beginning, and individual tapes are created as needed.
     214                 :  *    As a particular tape is read, logtape.c recycles its disk space. When a
     215                 :  *    tape is read to completion, it is destroyed entirely.
     216                 :  *
     217                 :  *    Tapes' buffers can take up substantial memory when many tapes are open at
     218                 :  *    once. We only need one tape open at a time in read mode (using a buffer
     219                 :  *    that's a multiple of BLCKSZ); but we need one tape open in write mode (each
     220                 :  *    requiring a buffer of size BLCKSZ) for each partition.
     221                 :  *
     222                 :  *    Note that it's possible for transition states to start small but then
     223                 :  *    grow very large; for instance in the case of ARRAY_AGG. In such cases,
     224                 :  *    it's still possible to significantly exceed hash_mem. We try to avoid
     225                 :  *    this situation by estimating what will fit in the available memory, and
     226                 :  *    imposing a limit on the number of groups separately from the amount of
     227                 :  *    memory consumed.
     228                 :  *
     229                 :  *    Transition / Combine function invocation:
     230                 :  *
     231                 :  *    For performance reasons transition functions, including combine
     232                 :  *    functions, aren't invoked one-by-one from nodeAgg.c after computing
     233                 :  *    arguments using the expression evaluation engine. Instead
     234                 :  *    ExecBuildAggTrans() builds one large expression that does both argument
     235                 :  *    evaluation and transition function invocation. That avoids performance
     236                 :  *    issues due to repeated uses of expression evaluation, complications due
     237                 :  *    to filter expressions having to be evaluated early, and allows to JIT
     238                 :  *    the entire expression into one native function.
     239                 :  *
     240                 :  * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
     241                 :  * Portions Copyright (c) 1994, Regents of the University of California
     242                 :  *
     243                 :  * IDENTIFICATION
     244                 :  *    src/backend/executor/nodeAgg.c
     245                 :  *
     246                 :  *-------------------------------------------------------------------------
     247                 :  */
     248                 : 
     249                 : #include "postgres.h"
     250                 : 
     251                 : #include "access/htup_details.h"
     252                 : #include "access/parallel.h"
     253                 : #include "catalog/objectaccess.h"
     254                 : #include "catalog/pg_aggregate.h"
     255                 : #include "catalog/pg_proc.h"
     256                 : #include "catalog/pg_type.h"
     257                 : #include "common/hashfn.h"
     258                 : #include "executor/execExpr.h"
     259                 : #include "executor/executor.h"
     260                 : #include "executor/nodeAgg.h"
     261                 : #include "lib/hyperloglog.h"
     262                 : #include "miscadmin.h"
     263                 : #include "nodes/makefuncs.h"
     264                 : #include "nodes/nodeFuncs.h"
     265                 : #include "optimizer/optimizer.h"
     266                 : #include "parser/parse_agg.h"
     267                 : #include "parser/parse_coerce.h"
     268                 : #include "utils/acl.h"
     269                 : #include "utils/builtins.h"
     270                 : #include "utils/datum.h"
     271                 : #include "utils/dynahash.h"
     272                 : #include "utils/expandeddatum.h"
     273                 : #include "utils/logtape.h"
     274                 : #include "utils/lsyscache.h"
     275                 : #include "utils/memutils.h"
     276                 : #include "utils/syscache.h"
     277                 : #include "utils/tuplesort.h"
     278                 : 
     279                 : /*
     280                 :  * Control how many partitions are created when spilling HashAgg to
     281                 :  * disk.
     282                 :  *
     283                 :  * HASHAGG_PARTITION_FACTOR is multiplied by the estimated number of
     284                 :  * partitions needed such that each partition will fit in memory. The factor
     285                 :  * is set higher than one because there's not a high cost to having a few too
     286                 :  * many partitions, and it makes it less likely that a partition will need to
     287                 :  * be spilled recursively. Another benefit of having more, smaller partitions
     288                 :  * is that small hash tables may perform better than large ones due to memory
     289                 :  * caching effects.
     290                 :  *
     291                 :  * We also specify a min and max number of partitions per spill. Too few might
     292                 :  * mean a lot of wasted I/O from repeated spilling of the same tuples. Too
     293                 :  * many will result in lots of memory wasted buffering the spill files (which
     294                 :  * could instead be spent on a larger hash table).
     295                 :  */
     296                 : #define HASHAGG_PARTITION_FACTOR 1.50
     297                 : #define HASHAGG_MIN_PARTITIONS 4
     298                 : #define HASHAGG_MAX_PARTITIONS 1024
     299                 : 
     300                 : /*
     301                 :  * For reading from tapes, the buffer size must be a multiple of
     302                 :  * BLCKSZ. Larger values help when reading from multiple tapes concurrently,
     303                 :  * but that doesn't happen in HashAgg, so we simply use BLCKSZ. Writing to a
     304                 :  * tape always uses a buffer of size BLCKSZ.
     305                 :  */
     306                 : #define HASHAGG_READ_BUFFER_SIZE BLCKSZ
     307                 : #define HASHAGG_WRITE_BUFFER_SIZE BLCKSZ
     308                 : 
     309                 : /*
     310                 :  * HyperLogLog is used for estimating the cardinality of the spilled tuples in
     311                 :  * a given partition. 5 bits corresponds to a size of about 32 bytes and a
     312                 :  * worst-case error of around 18%. That's effective enough to choose a
     313                 :  * reasonable number of partitions when recursing.
     314                 :  */
     315                 : #define HASHAGG_HLL_BIT_WIDTH 5
     316                 : 
     317                 : /*
     318                 :  * Estimate chunk overhead as a constant 16 bytes. XXX: should this be
     319                 :  * improved?
     320                 :  */
     321                 : #define CHUNKHDRSZ 16
     322                 : 
     323                 : /*
     324                 :  * Represents partitioned spill data for a single hashtable. Contains the
     325                 :  * necessary information to route tuples to the correct partition, and to
     326                 :  * transform the spilled data into new batches.
     327                 :  *
     328                 :  * The high bits are used for partition selection (when recursing, we ignore
     329                 :  * the bits that have already been used for partition selection at an earlier
     330                 :  * level).
     331                 :  */
     332                 : typedef struct HashAggSpill
     333                 : {
     334                 :     int         npartitions;    /* number of partitions */
     335                 :     LogicalTape **partitions;   /* spill partition tapes */
     336                 :     int64      *ntuples;        /* number of tuples in each partition */
     337                 :     uint32      mask;           /* mask to find partition from hash value */
     338                 :     int         shift;          /* after masking, shift by this amount */
     339                 :     hyperLogLogState *hll_card; /* cardinality estimate for contents */
     340                 : } HashAggSpill;
     341                 : 
     342                 : /*
     343                 :  * Represents work to be done for one pass of hash aggregation (with only one
     344                 :  * grouping set).
     345                 :  *
     346                 :  * Also tracks the bits of the hash already used for partition selection by
     347                 :  * earlier iterations, so that this batch can use new bits. If all bits have
     348                 :  * already been used, no partitioning will be done (any spilled data will go
     349                 :  * to a single output tape).
     350                 :  */
     351                 : typedef struct HashAggBatch
     352                 : {
     353                 :     int         setno;          /* grouping set */
     354                 :     int         used_bits;      /* number of bits of hash already used */
     355                 :     LogicalTape *input_tape;    /* input partition tape */
     356                 :     int64       input_tuples;   /* number of tuples in this batch */
     357                 :     double      input_card;     /* estimated group cardinality */
     358                 : } HashAggBatch;
     359                 : 
     360                 : /* used to find referenced colnos */
     361                 : typedef struct FindColsContext
     362                 : {
     363                 :     bool        is_aggref;      /* is under an aggref */
     364                 :     Bitmapset  *aggregated;     /* column references under an aggref */
     365                 :     Bitmapset  *unaggregated;   /* other column references */
     366                 : } FindColsContext;
     367                 : 
     368                 : static void select_current_set(AggState *aggstate, int setno, bool is_hash);
     369                 : static void initialize_phase(AggState *aggstate, int newphase);
     370                 : static TupleTableSlot *fetch_input_tuple(AggState *aggstate);
     371                 : static void initialize_aggregates(AggState *aggstate,
     372                 :                                   AggStatePerGroup *pergroups,
     373                 :                                   int numReset);
     374                 : static void advance_transition_function(AggState *aggstate,
     375                 :                                         AggStatePerTrans pertrans,
     376                 :                                         AggStatePerGroup pergroupstate);
     377                 : static void advance_aggregates(AggState *aggstate);
     378                 : static void process_ordered_aggregate_single(AggState *aggstate,
     379                 :                                              AggStatePerTrans pertrans,
     380                 :                                              AggStatePerGroup pergroupstate);
     381                 : static void process_ordered_aggregate_multi(AggState *aggstate,
     382                 :                                             AggStatePerTrans pertrans,
     383                 :                                             AggStatePerGroup pergroupstate);
     384                 : static void finalize_aggregate(AggState *aggstate,
     385                 :                                AggStatePerAgg peragg,
     386                 :                                AggStatePerGroup pergroupstate,
     387                 :                                Datum *resultVal, bool *resultIsNull);
     388                 : static void finalize_partialaggregate(AggState *aggstate,
     389                 :                                       AggStatePerAgg peragg,
     390                 :                                       AggStatePerGroup pergroupstate,
     391                 :                                       Datum *resultVal, bool *resultIsNull);
     392                 : static inline void prepare_hash_slot(AggStatePerHash perhash,
     393                 :                                      TupleTableSlot *inputslot,
     394                 :                                      TupleTableSlot *hashslot);
     395                 : static void prepare_projection_slot(AggState *aggstate,
     396                 :                                     TupleTableSlot *slot,
     397                 :                                     int currentSet);
     398                 : static void finalize_aggregates(AggState *aggstate,
     399                 :                                 AggStatePerAgg peraggs,
     400                 :                                 AggStatePerGroup pergroup);
     401                 : static TupleTableSlot *project_aggregates(AggState *aggstate);
     402                 : static void find_cols(AggState *aggstate, Bitmapset **aggregated,
     403                 :                       Bitmapset **unaggregated);
     404                 : static bool find_cols_walker(Node *node, FindColsContext *context);
     405                 : static void build_hash_tables(AggState *aggstate);
     406                 : static void build_hash_table(AggState *aggstate, int setno, long nbuckets);
     407                 : static void hashagg_recompile_expressions(AggState *aggstate, bool minslot,
     408                 :                                           bool nullcheck);
     409                 : static long hash_choose_num_buckets(double hashentrysize,
     410                 :                                     long ngroups, Size memory);
     411                 : static int  hash_choose_num_partitions(double input_groups,
     412                 :                                        double hashentrysize,
     413                 :                                        int used_bits,
     414                 :                                        int *log2_npartitions);
     415                 : static void initialize_hash_entry(AggState *aggstate,
     416                 :                                   TupleHashTable hashtable,
     417                 :                                   TupleHashEntry entry);
     418                 : static void lookup_hash_entries(AggState *aggstate);
     419                 : static TupleTableSlot *agg_retrieve_direct(AggState *aggstate);
     420                 : static void agg_fill_hash_table(AggState *aggstate);
     421                 : static bool agg_refill_hash_table(AggState *aggstate);
     422                 : static TupleTableSlot *agg_retrieve_hash_table(AggState *aggstate);
     423                 : static TupleTableSlot *agg_retrieve_hash_table_in_memory(AggState *aggstate);
     424                 : static void hash_agg_check_limits(AggState *aggstate);
     425                 : static void hash_agg_enter_spill_mode(AggState *aggstate);
     426                 : static void hash_agg_update_metrics(AggState *aggstate, bool from_tape,
     427                 :                                     int npartitions);
     428                 : static void hashagg_finish_initial_spills(AggState *aggstate);
     429                 : static void hashagg_reset_spill_state(AggState *aggstate);
     430                 : static HashAggBatch *hashagg_batch_new(LogicalTape *input_tape, int setno,
     431                 :                                        int64 input_tuples, double input_card,
     432                 :                                        int used_bits);
     433                 : static MinimalTuple hashagg_batch_read(HashAggBatch *batch, uint32 *hashp);
     434                 : static void hashagg_spill_init(HashAggSpill *spill, LogicalTapeSet *tapeset,
     435                 :                                int used_bits, double input_groups,
     436                 :                                double hashentrysize);
     437                 : static Size hashagg_spill_tuple(AggState *aggstate, HashAggSpill *spill,
     438                 :                                 TupleTableSlot *inputslot, uint32 hash);
     439                 : static void hashagg_spill_finish(AggState *aggstate, HashAggSpill *spill,
     440                 :                                  int setno);
     441                 : static Datum GetAggInitVal(Datum textInitVal, Oid transtype);
     442                 : static void build_pertrans_for_aggref(AggStatePerTrans pertrans,
     443                 :                                       AggState *aggstate, EState *estate,
     444                 :                                       Aggref *aggref, Oid transfn_oid,
     445                 :                                       Oid aggtranstype, Oid aggserialfn,
     446                 :                                       Oid aggdeserialfn, Datum initValue,
     447                 :                                       bool initValueIsNull, Oid *inputTypes,
     448                 :                                       int numArguments);
     449                 : 
     450                 : 
     451                 : /*
     452                 :  * Select the current grouping set; affects current_set and
     453                 :  * curaggcontext.
     454                 :  */
     455 ECB             : static void
     456 GIC     3257730 : select_current_set(AggState *aggstate, int setno, bool is_hash)
     457                 : {
     458                 :     /*
     459                 :      * When changing this, also adapt ExecAggPlainTransByVal() and
     460                 :      * ExecAggPlainTransByRef().
     461 ECB             :      */
     462 CBC     3257730 :     if (is_hash)
     463 GIC     2884038 :         aggstate->curaggcontext = aggstate->hashcontext;
     464 ECB             :     else
     465 GIC      373692 :         aggstate->curaggcontext = aggstate->aggcontexts[setno];
     466 ECB             : 
     467 CBC     3257730 :     aggstate->current_set = setno;
     468 GIC     3257730 : }
     469                 : 
     470                 : /*
     471                 :  * Switch to phase "newphase", which must either be 0 or 1 (to reset) or
     472                 :  * current_phase + 1. Juggle the tuplesorts accordingly.
     473                 :  *
     474                 :  * Phase 0 is for hashing, which we currently handle last in the AGG_MIXED
     475                 :  * case, so when entering phase 0, all we need to do is drop open sorts.
     476                 :  */
     477 ECB             : static void
     478 GIC       65265 : initialize_phase(AggState *aggstate, int newphase)
     479 ECB             : {
     480 GIC       65265 :     Assert(newphase <= 1 || newphase == aggstate->current_phase + 1);
     481                 : 
     482                 :     /*
     483                 :      * Whatever the previous state, we're now done with whatever input
     484                 :      * tuplesort was in use.
     485 ECB             :      */
     486 GIC       65265 :     if (aggstate->sort_in)
     487 ECB             :     {
     488 CBC          21 :         tuplesort_end(aggstate->sort_in);
     489 GIC          21 :         aggstate->sort_in = NULL;
     490                 :     }
     491 ECB             : 
     492 GIC       65265 :     if (newphase <= 1)
     493                 :     {
     494                 :         /*
     495                 :          * Discard any existing output tuplesort.
     496 ECB             :          */
     497 GIC       65172 :         if (aggstate->sort_out)
     498 ECB             :         {
     499 CBC           3 :             tuplesort_end(aggstate->sort_out);
     500 GIC           3 :             aggstate->sort_out = NULL;
     501                 :         }
     502                 :     }
     503                 :     else
     504                 :     {
     505                 :         /*
     506                 :          * The old output tuplesort becomes the new input one, and this is the
     507                 :          * right time to actually sort it.
     508 ECB             :          */
     509 CBC          93 :         aggstate->sort_in = aggstate->sort_out;
     510              93 :         aggstate->sort_out = NULL;
     511              93 :         Assert(aggstate->sort_in);
     512 GIC          93 :         tuplesort_performsort(aggstate->sort_in);
     513                 :     }
     514                 : 
     515                 :     /*
     516                 :      * If this isn't the last phase, we need to sort appropriately for the
     517                 :      * next phase in sequence.
     518 ECB             :      */
     519 GIC       65265 :     if (newphase > 0 && newphase < aggstate->numphases - 1)
     520 ECB             :     {
     521 CBC         117 :         Sort       *sortnode = aggstate->phases[newphase + 1].sortnode;
     522             117 :         PlanState  *outerNode = outerPlanState(aggstate);
     523 GIC         117 :         TupleDesc   tupDesc = ExecGetResultType(outerNode);
     524 ECB             : 
     525 GIC         117 :         aggstate->sort_out = tuplesort_begin_heap(tupDesc,
     526                 :                                                   sortnode->numCols,
     527                 :                                                   sortnode->sortColIdx,
     528                 :                                                   sortnode->sortOperators,
     529                 :                                                   sortnode->collations,
     530                 :                                                   sortnode->nullsFirst,
     531                 :                                                   work_mem,
     532                 :                                                   NULL, TUPLESORT_NONE);
     533                 :     }
     534 ECB             : 
     535 CBC       65265 :     aggstate->current_phase = newphase;
     536           65265 :     aggstate->phase = &aggstate->phases[newphase];
     537 GIC       65265 : }
     538                 : 
     539                 : /*
     540                 :  * Fetch a tuple from either the outer plan (for phase 1) or from the sorter
     541                 :  * populated by the previous phase.  Copy it to the sorter for the next phase
     542                 :  * if any.
     543                 :  *
     544                 :  * Callers cannot rely on memory for tuple in returned slot remaining valid
     545                 :  * past any subsequently fetched tuple.
     546                 :  */
     547 ECB             : static TupleTableSlot *
     548 GIC    12654740 : fetch_input_tuple(AggState *aggstate)
     549                 : {
     550                 :     TupleTableSlot *slot;
     551 ECB             : 
     552 GIC    12654740 :     if (aggstate->sort_in)
     553                 :     {
     554 ECB             :         /* make sure we check for interrupts in either path through here */
     555 CBC       87441 :         CHECK_FOR_INTERRUPTS();
     556 GIC       87441 :         if (!tuplesort_gettupleslot(aggstate->sort_in, true, false,
     557 ECB             :                                     aggstate->sort_slot, NULL))
     558 CBC          93 :             return NULL;
     559 GIC       87348 :         slot = aggstate->sort_slot;
     560                 :     }
     561 ECB             :     else
     562 GIC    12567299 :         slot = ExecProcNode(outerPlanState(aggstate));
     563 ECB             : 
     564 CBC    12654638 :     if (!TupIsNull(slot) && aggstate->sort_out)
     565 GIC       87348 :         tuplesort_puttupleslot(aggstate->sort_out, slot);
     566 ECB             : 
     567 GIC    12654638 :     return slot;
     568                 : }
     569                 : 
     570                 : /*
     571                 :  * (Re)Initialize an individual aggregate.
     572                 :  *
     573                 :  * This function handles only one grouping set, already set in
     574                 :  * aggstate->current_set.
     575                 :  *
     576                 :  * When called, CurrentMemoryContext should be the per-query context.
     577                 :  */
     578 ECB             : static void
     579 GIC      530517 : initialize_aggregate(AggState *aggstate, AggStatePerTrans pertrans,
     580                 :                      AggStatePerGroup pergroupstate)
     581                 : {
     582                 :     /*
     583                 :      * Start a fresh sort operation for each DISTINCT/ORDER BY aggregate.
     584 ECB             :      */
     585 GNC      530517 :     if (pertrans->aggsortrequired)
     586                 :     {
     587                 :         /*
     588                 :          * In case of rescan, maybe there could be an uncompleted sort
     589                 :          * operation?  Clean it up if so.
     590 ECB             :          */
     591 GBC       26811 :         if (pertrans->sortstates[aggstate->current_set])
     592 UIC           0 :             tuplesort_end(pertrans->sortstates[aggstate->current_set]);
     593                 : 
     594                 : 
     595                 :         /*
     596                 :          * We use a plain Datum sorter when there's a single input column;
     597                 :          * otherwise sort the full tuple.  (See comments for
     598                 :          * process_ordered_aggregate_single.)
     599 ECB             :          */
     600 GIC       26811 :         if (pertrans->numInputs == 1)
     601 ECB             :         {
     602 GIC       26775 :             Form_pg_attribute attr = TupleDescAttr(pertrans->sortdesc, 0);
     603 ECB             : 
     604 CBC       26775 :             pertrans->sortstates[aggstate->current_set] =
     605           26775 :                 tuplesort_begin_datum(attr->atttypid,
     606           26775 :                                       pertrans->sortOperators[0],
     607           26775 :                                       pertrans->sortCollations[0],
     608 GIC       26775 :                                       pertrans->sortNullsFirst[0],
     609                 :                                       work_mem, NULL, TUPLESORT_NONE);
     610                 :         }
     611 ECB             :         else
     612 CBC          36 :             pertrans->sortstates[aggstate->current_set] =
     613 GIC          36 :                 tuplesort_begin_heap(pertrans->sortdesc,
     614                 :                                      pertrans->numSortCols,
     615                 :                                      pertrans->sortColIdx,
     616                 :                                      pertrans->sortOperators,
     617                 :                                      pertrans->sortCollations,
     618                 :                                      pertrans->sortNullsFirst,
     619                 :                                      work_mem, NULL, TUPLESORT_NONE);
     620                 :     }
     621                 : 
     622                 :     /*
     623                 :      * (Re)set transValue to the initial value.
     624                 :      *
     625                 :      * Note that when the initial value is pass-by-ref, we must copy it (into
     626                 :      * the aggcontext) since we will pfree the transValue later.
     627 ECB             :      */
     628 CBC      530517 :     if (pertrans->initValueIsNull)
     629 GIC      274940 :         pergroupstate->transValue = pertrans->initValue;
     630                 :     else
     631                 :     {
     632                 :         MemoryContext oldContext;
     633 ECB             : 
     634 CBC      255577 :         oldContext = MemoryContextSwitchTo(aggstate->curaggcontext->ecxt_per_tuple_memory);
     635          511154 :         pergroupstate->transValue = datumCopy(pertrans->initValue,
     636          255577 :                                               pertrans->transtypeByVal,
     637          255577 :                                               pertrans->transtypeLen);
     638 GIC      255577 :         MemoryContextSwitchTo(oldContext);
     639 ECB             :     }
     640 GIC      530517 :     pergroupstate->transValueIsNull = pertrans->initValueIsNull;
     641                 : 
     642                 :     /*
     643                 :      * If the initial value for the transition state doesn't exist in the
     644                 :      * pg_aggregate table then we will let the first non-NULL value returned
     645                 :      * from the outer procNode become the initial value. (This is useful for
     646                 :      * aggregates like max() and min().) The noTransValue flag signals that we
     647                 :      * still need to do this.
     648 ECB             :      */
     649 CBC      530517 :     pergroupstate->noTransValue = pertrans->initValueIsNull;
     650 GIC      530517 : }
     651                 : 
     652                 : /*
     653                 :  * Initialize all aggregate transition states for a new group of input values.
     654                 :  *
     655                 :  * If there are multiple grouping sets, we initialize only the first numReset
     656                 :  * of them (the grouping sets are ordered so that the most specific one, which
     657                 :  * is reset most often, is first). As a convenience, if numReset is 0, we
     658                 :  * reinitialize all sets.
     659                 :  *
     660                 :  * NB: This cannot be used for hash aggregates, as for those the grouping set
     661                 :  * number has to be specified from further up.
     662                 :  *
     663                 :  * When called, CurrentMemoryContext should be the per-query context.
     664                 :  */
     665 ECB             : static void
     666 GIC      171113 : initialize_aggregates(AggState *aggstate,
     667                 :                       AggStatePerGroup *pergroups,
     668                 :                       int numReset)
     669                 : {
     670 ECB             :     int         transno;
     671 CBC      171113 :     int         numGroupingSets = Max(aggstate->phase->numsets, 1);
     672          171113 :     int         setno = 0;
     673          171113 :     int         numTrans = aggstate->numtrans;
     674 GIC      171113 :     AggStatePerTrans transstates = aggstate->pertrans;
     675 ECB             : 
     676 GBC      171113 :     if (numReset == 0)
     677 UIC           0 :         numReset = numGroupingSets;
     678 ECB             : 
     679 GIC      349304 :     for (setno = 0; setno < numReset; setno++)
     680 ECB             :     {
     681 GIC      178191 :         AggStatePerGroup pergroup = pergroups[setno];
     682 ECB             : 
     683 GIC      178191 :         select_current_set(aggstate, setno, false);
     684 ECB             : 
     685 GIC      518715 :         for (transno = 0; transno < numTrans; transno++)
     686 ECB             :         {
     687 CBC      340524 :             AggStatePerTrans pertrans = &transstates[transno];
     688 GIC      340524 :             AggStatePerGroup pergroupstate = &pergroup[transno];
     689 ECB             : 
     690 GIC      340524 :             initialize_aggregate(aggstate, pertrans, pergroupstate);
     691                 :         }
     692 ECB             :     }
     693 GIC      171113 : }
     694                 : 
     695                 : /*
     696                 :  * Given new input value(s), advance the transition function of one aggregate
     697                 :  * state within one grouping set only (already set in aggstate->current_set)
     698                 :  *
     699                 :  * The new values (and null flags) have been preloaded into argument positions
     700                 :  * 1 and up in pertrans->transfn_fcinfo, so that we needn't copy them again to
     701                 :  * pass to the transition function.  We also expect that the static fields of
     702                 :  * the fcinfo are already initialized; that was done by ExecInitAgg().
     703                 :  *
     704                 :  * It doesn't matter which memory context this is called in.
     705                 :  */
     706 ECB             : static void
     707 GIC      352944 : advance_transition_function(AggState *aggstate,
     708                 :                             AggStatePerTrans pertrans,
     709                 :                             AggStatePerGroup pergroupstate)
     710 ECB             : {
     711 GIC      352944 :     FunctionCallInfo fcinfo = pertrans->transfn_fcinfo;
     712                 :     MemoryContext oldContext;
     713                 :     Datum       newVal;
     714 ECB             : 
     715 GIC      352944 :     if (pertrans->transfn.fn_strict)
     716                 :     {
     717                 :         /*
     718                 :          * For a strict transfn, nothing happens when there's a NULL input; we
     719                 :          * just keep the prior transValue.
     720 ECB             :          */
     721 GIC      112500 :         int         numTransInputs = pertrans->numTransInputs;
     722                 :         int         i;
     723 ECB             : 
     724 GIC      225000 :         for (i = 1; i <= numTransInputs; i++)
     725 ECB             :         {
     726 GBC      112500 :             if (fcinfo->args[i].isnull)
     727 UIC           0 :                 return;
     728 ECB             :         }
     729 GIC      112500 :         if (pergroupstate->noTransValue)
     730                 :         {
     731                 :             /*
     732                 :              * transValue has not been initialized. This is the first non-NULL
     733                 :              * input value. We use it as the initial value for transValue. (We
     734                 :              * already checked that the agg's input type is binary-compatible
     735                 :              * with its transtype, so straight copy here is OK.)
     736                 :              *
     737                 :              * We must copy the datum into aggcontext if it is pass-by-ref. We
     738                 :              * do not need to pfree the old transValue, since it's NULL.
     739 EUB             :              */
     740 UBC           0 :             oldContext = MemoryContextSwitchTo(aggstate->curaggcontext->ecxt_per_tuple_memory);
     741               0 :             pergroupstate->transValue = datumCopy(fcinfo->args[1].value,
     742               0 :                                                   pertrans->transtypeByVal,
     743               0 :                                                   pertrans->transtypeLen);
     744               0 :             pergroupstate->transValueIsNull = false;
     745               0 :             pergroupstate->noTransValue = false;
     746               0 :             MemoryContextSwitchTo(oldContext);
     747 UIC           0 :             return;
     748 ECB             :         }
     749 GIC      112500 :         if (pergroupstate->transValueIsNull)
     750                 :         {
     751                 :             /*
     752                 :              * Don't call a strict function with NULL inputs.  Note it is
     753                 :              * possible to get here despite the above tests, if the transfn is
     754                 :              * strict *and* returned a NULL on a prior cycle. If that happens
     755                 :              * we will propagate the NULL all the way to the end.
     756 EUB             :              */
     757 UIC           0 :             return;
     758                 :         }
     759                 :     }
     760                 : 
     761 ECB             :     /* We run the transition functions in per-input-tuple memory context */
     762 GIC      352944 :     oldContext = MemoryContextSwitchTo(aggstate->tmpcontext->ecxt_per_tuple_memory);
     763                 : 
     764 ECB             :     /* set up aggstate->curpertrans for AggGetAggref() */
     765 GIC      352944 :     aggstate->curpertrans = pertrans;
     766                 : 
     767                 :     /*
     768                 :      * OK to call the transition function
     769 ECB             :      */
     770 CBC      352944 :     fcinfo->args[0].value = pergroupstate->transValue;
     771          352944 :     fcinfo->args[0].isnull = pergroupstate->transValueIsNull;
     772 GIC      352944 :     fcinfo->isnull = false;      /* just in case transfn doesn't set it */
     773 ECB             : 
     774 GIC      352944 :     newVal = FunctionCallInvoke(fcinfo);
     775 ECB             : 
     776 GIC      352944 :     aggstate->curpertrans = NULL;
     777                 : 
     778                 :     /*
     779                 :      * If pass-by-ref datatype, must copy the new value into aggcontext and
     780                 :      * free the prior transValue.  But if transfn returned a pointer to its
     781                 :      * first input, we don't need to do anything.  Also, if transfn returned a
     782                 :      * pointer to a R/W expanded object that is already a child of the
     783                 :      * aggcontext, assume we can adopt that value without copying it.
     784                 :      *
     785                 :      * It's safe to compare newVal with pergroup->transValue without regard
     786                 :      * for either being NULL, because ExecAggTransReparent() takes care to set
     787                 :      * transValue to 0 when NULL. Otherwise we could end up accidentally not
     788                 :      * reparenting, when the transValue has the same numerical value as
     789                 :      * newValue, despite being NULL.  This is a somewhat hot path, making it
     790                 :      * undesirable to instead solve this with another branch for the common
     791                 :      * case of the transition function returning its (modified) input
     792                 :      * argument.
     793 ECB             :      */
     794 GBC      352944 :     if (!pertrans->transtypeByVal &&
     795 UBC           0 :         DatumGetPointer(newVal) != DatumGetPointer(pergroupstate->transValue))
     796               0 :         newVal = ExecAggTransReparent(aggstate, pertrans,
     797 UIC           0 :                                       newVal, fcinfo->isnull,
     798 EUB             :                                       pergroupstate->transValue,
     799 UIC           0 :                                       pergroupstate->transValueIsNull);
     800 ECB             : 
     801 CBC      352944 :     pergroupstate->transValue = newVal;
     802 GIC      352944 :     pergroupstate->transValueIsNull = fcinfo->isnull;
     803 ECB             : 
     804 GIC      352944 :     MemoryContextSwitchTo(oldContext);
     805                 : }
     806                 : 
     807                 : /*
     808                 :  * Advance each aggregate transition state for one input tuple.  The input
     809                 :  * tuple has been stored in tmpcontext->ecxt_outertuple, so that it is
     810                 :  * accessible to ExecEvalExpr.
     811                 :  *
     812                 :  * We have two sets of transition states to handle: one for sorted aggregation
     813                 :  * and one for hashed; we do them both here, to avoid multiple evaluation of
     814                 :  * the inputs.
     815                 :  *
     816                 :  * When called, CurrentMemoryContext should be the per-query context.
     817                 :  */
     818 ECB             : static void
     819 GIC    12674216 : advance_aggregates(AggState *aggstate)
     820                 : {
     821                 :     bool        dummynull;
     822 ECB             : 
     823 GIC    12674216 :     ExecEvalExprSwitchContext(aggstate->phase->evaltrans,
     824                 :                               aggstate->tmpcontext,
     825 ECB             :                               &dummynull);
     826 GIC    12674183 : }
     827                 : 
     828                 : /*
     829                 :  * Run the transition function for a DISTINCT or ORDER BY aggregate
     830                 :  * with only one input.  This is called after we have completed
     831                 :  * entering all the input values into the sort object.  We complete the
     832                 :  * sort, read out the values in sorted order, and run the transition
     833                 :  * function on each value (applying DISTINCT if appropriate).
     834                 :  *
     835                 :  * Note that the strictness of the transition function was checked when
     836                 :  * entering the values into the sort, so we don't check it again here;
     837                 :  * we just apply standard SQL DISTINCT logic.
     838                 :  *
     839                 :  * The one-input case is handled separately from the multi-input case
     840                 :  * for performance reasons: for single by-value inputs, such as the
     841                 :  * common case of count(distinct id), the tuplesort_getdatum code path
     842                 :  * is around 300% faster.  (The speedup for by-reference types is less
     843                 :  * but still noticeable.)
     844                 :  *
     845                 :  * This function handles only one grouping set (already set in
     846                 :  * aggstate->current_set).
     847                 :  *
     848                 :  * When called, CurrentMemoryContext should be the per-query context.
     849                 :  */
     850 ECB             : static void
     851 GIC       26775 : process_ordered_aggregate_single(AggState *aggstate,
     852                 :                                  AggStatePerTrans pertrans,
     853                 :                                  AggStatePerGroup pergroupstate)
     854 ECB             : {
     855 CBC       26775 :     Datum       oldVal = (Datum) 0;
     856           26775 :     bool        oldIsNull = true;
     857           26775 :     bool        haveOldVal = false;
     858 GIC       26775 :     MemoryContext workcontext = aggstate->tmpcontext->ecxt_per_tuple_memory;
     859 ECB             :     MemoryContext oldContext;
     860 CBC       26775 :     bool        isDistinct = (pertrans->numDistinctCols > 0);
     861           26775 :     Datum       newAbbrevVal = (Datum) 0;
     862           26775 :     Datum       oldAbbrevVal = (Datum) 0;
     863 GIC       26775 :     FunctionCallInfo fcinfo = pertrans->transfn_fcinfo;
     864                 :     Datum      *newVal;
     865                 :     bool       *isNull;
     866 ECB             : 
     867 GIC       26775 :     Assert(pertrans->numDistinctCols < 2);
     868 ECB             : 
     869 GIC       26775 :     tuplesort_performsort(pertrans->sortstates[aggstate->current_set]);
     870                 : 
     871 ECB             :     /* Load the column into argument 1 (arg 0 will be transition value) */
     872 CBC       26775 :     newVal = &fcinfo->args[1].value;
     873 GIC       26775 :     isNull = &fcinfo->args[1].isnull;
     874                 : 
     875                 :     /*
     876                 :      * Note: if input type is pass-by-ref, the datums returned by the sort are
     877                 :      * freshly palloc'd in the per-query context, so we must be careful to
     878                 :      * pfree them when they are no longer needed.
     879                 :      */
     880 ECB             : 
     881 GIC      438861 :     while (tuplesort_getdatum(pertrans->sortstates[aggstate->current_set],
     882                 :                               true, false, newVal, isNull, &newAbbrevVal))
     883                 :     {
     884                 :         /*
     885                 :          * Clear and select the working context for evaluation of the equality
     886                 :          * function and transition function.
     887 ECB             :          */
     888 CBC      412086 :         MemoryContextReset(workcontext);
     889 GIC      412086 :         oldContext = MemoryContextSwitchTo(workcontext);
     890                 : 
     891                 :         /*
     892                 :          * If DISTINCT mode, and not distinct from prior, skip it.
     893 ECB             :          */
     894 CBC      412086 :         if (isDistinct &&
     895 GBC      145158 :             haveOldVal &&
     896 LBC           0 :             ((oldIsNull && *isNull) ||
     897 CBC      145158 :              (!oldIsNull && !*isNull &&
     898          279480 :               oldAbbrevVal == newAbbrevVal &&
     899 GIC      134322 :               DatumGetBool(FunctionCall2Coll(&pertrans->equalfnOne,
     900                 :                                              pertrans->aggCollation,
     901                 :                                              oldVal, *newVal)))))
     902 ECB             :         {
     903 GNC       59232 :             MemoryContextSwitchTo(oldContext);
     904           59232 :             continue;
     905                 :         }
     906 ECB             :         else
     907                 :         {
     908 CBC      352854 :             advance_transition_function(aggstate, pertrans, pergroupstate);
     909                 : 
     910 GNC      352854 :             MemoryContextSwitchTo(oldContext);
     911                 : 
     912                 :             /*
     913                 :              * Forget the old value, if any, and remember the new one for
     914                 :              * subsequent equality checks.
     915                 :              */
     916          352854 :             if (!pertrans->inputtypeByVal)
     917                 :             {
     918          262644 :                 if (!oldIsNull)
     919          262554 :                     pfree(DatumGetPointer(oldVal));
     920          262644 :                 if (!*isNull)
     921          262614 :                     oldVal = datumCopy(*newVal, pertrans->inputtypeByVal,
     922          262614 :                                        pertrans->inputtypeLen);
     923                 :             }
     924                 :             else
     925           90210 :                 oldVal = *newVal;
     926 CBC      352854 :             oldAbbrevVal = newAbbrevVal;
     927 GIC      352854 :             oldIsNull = *isNull;
     928 CBC      352854 :             haveOldVal = true;
     929 ECB             :         }
     930                 :     }
     931                 : 
     932 GIC       26775 :     if (!oldIsNull && !pertrans->inputtypeByVal)
     933 CBC          60 :         pfree(DatumGetPointer(oldVal));
     934 ECB             : 
     935 CBC       26775 :     tuplesort_end(pertrans->sortstates[aggstate->current_set]);
     936           26775 :     pertrans->sortstates[aggstate->current_set] = NULL;
     937 GIC       26775 : }
     938                 : 
     939                 : /*
     940 ECB             :  * Run the transition function for a DISTINCT or ORDER BY aggregate
     941                 :  * with more than one input.  This is called after we have completed
     942                 :  * entering all the input values into the sort object.  We complete the
     943                 :  * sort, read out the values in sorted order, and run the transition
     944                 :  * function on each value (applying DISTINCT if appropriate).
     945                 :  *
     946                 :  * This function handles only one grouping set (already set in
     947                 :  * aggstate->current_set).
     948                 :  *
     949                 :  * When called, CurrentMemoryContext should be the per-query context.
     950                 :  */
     951                 : static void
     952 GIC          36 : process_ordered_aggregate_multi(AggState *aggstate,
     953                 :                                 AggStatePerTrans pertrans,
     954                 :                                 AggStatePerGroup pergroupstate)
     955                 : {
     956              36 :     ExprContext *tmpcontext = aggstate->tmpcontext;
     957              36 :     FunctionCallInfo fcinfo = pertrans->transfn_fcinfo;
     958              36 :     TupleTableSlot *slot1 = pertrans->sortslot;
     959              36 :     TupleTableSlot *slot2 = pertrans->uniqslot;
     960 CBC          36 :     int         numTransInputs = pertrans->numTransInputs;
     961 GIC          36 :     int         numDistinctCols = pertrans->numDistinctCols;
     962              36 :     Datum       newAbbrevVal = (Datum) 0;
     963              36 :     Datum       oldAbbrevVal = (Datum) 0;
     964 CBC          36 :     bool        haveOldValue = false;
     965              36 :     TupleTableSlot *save = aggstate->tmpcontext->ecxt_outertuple;
     966 ECB             :     int         i;
     967                 : 
     968 CBC          36 :     tuplesort_performsort(pertrans->sortstates[aggstate->current_set]);
     969 ECB             : 
     970 CBC          36 :     ExecClearTuple(slot1);
     971              36 :     if (slot2)
     972 LBC           0 :         ExecClearTuple(slot2);
     973 ECB             : 
     974 GIC         126 :     while (tuplesort_gettupleslot(pertrans->sortstates[aggstate->current_set],
     975                 :                                   true, true, slot1, &newAbbrevVal))
     976 ECB             :     {
     977 GIC          90 :         CHECK_FOR_INTERRUPTS();
     978 ECB             : 
     979 CBC          90 :         tmpcontext->ecxt_outertuple = slot1;
     980 GBC          90 :         tmpcontext->ecxt_innertuple = slot2;
     981                 : 
     982 CBC          90 :         if (numDistinctCols == 0 ||
     983 UIC           0 :             !haveOldValue ||
     984               0 :             newAbbrevVal != oldAbbrevVal ||
     985 LBC           0 :             !ExecQual(pertrans->equalfnMulti, tmpcontext))
     986                 :         {
     987 ECB             :             /*
     988                 :              * Extract the first numTransInputs columns as datums to pass to
     989                 :              * the transfn.
     990                 :              */
     991 GBC          90 :             slot_getsomeattrs(slot1, numTransInputs);
     992 EUB             : 
     993                 :             /* Load values into fcinfo */
     994                 :             /* Start from 1, since the 0th arg will be the transition value */
     995 GIC         270 :             for (i = 0; i < numTransInputs; i++)
     996                 :             {
     997             180 :                 fcinfo->args[i + 1].value = slot1->tts_values[i];
     998             180 :                 fcinfo->args[i + 1].isnull = slot1->tts_isnull[i];
     999 ECB             :             }
    1000                 : 
    1001 GIC          90 :             advance_transition_function(aggstate, pertrans, pergroupstate);
    1002                 : 
    1003 CBC          90 :             if (numDistinctCols > 0)
    1004                 :             {
    1005 ECB             :                 /* swap the slot pointers to retain the current tuple */
    1006 LBC           0 :                 TupleTableSlot *tmpslot = slot2;
    1007                 : 
    1008 UIC           0 :                 slot2 = slot1;
    1009 LBC           0 :                 slot1 = tmpslot;
    1010                 :                 /* avoid ExecQual() calls by reusing abbreviated keys */
    1011               0 :                 oldAbbrevVal = newAbbrevVal;
    1012 UIC           0 :                 haveOldValue = true;
    1013                 :             }
    1014 EUB             :         }
    1015                 : 
    1016                 :         /* Reset context each time */
    1017 GBC          90 :         ResetExprContext(tmpcontext);
    1018                 : 
    1019              90 :         ExecClearTuple(slot1);
    1020 EUB             :     }
    1021                 : 
    1022 GIC          36 :     if (slot2)
    1023 UIC           0 :         ExecClearTuple(slot2);
    1024                 : 
    1025 CBC          36 :     tuplesort_end(pertrans->sortstates[aggstate->current_set]);
    1026 GIC          36 :     pertrans->sortstates[aggstate->current_set] = NULL;
    1027 ECB             : 
    1028                 :     /* restore previous slot, potentially in use for grouping sets */
    1029 GIC          36 :     tmpcontext->ecxt_outertuple = save;
    1030 CBC          36 : }
    1031 EUB             : 
    1032                 : /*
    1033 ECB             :  * Compute the final value of one aggregate.
    1034                 :  *
    1035                 :  * This function handles only one grouping set (already set in
    1036                 :  * aggstate->current_set).
    1037                 :  *
    1038                 :  * The finalfn will be run, and the result delivered, in the
    1039                 :  * output-tuple context; caller's CurrentMemoryContext does not matter.
    1040                 :  * (But note that in some cases, such as when there is no finalfn, the
    1041                 :  * result might be a pointer to or into the agg's transition value.)
    1042                 :  *
    1043                 :  * The finalfn uses the state as set in the transno. This also might be
    1044                 :  * being used by another aggregate function, so it's important that we do
    1045                 :  * nothing destructive here.
    1046                 :  */
    1047                 : static void
    1048 GIC      525579 : finalize_aggregate(AggState *aggstate,
    1049                 :                    AggStatePerAgg peragg,
    1050                 :                    AggStatePerGroup pergroupstate,
    1051                 :                    Datum *resultVal, bool *resultIsNull)
    1052                 : {
    1053          525579 :     LOCAL_FCINFO(fcinfo, FUNC_MAX_ARGS);
    1054          525579 :     bool        anynull = false;
    1055                 :     MemoryContext oldContext;
    1056                 :     int         i;
    1057                 :     ListCell   *lc;
    1058 CBC      525579 :     AggStatePerTrans pertrans = &aggstate->pertrans[peragg->transno];
    1059                 : 
    1060 GIC      525579 :     oldContext = MemoryContextSwitchTo(aggstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory);
    1061                 : 
    1062                 :     /*
    1063 ECB             :      * Evaluate any direct arguments.  We do this even if there's no finalfn
    1064                 :      * (which is unlikely anyway), so that side-effects happen as expected.
    1065                 :      * The direct arguments go into arg positions 1 and up, leaving position 0
    1066                 :      * for the transition state value.
    1067                 :      */
    1068 CBC      525579 :     i = 1;
    1069 GIC      526066 :     foreach(lc, peragg->aggdirectargs)
    1070 ECB             :     {
    1071 GIC         487 :         ExprState  *expr = (ExprState *) lfirst(lc);
    1072                 : 
    1073             487 :         fcinfo->args[i].value = ExecEvalExpr(expr,
    1074                 :                                              aggstate->ss.ps.ps_ExprContext,
    1075                 :                                              &fcinfo->args[i].isnull);
    1076             487 :         anynull |= fcinfo->args[i].isnull;
    1077             487 :         i++;
    1078 ECB             :     }
    1079                 : 
    1080                 :     /*
    1081                 :      * Apply the agg's finalfn if one is provided, else return transValue.
    1082                 :      */
    1083 CBC      525579 :     if (OidIsValid(peragg->finalfn_oid))
    1084                 :     {
    1085 GIC      147156 :         int         numFinalArgs = peragg->numFinalArgs;
    1086 ECB             : 
    1087                 :         /* set up aggstate->curperagg for AggGetAggref() */
    1088 GIC      147156 :         aggstate->curperagg = peragg;
    1089                 : 
    1090          147156 :         InitFunctionCallInfoData(*fcinfo, &peragg->finalfn,
    1091                 :                                  numFinalArgs,
    1092                 :                                  pertrans->aggCollation,
    1093 ECB             :                                  (void *) aggstate, NULL);
    1094                 : 
    1095                 :         /* Fill in the transition state value */
    1096 GIC      147156 :         fcinfo->args[0].value =
    1097          147156 :             MakeExpandedObjectReadOnly(pergroupstate->transValue,
    1098 ECB             :                                        pergroupstate->transValueIsNull,
    1099                 :                                        pertrans->transtypeLen);
    1100 CBC      147156 :         fcinfo->args[0].isnull = pergroupstate->transValueIsNull;
    1101 GIC      147156 :         anynull |= pergroupstate->transValueIsNull;
    1102                 : 
    1103                 :         /* Fill any remaining argument positions with nulls */
    1104          201938 :         for (; i < numFinalArgs; i++)
    1105                 :         {
    1106 CBC       54782 :             fcinfo->args[i].value = (Datum) 0;
    1107           54782 :             fcinfo->args[i].isnull = true;
    1108 GIC       54782 :             anynull = true;
    1109                 :         }
    1110 ECB             : 
    1111 CBC      147156 :         if (fcinfo->flinfo->fn_strict && anynull)
    1112                 :         {
    1113                 :             /* don't call a strict function with NULL inputs */
    1114 LBC           0 :             *resultVal = (Datum) 0;
    1115 UIC           0 :             *resultIsNull = true;
    1116 ECB             :         }
    1117                 :         else
    1118                 :         {
    1119 GIC      147156 :             *resultVal = FunctionCallInvoke(fcinfo);
    1120          147150 :             *resultIsNull = fcinfo->isnull;
    1121 ECB             :         }
    1122 GIC      147150 :         aggstate->curperagg = NULL;
    1123                 :     }
    1124 EUB             :     else
    1125                 :     {
    1126 GNC      378423 :         *resultVal =
    1127          378423 :             MakeExpandedObjectReadOnly(pergroupstate->transValue,
    1128                 :                                        pergroupstate->transValueIsNull,
    1129                 :                                        pertrans->transtypeLen);
    1130 GIC      378423 :         *resultIsNull = pergroupstate->transValueIsNull;
    1131 ECB             :     }
    1132                 : 
    1133 GIC      525573 :     MemoryContextSwitchTo(oldContext);
    1134          525573 : }
    1135 ECB             : 
    1136                 : /*
    1137                 :  * Compute the output value of one partial aggregate.
    1138                 :  *
    1139                 :  * The serialization function will be run, and the result delivered, in the
    1140                 :  * output-tuple context; caller's CurrentMemoryContext does not matter.
    1141                 :  */
    1142                 : static void
    1143 GIC        6722 : finalize_partialaggregate(AggState *aggstate,
    1144                 :                           AggStatePerAgg peragg,
    1145 ECB             :                           AggStatePerGroup pergroupstate,
    1146                 :                           Datum *resultVal, bool *resultIsNull)
    1147                 : {
    1148 GIC        6722 :     AggStatePerTrans pertrans = &aggstate->pertrans[peragg->transno];
    1149                 :     MemoryContext oldContext;
    1150 ECB             : 
    1151 GIC        6722 :     oldContext = MemoryContextSwitchTo(aggstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory);
    1152                 : 
    1153 ECB             :     /*
    1154                 :      * serialfn_oid will be set if we must serialize the transvalue before
    1155                 :      * returning it
    1156                 :      */
    1157 GIC        6722 :     if (OidIsValid(pertrans->serialfn_oid))
    1158                 :     {
    1159 ECB             :         /* Don't call a strict serialization function with NULL input. */
    1160 GIC         293 :         if (pertrans->serialfn.fn_strict && pergroupstate->transValueIsNull)
    1161                 :         {
    1162 CBC          46 :             *resultVal = (Datum) 0;
    1163 GIC          46 :             *resultIsNull = true;
    1164 ECB             :         }
    1165                 :         else
    1166                 :         {
    1167 GIC         247 :             FunctionCallInfo fcinfo = pertrans->serialfn_fcinfo;
    1168                 : 
    1169 CBC         247 :             fcinfo->args[0].value =
    1170 GIC         247 :                 MakeExpandedObjectReadOnly(pergroupstate->transValue,
    1171 ECB             :                                            pergroupstate->transValueIsNull,
    1172                 :                                            pertrans->transtypeLen);
    1173 GIC         247 :             fcinfo->args[0].isnull = pergroupstate->transValueIsNull;
    1174             247 :             fcinfo->isnull = false;
    1175 ECB             : 
    1176 CBC         247 :             *resultVal = FunctionCallInvoke(fcinfo);
    1177 GIC         247 :             *resultIsNull = fcinfo->isnull;
    1178 ECB             :         }
    1179                 :     }
    1180                 :     else
    1181                 :     {
    1182 GNC        6429 :         *resultVal =
    1183            6429 :             MakeExpandedObjectReadOnly(pergroupstate->transValue,
    1184                 :                                        pergroupstate->transValueIsNull,
    1185                 :                                        pertrans->transtypeLen);
    1186 CBC        6429 :         *resultIsNull = pergroupstate->transValueIsNull;
    1187 ECB             :     }
    1188                 : 
    1189 GIC        6722 :     MemoryContextSwitchTo(oldContext);
    1190            6722 : }
    1191                 : 
    1192                 : /*
    1193 ECB             :  * Extract the attributes that make up the grouping key into the
    1194                 :  * hashslot. This is necessary to compute the hash or perform a lookup.
    1195                 :  */
    1196                 : static inline void
    1197 GIC     3098917 : prepare_hash_slot(AggStatePerHash perhash,
    1198                 :                   TupleTableSlot *inputslot,
    1199                 :                   TupleTableSlot *hashslot)
    1200 ECB             : {
    1201                 :     int         i;
    1202                 : 
    1203                 :     /* transfer just the needed columns into hashslot */
    1204 GIC     3098917 :     slot_getsomeattrs(inputslot, perhash->largestGrpColIdx);
    1205 CBC     3098917 :     ExecClearTuple(hashslot);
    1206                 : 
    1207         7645511 :     for (i = 0; i < perhash->numhashGrpCols; i++)
    1208 ECB             :     {
    1209 GIC     4546594 :         int         varNumber = perhash->hashGrpColIdxInput[i] - 1;
    1210 ECB             : 
    1211 CBC     4546594 :         hashslot->tts_values[i] = inputslot->tts_values[varNumber];
    1212 GIC     4546594 :         hashslot->tts_isnull[i] = inputslot->tts_isnull[varNumber];
    1213                 :     }
    1214         3098917 :     ExecStoreVirtualTuple(hashslot);
    1215         3098917 : }
    1216                 : 
    1217                 : /*
    1218                 :  * Prepare to finalize and project based on the specified representative tuple
    1219                 :  * slot and grouping set.
    1220                 :  *
    1221                 :  * In the specified tuple slot, force to null all attributes that should be
    1222                 :  * read as null in the context of the current grouping set.  Also stash the
    1223                 :  * current group bitmap where GroupingExpr can get at it.
    1224                 :  *
    1225                 :  * This relies on three conditions:
    1226                 :  *
    1227                 :  * 1) Nothing is ever going to try and extract the whole tuple from this slot,
    1228                 :  * only reference it in evaluations, which will only access individual
    1229                 :  * attributes.
    1230                 :  *
    1231                 :  * 2) No system columns are going to need to be nulled. (If a system column is
    1232                 :  * referenced in a group clause, it is actually projected in the outer plan
    1233                 :  * tlist.)
    1234                 :  *
    1235                 :  * 3) Within a given phase, we never need to recover the value of an attribute
    1236                 :  * once it has been set to null.
    1237                 :  *
    1238 ECB             :  * Poking into the slot this way is a bit ugly, but the consensus is that the
    1239                 :  * alternative was worse.
    1240                 :  */
    1241                 : static void
    1242 CBC      447440 : prepare_projection_slot(AggState *aggstate, TupleTableSlot *slot, int currentSet)
    1243                 : {
    1244          447440 :     if (aggstate->phase->grouped_cols)
    1245                 :     {
    1246          283937 :         Bitmapset  *grouped_cols = aggstate->phase->grouped_cols[currentSet];
    1247                 : 
    1248 GIC      283937 :         aggstate->grouped_cols = grouped_cols;
    1249                 : 
    1250          283937 :         if (TTS_EMPTY(slot))
    1251                 :         {
    1252                 :             /*
    1253 ECB             :              * Force all values to be NULL if working on an empty input tuple
    1254                 :              * (i.e. an empty grouping set for which no input rows were
    1255                 :              * supplied).
    1256                 :              */
    1257 GIC          24 :             ExecStoreAllNullTuple(slot);
    1258                 :         }
    1259          283913 :         else if (aggstate->all_grouped_cols)
    1260 ECB             :         {
    1261                 :             ListCell   *lc;
    1262                 : 
    1263                 :             /* all_grouped_cols is arranged in desc order */
    1264 CBC      283886 :             slot_getsomeattrs(slot, linitial_int(aggstate->all_grouped_cols));
    1265                 : 
    1266          774053 :             foreach(lc, aggstate->all_grouped_cols)
    1267 ECB             :             {
    1268 GIC      490167 :                 int         attnum = lfirst_int(lc);
    1269                 : 
    1270          490167 :                 if (!bms_is_member(attnum, grouped_cols))
    1271 CBC       28673 :                     slot->tts_isnull[attnum - 1] = true;
    1272                 :             }
    1273                 :         }
    1274                 :     }
    1275 GIC      447440 : }
    1276                 : 
    1277                 : /*
    1278                 :  * Compute the final value of all aggregates for one group.
    1279                 :  *
    1280                 :  * This function handles only one grouping set at a time, which the caller must
    1281                 :  * have selected.  It's also the caller's responsibility to adjust the supplied
    1282                 :  * pergroup parameter to point to the current set's transvalues.
    1283 ECB             :  *
    1284                 :  * Results are stored in the output econtext aggvalues/aggnulls.
    1285                 :  */
    1286                 : static void
    1287 CBC      447440 : finalize_aggregates(AggState *aggstate,
    1288 ECB             :                     AggStatePerAgg peraggs,
    1289                 :                     AggStatePerGroup pergroup)
    1290                 : {
    1291 GIC      447440 :     ExprContext *econtext = aggstate->ss.ps.ps_ExprContext;
    1292          447440 :     Datum      *aggvalues = econtext->ecxt_aggvalues;
    1293          447440 :     bool       *aggnulls = econtext->ecxt_aggnulls;
    1294                 :     int         aggno;
    1295 ECB             : 
    1296                 :     /*
    1297                 :      * If there were any DISTINCT and/or ORDER BY aggregates, sort their
    1298                 :      * inputs and run the transition functions.
    1299                 :      */
    1300 GNC      979612 :     for (int transno = 0; transno < aggstate->numtrans; transno++)
    1301                 :     {
    1302 CBC      532172 :         AggStatePerTrans pertrans = &aggstate->pertrans[transno];
    1303                 :         AggStatePerGroup pergroupstate;
    1304 ECB             : 
    1305 GIC      532172 :         pergroupstate = &pergroup[transno];
    1306                 : 
    1307 GNC      532172 :         if (pertrans->aggsortrequired)
    1308 ECB             :         {
    1309 GIC       26811 :             Assert(aggstate->aggstrategy != AGG_HASHED &&
    1310                 :                    aggstate->aggstrategy != AGG_MIXED);
    1311                 : 
    1312 CBC       26811 :             if (pertrans->numInputs == 1)
    1313 GIC       26775 :                 process_ordered_aggregate_single(aggstate,
    1314                 :                                                  pertrans,
    1315                 :                                                  pergroupstate);
    1316 ECB             :             else
    1317 GIC          36 :                 process_ordered_aggregate_multi(aggstate,
    1318 ECB             :                                                 pertrans,
    1319                 :                                                 pergroupstate);
    1320                 :         }
    1321 GNC      505361 :         else if (pertrans->numDistinctCols > 0 && pertrans->haslast)
    1322                 :         {
    1323            9168 :             pertrans->haslast = false;
    1324                 : 
    1325            9168 :             if (pertrans->numDistinctCols == 1)
    1326                 :             {
    1327            9126 :                 if (!pertrans->inputtypeByVal && !pertrans->lastisnull)
    1328             125 :                     pfree(DatumGetPointer(pertrans->lastdatum));
    1329                 : 
    1330            9126 :                 pertrans->lastisnull = false;
    1331            9126 :                 pertrans->lastdatum = (Datum) 0;
    1332                 :             }
    1333                 :             else
    1334              42 :                 ExecClearTuple(pertrans->uniqslot);
    1335                 :         }
    1336                 :     }
    1337 ECB             : 
    1338                 :     /*
    1339                 :      * Run the final functions.
    1340                 :      */
    1341 CBC      979735 :     for (aggno = 0; aggno < aggstate->numaggs; aggno++)
    1342                 :     {
    1343 GIC      532301 :         AggStatePerAgg peragg = &peraggs[aggno];
    1344 CBC      532301 :         int         transno = peragg->transno;
    1345                 :         AggStatePerGroup pergroupstate;
    1346                 : 
    1347 GIC      532301 :         pergroupstate = &pergroup[transno];
    1348                 : 
    1349          532301 :         if (DO_AGGSPLIT_SKIPFINAL(aggstate->aggsplit))
    1350            6722 :             finalize_partialaggregate(aggstate, peragg, pergroupstate,
    1351 CBC        6722 :                                       &aggvalues[aggno], &aggnulls[aggno]);
    1352                 :         else
    1353          525579 :             finalize_aggregate(aggstate, peragg, pergroupstate,
    1354          525579 :                                &aggvalues[aggno], &aggnulls[aggno]);
    1355                 :     }
    1356 GIC      447434 : }
    1357 ECB             : 
    1358                 : /*
    1359                 :  * Project the result of a group (whose aggs have already been calculated by
    1360                 :  * finalize_aggregates). Returns the result slot, or NULL if no row is
    1361                 :  * projected (suppressed by qual).
    1362                 :  */
    1363                 : static TupleTableSlot *
    1364 CBC      447434 : project_aggregates(AggState *aggstate)
    1365                 : {
    1366          447434 :     ExprContext *econtext = aggstate->ss.ps.ps_ExprContext;
    1367                 : 
    1368                 :     /*
    1369                 :      * Check the qual (HAVING clause); if the group does not match, ignore it.
    1370                 :      */
    1371 GIC      447434 :     if (ExecQual(aggstate->ss.ps.qual, econtext))
    1372                 :     {
    1373                 :         /*
    1374 ECB             :          * Form and return projection tuple using the aggregate results and
    1375                 :          * the representative input tuple.
    1376                 :          */
    1377 GIC      403289 :         return ExecProject(aggstate->ss.ps.ps_ProjInfo);
    1378                 :     }
    1379                 :     else
    1380           44145 :         InstrCountFiltered1(aggstate, 1);
    1381 ECB             : 
    1382 GIC       44145 :     return NULL;
    1383                 : }
    1384                 : 
    1385                 : /*
    1386                 :  * Find input-tuple columns that are needed, dividing them into
    1387 ECB             :  * aggregated and unaggregated sets.
    1388                 :  */
    1389                 : static void
    1390 CBC        3990 : find_cols(AggState *aggstate, Bitmapset **aggregated, Bitmapset **unaggregated)
    1391                 : {
    1392            3990 :     Agg        *agg = (Agg *) aggstate->ss.ps.plan;
    1393                 :     FindColsContext context;
    1394                 : 
    1395 GIC        3990 :     context.is_aggref = false;
    1396            3990 :     context.aggregated = NULL;
    1397            3990 :     context.unaggregated = NULL;
    1398                 : 
    1399                 :     /* Examine tlist and quals */
    1400 CBC        3990 :     (void) find_cols_walker((Node *) agg->plan.targetlist, &context);
    1401 GIC        3990 :     (void) find_cols_walker((Node *) agg->plan.qual, &context);
    1402 ECB             : 
    1403                 :     /* In some cases, grouping columns will not appear in the tlist */
    1404 GIC       11763 :     for (int i = 0; i < agg->numCols; i++)
    1405 CBC        7773 :         context.unaggregated = bms_add_member(context.unaggregated,
    1406            7773 :                                               agg->grpColIdx[i]);
    1407 ECB             : 
    1408 GIC        3990 :     *aggregated = context.aggregated;
    1409            3990 :     *unaggregated = context.unaggregated;
    1410 CBC        3990 : }
    1411 ECB             : 
    1412                 : static bool
    1413 GIC       41605 : find_cols_walker(Node *node, FindColsContext *context)
    1414 ECB             : {
    1415 CBC       41605 :     if (node == NULL)
    1416            7021 :         return false;
    1417 GIC       34584 :     if (IsA(node, Var))
    1418 ECB             :     {
    1419 CBC       10513 :         Var        *var = (Var *) node;
    1420 ECB             : 
    1421                 :         /* setrefs.c should have set the varno to OUTER_VAR */
    1422 GIC       10513 :         Assert(var->varno == OUTER_VAR);
    1423 CBC       10513 :         Assert(var->varlevelsup == 0);
    1424 GIC       10513 :         if (context->is_aggref)
    1425 CBC        2206 :             context->aggregated = bms_add_member(context->aggregated,
    1426            2206 :                                                  var->varattno);
    1427 ECB             :         else
    1428 GIC        8307 :             context->unaggregated = bms_add_member(context->unaggregated,
    1429 CBC        8307 :                                                    var->varattno);
    1430 GIC       10513 :         return false;
    1431                 :     }
    1432 CBC       24071 :     if (IsA(node, Aggref))
    1433 ECB             :     {
    1434 CBC        3277 :         Assert(!context->is_aggref);
    1435            3277 :         context->is_aggref = true;
    1436            3277 :         expression_tree_walker(node, find_cols_walker, (void *) context);
    1437 GIC        3277 :         context->is_aggref = false;
    1438 CBC        3277 :         return false;
    1439 ECB             :     }
    1440 CBC       20794 :     return expression_tree_walker(node, find_cols_walker,
    1441                 :                                   (void *) context);
    1442 ECB             : }
    1443                 : 
    1444                 : /*
    1445                 :  * (Re-)initialize the hash table(s) to empty.
    1446                 :  *
    1447                 :  * To implement hashed aggregation, we need a hashtable that stores a
    1448                 :  * representative tuple and an array of AggStatePerGroup structs for each
    1449                 :  * distinct set of GROUP BY column values.  We compute the hash key from the
    1450                 :  * GROUP BY columns.  The per-group data is allocated in lookup_hash_entry(),
    1451                 :  * for each entry.
    1452                 :  *
    1453                 :  * We have a separate hashtable and associated perhash data structure for each
    1454                 :  * grouping set for which we're doing hashing.
    1455                 :  *
    1456                 :  * The contents of the hash tables always live in the hashcontext's per-tuple
    1457                 :  * memory context (there is only one of these for all tables together, since
    1458                 :  * they are all reset at the same time).
    1459                 :  */
    1460                 : static void
    1461 GIC       41380 : build_hash_tables(AggState *aggstate)
    1462                 : {
    1463                 :     int         setno;
    1464                 : 
    1465           82902 :     for (setno = 0; setno < aggstate->num_hashes; ++setno)
    1466                 :     {
    1467           41522 :         AggStatePerHash perhash = &aggstate->perhash[setno];
    1468                 :         long        nbuckets;
    1469                 :         Size        memory;
    1470                 : 
    1471 CBC       41522 :         if (perhash->hashtable != NULL)
    1472                 :         {
    1473 GIC       37944 :             ResetTupleHashTable(perhash->hashtable);
    1474           37944 :             continue;
    1475 ECB             :         }
    1476                 : 
    1477 CBC        3578 :         Assert(perhash->aggnode->numGroups > 0);
    1478                 : 
    1479 GIC        3578 :         memory = aggstate->hash_mem_limit / aggstate->num_hashes;
    1480                 : 
    1481 ECB             :         /* choose reasonable number of buckets per hashtable */
    1482 GIC        3578 :         nbuckets = hash_choose_num_buckets(aggstate->hashentrysize,
    1483 CBC        3578 :                                            perhash->aggnode->numGroups,
    1484 ECB             :                                            memory);
    1485                 : 
    1486 GIC        3578 :         build_hash_table(aggstate, setno, nbuckets);
    1487 ECB             :     }
    1488                 : 
    1489 CBC       41380 :     aggstate->hash_ngroups_current = 0;
    1490 GIC       41380 : }
    1491                 : 
    1492 ECB             : /*
    1493                 :  * Build a single hashtable for this grouping set.
    1494                 :  */
    1495                 : static void
    1496 CBC        3578 : build_hash_table(AggState *aggstate, int setno, long nbuckets)
    1497                 : {
    1498 GIC        3578 :     AggStatePerHash perhash = &aggstate->perhash[setno];
    1499 CBC        3578 :     MemoryContext metacxt = aggstate->hash_metacxt;
    1500            3578 :     MemoryContext hashcxt = aggstate->hashcontext->ecxt_per_tuple_memory;
    1501 GIC        3578 :     MemoryContext tmpcxt = aggstate->tmpcontext->ecxt_per_tuple_memory;
    1502                 :     Size        additionalsize;
    1503                 : 
    1504            3578 :     Assert(aggstate->aggstrategy == AGG_HASHED ||
    1505                 :            aggstate->aggstrategy == AGG_MIXED);
    1506 ECB             : 
    1507                 :     /*
    1508                 :      * Used to make sure initial hash table allocation does not exceed
    1509                 :      * hash_mem. Note that the estimate does not include space for
    1510                 :      * pass-by-reference transition data values, nor for the representative
    1511                 :      * tuple of each group.
    1512                 :      */
    1513 GIC        3578 :     additionalsize = aggstate->numtrans * sizeof(AggStatePerGroupData);
    1514 ECB             : 
    1515 GIC        7156 :     perhash->hashtable = BuildTupleHashTableExt(&aggstate->ss.ps,
    1516            3578 :                                                 perhash->hashslot->tts_tupleDescriptor,
    1517                 :                                                 perhash->numCols,
    1518                 :                                                 perhash->hashGrpColIdxHash,
    1519            3578 :                                                 perhash->eqfuncoids,
    1520                 :                                                 perhash->hashfunctions,
    1521            3578 :                                                 perhash->aggnode->grpCollations,
    1522                 :                                                 nbuckets,
    1523 ECB             :                                                 additionalsize,
    1524                 :                                                 metacxt,
    1525                 :                                                 hashcxt,
    1526                 :                                                 tmpcxt,
    1527 GIC        3578 :                                                 DO_AGGSPLIT_SKIPFINAL(aggstate->aggsplit));
    1528            3578 : }
    1529 ECB             : 
    1530                 : /*
    1531                 :  * Compute columns that actually need to be stored in hashtable entries.  The
    1532                 :  * incoming tuples from the child plan node will contain grouping columns,
    1533                 :  * other columns referenced in our targetlist and qual, columns used to
    1534                 :  * compute the aggregate functions, and perhaps just junk columns we don't use
    1535                 :  * at all.  Only columns of the first two types need to be stored in the
    1536                 :  * hashtable, and getting rid of the others can make the table entries
    1537                 :  * significantly smaller.  The hashtable only contains the relevant columns,
    1538                 :  * and is packed/unpacked in lookup_hash_entry() / agg_retrieve_hash_table()
    1539                 :  * into the format of the normal input descriptor.
    1540                 :  *
    1541                 :  * Additional columns, in addition to the columns grouped by, come from two
    1542                 :  * sources: Firstly functionally dependent columns that we don't need to group
    1543                 :  * by themselves, and secondly ctids for row-marks.
    1544                 :  *
    1545                 :  * To eliminate duplicates, we build a bitmapset of the needed columns, and
    1546                 :  * then build an array of the columns included in the hashtable. We might
    1547                 :  * still have duplicates if the passed-in grpColIdx has them, which can happen
    1548                 :  * in edge cases from semijoins/distinct; these can't always be removed,
    1549                 :  * because it's not certain that the duplicate cols will be using the same
    1550                 :  * hash function.
    1551                 :  *
    1552                 :  * Note that the array is preserved over ExecReScanAgg, so we allocate it in
    1553                 :  * the per-query context (unlike the hash table itself).
    1554                 :  */
    1555                 : static void
    1556 GIC        3990 : find_hash_columns(AggState *aggstate)
    1557                 : {
    1558                 :     Bitmapset  *base_colnos;
    1559                 :     Bitmapset  *aggregated_colnos;
    1560            3990 :     TupleDesc   scanDesc = aggstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
    1561            3990 :     List       *outerTlist = outerPlanState(aggstate)->plan->targetlist;
    1562            3990 :     int         numHashes = aggstate->num_hashes;
    1563            3990 :     EState     *estate = aggstate->ss.ps.state;
    1564                 :     int         j;
    1565                 : 
    1566 ECB             :     /* Find Vars that will be needed in tlist and qual */
    1567 GIC        3990 :     find_cols(aggstate, &aggregated_colnos, &base_colnos);
    1568            3990 :     aggstate->colnos_needed = bms_union(base_colnos, aggregated_colnos);
    1569            3990 :     aggstate->max_colno_needed = 0;
    1570 CBC        3990 :     aggstate->all_cols_needed = true;
    1571 ECB             : 
    1572 CBC       16547 :     for (int i = 0; i < scanDesc->natts; i++)
    1573 ECB             :     {
    1574 GIC       12557 :         int         colno = i + 1;
    1575                 : 
    1576           12557 :         if (bms_is_member(colno, aggstate->colnos_needed))
    1577 CBC        9387 :             aggstate->max_colno_needed = colno;
    1578 ECB             :         else
    1579 CBC        3170 :             aggstate->all_cols_needed = false;
    1580 ECB             :     }
    1581                 : 
    1582 CBC        8177 :     for (j = 0; j < numHashes; ++j)
    1583                 :     {
    1584            4187 :         AggStatePerHash perhash = &aggstate->perhash[j];
    1585 GIC        4187 :         Bitmapset  *colnos = bms_copy(base_colnos);
    1586 CBC        4187 :         AttrNumber *grpColIdx = perhash->aggnode->grpColIdx;
    1587            4187 :         List       *hashTlist = NIL;
    1588                 :         TupleDesc   hashDesc;
    1589 ECB             :         int         maxCols;
    1590                 :         int         i;
    1591                 : 
    1592 CBC        4187 :         perhash->largestGrpColIdx = 0;
    1593                 : 
    1594 ECB             :         /*
    1595                 :          * If we're doing grouping sets, then some Vars might be referenced in
    1596                 :          * tlist/qual for the benefit of other grouping sets, but not needed
    1597                 :          * when hashing; i.e. prepare_projection_slot will null them out, so
    1598                 :          * there'd be no point storing them.  Use prepare_projection_slot's
    1599                 :          * logic to determine which.
    1600                 :          */
    1601 GIC        4187 :         if (aggstate->phases[0].grouped_cols)
    1602 ECB             :         {
    1603 GIC        4187 :             Bitmapset  *grouped_cols = aggstate->phases[0].grouped_cols[j];
    1604                 :             ListCell   *lc;
    1605                 : 
    1606           12712 :             foreach(lc, aggstate->all_grouped_cols)
    1607                 :             {
    1608            8525 :                 int         attnum = lfirst_int(lc);
    1609                 : 
    1610            8525 :                 if (!bms_is_member(attnum, grouped_cols))
    1611 CBC         504 :                     colnos = bms_del_member(colnos, attnum);
    1612                 :             }
    1613 ECB             :         }
    1614                 : 
    1615                 :         /*
    1616                 :          * Compute maximum number of input columns accounting for possible
    1617                 :          * duplications in the grpColIdx array, which can happen in some edge
    1618                 :          * cases where HashAggregate was generated as part of a semijoin or a
    1619                 :          * DISTINCT.
    1620                 :          */
    1621 CBC        4187 :         maxCols = bms_num_members(colnos) + perhash->numCols;
    1622                 : 
    1623 GIC        4187 :         perhash->hashGrpColIdxInput =
    1624            4187 :             palloc(maxCols * sizeof(AttrNumber));
    1625            4187 :         perhash->hashGrpColIdxHash =
    1626            4187 :             palloc(perhash->numCols * sizeof(AttrNumber));
    1627                 : 
    1628                 :         /* Add all the grouping columns to colnos */
    1629           12211 :         for (i = 0; i < perhash->numCols; i++)
    1630            8024 :             colnos = bms_add_member(colnos, grpColIdx[i]);
    1631 ECB             : 
    1632                 :         /*
    1633                 :          * First build mapping for columns directly hashed. These are the
    1634                 :          * first, because they'll be accessed when computing hash values and
    1635                 :          * comparing tuples for exact matches. We also build simple mapping
    1636                 :          * for execGrouping, so it knows where to find the to-be-hashed /
    1637                 :          * compared columns in the input.
    1638                 :          */
    1639 CBC       12211 :         for (i = 0; i < perhash->numCols; i++)
    1640 ECB             :         {
    1641 GIC        8024 :             perhash->hashGrpColIdxInput[i] = grpColIdx[i];
    1642            8024 :             perhash->hashGrpColIdxHash[i] = i + 1;
    1643            8024 :             perhash->numhashGrpCols++;
    1644                 :             /* delete already mapped columns */
    1645 GNC        8024 :             colnos = bms_del_member(colnos, grpColIdx[i]);
    1646                 :         }
    1647                 : 
    1648                 :         /* and add the remaining columns */
    1649            4187 :         i = -1;
    1650            4499 :         while ((i = bms_next_member(colnos, i)) >= 0)
    1651                 :         {
    1652 CBC         312 :             perhash->hashGrpColIdxInput[perhash->numhashGrpCols] = i;
    1653             312 :             perhash->numhashGrpCols++;
    1654 ECB             :         }
    1655                 : 
    1656                 :         /* and build a tuple descriptor for the hashtable */
    1657 GIC       12523 :         for (i = 0; i < perhash->numhashGrpCols; i++)
    1658                 :         {
    1659            8336 :             int         varNumber = perhash->hashGrpColIdxInput[i] - 1;
    1660 ECB             : 
    1661 CBC        8336 :             hashTlist = lappend(hashTlist, list_nth(outerTlist, varNumber));
    1662 GIC        8336 :             perhash->largestGrpColIdx =
    1663 CBC        8336 :                 Max(varNumber + 1, perhash->largestGrpColIdx);
    1664 ECB             :         }
    1665                 : 
    1666 GIC        4187 :         hashDesc = ExecTypeFromTL(hashTlist);
    1667                 : 
    1668 CBC        4187 :         execTuplesHashPrepare(perhash->numCols,
    1669 GIC        4187 :                               perhash->aggnode->grpOperators,
    1670 ECB             :                               &perhash->eqfuncoids,
    1671                 :                               &perhash->hashfunctions);
    1672 CBC        4187 :         perhash->hashslot =
    1673            4187 :             ExecAllocTableSlot(&estate->es_tupleTable, hashDesc,
    1674 ECB             :                                &TTSOpsMinimalTuple);
    1675                 : 
    1676 GIC        4187 :         list_free(hashTlist);
    1677 CBC        4187 :         bms_free(colnos);
    1678                 :     }
    1679 ECB             : 
    1680 CBC        3990 :     bms_free(base_colnos);
    1681 GIC        3990 : }
    1682                 : 
    1683 ECB             : /*
    1684                 :  * Estimate per-hash-table-entry overhead.
    1685                 :  */
    1686                 : Size
    1687 CBC       14891 : hash_agg_entry_size(int numTrans, Size tupleWidth, Size transitionSpace)
    1688 ECB             : {
    1689                 :     Size        tupleChunkSize;
    1690                 :     Size        pergroupChunkSize;
    1691                 :     Size        transitionChunkSize;
    1692 CBC       14891 :     Size        tupleSize = (MAXALIGN(SizeofMinimalTupleHeader) +
    1693                 :                              tupleWidth);
    1694 GIC       14891 :     Size        pergroupSize = numTrans * sizeof(AggStatePerGroupData);
    1695                 : 
    1696           14891 :     tupleChunkSize = CHUNKHDRSZ + tupleSize;
    1697                 : 
    1698 CBC       14891 :     if (pergroupSize > 0)
    1699 GIC        5095 :         pergroupChunkSize = CHUNKHDRSZ + pergroupSize;
    1700                 :     else
    1701            9796 :         pergroupChunkSize = 0;
    1702                 : 
    1703 CBC       14891 :     if (transitionSpace > 0)
    1704 GIC        2375 :         transitionChunkSize = CHUNKHDRSZ + transitionSpace;
    1705 ECB             :     else
    1706 GIC       12516 :         transitionChunkSize = 0;
    1707 ECB             : 
    1708                 :     return
    1709                 :         sizeof(TupleHashEntryData) +
    1710 CBC       14891 :         tupleChunkSize +
    1711 GIC       14891 :         pergroupChunkSize +
    1712 ECB             :         transitionChunkSize;
    1713                 : }
    1714                 : 
    1715                 : /*
    1716                 :  * hashagg_recompile_expressions()
    1717                 :  *
    1718                 :  * Identifies the right phase, compiles the right expression given the
    1719                 :  * arguments, and then sets phase->evalfunc to that expression.
    1720                 :  *
    1721                 :  * Different versions of the compiled expression are needed depending on
    1722                 :  * whether hash aggregation has spilled or not, and whether it's reading from
    1723                 :  * the outer plan or a tape. Before spilling to disk, the expression reads
    1724                 :  * from the outer plan and does not need to perform a NULL check. After
    1725                 :  * HashAgg begins to spill, new groups will not be created in the hash table,
    1726                 :  * and the AggStatePerGroup array may be NULL; therefore we need to add a null
    1727                 :  * pointer check to the expression. Then, when reading spilled data from a
    1728                 :  * tape, we change the outer slot type to be a fixed minimal tuple slot.
    1729                 :  *
    1730                 :  * It would be wasteful to recompile every time, so cache the compiled
    1731                 :  * expressions in the AggStatePerPhase, and reuse when appropriate.
    1732                 :  */
    1733                 : static void
    1734 GIC       64656 : hashagg_recompile_expressions(AggState *aggstate, bool minslot, bool nullcheck)
    1735                 : {
    1736                 :     AggStatePerPhase phase;
    1737           64656 :     int         i = minslot ? 1 : 0;
    1738           64656 :     int         j = nullcheck ? 1 : 0;
    1739                 : 
    1740           64656 :     Assert(aggstate->aggstrategy == AGG_HASHED ||
    1741                 :            aggstate->aggstrategy == AGG_MIXED);
    1742                 : 
    1743           64656 :     if (aggstate->aggstrategy == AGG_HASHED)
    1744           38370 :         phase = &aggstate->phases[0];
    1745 ECB             :     else                        /* AGG_MIXED */
    1746 GIC       26286 :         phase = &aggstate->phases[1];
    1747                 : 
    1748 CBC       64656 :     if (phase->evaltrans_cache[i][j] == NULL)
    1749 ECB             :     {
    1750 GIC          48 :         const TupleTableSlotOps *outerops = aggstate->ss.ps.outerops;
    1751 CBC          48 :         bool        outerfixed = aggstate->ss.ps.outeropsfixed;
    1752 GIC          48 :         bool        dohash = true;
    1753              48 :         bool        dosort = false;
    1754 ECB             : 
    1755                 :         /*
    1756                 :          * If minslot is true, that means we are processing a spilled batch
    1757                 :          * (inside agg_refill_hash_table()), and we must not advance the
    1758                 :          * sorted grouping sets.
    1759                 :          */
    1760 GIC          48 :         if (aggstate->aggstrategy == AGG_MIXED && !minslot)
    1761 CBC           6 :             dosort = true;
    1762 ECB             : 
    1763                 :         /* temporarily change the outerops while compiling the expression */
    1764 CBC          48 :         if (minslot)
    1765                 :         {
    1766 GIC          24 :             aggstate->ss.ps.outerops = &TTSOpsMinimalTuple;
    1767              24 :             aggstate->ss.ps.outeropsfixed = true;
    1768                 :         }
    1769                 : 
    1770              48 :         phase->evaltrans_cache[i][j] = ExecBuildAggTrans(aggstate, phase,
    1771 ECB             :                                                          dosort, dohash,
    1772                 :                                                          nullcheck);
    1773                 : 
    1774                 :         /* change back */
    1775 CBC          48 :         aggstate->ss.ps.outerops = outerops;
    1776 GIC          48 :         aggstate->ss.ps.outeropsfixed = outerfixed;
    1777 ECB             :     }
    1778                 : 
    1779 GIC       64656 :     phase->evaltrans = phase->evaltrans_cache[i][j];
    1780           64656 : }
    1781 ECB             : 
    1782                 : /*
    1783                 :  * Set limits that trigger spilling to avoid exceeding hash_mem. Consider the
    1784                 :  * number of partitions we expect to create (if we do spill).
    1785                 :  *
    1786                 :  * There are two limits: a memory limit, and also an ngroups limit. The
    1787                 :  * ngroups limit becomes important when we expect transition values to grow
    1788                 :  * substantially larger than the initial value.
    1789                 :  */
    1790                 : void
    1791 CBC       27408 : hash_agg_set_limits(double hashentrysize, double input_groups, int used_bits,
    1792                 :                     Size *mem_limit, uint64 *ngroups_limit,
    1793                 :                     int *num_partitions)
    1794                 : {
    1795                 :     int         npartitions;
    1796                 :     Size        partition_mem;
    1797 GIC       27408 :     Size        hash_mem_limit = get_hash_memory_limit();
    1798                 : 
    1799                 :     /* if not expected to spill, use all of hash_mem */
    1800           27408 :     if (input_groups * hashentrysize <= hash_mem_limit)
    1801                 :     {
    1802 CBC       26193 :         if (num_partitions != NULL)
    1803 GIC       13857 :             *num_partitions = 0;
    1804           26193 :         *mem_limit = hash_mem_limit;
    1805           26193 :         *ngroups_limit = hash_mem_limit / hashentrysize;
    1806           26193 :         return;
    1807                 :     }
    1808 ECB             : 
    1809                 :     /*
    1810                 :      * Calculate expected memory requirements for spilling, which is the size
    1811                 :      * of the buffers needed for all the tapes that need to be open at once.
    1812                 :      * Then, subtract that from the memory available for holding hash tables.
    1813                 :      */
    1814 CBC        1215 :     npartitions = hash_choose_num_partitions(input_groups,
    1815 ECB             :                                              hashentrysize,
    1816                 :                                              used_bits,
    1817                 :                                              NULL);
    1818 GIC        1215 :     if (num_partitions != NULL)
    1819              45 :         *num_partitions = npartitions;
    1820                 : 
    1821            1215 :     partition_mem =
    1822            1215 :         HASHAGG_READ_BUFFER_SIZE +
    1823                 :         HASHAGG_WRITE_BUFFER_SIZE * npartitions;
    1824                 : 
    1825 ECB             :     /*
    1826                 :      * Don't set the limit below 3/4 of hash_mem. In that case, we are at the
    1827                 :      * minimum number of partitions, so we aren't going to dramatically exceed
    1828                 :      * work mem anyway.
    1829                 :      */
    1830 CBC        1215 :     if (hash_mem_limit > 4 * partition_mem)
    1831 UIC           0 :         *mem_limit = hash_mem_limit - partition_mem;
    1832 ECB             :     else
    1833 CBC        1215 :         *mem_limit = hash_mem_limit * 0.75;
    1834                 : 
    1835 GIC        1215 :     if (*mem_limit > hashentrysize)
    1836            1215 :         *ngroups_limit = *mem_limit / hashentrysize;
    1837                 :     else
    1838 UIC           0 :         *ngroups_limit = 1;
    1839                 : }
    1840                 : 
    1841 ECB             : /*
    1842 EUB             :  * hash_agg_check_limits
    1843                 :  *
    1844 ECB             :  * After adding a new group to the hash table, check whether we need to enter
    1845                 :  * spill mode. Allocations may happen without adding new groups (for instance,
    1846                 :  * if the transition state size grows), so this check is imperfect.
    1847                 :  */
    1848                 : static void
    1849 GBC      268361 : hash_agg_check_limits(AggState *aggstate)
    1850                 : {
    1851 GIC      268361 :     uint64      ngroups = aggstate->hash_ngroups_current;
    1852          268361 :     Size        meta_mem = MemoryContextMemAllocated(aggstate->hash_metacxt,
    1853                 :                                                      true);
    1854          268361 :     Size        hashkey_mem = MemoryContextMemAllocated(aggstate->hashcontext->ecxt_per_tuple_memory,
    1855                 :                                                         true);
    1856                 : 
    1857                 :     /*
    1858                 :      * Don't spill unless there's at least one group in the hash table so we
    1859                 :      * can be sure to make progress even in edge cases.
    1860 ECB             :      */
    1861 GIC      268361 :     if (aggstate->hash_ngroups_current > 0 &&
    1862 CBC      268361 :         (meta_mem + hashkey_mem > aggstate->hash_mem_limit ||
    1863          255143 :          ngroups > aggstate->hash_ngroups_limit))
    1864                 :     {
    1865           13236 :         hash_agg_enter_spill_mode(aggstate);
    1866                 :     }
    1867 GIC      268361 : }
    1868                 : 
    1869                 : /*
    1870                 :  * Enter "spill mode", meaning that no new groups are added to any of the hash
    1871                 :  * tables. Tuples that would create a new group are instead spilled, and
    1872 ECB             :  * processed later.
    1873                 :  */
    1874                 : static void
    1875 GIC       13236 : hash_agg_enter_spill_mode(AggState *aggstate)
    1876 ECB             : {
    1877 GIC       13236 :     aggstate->hash_spill_mode = true;
    1878 CBC       13236 :     hashagg_recompile_expressions(aggstate, aggstate->table_filled, true);
    1879                 : 
    1880 GIC       13236 :     if (!aggstate->hash_ever_spilled)
    1881                 :     {
    1882              33 :         Assert(aggstate->hash_tapeset == NULL);
    1883              33 :         Assert(aggstate->hash_spills == NULL);
    1884                 : 
    1885              33 :         aggstate->hash_ever_spilled = true;
    1886 ECB             : 
    1887 GIC          33 :         aggstate->hash_tapeset = LogicalTapeSetCreate(true, NULL, -1);
    1888 ECB             : 
    1889 CBC          33 :         aggstate->hash_spills = palloc(sizeof(HashAggSpill) * aggstate->num_hashes);
    1890                 : 
    1891              96 :         for (int setno = 0; setno < aggstate->num_hashes; setno++)
    1892                 :         {
    1893              63 :             AggStatePerHash perhash = &aggstate->perhash[setno];
    1894              63 :             HashAggSpill *spill = &aggstate->hash_spills[setno];
    1895                 : 
    1896              63 :             hashagg_spill_init(spill, aggstate->hash_tapeset, 0,
    1897 GIC          63 :                                perhash->aggnode->numGroups,
    1898 ECB             :                                aggstate->hashentrysize);
    1899                 :         }
    1900                 :     }
    1901 GIC       13236 : }
    1902 ECB             : 
    1903                 : /*
    1904                 :  * Update metrics after filling the hash table.
    1905                 :  *
    1906                 :  * If reading from the outer plan, from_tape should be false; if reading from
    1907                 :  * another tape, from_tape should be true.
    1908                 :  */
    1909                 : static void
    1910 GIC       54765 : hash_agg_update_metrics(AggState *aggstate, bool from_tape, int npartitions)
    1911                 : {
    1912 ECB             :     Size        meta_mem;
    1913                 :     Size        hashkey_mem;
    1914                 :     Size        buffer_mem;
    1915                 :     Size        total_mem;
    1916                 : 
    1917 GIC       54765 :     if (aggstate->aggstrategy != AGG_MIXED &&
    1918           41571 :         aggstate->aggstrategy != AGG_HASHED)
    1919 UIC           0 :         return;
    1920                 : 
    1921 ECB             :     /* memory for the hash table itself */
    1922 GIC       54765 :     meta_mem = MemoryContextMemAllocated(aggstate->hash_metacxt, true);
    1923                 : 
    1924                 :     /* memory for the group keys and transition states */
    1925           54765 :     hashkey_mem = MemoryContextMemAllocated(aggstate->hashcontext->ecxt_per_tuple_memory, true);
    1926                 : 
    1927                 :     /* memory for read/write tape buffers, if spilled */
    1928 CBC       54765 :     buffer_mem = npartitions * HASHAGG_WRITE_BUFFER_SIZE;
    1929           54765 :     if (from_tape)
    1930 GBC       13506 :         buffer_mem += HASHAGG_READ_BUFFER_SIZE;
    1931                 : 
    1932                 :     /* update peak mem */
    1933 CBC       54765 :     total_mem = meta_mem + hashkey_mem + buffer_mem;
    1934 GIC       54765 :     if (total_mem > aggstate->hash_mem_peak)
    1935            3363 :         aggstate->hash_mem_peak = total_mem;
    1936 ECB             : 
    1937                 :     /* update disk usage */
    1938 GIC       54765 :     if (aggstate->hash_tapeset != NULL)
    1939 ECB             :     {
    1940 CBC       13539 :         uint64      disk_used = LogicalTapeSetBlocks(aggstate->hash_tapeset) * (BLCKSZ / 1024);
    1941 ECB             : 
    1942 GIC       13539 :         if (aggstate->hash_disk_used < disk_used)
    1943              30 :             aggstate->hash_disk_used = disk_used;
    1944 ECB             :     }
    1945                 : 
    1946                 :     /* update hashentrysize estimate based on contents */
    1947 GIC       54765 :     if (aggstate->hash_ngroups_current > 0)
    1948                 :     {
    1949 CBC       53531 :         aggstate->hashentrysize =
    1950 GIC       53531 :             sizeof(TupleHashEntryData) +
    1951 CBC       53531 :             (hashkey_mem / (double) aggstate->hash_ngroups_current);
    1952                 :     }
    1953 ECB             : }
    1954                 : 
    1955                 : /*
    1956                 :  * Choose a reasonable number of buckets for the initial hash table size.
    1957                 :  */
    1958                 : static long
    1959 GIC        3578 : hash_choose_num_buckets(double hashentrysize, long ngroups, Size memory)
    1960 ECB             : {
    1961                 :     long        max_nbuckets;
    1962 CBC        3578 :     long        nbuckets = ngroups;
    1963                 : 
    1964 GIC        3578 :     max_nbuckets = memory / hashentrysize;
    1965                 : 
    1966                 :     /*
    1967                 :      * Underestimating is better than overestimating. Too many buckets crowd
    1968                 :      * out space for group keys and transition state values.
    1969                 :      */
    1970 CBC        3578 :     max_nbuckets >>= 1;
    1971                 : 
    1972 GIC        3578 :     if (nbuckets > max_nbuckets)
    1973 CBC          36 :         nbuckets = max_nbuckets;
    1974                 : 
    1975            3578 :     return Max(nbuckets, 1);
    1976                 : }
    1977                 : 
    1978                 : /*
    1979                 :  * Determine the number of partitions to create when spilling, which will
    1980                 :  * always be a power of two. If log2_npartitions is non-NULL, set
    1981 ECB             :  * *log2_npartitions to the log2() of the number of partitions.
    1982                 :  */
    1983                 : static int
    1984 CBC        7533 : hash_choose_num_partitions(double input_groups, double hashentrysize,
    1985                 :                            int used_bits, int *log2_npartitions)
    1986 ECB             : {
    1987 GIC        7533 :     Size        hash_mem_limit = get_hash_memory_limit();
    1988                 :     double      partition_limit;
    1989                 :     double      mem_wanted;
    1990                 :     double      dpartitions;
    1991                 :     int         npartitions;
    1992                 :     int         partition_bits;
    1993                 : 
    1994                 :     /*
    1995 ECB             :      * Avoid creating so many partitions that the memory requirements of the
    1996                 :      * open partition files are greater than 1/4 of hash_mem.
    1997                 :      */
    1998 CBC        7533 :     partition_limit =
    1999 GIC        7533 :         (hash_mem_limit * 0.25 - HASHAGG_READ_BUFFER_SIZE) /
    2000                 :         HASHAGG_WRITE_BUFFER_SIZE;
    2001                 : 
    2002            7533 :     mem_wanted = HASHAGG_PARTITION_FACTOR * input_groups * hashentrysize;
    2003                 : 
    2004                 :     /* make enough partitions so that each one is likely to fit in memory */
    2005            7533 :     dpartitions = 1 + (mem_wanted / hash_mem_limit);
    2006                 : 
    2007            7533 :     if (dpartitions > partition_limit)
    2008            7512 :         dpartitions = partition_limit;
    2009 ECB             : 
    2010 CBC        7533 :     if (dpartitions < HASHAGG_MIN_PARTITIONS)
    2011 GIC        7533 :         dpartitions = HASHAGG_MIN_PARTITIONS;
    2012            7533 :     if (dpartitions > HASHAGG_MAX_PARTITIONS)
    2013 LBC           0 :         dpartitions = HASHAGG_MAX_PARTITIONS;
    2014                 : 
    2015                 :     /* HASHAGG_MAX_PARTITIONS limit makes this safe */
    2016 CBC        7533 :     npartitions = (int) dpartitions;
    2017                 : 
    2018 ECB             :     /* ceil(log2(npartitions)) */
    2019 CBC        7533 :     partition_bits = my_log2(npartitions);
    2020                 : 
    2021 ECB             :     /* make sure that we don't exhaust the hash bits */
    2022 CBC        7533 :     if (partition_bits + used_bits >= 32)
    2023 LBC           0 :         partition_bits = 32 - used_bits;
    2024 EUB             : 
    2025 GIC        7533 :     if (log2_npartitions != NULL)
    2026            6318 :         *log2_npartitions = partition_bits;
    2027 ECB             : 
    2028                 :     /* number of partitions will be a power of two */
    2029 GIC        7533 :     npartitions = 1 << partition_bits;
    2030 ECB             : 
    2031 GIC        7533 :     return npartitions;
    2032                 : }
    2033 ECB             : 
    2034 EUB             : /*
    2035                 :  * Initialize a freshly-created TupleHashEntry.
    2036 ECB             :  */
    2037                 : static void
    2038 GIC      268361 : initialize_hash_entry(AggState *aggstate, TupleHashTable hashtable,
    2039                 :                       TupleHashEntry entry)
    2040 ECB             : {
    2041                 :     AggStatePerGroup pergroup;
    2042                 :     int         transno;
    2043                 : 
    2044 GIC      268361 :     aggstate->hash_ngroups_current++;
    2045          268361 :     hash_agg_check_limits(aggstate);
    2046                 : 
    2047                 :     /* no need to allocate or initialize per-group state */
    2048          268361 :     if (aggstate->numtrans == 0)
    2049 CBC      151687 :         return;
    2050                 : 
    2051                 :     pergroup = (AggStatePerGroup)
    2052 GIC      116674 :         MemoryContextAlloc(hashtable->tablecxt,
    2053          116674 :                            sizeof(AggStatePerGroupData) * aggstate->numtrans);
    2054                 : 
    2055 CBC      116674 :     entry->additional = pergroup;
    2056 ECB             : 
    2057                 :     /*
    2058                 :      * Initialize aggregates for new tuple group, lookup_hash_entries()
    2059                 :      * already has selected the relevant grouping set.
    2060                 :      */
    2061 GIC      306667 :     for (transno = 0; transno < aggstate->numtrans; transno++)
    2062                 :     {
    2063 CBC      189993 :         AggStatePerTrans pertrans = &aggstate->pertrans[transno];
    2064          189993 :         AggStatePerGroup pergroupstate = &pergroup[transno];
    2065                 : 
    2066          189993 :         initialize_aggregate(aggstate, pertrans, pergroupstate);
    2067                 :     }
    2068                 : }
    2069                 : 
    2070                 : /*
    2071                 :  * Look up hash entries for the current tuple in all hashed grouping sets.
    2072 ECB             :  *
    2073                 :  * Be aware that lookup_hash_entry can reset the tmpcontext.
    2074                 :  *
    2075                 :  * Some entries may be left NULL if we are in "spill mode". The same tuple
    2076                 :  * will belong to different groups for each grouping set, so may match a group
    2077                 :  * already in memory for one set and match a group not in memory for another
    2078                 :  * set. When in "spill mode", the tuple will be spilled for each grouping set
    2079                 :  * where it doesn't match a group in memory.
    2080                 :  *
    2081                 :  * NB: It's possible to spill the same tuple for several different grouping
    2082                 :  * sets. This may seem wasteful, but it's actually a trade-off: if we spill
    2083                 :  * the tuple multiple times for multiple grouping sets, it can be partitioned
    2084                 :  * for each grouping set, making the refilling of the hash table very
    2085                 :  * efficient.
    2086                 :  */
    2087                 : static void
    2088 GIC     2694173 : lookup_hash_entries(AggState *aggstate)
    2089                 : {
    2090         2694173 :     AggStatePerGroup *pergroup = aggstate->hash_pergroup;
    2091         2694173 :     TupleTableSlot *outerslot = aggstate->tmpcontext->ecxt_outertuple;
    2092                 :     int         setno;
    2093                 : 
    2094         5455494 :     for (setno = 0; setno < aggstate->num_hashes; setno++)
    2095                 :     {
    2096         2761321 :         AggStatePerHash perhash = &aggstate->perhash[setno];
    2097         2761321 :         TupleHashTable hashtable = perhash->hashtable;
    2098         2761321 :         TupleTableSlot *hashslot = perhash->hashslot;
    2099 ECB             :         TupleHashEntry entry;
    2100                 :         uint32      hash;
    2101 CBC     2761321 :         bool        isnew = false;
    2102 ECB             :         bool       *p_isnew;
    2103                 : 
    2104                 :         /* if hash table already spilled, don't create new entries */
    2105 CBC     2761321 :         p_isnew = aggstate->hash_spill_mode ? NULL : &isnew;
    2106                 : 
    2107         2761321 :         select_current_set(aggstate, setno, true);
    2108         2761321 :         prepare_hash_slot(perhash,
    2109 ECB             :                           outerslot,
    2110                 :                           hashslot);
    2111                 : 
    2112 CBC     2761321 :         entry = LookupTupleHashEntry(hashtable, hashslot,
    2113                 :                                      p_isnew, &hash);
    2114                 : 
    2115 GIC     2761321 :         if (entry != NULL)
    2116 ECB             :         {
    2117 GIC     2644897 :             if (isnew)
    2118 CBC      220244 :                 initialize_hash_entry(aggstate, hashtable, entry);
    2119         2644897 :             pergroup[setno] = entry->additional;
    2120                 :         }
    2121                 :         else
    2122                 :         {
    2123          116424 :             HashAggSpill *spill = &aggstate->hash_spills[setno];
    2124 GIC      116424 :             TupleTableSlot *slot = aggstate->tmpcontext->ecxt_outertuple;
    2125                 : 
    2126 CBC      116424 :             if (spill->partitions == NULL)
    2127 UIC           0 :                 hashagg_spill_init(spill, aggstate->hash_tapeset, 0,
    2128 LBC           0 :                                    perhash->aggnode->numGroups,
    2129 ECB             :                                    aggstate->hashentrysize);
    2130                 : 
    2131 GIC      116424 :             hashagg_spill_tuple(aggstate, spill, slot, hash);
    2132          116424 :             pergroup[setno] = NULL;
    2133                 :         }
    2134 ECB             :     }
    2135 CBC     2694173 : }
    2136                 : 
    2137 ECB             : /*
    2138 EUB             :  * ExecAgg -
    2139                 :  *
    2140                 :  *    ExecAgg receives tuples from its outer subplan and aggregates over
    2141                 :  *    the appropriate attribute for each aggregate function use (Aggref
    2142 ECB             :  *    node) appearing in the targetlist or qual of the node.  The number
    2143                 :  *    of tuples to aggregate over depends on whether grouped or plain
    2144                 :  *    aggregation is selected.  In grouped aggregation, we produce a result
    2145                 :  *    row for each group; in plain aggregation there's a single result row
    2146                 :  *    for the whole query.  In either case, the value of each aggregate is
    2147                 :  *    stored in the expression context to be used when ExecProject evaluates
    2148                 :  *    the result tuple.
    2149                 :  */
    2150                 : static TupleTableSlot *
    2151 GIC      499340 : ExecAgg(PlanState *pstate)
    2152                 : {
    2153          499340 :     AggState   *node = castNode(AggState, pstate);
    2154          499340 :     TupleTableSlot *result = NULL;
    2155                 : 
    2156          499340 :     CHECK_FOR_INTERRUPTS();
    2157                 : 
    2158          499340 :     if (!node->agg_done)
    2159                 :     {
    2160                 :         /* Dispatch based on strategy */
    2161          445329 :         switch (node->phase->aggstrategy)
    2162 ECB             :         {
    2163 GIC      288377 :             case AGG_HASHED:
    2164 CBC      288377 :                 if (!node->table_filled)
    2165           41196 :                     agg_fill_hash_table(node);
    2166                 :                 /* FALLTHROUGH */
    2167 ECB             :             case AGG_MIXED:
    2168 GIC      302049 :                 result = agg_retrieve_hash_table(node);
    2169 CBC      302049 :                 break;
    2170 GIC      143280 :             case AGG_PLAIN:
    2171                 :             case AGG_SORTED:
    2172 CBC      143280 :                 result = agg_retrieve_direct(node);
    2173 GIC      143232 :                 break;
    2174 ECB             :         }
    2175                 : 
    2176 CBC      445281 :         if (!TupIsNull(result))
    2177 GIC      403289 :             return result;
    2178                 :     }
    2179 ECB             : 
    2180 CBC       96003 :     return NULL;
    2181 ECB             : }
    2182                 : 
    2183                 : /*
    2184                 :  * ExecAgg for non-hashed case
    2185                 :  */
    2186                 : static TupleTableSlot *
    2187 CBC      143280 : agg_retrieve_direct(AggState *aggstate)
    2188 ECB             : {
    2189 GIC      143280 :     Agg        *node = aggstate->phase->aggnode;
    2190                 :     ExprContext *econtext;
    2191 ECB             :     ExprContext *tmpcontext;
    2192                 :     AggStatePerAgg peragg;
    2193                 :     AggStatePerGroup *pergroups;
    2194                 :     TupleTableSlot *outerslot;
    2195                 :     TupleTableSlot *firstSlot;
    2196                 :     TupleTableSlot *result;
    2197 GIC      143280 :     bool        hasGroupingSets = aggstate->phase->numsets > 0;
    2198 CBC      143280 :     int         numGroupingSets = Max(aggstate->phase->numsets, 1);
    2199                 :     int         currentSet;
    2200 ECB             :     int         nextSetSize;
    2201                 :     int         numReset;
    2202                 :     int         i;
    2203                 : 
    2204                 :     /*
    2205                 :      * get state info from node
    2206                 :      *
    2207                 :      * econtext is the per-output-tuple expression context
    2208                 :      *
    2209                 :      * tmpcontext is the per-input-tuple expression context
    2210                 :      */
    2211 GIC      143280 :     econtext = aggstate->ss.ps.ps_ExprContext;
    2212          143280 :     tmpcontext = aggstate->tmpcontext;
    2213                 : 
    2214          143280 :     peragg = aggstate->peragg;
    2215          143280 :     pergroups = aggstate->pergroups;
    2216          143280 :     firstSlot = aggstate->ss.ss_ScanTupleSlot;
    2217                 : 
    2218                 :     /*
    2219                 :      * We loop retrieving groups until we find one matching
    2220                 :      * aggstate->ss.ps.qual
    2221                 :      *
    2222 ECB             :      * For grouping sets, we have the invariant that aggstate->projected_set
    2223                 :      * is either -1 (initial call) or the index (starting from 0) in
    2224                 :      * gset_lengths for the group we just completed (either by projecting a
    2225                 :      * row or by discarding it in the qual).
    2226                 :      */
    2227 CBC      178598 :     while (!aggstate->agg_done)
    2228                 :     {
    2229                 :         /*
    2230                 :          * Clear the per-output-tuple context for each group, as well as
    2231                 :          * aggcontext (which contains any pass-by-ref transvalues of the old
    2232                 :          * group).  Some aggregate functions store working state in child
    2233                 :          * contexts; those now get reset automatically without us needing to
    2234                 :          * do anything special.
    2235                 :          *
    2236                 :          * We use ReScanExprContext not just ResetExprContext because we want
    2237                 :          * any registered shutdown callbacks to be called.  That allows
    2238 ECB             :          * aggregate functions to ensure they've cleaned up any non-memory
    2239                 :          * resources.
    2240                 :          */
    2241 GIC      178537 :         ReScanExprContext(econtext);
    2242                 : 
    2243                 :         /*
    2244                 :          * Determine how many grouping sets need to be reset at this boundary.
    2245                 :          */
    2246          178537 :         if (aggstate->projected_set >= 0 &&
    2247          122845 :             aggstate->projected_set < numGroupingSets)
    2248          122842 :             numReset = aggstate->projected_set + 1;
    2249                 :         else
    2250           55695 :             numReset = numGroupingSets;
    2251                 : 
    2252 ECB             :         /*
    2253                 :          * numReset can change on a phase boundary, but that's OK; we want to
    2254                 :          * reset the contexts used in _this_ phase, and later, after possibly
    2255                 :          * changing phase, initialize the right number of aggregates for the
    2256                 :          * _new_ phase.
    2257                 :          */
    2258                 : 
    2259 CBC      368195 :         for (i = 0; i < numReset; i++)
    2260                 :         {
    2261          189658 :             ReScanExprContext(aggstate->aggcontexts[i]);
    2262                 :         }
    2263                 : 
    2264                 :         /*
    2265                 :          * Check if input is complete and there are no more groups to project
    2266                 :          * in this phase; move to next phase or mark as done.
    2267                 :          */
    2268 GIC      178537 :         if (aggstate->input_done == true &&
    2269             762 :             aggstate->projected_set >= (numGroupingSets - 1))
    2270 ECB             :         {
    2271 GIC         360 :             if (aggstate->current_phase < aggstate->numphases - 1)
    2272 ECB             :             {
    2273 GIC          93 :                 initialize_phase(aggstate, aggstate->current_phase + 1);
    2274              93 :                 aggstate->input_done = false;
    2275              93 :                 aggstate->projected_set = -1;
    2276              93 :                 numGroupingSets = Max(aggstate->phase->numsets, 1);
    2277              93 :                 node = aggstate->phase->aggnode;
    2278              93 :                 numReset = numGroupingSets;
    2279 ECB             :             }
    2280 CBC         267 :             else if (aggstate->aggstrategy == AGG_MIXED)
    2281                 :             {
    2282 ECB             :                 /*
    2283                 :                  * Mixed mode; we've output all the grouped stuff and have
    2284                 :                  * full hashtables, so switch to outputting those.
    2285                 :                  */
    2286 CBC          69 :                 initialize_phase(aggstate, 0);
    2287              69 :                 aggstate->table_filled = true;
    2288              69 :                 ResetTupleHashIterator(aggstate->perhash[0].hashtable,
    2289 ECB             :                                        &aggstate->perhash[0].hashiter);
    2290 GIC          69 :                 select_current_set(aggstate, 0, true);
    2291 CBC          69 :                 return agg_retrieve_hash_table(aggstate);
    2292                 :             }
    2293                 :             else
    2294                 :             {
    2295 GIC         198 :                 aggstate->agg_done = true;
    2296             198 :                 break;
    2297 ECB             :             }
    2298                 :         }
    2299                 : 
    2300                 :         /*
    2301                 :          * Get the number of columns in the next grouping set after the last
    2302                 :          * projected one (if any). This is the number of columns to compare to
    2303                 :          * see if we reached the boundary of that set too.
    2304                 :          */
    2305 GIC      178270 :         if (aggstate->projected_set >= 0 &&
    2306 CBC      122485 :             aggstate->projected_set < (numGroupingSets - 1))
    2307           13638 :             nextSetSize = aggstate->phase->gset_lengths[aggstate->projected_set + 1];
    2308                 :         else
    2309 GIC      164632 :             nextSetSize = 0;
    2310                 : 
    2311                 :         /*----------
    2312                 :          * If a subgroup for the current grouping set is present, project it.
    2313                 :          *
    2314                 :          * We have a new group if:
    2315                 :          *  - we're out of input but haven't projected all grouping sets
    2316 ECB             :          *    (checked above)
    2317                 :          * OR
    2318                 :          *    - we already projected a row that wasn't from the last grouping
    2319                 :          *      set
    2320                 :          *    AND
    2321                 :          *    - the next grouping set has at least one grouping column (since
    2322                 :          *      empty grouping sets project only once input is exhausted)
    2323                 :          *    AND
    2324                 :          *    - the previous and pending rows differ on the grouping columns
    2325                 :          *      of the next grouping set
    2326                 :          *----------
    2327                 :          */
    2328 GIC      178270 :         tmpcontext->ecxt_innertuple = econtext->ecxt_outertuple;
    2329          178270 :         if (aggstate->input_done ||
    2330          177868 :             (node->aggstrategy != AGG_PLAIN &&
    2331          122889 :              aggstate->projected_set != -1 &&
    2332          122083 :              aggstate->projected_set < (numGroupingSets - 1) &&
    2333            9970 :              nextSetSize > 0 &&
    2334            9970 :              !ExecQualAndReset(aggstate->phase->eqfunctions[nextSetSize - 1],
    2335                 :                                tmpcontext)))
    2336                 :         {
    2337            7069 :             aggstate->projected_set += 1;
    2338                 : 
    2339 CBC        7069 :             Assert(aggstate->projected_set < numGroupingSets);
    2340            7069 :             Assert(nextSetSize > 0 || aggstate->input_done);
    2341 ECB             :         }
    2342                 :         else
    2343                 :         {
    2344                 :             /*
    2345                 :              * We no longer care what group we just projected, the next
    2346                 :              * projection will always be the first (or only) grouping set
    2347                 :              * (unless the input proves to be empty).
    2348                 :              */
    2349 GIC      171201 :             aggstate->projected_set = 0;
    2350 ECB             : 
    2351                 :             /*
    2352                 :              * If we don't already have the first tuple of the new group,
    2353                 :              * fetch it from the outer plan.
    2354                 :              */
    2355 GIC      171201 :             if (aggstate->grp_firstTuple == NULL)
    2356                 :             {
    2357           55785 :                 outerslot = fetch_input_tuple(aggstate);
    2358           55776 :                 if (!TupIsNull(outerslot))
    2359                 :                 {
    2360 ECB             :                     /*
    2361                 :                      * Make a copy of the first input tuple; we will use this
    2362                 :                      * for comparisons (in group mode) and for projection.
    2363                 :                      */
    2364 GIC       50882 :                     aggstate->grp_firstTuple = ExecCopySlotHeapTuple(outerslot);
    2365                 :                 }
    2366 ECB             :                 else
    2367                 :                 {
    2368                 :                     /* outer plan produced no tuples at all */
    2369 CBC        4894 :                     if (hasGroupingSets)
    2370                 :                     {
    2371                 :                         /*
    2372                 :                          * If there was no input at all, we need to project
    2373                 :                          * rows only if there are grouping sets of size 0.
    2374                 :                          * Note that this implies that there can't be any
    2375 ECB             :                          * references to ungrouped Vars, which would otherwise
    2376                 :                          * cause issues with the empty output slot.
    2377                 :                          *
    2378                 :                          * XXX: This is no longer true, we currently deal with
    2379                 :                          * this in finalize_aggregates().
    2380                 :                          */
    2381 GIC          27 :                         aggstate->input_done = true;
    2382                 : 
    2383              39 :                         while (aggstate->phase->gset_lengths[aggstate->projected_set] > 0)
    2384                 :                         {
    2385              15 :                             aggstate->projected_set += 1;
    2386              15 :                             if (aggstate->projected_set >= numGroupingSets)
    2387                 :                             {
    2388                 :                                 /*
    2389                 :                                  * We can't set agg_done here because we might
    2390                 :                                  * have more phases to do, even though the
    2391                 :                                  * input is empty. So we need to restart the
    2392 ECB             :                                  * whole outer loop.
    2393                 :                                  */
    2394 CBC           3 :                                 break;
    2395                 :                             }
    2396 ECB             :                         }
    2397                 : 
    2398 GIC          27 :                         if (aggstate->projected_set >= numGroupingSets)
    2399               3 :                             continue;
    2400                 :                     }
    2401                 :                     else
    2402                 :                     {
    2403            4867 :                         aggstate->agg_done = true;
    2404                 :                         /* If we are grouping, we should produce no tuples too */
    2405 CBC        4867 :                         if (node->aggstrategy != AGG_PLAIN)
    2406 GIC          76 :                             return NULL;
    2407                 :                     }
    2408                 :                 }
    2409 ECB             :             }
    2410                 : 
    2411                 :             /*
    2412                 :              * Initialize working state for a new input tuple group.
    2413                 :              */
    2414 CBC      171113 :             initialize_aggregates(aggstate, pergroups, numReset);
    2415                 : 
    2416          171113 :             if (aggstate->grp_firstTuple != NULL)
    2417 ECB             :             {
    2418                 :                 /*
    2419                 :                  * Store the copied first input tuple in the tuple table slot
    2420                 :                  * reserved for it.  The tuple will be deleted when it is
    2421                 :                  * cleared from the slot.
    2422                 :                  */
    2423 GIC      166298 :                 ExecForceStoreHeapTuple(aggstate->grp_firstTuple,
    2424                 :                                         firstSlot, true);
    2425 CBC      166298 :                 aggstate->grp_firstTuple = NULL; /* don't keep two pointers */
    2426                 : 
    2427 ECB             :                 /* set up for first advance_aggregates call */
    2428 GIC      166298 :                 tmpcontext->ecxt_outertuple = firstSlot;
    2429                 : 
    2430                 :                 /*
    2431                 :                  * Process each outer-plan tuple, and then fetch the next one,
    2432                 :                  * until we exhaust the outer plan or cross a group boundary.
    2433                 :                  */
    2434 ECB             :                 for (;;)
    2435                 :                 {
    2436                 :                     /*
    2437                 :                      * During phase 1 only of a mixed agg, we need to update
    2438                 :                      * hashtables as well in advance_aggregates.
    2439                 :                      */
    2440 GIC     9882641 :                     if (aggstate->aggstrategy == AGG_MIXED &&
    2441           19022 :                         aggstate->current_phase == 1)
    2442                 :                     {
    2443           19022 :                         lookup_hash_entries(aggstate);
    2444                 :                     }
    2445                 : 
    2446                 :                     /* Advance the aggregates (or combine functions) */
    2447         9882641 :                     advance_aggregates(aggstate);
    2448                 : 
    2449                 :                     /* Reset per-input-tuple context after each tuple */
    2450         9882608 :                     ResetExprContext(tmpcontext);
    2451 ECB             : 
    2452 CBC     9882608 :                     outerslot = fetch_input_tuple(aggstate);
    2453 GIC     9882608 :                     if (TupIsNull(outerslot))
    2454 ECB             :                     {
    2455                 :                         /* no more outer-plan tuples available */
    2456                 : 
    2457                 :                         /* if we built hash tables, finalize any spills */
    2458 CBC       50846 :                         if (aggstate->aggstrategy == AGG_MIXED &&
    2459 GIC          63 :                             aggstate->current_phase == 1)
    2460              63 :                             hashagg_finish_initial_spills(aggstate);
    2461 ECB             : 
    2462 GIC       50846 :                         if (hasGroupingSets)
    2463 ECB             :                         {
    2464 CBC         333 :                             aggstate->input_done = true;
    2465 GIC         333 :                             break;
    2466                 :                         }
    2467                 :                         else
    2468                 :                         {
    2469 CBC       50513 :                             aggstate->agg_done = true;
    2470           50513 :                             break;
    2471 ECB             :                         }
    2472                 :                     }
    2473                 :                     /* set up for next advance_aggregates call */
    2474 GIC     9831762 :                     tmpcontext->ecxt_outertuple = outerslot;
    2475 ECB             : 
    2476                 :                     /*
    2477                 :                      * If we are grouping, check whether we've crossed a group
    2478                 :                      * boundary.
    2479                 :                      */
    2480 GNC     9831762 :                     if (node->aggstrategy != AGG_PLAIN && node->numCols > 0)
    2481 ECB             :                     {
    2482 GIC      887984 :                         tmpcontext->ecxt_innertuple = firstSlot;
    2483          887984 :                         if (!ExecQual(aggstate->phase->eqfunctions[node->numCols - 1],
    2484                 :                                       tmpcontext))
    2485 ECB             :                         {
    2486 GIC      115419 :                             aggstate->grp_firstTuple = ExecCopySlotHeapTuple(outerslot);
    2487          115419 :                             break;
    2488                 :                         }
    2489                 :                     }
    2490                 :                 }
    2491 ECB             :             }
    2492                 : 
    2493                 :             /*
    2494                 :              * Use the representative input tuple for any references to
    2495                 :              * non-aggregated input columns in aggregate direct args, the node
    2496                 :              * qual, and the tlist.  (If we are not grouping, and there are no
    2497                 :              * input rows at all, we will come here with an empty firstSlot
    2498                 :              * ... but if not grouping, there can't be any references to
    2499                 :              * non-aggregated input columns, so no problem.)
    2500                 :              */
    2501 GIC      171080 :             econtext->ecxt_outertuple = firstSlot;
    2502                 :         }
    2503                 : 
    2504          178149 :         Assert(aggstate->projected_set >= 0);
    2505                 : 
    2506          178149 :         currentSet = aggstate->projected_set;
    2507                 : 
    2508          178149 :         prepare_projection_slot(aggstate, econtext->ecxt_outertuple, currentSet);
    2509                 : 
    2510          178149 :         select_current_set(aggstate, currentSet, false);
    2511                 : 
    2512 CBC      178149 :         finalize_aggregates(aggstate,
    2513                 :                             peragg,
    2514 GIC      178149 :                             pergroups[currentSet]);
    2515 ECB             : 
    2516                 :         /*
    2517                 :          * If there's no row to project right now, we must continue rather
    2518                 :          * than returning a null since there might be more groups.
    2519                 :          */
    2520 GIC      178143 :         result = project_aggregates(aggstate);
    2521 CBC      178143 :         if (result)
    2522 GIC      142828 :             return result;
    2523 ECB             :     }
    2524                 : 
    2525                 :     /* No more groups */
    2526 GIC         259 :     return NULL;
    2527                 : }
    2528                 : 
    2529                 : /*
    2530                 :  * ExecAgg for hashed case: read input and build hash table
    2531 ECB             :  */
    2532                 : static void
    2533 CBC       41196 : agg_fill_hash_table(AggState *aggstate)
    2534                 : {
    2535                 :     TupleTableSlot *outerslot;
    2536 GIC       41196 :     ExprContext *tmpcontext = aggstate->tmpcontext;
    2537 ECB             : 
    2538                 :     /*
    2539                 :      * Process each outer-plan tuple, and then fetch the next one, until we
    2540                 :      * exhaust the outer plan.
    2541                 :      */
    2542                 :     for (;;)
    2543                 :     {
    2544 CBC     2716347 :         outerslot = fetch_input_tuple(aggstate);
    2545 GIC     2716347 :         if (TupIsNull(outerslot))
    2546                 :             break;
    2547 ECB             : 
    2548                 :         /* set up for lookup_hash_entries and advance_aggregates */
    2549 GIC     2675151 :         tmpcontext->ecxt_outertuple = outerslot;
    2550                 : 
    2551                 :         /* Find or build hashtable entries */
    2552         2675151 :         lookup_hash_entries(aggstate);
    2553                 : 
    2554                 :         /* Advance the aggregates (or combine functions) */
    2555 CBC     2675151 :         advance_aggregates(aggstate);
    2556 ECB             : 
    2557                 :         /*
    2558                 :          * Reset per-input-tuple context after each tuple, but note that the
    2559                 :          * hash lookups do this too
    2560                 :          */
    2561 GIC     2675151 :         ResetExprContext(aggstate->tmpcontext);
    2562                 :     }
    2563 ECB             : 
    2564                 :     /* finalize spills, if any */
    2565 GIC       41196 :     hashagg_finish_initial_spills(aggstate);
    2566 ECB             : 
    2567 GIC       41196 :     aggstate->table_filled = true;
    2568                 :     /* Initialize to walk the first hash table */
    2569           41196 :     select_current_set(aggstate, 0, true);
    2570           41196 :     ResetTupleHashIterator(aggstate->perhash[0].hashtable,
    2571                 :                            &aggstate->perhash[0].hashiter);
    2572 CBC       41196 : }
    2573                 : 
    2574                 : /*
    2575                 :  * If any data was spilled during hash aggregation, reset the hash table and
    2576 ECB             :  * reprocess one batch of spilled data. After reprocessing a batch, the hash
    2577                 :  * table will again contain data, ready to be consumed by
    2578                 :  * agg_retrieve_hash_table_in_memory().
    2579                 :  *
    2580                 :  * Should only be called after all in memory hash table entries have been
    2581                 :  * finalized and emitted.
    2582                 :  *
    2583                 :  * Return false when input is exhausted and there's no more work to be done;
    2584                 :  * otherwise return true.
    2585                 :  */
    2586                 : static bool
    2587 GIC       55163 : agg_refill_hash_table(AggState *aggstate)
    2588                 : {
    2589                 :     HashAggBatch *batch;
    2590                 :     AggStatePerHash perhash;
    2591                 :     HashAggSpill spill;
    2592           55163 :     LogicalTapeSet *tapeset = aggstate->hash_tapeset;
    2593           55163 :     bool        spill_initialized = false;
    2594                 : 
    2595           55163 :     if (aggstate->hash_batches == NIL)
    2596           41657 :         return false;
    2597                 : 
    2598 ECB             :     /* hash_batches is a stack, with the top item at the end of the list */
    2599 GIC       13506 :     batch = llast(aggstate->hash_batches);
    2600           13506 :     aggstate->hash_batches = list_delete_last(aggstate->hash_batches);
    2601                 : 
    2602           13506 :     hash_agg_set_limits(aggstate->hashentrysize, batch->input_card,
    2603 ECB             :                         batch->used_bits, &aggstate->hash_mem_limit,
    2604                 :                         &aggstate->hash_ngroups_limit, NULL);
    2605                 : 
    2606                 :     /*
    2607                 :      * Each batch only processes one grouping set; set the rest to NULL so
    2608                 :      * that advance_aggregates() knows to ignore them. We don't touch
    2609                 :      * pergroups for sorted grouping sets here, because they will be needed if
    2610                 :      * we rescan later. The expressions for sorted grouping sets will not be
    2611                 :      * evaluated after we recompile anyway.
    2612                 :      */
    2613 CBC      103782 :     MemSet(aggstate->hash_pergroup, 0,
    2614                 :            sizeof(AggStatePerGroup) * aggstate->num_hashes);
    2615                 : 
    2616                 :     /* free memory and reset hash tables */
    2617 GIC       13506 :     ReScanExprContext(aggstate->hashcontext);
    2618          103782 :     for (int setno = 0; setno < aggstate->num_hashes; setno++)
    2619           90276 :         ResetTupleHashTable(aggstate->perhash[setno].hashtable);
    2620                 : 
    2621           13506 :     aggstate->hash_ngroups_current = 0;
    2622                 : 
    2623                 :     /*
    2624 ECB             :      * In AGG_MIXED mode, hash aggregation happens in phase 1 and the output
    2625                 :      * happens in phase 0. So, we switch to phase 1 when processing a batch,
    2626                 :      * and back to phase 0 after the batch is done.
    2627                 :      */
    2628 CBC       13506 :     Assert(aggstate->current_phase == 0);
    2629           13506 :     if (aggstate->phase->aggstrategy == AGG_MIXED)
    2630 ECB             :     {
    2631 GIC       13131 :         aggstate->current_phase = 1;
    2632 CBC       13131 :         aggstate->phase = &aggstate->phases[aggstate->current_phase];
    2633                 :     }
    2634                 : 
    2635 GIC       13506 :     select_current_set(aggstate, batch->setno, true);
    2636                 : 
    2637           13506 :     perhash = &aggstate->perhash[aggstate->current_set];
    2638                 : 
    2639 ECB             :     /*
    2640                 :      * Spilled tuples are always read back as MinimalTuples, which may be
    2641                 :      * different from the outer plan, so recompile the aggregate expressions.
    2642                 :      *
    2643                 :      * We still need the NULL check, because we are only processing one
    2644                 :      * grouping set at a time and the rest will be NULL.
    2645                 :      */
    2646 CBC       13506 :     hashagg_recompile_expressions(aggstate, true, true);
    2647                 : 
    2648 ECB             :     for (;;)
    2649 GIC      337596 :     {
    2650          351102 :         TupleTableSlot *spillslot = aggstate->hash_spill_rslot;
    2651          351102 :         TupleTableSlot *hashslot = perhash->hashslot;
    2652                 :         TupleHashEntry entry;
    2653                 :         MinimalTuple tuple;
    2654                 :         uint32      hash;
    2655          351102 :         bool        isnew = false;
    2656          351102 :         bool       *p_isnew = aggstate->hash_spill_mode ? NULL : &isnew;
    2657 ECB             : 
    2658 GIC      351102 :         CHECK_FOR_INTERRUPTS();
    2659                 : 
    2660 CBC      351102 :         tuple = hashagg_batch_read(batch, &hash);
    2661          351102 :         if (tuple == NULL)
    2662           13506 :             break;
    2663                 : 
    2664 GIC      337596 :         ExecStoreMinimalTuple(tuple, spillslot, true);
    2665          337596 :         aggstate->tmpcontext->ecxt_outertuple = spillslot;
    2666 ECB             : 
    2667 CBC      337596 :         prepare_hash_slot(perhash,
    2668 GIC      337596 :                           aggstate->tmpcontext->ecxt_outertuple,
    2669 ECB             :                           hashslot);
    2670 GIC      337596 :         entry = LookupTupleHashEntryHash(perhash->hashtable, hashslot,
    2671 ECB             :                                          p_isnew, hash);
    2672                 : 
    2673 CBC      337596 :         if (entry != NULL)
    2674                 :         {
    2675          116424 :             if (isnew)
    2676           48117 :                 initialize_hash_entry(aggstate, perhash->hashtable, entry);
    2677 GIC      116424 :             aggstate->hash_pergroup[batch->setno] = entry->additional;
    2678 CBC      116424 :             advance_aggregates(aggstate);
    2679 ECB             :         }
    2680                 :         else
    2681                 :         {
    2682 GIC      221172 :             if (!spill_initialized)
    2683                 :             {
    2684 ECB             :                 /*
    2685                 :                  * Avoid initializing the spill until we actually need it so
    2686                 :                  * that we don't assign tapes that will never be used.
    2687                 :                  */
    2688 CBC        6255 :                 spill_initialized = true;
    2689            6255 :                 hashagg_spill_init(&spill, tapeset, batch->used_bits,
    2690                 :                                    batch->input_card, aggstate->hashentrysize);
    2691                 :             }
    2692                 :             /* no memory for a new group, spill */
    2693          221172 :             hashagg_spill_tuple(aggstate, &spill, spillslot, hash);
    2694                 : 
    2695 GIC      221172 :             aggstate->hash_pergroup[batch->setno] = NULL;
    2696                 :         }
    2697                 : 
    2698                 :         /*
    2699 ECB             :          * Reset per-input-tuple context after each tuple, but note that the
    2700                 :          * hash lookups do this too
    2701                 :          */
    2702 GIC      337596 :         ResetExprContext(aggstate->tmpcontext);
    2703                 :     }
    2704 ECB             : 
    2705 GIC       13506 :     LogicalTapeClose(batch->input_tape);
    2706 ECB             : 
    2707                 :     /* change back to phase 0 */
    2708 GIC       13506 :     aggstate->current_phase = 0;
    2709           13506 :     aggstate->phase = &aggstate->phases[aggstate->current_phase];
    2710                 : 
    2711           13506 :     if (spill_initialized)
    2712                 :     {
    2713 CBC        6255 :         hashagg_spill_finish(aggstate, &spill, batch->setno);
    2714 GIC        6255 :         hash_agg_update_metrics(aggstate, true, spill.npartitions);
    2715                 :     }
    2716 ECB             :     else
    2717 GIC        7251 :         hash_agg_update_metrics(aggstate, true, 0);
    2718                 : 
    2719 CBC       13506 :     aggstate->hash_spill_mode = false;
    2720 ECB             : 
    2721                 :     /* prepare to walk the first hash table */
    2722 CBC       13506 :     select_current_set(aggstate, batch->setno, true);
    2723 GIC       13506 :     ResetTupleHashIterator(aggstate->perhash[batch->setno].hashtable,
    2724 ECB             :                            &aggstate->perhash[batch->setno].hashiter);
    2725                 : 
    2726 GIC       13506 :     pfree(batch);
    2727                 : 
    2728 CBC       13506 :     return true;
    2729                 : }
    2730 ECB             : 
    2731                 : /*
    2732                 :  * ExecAgg for hashed case: retrieving groups from hash table
    2733                 :  *
    2734                 :  * After exhausting in-memory tuples, also try refilling the hash table using
    2735                 :  * previously-spilled tuples. Only returns NULL after all in-memory and
    2736                 :  * spilled tuples are exhausted.
    2737                 :  */
    2738                 : static TupleTableSlot *
    2739 CBC      302118 : agg_retrieve_hash_table(AggState *aggstate)
    2740                 : {
    2741 GIC      302118 :     TupleTableSlot *result = NULL;
    2742                 : 
    2743          576085 :     while (result == NULL)
    2744                 :     {
    2745          315624 :         result = agg_retrieve_hash_table_in_memory(aggstate);
    2746          315624 :         if (result == NULL)
    2747                 :         {
    2748           55163 :             if (!agg_refill_hash_table(aggstate))
    2749                 :             {
    2750 CBC       41657 :                 aggstate->agg_done = true;
    2751 GIC       41657 :                 break;
    2752 ECB             :             }
    2753                 :         }
    2754                 :     }
    2755                 : 
    2756 CBC      302118 :     return result;
    2757 ECB             : }
    2758                 : 
    2759                 : /*
    2760                 :  * Retrieve the groups from the in-memory hash tables without considering any
    2761                 :  * spilled tuples.
    2762                 :  */
    2763                 : static TupleTableSlot *
    2764 GIC      315624 : agg_retrieve_hash_table_in_memory(AggState *aggstate)
    2765                 : {
    2766                 :     ExprContext *econtext;
    2767 ECB             :     AggStatePerAgg peragg;
    2768                 :     AggStatePerGroup pergroup;
    2769                 :     TupleHashEntryData *entry;
    2770                 :     TupleTableSlot *firstSlot;
    2771                 :     TupleTableSlot *result;
    2772                 :     AggStatePerHash perhash;
    2773                 : 
    2774                 :     /*
    2775                 :      * get state info from node.
    2776                 :      *
    2777                 :      * econtext is the per-output-tuple expression context.
    2778                 :      */
    2779 GIC      315624 :     econtext = aggstate->ss.ps.ps_ExprContext;
    2780          315624 :     peragg = aggstate->peragg;
    2781          315624 :     firstSlot = aggstate->ss.ss_ScanTupleSlot;
    2782                 : 
    2783                 :     /*
    2784                 :      * Note that perhash (and therefore anything accessed through it) can
    2785                 :      * change inside the loop, as we change between grouping sets.
    2786                 :      */
    2787          315624 :     perhash = &aggstate->perhash[aggstate->current_set];
    2788                 : 
    2789                 :     /*
    2790 ECB             :      * We loop retrieving groups until we find one satisfying
    2791                 :      * aggstate->ss.ps.qual
    2792                 :      */
    2793                 :     for (;;)
    2794 GIC       58946 :     {
    2795          374570 :         TupleTableSlot *hashslot = perhash->hashslot;
    2796                 :         int         i;
    2797                 : 
    2798 CBC      374570 :         CHECK_FOR_INTERRUPTS();
    2799                 : 
    2800                 :         /*
    2801                 :          * Find the next entry in the hash table
    2802                 :          */
    2803 GIC      374570 :         entry = ScanTupleHashTable(perhash->hashtable, &perhash->hashiter);
    2804          374570 :         if (entry == NULL)
    2805 ECB             :         {
    2806 CBC      105279 :             int         nextset = aggstate->current_set + 1;
    2807                 : 
    2808 GIC      105279 :             if (nextset < aggstate->num_hashes)
    2809 ECB             :             {
    2810                 :                 /*
    2811                 :                  * Switch to next grouping set, reinitialize, and restart the
    2812                 :                  * loop.
    2813                 :                  */
    2814 CBC       50116 :                 select_current_set(aggstate, nextset, true);
    2815 ECB             : 
    2816 GIC       50116 :                 perhash = &aggstate->perhash[aggstate->current_set];
    2817 ECB             : 
    2818 GIC       50116 :                 ResetTupleHashIterator(perhash->hashtable, &perhash->hashiter);
    2819 ECB             : 
    2820 GIC       50116 :                 continue;
    2821                 :             }
    2822                 :             else
    2823                 :             {
    2824           55163 :                 return NULL;
    2825 ECB             :             }
    2826                 :         }
    2827                 : 
    2828                 :         /*
    2829                 :          * Clear the per-output-tuple context for each group
    2830                 :          *
    2831                 :          * We intentionally don't use ReScanExprContext here; if any aggs have
    2832                 :          * registered shutdown callbacks, they mustn't be called yet, since we
    2833                 :          * might not be done with that agg.
    2834                 :          */
    2835 CBC      269291 :         ResetExprContext(econtext);
    2836                 : 
    2837                 :         /*
    2838                 :          * Transform representative tuple back into one with the right
    2839                 :          * columns.
    2840                 :          */
    2841 GIC      269291 :         ExecStoreMinimalTuple(entry->firstTuple, hashslot, false);
    2842          269291 :         slot_getallattrs(hashslot);
    2843                 : 
    2844          269291 :         ExecClearTuple(firstSlot);
    2845          269291 :         memset(firstSlot->tts_isnull, true,
    2846 CBC      269291 :                firstSlot->tts_tupleDescriptor->natts * sizeof(bool));
    2847                 : 
    2848 GIC      707202 :         for (i = 0; i < perhash->numhashGrpCols; i++)
    2849                 :         {
    2850          437911 :             int         varNumber = perhash->hashGrpColIdxInput[i] - 1;
    2851                 : 
    2852 CBC      437911 :             firstSlot->tts_values[varNumber] = hashslot->tts_values[i];
    2853          437911 :             firstSlot->tts_isnull[varNumber] = hashslot->tts_isnull[i];
    2854                 :         }
    2855          269291 :         ExecStoreVirtualTuple(firstSlot);
    2856 ECB             : 
    2857 CBC      269291 :         pergroup = (AggStatePerGroup) entry->additional;
    2858                 : 
    2859 ECB             :         /*
    2860                 :          * Use the representative input tuple for any references to
    2861                 :          * non-aggregated input columns in the qual and tlist.
    2862                 :          */
    2863 CBC      269291 :         econtext->ecxt_outertuple = firstSlot;
    2864 ECB             : 
    2865 GIC      269291 :         prepare_projection_slot(aggstate,
    2866 ECB             :                                 econtext->ecxt_outertuple,
    2867                 :                                 aggstate->current_set);
    2868                 : 
    2869 GIC      269291 :         finalize_aggregates(aggstate, peragg, pergroup);
    2870                 : 
    2871          269291 :         result = project_aggregates(aggstate);
    2872          269291 :         if (result)
    2873          260461 :             return result;
    2874 ECB             :     }
    2875                 : 
    2876                 :     /* No more groups */
    2877                 :     return NULL;
    2878                 : }
    2879                 : 
    2880                 : /*
    2881                 :  * hashagg_spill_init
    2882                 :  *
    2883                 :  * Called after we determined that spilling is necessary. Chooses the number
    2884                 :  * of partitions to create, and initializes them.
    2885                 :  */
    2886                 : static void
    2887 GIC        6318 : hashagg_spill_init(HashAggSpill *spill, LogicalTapeSet *tapeset, int used_bits,
    2888                 :                    double input_groups, double hashentrysize)
    2889                 : {
    2890                 :     int         npartitions;
    2891                 :     int         partition_bits;
    2892                 : 
    2893            6318 :     npartitions = hash_choose_num_partitions(input_groups, hashentrysize,
    2894                 :                                              used_bits, &partition_bits);
    2895                 : 
    2896            6318 :     spill->partitions = palloc0(sizeof(LogicalTape *) * npartitions);
    2897            6318 :     spill->ntuples = palloc0(sizeof(int64) * npartitions);
    2898 CBC        6318 :     spill->hll_card = palloc0(sizeof(hyperLogLogState) * npartitions);
    2899                 : 
    2900 GIC       31590 :     for (int i = 0; i < npartitions; i++)
    2901           25272 :         spill->partitions[i] = LogicalTapeCreate(tapeset);
    2902                 : 
    2903            6318 :     spill->shift = 32 - used_bits - partition_bits;
    2904 CBC        6318 :     spill->mask = (npartitions - 1) << spill->shift;
    2905 GIC        6318 :     spill->npartitions = npartitions;
    2906                 : 
    2907 CBC       31590 :     for (int i = 0; i < npartitions; i++)
    2908           25272 :         initHyperLogLog(&spill->hll_card[i], HASHAGG_HLL_BIT_WIDTH);
    2909            6318 : }
    2910                 : 
    2911 ECB             : /*
    2912                 :  * hashagg_spill_tuple
    2913                 :  *
    2914                 :  * No room for new groups in the hash table. Save for later in the appropriate
    2915                 :  * partition.
    2916                 :  */
    2917                 : static Size
    2918 CBC      337596 : hashagg_spill_tuple(AggState *aggstate, HashAggSpill *spill,
    2919 ECB             :                     TupleTableSlot *inputslot, uint32 hash)
    2920                 : {
    2921                 :     TupleTableSlot *spillslot;
    2922                 :     int         partition;
    2923                 :     MinimalTuple tuple;
    2924                 :     LogicalTape *tape;
    2925 GIC      337596 :     int         total_written = 0;
    2926                 :     bool        shouldFree;
    2927                 : 
    2928          337596 :     Assert(spill->partitions != NULL);
    2929 ECB             : 
    2930                 :     /* spill only attributes that we actually need */
    2931 GIC      337596 :     if (!aggstate->all_cols_needed)
    2932                 :     {
    2933            2454 :         spillslot = aggstate->hash_spill_wslot;
    2934            2454 :         slot_getsomeattrs(inputslot, aggstate->max_colno_needed);
    2935            2454 :         ExecClearTuple(spillslot);
    2936 CBC        7362 :         for (int i = 0; i < spillslot->tts_tupleDescriptor->natts; i++)
    2937                 :         {
    2938 GIC        4908 :             if (bms_is_member(i + 1, aggstate->colnos_needed))
    2939 ECB             :             {
    2940 GIC        2454 :                 spillslot->tts_values[i] = inputslot->tts_values[i];
    2941            2454 :                 spillslot->tts_isnull[i] = inputslot->tts_isnull[i];
    2942 ECB             :             }
    2943                 :             else
    2944 CBC        2454 :                 spillslot->tts_isnull[i] = true;
    2945 ECB             :         }
    2946 CBC        2454 :         ExecStoreVirtualTuple(spillslot);
    2947 ECB             :     }
    2948                 :     else
    2949 CBC      335142 :         spillslot = inputslot;
    2950                 : 
    2951          337596 :     tuple = ExecFetchSlotMinimalTuple(spillslot, &shouldFree);
    2952 ECB             : 
    2953 GIC      337596 :     partition = (hash & spill->mask) >> spill->shift;
    2954          337596 :     spill->ntuples[partition]++;
    2955 ECB             : 
    2956                 :     /*
    2957                 :      * All hash values destined for a given partition have some bits in
    2958                 :      * common, which causes bad HLL cardinality estimates. Hash the hash to
    2959                 :      * get a more uniform distribution.
    2960                 :      */
    2961 GIC      337596 :     addHyperLogLog(&spill->hll_card[partition], hash_bytes_uint32(hash));
    2962 ECB             : 
    2963 GIC      337596 :     tape = spill->partitions[partition];
    2964 ECB             : 
    2965 GNC      337596 :     LogicalTapeWrite(tape, &hash, sizeof(uint32));
    2966 GIC      337596 :     total_written += sizeof(uint32);
    2967                 : 
    2968 GNC      337596 :     LogicalTapeWrite(tape, tuple, tuple->t_len);
    2969 GIC      337596 :     total_written += tuple->t_len;
    2970                 : 
    2971          337596 :     if (shouldFree)
    2972 CBC      116424 :         pfree(tuple);
    2973                 : 
    2974          337596 :     return total_written;
    2975                 : }
    2976 ECB             : 
    2977                 : /*
    2978                 :  * hashagg_batch_new
    2979                 :  *
    2980                 :  * Construct a HashAggBatch item, which represents one iteration of HashAgg to
    2981                 :  * be done.
    2982                 :  */
    2983                 : static HashAggBatch *
    2984 GIC       13506 : hashagg_batch_new(LogicalTape *input_tape, int setno,
    2985 ECB             :                   int64 input_tuples, double input_card, int used_bits)
    2986                 : {
    2987 GIC       13506 :     HashAggBatch *batch = palloc0(sizeof(HashAggBatch));
    2988                 : 
    2989           13506 :     batch->setno = setno;
    2990           13506 :     batch->used_bits = used_bits;
    2991           13506 :     batch->input_tape = input_tape;
    2992           13506 :     batch->input_tuples = input_tuples;
    2993           13506 :     batch->input_card = input_card;
    2994                 : 
    2995 CBC       13506 :     return batch;
    2996                 : }
    2997                 : 
    2998 ECB             : /*
    2999                 :  * read_spilled_tuple
    3000                 :  *      read the next tuple from a batch's tape.  Return NULL if no more.
    3001                 :  */
    3002                 : static MinimalTuple
    3003 CBC      351102 : hashagg_batch_read(HashAggBatch *batch, uint32 *hashp)
    3004 ECB             : {
    3005 GIC      351102 :     LogicalTape *tape = batch->input_tape;
    3006 ECB             :     MinimalTuple tuple;
    3007                 :     uint32      t_len;
    3008                 :     size_t      nread;
    3009                 :     uint32      hash;
    3010                 : 
    3011 GIC      351102 :     nread = LogicalTapeRead(tape, &hash, sizeof(uint32));
    3012          351102 :     if (nread == 0)
    3013           13506 :         return NULL;
    3014 CBC      337596 :     if (nread != sizeof(uint32))
    3015 UIC           0 :         ereport(ERROR,
    3016 ECB             :                 (errcode_for_file_access(),
    3017                 :                  errmsg("unexpected EOF for tape %p: requested %zu bytes, read %zu bytes",
    3018                 :                         tape, sizeof(uint32), nread)));
    3019 GIC      337596 :     if (hashp != NULL)
    3020          337596 :         *hashp = hash;
    3021                 : 
    3022 CBC      337596 :     nread = LogicalTapeRead(tape, &t_len, sizeof(t_len));
    3023          337596 :     if (nread != sizeof(uint32))
    3024 LBC           0 :         ereport(ERROR,
    3025 ECB             :                 (errcode_for_file_access(),
    3026 EUB             :                  errmsg("unexpected EOF for tape %p: requested %zu bytes, read %zu bytes",
    3027                 :                         tape, sizeof(uint32), nread)));
    3028                 : 
    3029 GIC      337596 :     tuple = (MinimalTuple) palloc(t_len);
    3030 CBC      337596 :     tuple->t_len = t_len;
    3031 ECB             : 
    3032 GIC      337596 :     nread = LogicalTapeRead(tape,
    3033                 :                             (char *) tuple + sizeof(uint32),
    3034 ECB             :                             t_len - sizeof(uint32));
    3035 GBC      337596 :     if (nread != t_len - sizeof(uint32))
    3036 UIC           0 :         ereport(ERROR,
    3037                 :                 (errcode_for_file_access(),
    3038                 :                  errmsg("unexpected EOF for tape %p: requested %zu bytes, read %zu bytes",
    3039                 :                         tape, t_len - sizeof(uint32), nread)));
    3040 ECB             : 
    3041 CBC      337596 :     return tuple;
    3042                 : }
    3043 ECB             : 
    3044                 : /*
    3045                 :  * hashagg_finish_initial_spills
    3046                 :  *
    3047 EUB             :  * After a HashAggBatch has been processed, it may have spilled tuples to
    3048                 :  * disk. If so, turn the spilled partitions into new batches that must later
    3049                 :  * be executed.
    3050                 :  */
    3051                 : static void
    3052 CBC       41259 : hashagg_finish_initial_spills(AggState *aggstate)
    3053                 : {
    3054                 :     int         setno;
    3055 GIC       41259 :     int         total_npartitions = 0;
    3056                 : 
    3057           41259 :     if (aggstate->hash_spills != NULL)
    3058                 :     {
    3059              96 :         for (setno = 0; setno < aggstate->num_hashes; setno++)
    3060                 :         {
    3061              63 :             HashAggSpill *spill = &aggstate->hash_spills[setno];
    3062                 : 
    3063 CBC          63 :             total_npartitions += spill->npartitions;
    3064 GIC          63 :             hashagg_spill_finish(aggstate, spill, setno);
    3065                 :         }
    3066 ECB             : 
    3067                 :         /*
    3068                 :          * We're not processing tuples from outer plan any more; only
    3069                 :          * processing batches of spilled tuples. The initial spill structures
    3070                 :          * are no longer needed.
    3071                 :          */
    3072 CBC          33 :         pfree(aggstate->hash_spills);
    3073 GIC          33 :         aggstate->hash_spills = NULL;
    3074 ECB             :     }
    3075                 : 
    3076 GIC       41259 :     hash_agg_update_metrics(aggstate, false, total_npartitions);
    3077           41259 :     aggstate->hash_spill_mode = false;
    3078           41259 : }
    3079                 : 
    3080                 : /*
    3081                 :  * hashagg_spill_finish
    3082                 :  *
    3083 ECB             :  * Transform spill partitions into new batches.
    3084                 :  */
    3085                 : static void
    3086 GIC        6318 : hashagg_spill_finish(AggState *aggstate, HashAggSpill *spill, int setno)
    3087 ECB             : {
    3088                 :     int         i;
    3089 CBC        6318 :     int         used_bits = 32 - spill->shift;
    3090                 : 
    3091 GIC        6318 :     if (spill->npartitions == 0)
    3092 UIC           0 :         return;                 /* didn't spill */
    3093                 : 
    3094 GIC       31590 :     for (i = 0; i < spill->npartitions; i++)
    3095                 :     {
    3096           25272 :         LogicalTape *tape = spill->partitions[i];
    3097 ECB             :         HashAggBatch *new_batch;
    3098                 :         double      cardinality;
    3099                 : 
    3100                 :         /* if the partition is empty, don't create a new batch of work */
    3101 GIC       25272 :         if (spill->ntuples[i] == 0)
    3102 CBC       11766 :             continue;
    3103 EUB             : 
    3104 GIC       13506 :         cardinality = estimateHyperLogLog(&spill->hll_card[i]);
    3105 CBC       13506 :         freeHyperLogLog(&spill->hll_card[i]);
    3106                 : 
    3107 ECB             :         /* rewinding frees the buffer while not in use */
    3108 GIC       13506 :         LogicalTapeRewindForRead(tape, HASHAGG_READ_BUFFER_SIZE);
    3109                 : 
    3110           13506 :         new_batch = hashagg_batch_new(tape, setno,
    3111           13506 :                                       spill->ntuples[i], cardinality,
    3112 ECB             :                                       used_bits);
    3113 CBC       13506 :         aggstate->hash_batches = lappend(aggstate->hash_batches, new_batch);
    3114 GIC       13506 :         aggstate->hash_batches_used++;
    3115 ECB             :     }
    3116                 : 
    3117 GIC        6318 :     pfree(spill->ntuples);
    3118            6318 :     pfree(spill->hll_card);
    3119 CBC        6318 :     pfree(spill->partitions);
    3120                 : }
    3121 ECB             : 
    3122                 : /*
    3123                 :  * Free resources related to a spilled HashAgg.
    3124                 :  */
    3125                 : static void
    3126 GIC       59102 : hashagg_reset_spill_state(AggState *aggstate)
    3127                 : {
    3128 ECB             :     /* free spills from initial pass */
    3129 CBC       59102 :     if (aggstate->hash_spills != NULL)
    3130 ECB             :     {
    3131                 :         int         setno;
    3132                 : 
    3133 UIC           0 :         for (setno = 0; setno < aggstate->num_hashes; setno++)
    3134                 :         {
    3135               0 :             HashAggSpill *spill = &aggstate->hash_spills[setno];
    3136                 : 
    3137 LBC           0 :             pfree(spill->ntuples);
    3138 UIC           0 :             pfree(spill->partitions);
    3139                 :         }
    3140 LBC           0 :         pfree(aggstate->hash_spills);
    3141 UIC           0 :         aggstate->hash_spills = NULL;
    3142                 :     }
    3143                 : 
    3144 EUB             :     /* free batches */
    3145 GIC       59102 :     list_free_deep(aggstate->hash_batches);
    3146 GBC       59102 :     aggstate->hash_batches = NIL;
    3147                 : 
    3148 EUB             :     /* close tape set */
    3149 GBC       59102 :     if (aggstate->hash_tapeset != NULL)
    3150                 :     {
    3151              33 :         LogicalTapeSetClose(aggstate->hash_tapeset);
    3152              33 :         aggstate->hash_tapeset = NULL;
    3153                 :     }
    3154 GIC       59102 : }
    3155                 : 
    3156 ECB             : 
    3157                 : /* -----------------
    3158                 :  * ExecInitAgg
    3159                 :  *
    3160                 :  *  Creates the run-time information for the agg node produced by the
    3161                 :  *  planner and initializes its outer subtree.
    3162                 :  *
    3163                 :  * -----------------
    3164                 :  */
    3165                 : AggState *
    3166 GIC       21244 : ExecInitAgg(Agg *node, EState *estate, int eflags)
    3167                 : {
    3168                 :     AggState   *aggstate;
    3169                 :     AggStatePerAgg peraggs;
    3170                 :     AggStatePerTrans pertransstates;
    3171                 :     AggStatePerGroup *pergroups;
    3172                 :     Plan       *outerPlan;
    3173                 :     ExprContext *econtext;
    3174                 :     TupleDesc   scanDesc;
    3175                 :     int         max_aggno;
    3176                 :     int         max_transno;
    3177 ECB             :     int         numaggrefs;
    3178                 :     int         numaggs;
    3179                 :     int         numtrans;
    3180                 :     int         phase;
    3181                 :     int         phaseidx;
    3182                 :     ListCell   *l;
    3183 GIC       21244 :     Bitmapset  *all_grouped_cols = NULL;
    3184           21244 :     int         numGroupingSets = 1;
    3185                 :     int         numPhases;
    3186                 :     int         numHashes;
    3187           21244 :     int         i = 0;
    3188           21244 :     int         j = 0;
    3189           38596 :     bool        use_hashing = (node->aggstrategy == AGG_HASHED ||
    3190           17352 :                                node->aggstrategy == AGG_MIXED);
    3191                 : 
    3192                 :     /* check for unsupported flags */
    3193           21244 :     Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
    3194 ECB             : 
    3195                 :     /*
    3196                 :      * create state structure
    3197                 :      */
    3198 CBC       21244 :     aggstate = makeNode(AggState);
    3199           21244 :     aggstate->ss.ps.plan = (Plan *) node;
    3200           21244 :     aggstate->ss.ps.state = estate;
    3201           21244 :     aggstate->ss.ps.ExecProcNode = ExecAgg;
    3202                 : 
    3203 GIC       21244 :     aggstate->aggs = NIL;
    3204 CBC       21244 :     aggstate->numaggs = 0;
    3205 GIC       21244 :     aggstate->numtrans = 0;
    3206           21244 :     aggstate->aggstrategy = node->aggstrategy;
    3207           21244 :     aggstate->aggsplit = node->aggsplit;
    3208           21244 :     aggstate->maxsets = 0;
    3209 CBC       21244 :     aggstate->projected_set = -1;
    3210           21244 :     aggstate->current_set = 0;
    3211           21244 :     aggstate->peragg = NULL;
    3212           21244 :     aggstate->pertrans = NULL;
    3213 GIC       21244 :     aggstate->curperagg = NULL;
    3214 CBC       21244 :     aggstate->curpertrans = NULL;
    3215           21244 :     aggstate->input_done = false;
    3216           21244 :     aggstate->agg_done = false;
    3217           21244 :     aggstate->pergroups = NULL;
    3218           21244 :     aggstate->grp_firstTuple = NULL;
    3219           21244 :     aggstate->sort_in = NULL;
    3220           21244 :     aggstate->sort_out = NULL;
    3221 ECB             : 
    3222                 :     /*
    3223                 :      * phases[0] always exists, but is dummy in sorted/plain mode
    3224                 :      */
    3225 CBC       21244 :     numPhases = (use_hashing ? 1 : 2);
    3226           21244 :     numHashes = (use_hashing ? 1 : 0);
    3227 ECB             : 
    3228                 :     /*
    3229                 :      * Calculate the maximum number of grouping sets in any phase; this
    3230                 :      * determines the size of some allocations.  Also calculate the number of
    3231                 :      * phases, since all hashed/mixed nodes contribute to only a single phase.
    3232                 :      */
    3233 GIC       21244 :     if (node->groupingSets)
    3234                 :     {
    3235             358 :         numGroupingSets = list_length(node->groupingSets);
    3236 ECB             : 
    3237 CBC         767 :         foreach(l, node->chain)
    3238                 :         {
    3239 GIC         409 :             Agg        *agg = lfirst(l);
    3240                 : 
    3241             409 :             numGroupingSets = Max(numGroupingSets,
    3242                 :                                   list_length(agg->groupingSets));
    3243                 : 
    3244 ECB             :             /*
    3245                 :              * additional AGG_HASHED aggs become part of phase 0, but all
    3246                 :              * others add an extra phase.
    3247                 :              */
    3248 CBC         409 :             if (agg->aggstrategy != AGG_HASHED)
    3249 GIC         212 :                 ++numPhases;
    3250 ECB             :             else
    3251 GIC         197 :                 ++numHashes;
    3252 ECB             :         }
    3253                 :     }
    3254                 : 
    3255 GIC       21244 :     aggstate->maxsets = numGroupingSets;
    3256           21244 :     aggstate->numphases = numPhases;
    3257                 : 
    3258           21244 :     aggstate->aggcontexts = (ExprContext **)
    3259 CBC       21244 :         palloc0(sizeof(ExprContext *) * numGroupingSets);
    3260 ECB             : 
    3261                 :     /*
    3262                 :      * Create expression contexts.  We need three or more, one for
    3263                 :      * per-input-tuple processing, one for per-output-tuple processing, one
    3264                 :      * for all the hashtables, and one for each grouping set.  The per-tuple
    3265                 :      * memory context of the per-grouping-set ExprContexts (aggcontexts)
    3266                 :      * replaces the standalone memory context formerly used to hold transition
    3267                 :      * values.  We cheat a little by using ExecAssignExprContext() to build
    3268                 :      * all of them.
    3269                 :      *
    3270                 :      * NOTE: the details of what is stored in aggcontexts and what is stored
    3271                 :      * in the regular per-query memory context are driven by a simple
    3272                 :      * decision: we want to reset the aggcontext at group boundaries (if not
    3273                 :      * hashing) and in ExecReScanAgg to recover no-longer-wanted space.
    3274                 :      */
    3275 GIC       21244 :     ExecAssignExprContext(estate, &aggstate->ss.ps);
    3276           21244 :     aggstate->tmpcontext = aggstate->ss.ps.ps_ExprContext;
    3277                 : 
    3278           42902 :     for (i = 0; i < numGroupingSets; ++i)
    3279                 :     {
    3280           21658 :         ExecAssignExprContext(estate, &aggstate->ss.ps);
    3281           21658 :         aggstate->aggcontexts[i] = aggstate->ss.ps.ps_ExprContext;
    3282                 :     }
    3283                 : 
    3284           21244 :     if (use_hashing)
    3285            3990 :         aggstate->hashcontext = CreateWorkExprContext(estate);
    3286 ECB             : 
    3287 CBC       21244 :     ExecAssignExprContext(estate, &aggstate->ss.ps);
    3288                 : 
    3289 ECB             :     /*
    3290                 :      * Initialize child nodes.
    3291                 :      *
    3292                 :      * If we are doing a hashed aggregation then the child plan does not need
    3293                 :      * to handle REWIND efficiently; see ExecReScanAgg.
    3294                 :      */
    3295 CBC       21244 :     if (node->aggstrategy == AGG_HASHED)
    3296            3892 :         eflags &= ~EXEC_FLAG_REWIND;
    3297 GIC       21244 :     outerPlan = outerPlan(node);
    3298 CBC       21244 :     outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
    3299                 : 
    3300                 :     /*
    3301                 :      * initialize source tuple type.
    3302                 :      */
    3303 GIC       21244 :     aggstate->ss.ps.outerops =
    3304           21244 :         ExecGetResultSlotOps(outerPlanState(&aggstate->ss),
    3305                 :                              &aggstate->ss.ps.outeropsfixed);
    3306 CBC       21244 :     aggstate->ss.ps.outeropsset = true;
    3307 ECB             : 
    3308 CBC       21244 :     ExecCreateScanSlotFromOuterPlan(estate, &aggstate->ss,
    3309 ECB             :                                     aggstate->ss.ps.outerops);
    3310 GIC       21244 :     scanDesc = aggstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
    3311                 : 
    3312                 :     /*
    3313                 :      * If there are more than two phases (including a potential dummy phase
    3314 ECB             :      * 0), input will be resorted using tuplesort. Need a slot for that.
    3315                 :      */
    3316 GIC       21244 :     if (numPhases > 2)
    3317 ECB             :     {
    3318 GIC          93 :         aggstate->sort_slot = ExecInitExtraTupleSlot(estate, scanDesc,
    3319 ECB             :                                                      &TTSOpsMinimalTuple);
    3320                 : 
    3321                 :         /*
    3322                 :          * The output of the tuplesort, and the output from the outer child
    3323                 :          * might not use the same type of slot. In most cases the child will
    3324                 :          * be a Sort, and thus return a TTSOpsMinimalTuple type slot - but the
    3325                 :          * input can also be presorted due an index, in which case it could be
    3326                 :          * a different type of slot.
    3327                 :          *
    3328                 :          * XXX: For efficiency it would be good to instead/additionally
    3329                 :          * generate expressions with corresponding settings of outerops* for
    3330                 :          * the individual phases - deforming is often a bottleneck for
    3331                 :          * aggregations with lots of rows per group. If there's multiple
    3332                 :          * sorts, we know that all but the first use TTSOpsMinimalTuple (via
    3333                 :          * the nodeAgg.c internal tuplesort).
    3334                 :          */
    3335 GIC          93 :         if (aggstate->ss.ps.outeropsfixed &&
    3336              93 :             aggstate->ss.ps.outerops != &TTSOpsMinimalTuple)
    3337              12 :             aggstate->ss.ps.outeropsfixed = false;
    3338                 :     }
    3339                 : 
    3340                 :     /*
    3341                 :      * Initialize result type, slot and projection.
    3342                 :      */
    3343           21244 :     ExecInitResultTupleSlotTL(&aggstate->ss.ps, &TTSOpsVirtual);
    3344           21244 :     ExecAssignProjectionInfo(&aggstate->ss.ps, NULL);
    3345                 : 
    3346 ECB             :     /*
    3347                 :      * initialize child expressions
    3348                 :      *
    3349                 :      * We expect the parser to have checked that no aggs contain other agg
    3350                 :      * calls in their arguments (and just to be sure, we verify it again while
    3351                 :      * initializing the plan node).  This would make no sense under SQL
    3352                 :      * semantics, and it's forbidden by the spec.  Because it is true, we
    3353                 :      * don't need to worry about evaluating the aggs in any particular order.
    3354                 :      *
    3355                 :      * Note: execExpr.c finds Aggrefs for us, and adds them to aggstate->aggs.
    3356                 :      * Aggrefs in the qual are found here; Aggrefs in the targetlist are found
    3357                 :      * during ExecAssignProjectionInfo, above.
    3358                 :      */
    3359 GIC       21244 :     aggstate->ss.ps.qual =
    3360           21244 :         ExecInitQual(node->plan.qual, (PlanState *) aggstate);
    3361                 : 
    3362                 :     /*
    3363                 :      * We should now have found all Aggrefs in the targetlist and quals.
    3364                 :      */
    3365           21244 :     numaggrefs = list_length(aggstate->aggs);
    3366           21244 :     max_aggno = -1;
    3367           21244 :     max_transno = -1;
    3368           43754 :     foreach(l, aggstate->aggs)
    3369                 :     {
    3370 CBC       22510 :         Aggref     *aggref = (Aggref *) lfirst(l);
    3371 ECB             : 
    3372 GIC       22510 :         max_aggno = Max(max_aggno, aggref->aggno);
    3373           22510 :         max_transno = Max(max_transno, aggref->aggtransno);
    3374                 :     }
    3375           21244 :     numaggs = max_aggno + 1;
    3376 CBC       21244 :     numtrans = max_transno + 1;
    3377 ECB             : 
    3378                 :     /*
    3379                 :      * For each phase, prepare grouping set data and fmgr lookup data for
    3380                 :      * compare functions.  Accumulate all_grouped_cols in passing.
    3381                 :      */
    3382 GIC       21244 :     aggstate->phases = palloc0(numPhases * sizeof(AggStatePerPhaseData));
    3383 ECB             : 
    3384 CBC       21244 :     aggstate->num_hashes = numHashes;
    3385 GIC       21244 :     if (numHashes)
    3386 ECB             :     {
    3387 CBC        3990 :         aggstate->perhash = palloc0(sizeof(AggStatePerHashData) * numHashes);
    3388 GIC        3990 :         aggstate->phases[0].numsets = 0;
    3389            3990 :         aggstate->phases[0].gset_lengths = palloc(numHashes * sizeof(int));
    3390            3990 :         aggstate->phases[0].grouped_cols = palloc(numHashes * sizeof(Bitmapset *));
    3391                 :     }
    3392                 : 
    3393 CBC       21244 :     phase = 0;
    3394 GIC       42897 :     for (phaseidx = 0; phaseidx <= list_length(node->chain); ++phaseidx)
    3395 ECB             :     {
    3396                 :         Agg        *aggnode;
    3397                 :         Sort       *sortnode;
    3398                 : 
    3399 CBC       21653 :         if (phaseidx > 0)
    3400 ECB             :         {
    3401 CBC         409 :             aggnode = list_nth_node(Agg, node->chain, phaseidx - 1);
    3402 GNC         409 :             sortnode = castNode(Sort, outerPlan(aggnode));
    3403                 :         }
    3404 ECB             :         else
    3405                 :         {
    3406 GIC       21244 :             aggnode = node;
    3407           21244 :             sortnode = NULL;
    3408                 :         }
    3409                 : 
    3410 CBC       21653 :         Assert(phase <= 1 || sortnode);
    3411                 : 
    3412           21653 :         if (aggnode->aggstrategy == AGG_HASHED
    3413           17564 :             || aggnode->aggstrategy == AGG_MIXED)
    3414 GIC        4187 :         {
    3415            4187 :             AggStatePerPhase phasedata = &aggstate->phases[0];
    3416                 :             AggStatePerHash perhash;
    3417 CBC        4187 :             Bitmapset  *cols = NULL;
    3418 ECB             : 
    3419 GIC        4187 :             Assert(phase == 0);
    3420            4187 :             i = phasedata->numsets++;
    3421 CBC        4187 :             perhash = &aggstate->perhash[i];
    3422                 : 
    3423 ECB             :             /* phase 0 always points to the "real" Agg in the hash case */
    3424 CBC        4187 :             phasedata->aggnode = node;
    3425            4187 :             phasedata->aggstrategy = node->aggstrategy;
    3426 ECB             : 
    3427                 :             /* but the actual Agg node representing this hash is saved here */
    3428 CBC        4187 :             perhash->aggnode = aggnode;
    3429                 : 
    3430            4187 :             phasedata->gset_lengths[i] = perhash->numCols = aggnode->numCols;
    3431 ECB             : 
    3432 CBC       12211 :             for (j = 0; j < aggnode->numCols; ++j)
    3433 GIC        8024 :                 cols = bms_add_member(cols, aggnode->grpColIdx[j]);
    3434                 : 
    3435 CBC        4187 :             phasedata->grouped_cols[i] = cols;
    3436 ECB             : 
    3437 GIC        4187 :             all_grouped_cols = bms_add_members(all_grouped_cols, cols);
    3438            4187 :             continue;
    3439 ECB             :         }
    3440                 :         else
    3441                 :         {
    3442 GIC       17466 :             AggStatePerPhase phasedata = &aggstate->phases[++phase];
    3443 ECB             :             int         num_sets;
    3444                 : 
    3445 GIC       17466 :             phasedata->numsets = num_sets = list_length(aggnode->groupingSets);
    3446 ECB             : 
    3447 GIC       17466 :             if (num_sets)
    3448 ECB             :             {
    3449 CBC         431 :                 phasedata->gset_lengths = palloc(num_sets * sizeof(int));
    3450 GIC         431 :                 phasedata->grouped_cols = palloc(num_sets * sizeof(Bitmapset *));
    3451                 : 
    3452             431 :                 i = 0;
    3453 CBC        1312 :                 foreach(l, aggnode->groupingSets)
    3454                 :                 {
    3455 GIC         881 :                     int         current_length = list_length(lfirst(l));
    3456 CBC         881 :                     Bitmapset  *cols = NULL;
    3457                 : 
    3458 ECB             :                     /* planner forces this to be correct */
    3459 GIC        1749 :                     for (j = 0; j < current_length; ++j)
    3460 CBC         868 :                         cols = bms_add_member(cols, aggnode->grpColIdx[j]);
    3461 ECB             : 
    3462 GIC         881 :                     phasedata->grouped_cols[i] = cols;
    3463 CBC         881 :                     phasedata->gset_lengths[i] = current_length;
    3464 ECB             : 
    3465 GIC         881 :                     ++i;
    3466 ECB             :                 }
    3467                 : 
    3468 GIC         431 :                 all_grouped_cols = bms_add_members(all_grouped_cols,
    3469             431 :                                                    phasedata->grouped_cols[0]);
    3470 ECB             :             }
    3471                 :             else
    3472                 :             {
    3473 CBC       17035 :                 Assert(phaseidx == 0);
    3474 ECB             : 
    3475 GIC       17035 :                 phasedata->gset_lengths = NULL;
    3476 CBC       17035 :                 phasedata->grouped_cols = NULL;
    3477                 :             }
    3478                 : 
    3479 ECB             :             /*
    3480                 :              * If we are grouping, precompute fmgr lookup data for inner loop.
    3481                 :              */
    3482 GIC       17466 :             if (aggnode->aggstrategy == AGG_SORTED)
    3483                 :             {
    3484                 :                 /*
    3485                 :                  * Build a separate function for each subset of columns that
    3486                 :                  * need to be compared.
    3487                 :                  */
    3488            1044 :                 phasedata->eqfunctions =
    3489 CBC        1044 :                     (ExprState **) palloc0(aggnode->numCols * sizeof(ExprState *));
    3490                 : 
    3491                 :                 /* for each grouping set */
    3492 GNC        1804 :                 for (int k = 0; k < phasedata->numsets; k++)
    3493                 :                 {
    3494             760 :                     int         length = phasedata->gset_lengths[k];
    3495 ECB             : 
    3496                 :                     /* nothing to do for empty grouping set */
    3497 GIC         760 :                     if (length == 0)
    3498             163 :                         continue;
    3499 ECB             : 
    3500                 :                     /* if we already had one of this length, it'll do */
    3501 CBC         597 :                     if (phasedata->eqfunctions[length - 1] != NULL)
    3502 GIC          69 :                         continue;
    3503                 : 
    3504 CBC         528 :                     phasedata->eqfunctions[length - 1] =
    3505             528 :                         execTuplesMatchPrepare(scanDesc,
    3506                 :                                                length,
    3507 GIC         528 :                                                aggnode->grpColIdx,
    3508 CBC         528 :                                                aggnode->grpOperators,
    3509             528 :                                                aggnode->grpCollations,
    3510                 :                                                (PlanState *) aggstate);
    3511 ECB             :                 }
    3512                 : 
    3513                 :                 /* and for all grouped columns, unless already computed */
    3514 GNC        1044 :                 if (aggnode->numCols > 0 &&
    3515             997 :                     phasedata->eqfunctions[aggnode->numCols - 1] == NULL)
    3516 ECB             :                 {
    3517 CBC         651 :                     phasedata->eqfunctions[aggnode->numCols - 1] =
    3518 GIC         651 :                         execTuplesMatchPrepare(scanDesc,
    3519                 :                                                aggnode->numCols,
    3520             651 :                                                aggnode->grpColIdx,
    3521             651 :                                                aggnode->grpOperators,
    3522 CBC         651 :                                                aggnode->grpCollations,
    3523 ECB             :                                                (PlanState *) aggstate);
    3524                 :                 }
    3525                 :             }
    3526                 : 
    3527 GIC       17466 :             phasedata->aggnode = aggnode;
    3528 CBC       17466 :             phasedata->aggstrategy = aggnode->aggstrategy;
    3529           17466 :             phasedata->sortnode = sortnode;
    3530 ECB             :         }
    3531                 :     }
    3532                 : 
    3533                 :     /*
    3534                 :      * Convert all_grouped_cols to a descending-order list.
    3535                 :      */
    3536 CBC       21244 :     i = -1;
    3537           29545 :     while ((i = bms_next_member(all_grouped_cols, i)) >= 0)
    3538 GIC        8301 :         aggstate->all_grouped_cols = lcons_int(i, aggstate->all_grouped_cols);
    3539                 : 
    3540                 :     /*
    3541                 :      * Set up aggregate-result storage in the output expr context, and also
    3542                 :      * allocate my private per-agg working storage
    3543                 :      */
    3544 CBC       21244 :     econtext = aggstate->ss.ps.ps_ExprContext;
    3545           21244 :     econtext->ecxt_aggvalues = (Datum *) palloc0(sizeof(Datum) * numaggs);
    3546           21244 :     econtext->ecxt_aggnulls = (bool *) palloc0(sizeof(bool) * numaggs);
    3547                 : 
    3548 GIC       21244 :     peraggs = (AggStatePerAgg) palloc0(sizeof(AggStatePerAggData) * numaggs);
    3549           21244 :     pertransstates = (AggStatePerTrans) palloc0(sizeof(AggStatePerTransData) * numtrans);
    3550                 : 
    3551           21244 :     aggstate->peragg = peraggs;
    3552 CBC       21244 :     aggstate->pertrans = pertransstates;
    3553 ECB             : 
    3554                 : 
    3555 GIC       21244 :     aggstate->all_pergroups =
    3556 CBC       21244 :         (AggStatePerGroup *) palloc0(sizeof(AggStatePerGroup)
    3557           21244 :                                      * (numGroupingSets + numHashes));
    3558 GIC       21244 :     pergroups = aggstate->all_pergroups;
    3559 ECB             : 
    3560 CBC       21244 :     if (node->aggstrategy != AGG_HASHED)
    3561                 :     {
    3562 GIC       35118 :         for (i = 0; i < numGroupingSets; i++)
    3563 ECB             :         {
    3564 CBC       17766 :             pergroups[i] = (AggStatePerGroup) palloc0(sizeof(AggStatePerGroupData)
    3565 ECB             :                                                       * numaggs);
    3566                 :         }
    3567                 : 
    3568 CBC       17352 :         aggstate->pergroups = pergroups;
    3569 GIC       17352 :         pergroups += numGroupingSets;
    3570 ECB             :     }
    3571                 : 
    3572                 :     /*
    3573                 :      * Hashing can only appear in the initial phase.
    3574                 :      */
    3575 GIC       21244 :     if (use_hashing)
    3576 ECB             :     {
    3577 CBC        3990 :         Plan       *outerplan = outerPlan(node);
    3578 GIC        3990 :         uint64      totalGroups = 0;
    3579                 : 
    3580            3990 :         aggstate->hash_metacxt = AllocSetContextCreate(aggstate->ss.ps.state->es_query_cxt,
    3581                 :                                                        "HashAgg meta context",
    3582 ECB             :                                                        ALLOCSET_DEFAULT_SIZES);
    3583 GIC        3990 :         aggstate->hash_spill_rslot = ExecInitExtraTupleSlot(estate, scanDesc,
    3584 ECB             :                                                             &TTSOpsMinimalTuple);
    3585 CBC        3990 :         aggstate->hash_spill_wslot = ExecInitExtraTupleSlot(estate, scanDesc,
    3586                 :                                                             &TTSOpsVirtual);
    3587 ECB             : 
    3588                 :         /* this is an array of pointers, not structures */
    3589 GIC        3990 :         aggstate->hash_pergroup = pergroups;
    3590 ECB             : 
    3591 GIC        7980 :         aggstate->hashentrysize = hash_agg_entry_size(aggstate->numtrans,
    3592 CBC        3990 :                                                       outerplan->plan_width,
    3593                 :                                                       node->transitionSpace);
    3594                 : 
    3595                 :         /*
    3596 ECB             :          * Consider all of the grouping sets together when setting the limits
    3597                 :          * and estimating the number of partitions. This can be inaccurate
    3598                 :          * when there is more than one grouping set, but should still be
    3599                 :          * reasonable.
    3600                 :          */
    3601 GNC        8177 :         for (int k = 0; k < aggstate->num_hashes; k++)
    3602            4187 :             totalGroups += aggstate->perhash[k].aggnode->numGroups;
    3603                 : 
    3604 GIC        3990 :         hash_agg_set_limits(aggstate->hashentrysize, totalGroups, 0,
    3605                 :                             &aggstate->hash_mem_limit,
    3606                 :                             &aggstate->hash_ngroups_limit,
    3607                 :                             &aggstate->hash_planned_partitions);
    3608 CBC        3990 :         find_hash_columns(aggstate);
    3609 ECB             : 
    3610                 :         /* Skip massive memory allocation if we are just doing EXPLAIN */
    3611 CBC        3990 :         if (!(eflags & EXEC_FLAG_EXPLAIN_ONLY))
    3612 GIC        3466 :             build_hash_tables(aggstate);
    3613                 : 
    3614            3990 :         aggstate->table_filled = false;
    3615 ECB             : 
    3616                 :         /* Initialize this to 1, meaning nothing spilled, yet */
    3617 GIC        3990 :         aggstate->hash_batches_used = 1;
    3618 ECB             :     }
    3619                 : 
    3620                 :     /*
    3621                 :      * Initialize current phase-dependent values to initial phase. The initial
    3622                 :      * phase is 1 (first sort pass) for all strategies that use sorting (if
    3623                 :      * hashing is being done too, then phase 0 is processed last); but if only
    3624                 :      * hashing is being done, then phase 0 is all there is.
    3625                 :      */
    3626 GIC       21244 :     if (node->aggstrategy == AGG_HASHED)
    3627                 :     {
    3628            3892 :         aggstate->current_phase = 0;
    3629            3892 :         initialize_phase(aggstate, 0);
    3630            3892 :         select_current_set(aggstate, 0, true);
    3631                 :     }
    3632                 :     else
    3633 ECB             :     {
    3634 GIC       17352 :         aggstate->current_phase = 1;
    3635 CBC       17352 :         initialize_phase(aggstate, 1);
    3636           17352 :         select_current_set(aggstate, 0, false);
    3637 ECB             :     }
    3638                 : 
    3639                 :     /*
    3640                 :      * Perform lookups of aggregate function info, and initialize the
    3641                 :      * unchanging fields of the per-agg and per-trans data.
    3642                 :      */
    3643 CBC       43751 :     foreach(l, aggstate->aggs)
    3644                 :     {
    3645 GIC       22510 :         Aggref     *aggref = lfirst(l);
    3646                 :         AggStatePerAgg peragg;
    3647                 :         AggStatePerTrans pertrans;
    3648                 :         Oid         aggTransFnInputTypes[FUNC_MAX_ARGS];
    3649                 :         int         numAggTransFnArgs;
    3650 ECB             :         int         numDirectArgs;
    3651                 :         HeapTuple   aggTuple;
    3652                 :         Form_pg_aggregate aggform;
    3653                 :         AclResult   aclresult;
    3654                 :         Oid         finalfn_oid;
    3655                 :         Oid         serialfn_oid,
    3656                 :                     deserialfn_oid;
    3657                 :         Oid         aggOwner;
    3658                 :         Expr       *finalfnexpr;
    3659                 :         Oid         aggtranstype;
    3660                 : 
    3661                 :         /* Planner should have assigned aggregate to correct level */
    3662 GIC       22510 :         Assert(aggref->agglevelsup == 0);
    3663                 :         /* ... and the split mode should match */
    3664           22510 :         Assert(aggref->aggsplit == aggstate->aggsplit);
    3665                 : 
    3666           22510 :         peragg = &peraggs[aggref->aggno];
    3667                 : 
    3668                 :         /* Check if we initialized the state for this aggregate already. */
    3669 CBC       22510 :         if (peragg->aggref != NULL)
    3670 GIC         212 :             continue;
    3671 ECB             : 
    3672 GIC       22298 :         peragg->aggref = aggref;
    3673 CBC       22298 :         peragg->transno = aggref->aggtransno;
    3674                 : 
    3675                 :         /* Fetch the pg_aggregate row */
    3676           22298 :         aggTuple = SearchSysCache1(AGGFNOID,
    3677 ECB             :                                    ObjectIdGetDatum(aggref->aggfnoid));
    3678 GIC       22298 :         if (!HeapTupleIsValid(aggTuple))
    3679 LBC           0 :             elog(ERROR, "cache lookup failed for aggregate %u",
    3680 ECB             :                  aggref->aggfnoid);
    3681 GIC       22298 :         aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
    3682                 : 
    3683 ECB             :         /* Check permission to call aggregate function */
    3684 GNC       22298 :         aclresult = object_aclcheck(ProcedureRelationId, aggref->aggfnoid, GetUserId(),
    3685 ECB             :                                      ACL_EXECUTE);
    3686 GBC       22298 :         if (aclresult != ACLCHECK_OK)
    3687 GIC           3 :             aclcheck_error(aclresult, OBJECT_AGGREGATE,
    3688 CBC           3 :                            get_func_name(aggref->aggfnoid));
    3689 GIC       22295 :         InvokeFunctionExecuteHook(aggref->aggfnoid);
    3690                 : 
    3691 ECB             :         /* planner recorded transition state type in the Aggref itself */
    3692 GIC       22295 :         aggtranstype = aggref->aggtranstype;
    3693 CBC       22295 :         Assert(OidIsValid(aggtranstype));
    3694 ECB             : 
    3695                 :         /* Final function only required if we're finalizing the aggregates */
    3696 CBC       22295 :         if (DO_AGGSPLIT_SKIPFINAL(aggstate->aggsplit))
    3697 GIC        2107 :             peragg->finalfn_oid = finalfn_oid = InvalidOid;
    3698                 :         else
    3699 CBC       20188 :             peragg->finalfn_oid = finalfn_oid = aggform->aggfinalfn;
    3700 ECB             : 
    3701 GIC       22295 :         serialfn_oid = InvalidOid;
    3702           22295 :         deserialfn_oid = InvalidOid;
    3703 ECB             : 
    3704                 :         /*
    3705                 :          * Check if serialization/deserialization is required.  We only do it
    3706                 :          * for aggregates that have transtype INTERNAL.
    3707                 :          */
    3708 CBC       22295 :         if (aggtranstype == INTERNALOID)
    3709 ECB             :         {
    3710                 :             /*
    3711                 :              * The planner should only have generated a serialize agg node if
    3712                 :              * every aggregate with an INTERNAL state has a serialization
    3713                 :              * function.  Verify that.
    3714                 :              */
    3715 CBC        9128 :             if (DO_AGGSPLIT_SERIALIZE(aggstate->aggsplit))
    3716                 :             {
    3717                 :                 /* serialization only valid when not running finalfn */
    3718 GIC         168 :                 Assert(DO_AGGSPLIT_SKIPFINAL(aggstate->aggsplit));
    3719                 : 
    3720             168 :                 if (!OidIsValid(aggform->aggserialfn))
    3721 UIC           0 :                     elog(ERROR, "serialfunc not provided for serialization aggregation");
    3722 CBC         168 :                 serialfn_oid = aggform->aggserialfn;
    3723                 :             }
    3724                 : 
    3725 ECB             :             /* Likewise for deserialization functions */
    3726 GIC        9128 :             if (DO_AGGSPLIT_DESERIALIZE(aggstate->aggsplit))
    3727 ECB             :             {
    3728 EUB             :                 /* deserialization only valid when combining states */
    3729 CBC          60 :                 Assert(DO_AGGSPLIT_COMBINE(aggstate->aggsplit));
    3730                 : 
    3731 GIC          60 :                 if (!OidIsValid(aggform->aggdeserialfn))
    3732 UIC           0 :                     elog(ERROR, "deserialfunc not provided for deserialization aggregation");
    3733 CBC          60 :                 deserialfn_oid = aggform->aggdeserialfn;
    3734                 :             }
    3735                 :         }
    3736 ECB             : 
    3737                 :         /* Check that aggregate owner has permission to call component fns */
    3738                 :         {
    3739 EUB             :             HeapTuple   procTuple;
    3740 ECB             : 
    3741 GIC       22295 :             procTuple = SearchSysCache1(PROCOID,
    3742                 :                                         ObjectIdGetDatum(aggref->aggfnoid));
    3743           22295 :             if (!HeapTupleIsValid(procTuple))
    3744 UIC           0 :                 elog(ERROR, "cache lookup failed for function %u",
    3745                 :                      aggref->aggfnoid);
    3746 GIC       22295 :             aggOwner = ((Form_pg_proc) GETSTRUCT(procTuple))->proowner;
    3747           22295 :             ReleaseSysCache(procTuple);
    3748 ECB             : 
    3749 GIC       22295 :             if (OidIsValid(finalfn_oid))
    3750 ECB             :             {
    3751 GNC        9825 :                 aclresult = object_aclcheck(ProcedureRelationId, finalfn_oid, aggOwner,
    3752                 :                                              ACL_EXECUTE);
    3753 CBC        9825 :                 if (aclresult != ACLCHECK_OK)
    3754 LBC           0 :                     aclcheck_error(aclresult, OBJECT_FUNCTION,
    3755 UIC           0 :                                    get_func_name(finalfn_oid));
    3756 CBC        9825 :                 InvokeFunctionExecuteHook(finalfn_oid);
    3757                 :             }
    3758           22295 :             if (OidIsValid(serialfn_oid))
    3759                 :             {
    3760 GNC         168 :                 aclresult = object_aclcheck(ProcedureRelationId, serialfn_oid, aggOwner,
    3761 EUB             :                                              ACL_EXECUTE);
    3762 GBC         168 :                 if (aclresult != ACLCHECK_OK)
    3763 LBC           0 :                     aclcheck_error(aclresult, OBJECT_FUNCTION,
    3764 UIC           0 :                                    get_func_name(serialfn_oid));
    3765 CBC         168 :                 InvokeFunctionExecuteHook(serialfn_oid);
    3766                 :             }
    3767           22295 :             if (OidIsValid(deserialfn_oid))
    3768                 :             {
    3769 GNC          60 :                 aclresult = object_aclcheck(ProcedureRelationId, deserialfn_oid, aggOwner,
    3770 EUB             :                                              ACL_EXECUTE);
    3771 GBC          60 :                 if (aclresult != ACLCHECK_OK)
    3772 LBC           0 :                     aclcheck_error(aclresult, OBJECT_FUNCTION,
    3773 UIC           0 :                                    get_func_name(deserialfn_oid));
    3774 CBC          60 :                 InvokeFunctionExecuteHook(deserialfn_oid);
    3775                 :             }
    3776 ECB             :         }
    3777                 : 
    3778                 :         /*
    3779 EUB             :          * Get actual datatypes of the (nominal) aggregate inputs.  These
    3780                 :          * could be different from the agg's declared input types, when the
    3781 ECB             :          * agg accepts ANY or a polymorphic type.
    3782                 :          */
    3783 GIC       22295 :         numAggTransFnArgs = get_aggregate_argtypes(aggref,
    3784                 :                                                    aggTransFnInputTypes);
    3785                 : 
    3786                 :         /* Count the "direct" arguments, if any */
    3787           22295 :         numDirectArgs = list_length(aggref->aggdirectargs);
    3788                 : 
    3789                 :         /* Detect how many arguments to pass to the finalfn */
    3790 CBC       22295 :         if (aggform->aggfinalextra)
    3791 GIC        6598 :             peragg->numFinalArgs = numAggTransFnArgs + 1;
    3792                 :         else
    3793           15697 :             peragg->numFinalArgs = numDirectArgs + 1;
    3794 ECB             : 
    3795                 :         /* Initialize any direct-argument expressions */
    3796 GIC       22295 :         peragg->aggdirectargs = ExecInitExprList(aggref->aggdirectargs,
    3797 ECB             :                                                  (PlanState *) aggstate);
    3798                 : 
    3799                 :         /*
    3800                 :          * build expression trees using actual argument & result types for the
    3801                 :          * finalfn, if it exists and is required.
    3802                 :          */
    3803 CBC       22295 :         if (OidIsValid(finalfn_oid))
    3804                 :         {
    3805 GIC        9825 :             build_aggregate_finalfn_expr(aggTransFnInputTypes,
    3806                 :                                          peragg->numFinalArgs,
    3807                 :                                          aggtranstype,
    3808                 :                                          aggref->aggtype,
    3809                 :                                          aggref->inputcollid,
    3810 ECB             :                                          finalfn_oid,
    3811                 :                                          &finalfnexpr);
    3812 CBC        9825 :             fmgr_info(finalfn_oid, &peragg->finalfn);
    3813 GIC        9825 :             fmgr_info_set_expr((Node *) finalfnexpr, &peragg->finalfn);
    3814                 :         }
    3815                 : 
    3816                 :         /* get info about the output value's datatype */
    3817           22295 :         get_typlenbyval(aggref->aggtype,
    3818                 :                         &peragg->resulttypeLen,
    3819 ECB             :                         &peragg->resulttypeByVal);
    3820                 : 
    3821                 :         /*
    3822                 :          * Build working state for invoking the transition function, if we
    3823                 :          * haven't done it already.
    3824                 :          */
    3825 GIC       22295 :         pertrans = &pertransstates[aggref->aggtransno];
    3826           22295 :         if (pertrans->aggref == NULL)
    3827                 :         {
    3828                 :             Datum       textInitVal;
    3829                 :             Datum       initValue;
    3830                 :             bool        initValueIsNull;
    3831                 :             Oid         transfn_oid;
    3832 ECB             : 
    3833                 :             /*
    3834                 :              * If this aggregation is performing state combines, then instead
    3835                 :              * of using the transition function, we'll use the combine
    3836                 :              * function.
    3837                 :              */
    3838 GIC       22166 :             if (DO_AGGSPLIT_COMBINE(aggstate->aggsplit))
    3839                 :             {
    3840             671 :                 transfn_oid = aggform->aggcombinefn;
    3841                 : 
    3842                 :                 /* If not set then the planner messed up */
    3843             671 :                 if (!OidIsValid(transfn_oid))
    3844 UIC           0 :                     elog(ERROR, "combinefn not set for aggregate function");
    3845 ECB             :             }
    3846                 :             else
    3847 CBC       21495 :                 transfn_oid = aggform->aggtransfn;
    3848                 : 
    3849 GNC       22166 :             aclresult = object_aclcheck(ProcedureRelationId, transfn_oid, aggOwner, ACL_EXECUTE);
    3850 CBC       22166 :             if (aclresult != ACLCHECK_OK)
    3851 UBC           0 :                 aclcheck_error(aclresult, OBJECT_FUNCTION,
    3852 UIC           0 :                                get_func_name(transfn_oid));
    3853 GIC       22166 :             InvokeFunctionExecuteHook(transfn_oid);
    3854 ECB             : 
    3855                 :             /*
    3856                 :              * initval is potentially null, so don't try to access it as a
    3857                 :              * struct field. Must do it the hard way with SysCacheGetAttr.
    3858 EUB             :              */
    3859 GBC       22166 :             textInitVal = SysCacheGetAttr(AGGFNOID, aggTuple,
    3860 ECB             :                                           Anum_pg_aggregate_agginitval,
    3861                 :                                           &initValueIsNull);
    3862 GIC       22166 :             if (initValueIsNull)
    3863           13119 :                 initValue = (Datum) 0;
    3864                 :             else
    3865            9047 :                 initValue = GetAggInitVal(textInitVal, aggtranstype);
    3866 ECB             : 
    3867 GIC       22166 :             if (DO_AGGSPLIT_COMBINE(aggstate->aggsplit))
    3868                 :             {
    3869 CBC         671 :                 Oid         combineFnInputTypes[] = {aggtranstype,
    3870 ECB             :                 aggtranstype};
    3871                 : 
    3872                 :                 /*
    3873                 :                  * When combining there's only one input, the to-be-combined
    3874                 :                  * transition value.  The transition value is not counted
    3875                 :                  * here.
    3876                 :                  */
    3877 GIC         671 :                 pertrans->numTransInputs = 1;
    3878                 : 
    3879                 :                 /* aggcombinefn always has two arguments of aggtranstype */
    3880             671 :                 build_pertrans_for_aggref(pertrans, aggstate, estate,
    3881                 :                                           aggref, transfn_oid, aggtranstype,
    3882                 :                                           serialfn_oid, deserialfn_oid,
    3883                 :                                           initValue, initValueIsNull,
    3884 ECB             :                                           combineFnInputTypes, 2);
    3885                 : 
    3886                 :                 /*
    3887                 :                  * Ensure that a combine function to combine INTERNAL states
    3888                 :                  * is not strict. This should have been checked during CREATE
    3889                 :                  * AGGREGATE, but the strict property could have been changed
    3890                 :                  * since then.
    3891                 :                  */
    3892 GIC         671 :                 if (pertrans->transfn.fn_strict && aggtranstype == INTERNALOID)
    3893 UIC           0 :                     ereport(ERROR,
    3894                 :                             (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
    3895                 :                              errmsg("combine function with transition type %s must not be declared STRICT",
    3896                 :                                     format_type_be(aggtranstype))));
    3897                 :             }
    3898                 :             else
    3899 ECB             :             {
    3900 EUB             :                 /* Detect how many arguments to pass to the transfn */
    3901 GIC       21495 :                 if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
    3902             126 :                     pertrans->numTransInputs = list_length(aggref->args);
    3903                 :                 else
    3904           21369 :                     pertrans->numTransInputs = numAggTransFnArgs;
    3905                 : 
    3906           21495 :                 build_pertrans_for_aggref(pertrans, aggstate, estate,
    3907                 :                                           aggref, transfn_oid, aggtranstype,
    3908 ECB             :                                           serialfn_oid, deserialfn_oid,
    3909                 :                                           initValue, initValueIsNull,
    3910                 :                                           aggTransFnInputTypes,
    3911                 :                                           numAggTransFnArgs);
    3912                 : 
    3913                 :                 /*
    3914                 :                  * If the transfn is strict and the initval is NULL, make sure
    3915                 :                  * input type and transtype are the same (or at least
    3916                 :                  * binary-compatible), so that it's OK to use the first
    3917                 :                  * aggregated input value as the initial transValue.  This
    3918                 :                  * should have been checked at agg definition time, but we
    3919                 :                  * must check again in case the transfn's strictness property
    3920                 :                  * has been changed.
    3921                 :                  */
    3922 GIC       21495 :                 if (pertrans->transfn.fn_strict && pertrans->initValueIsNull)
    3923                 :                 {
    3924            2400 :                     if (numAggTransFnArgs <= numDirectArgs ||
    3925            2400 :                         !IsBinaryCoercible(aggTransFnInputTypes[numDirectArgs],
    3926                 :                                            aggtranstype))
    3927 UIC           0 :                         ereport(ERROR,
    3928                 :                                 (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
    3929 ECB             :                                  errmsg("aggregate %u needs to have compatible input type and transition type",
    3930                 :                                         aggref->aggfnoid)));
    3931                 :                 }
    3932                 :             }
    3933                 :         }
    3934 EUB             :         else
    3935 GIC         129 :             pertrans->aggshared = true;
    3936           22295 :         ReleaseSysCache(aggTuple);
    3937                 :     }
    3938                 : 
    3939                 :     /*
    3940                 :      * Update aggstate->numaggs to be the number of unique aggregates found.
    3941                 :      * Also set numstates to the number of unique transition states found.
    3942 ECB             :      */
    3943 CBC       21241 :     aggstate->numaggs = numaggs;
    3944 GIC       21241 :     aggstate->numtrans = numtrans;
    3945                 : 
    3946                 :     /*
    3947                 :      * Last, check whether any more aggregates got added onto the node while
    3948                 :      * we processed the expressions for the aggregate arguments (including not
    3949                 :      * only the regular arguments and FILTER expressions handled immediately
    3950 ECB             :      * above, but any direct arguments we might've handled earlier).  If so,
    3951                 :      * we have nested aggregate functions, which is semantically nonsensical,
    3952                 :      * so complain.  (This should have been caught by the parser, so we don't
    3953                 :      * need to work hard on a helpful error message; but we defend against it
    3954                 :      * here anyway, just to be sure.)
    3955                 :      */
    3956 GIC       21241 :     if (numaggrefs != list_length(aggstate->aggs))
    3957 UIC           0 :         ereport(ERROR,
    3958                 :                 (errcode(ERRCODE_GROUPING_ERROR),
    3959                 :                  errmsg("aggregate function calls cannot be nested")));
    3960                 : 
    3961                 :     /*
    3962                 :      * Build expressions doing all the transition work at once. We build a
    3963 ECB             :      * different one for each phase, as the number of transition function
    3964 EUB             :      * invocation can differ between phases. Note this'll work both for
    3965                 :      * transition and combination functions (although there'll only be one
    3966                 :      * phase in the latter case).
    3967                 :      */
    3968 GIC       59945 :     for (phaseidx = 0; phaseidx < aggstate->numphases; phaseidx++)
    3969                 :     {
    3970           38704 :         AggStatePerPhase phase = &aggstate->phases[phaseidx];
    3971           38704 :         bool        dohash = false;
    3972           38704 :         bool        dosort = false;
    3973                 : 
    3974                 :         /* phase 0 doesn't necessarily exist */
    3975 CBC       38704 :         if (!phase->aggnode)
    3976 GIC       17251 :             continue;
    3977 ECB             : 
    3978 CBC       21453 :         if (aggstate->aggstrategy == AGG_MIXED && phaseidx == 1)
    3979 ECB             :         {
    3980                 :             /*
    3981                 :              * Phase one, and only phase one, in a mixed agg performs both
    3982                 :              * sorting and aggregation.
    3983                 :              */
    3984 GIC          98 :             dohash = true;
    3985 CBC          98 :             dosort = true;
    3986                 :         }
    3987 GIC       21355 :         else if (aggstate->aggstrategy == AGG_MIXED && phaseidx == 0)
    3988                 :         {
    3989                 :             /*
    3990                 :              * No need to compute a transition function for an AGG_MIXED phase
    3991 ECB             :              * 0 - the contents of the hashtables will have been computed
    3992                 :              * during phase 1.
    3993                 :              */
    3994 CBC          98 :             continue;
    3995                 :         }
    3996 GIC       21257 :         else if (phase->aggstrategy == AGG_PLAIN ||
    3997            4905 :                  phase->aggstrategy == AGG_SORTED)
    3998                 :         {
    3999           17365 :             dohash = false;
    4000           17365 :             dosort = true;
    4001 ECB             :         }
    4002 GIC        3892 :         else if (phase->aggstrategy == AGG_HASHED)
    4003 ECB             :         {
    4004 CBC        3892 :             dohash = true;
    4005 GIC        3892 :             dosort = false;
    4006 ECB             :         }
    4007                 :         else
    4008 UIC           0 :             Assert(false);
    4009 ECB             : 
    4010 GIC       21355 :         phase->evaltrans = ExecBuildAggTrans(aggstate, phase, dosort, dohash,
    4011 ECB             :                                              false);
    4012                 : 
    4013                 :         /* cache compiled expression for outer slot without NULL check */
    4014 GIC       21355 :         phase->evaltrans_cache[0][0] = phase->evaltrans;
    4015 EUB             :     }
    4016                 : 
    4017 CBC       21241 :     return aggstate;
    4018                 : }
    4019                 : 
    4020                 : /*
    4021 ECB             :  * Build the state needed to calculate a state value for an aggregate.
    4022                 :  *
    4023                 :  * This initializes all the fields in 'pertrans'. 'aggref' is the aggregate
    4024                 :  * to initialize the state for. 'transfn_oid', 'aggtranstype', and the rest
    4025                 :  * of the arguments could be calculated from 'aggref', but the caller has
    4026                 :  * calculated them already, so might as well pass them.
    4027                 :  *
    4028                 :  * 'transfn_oid' may be either the Oid of the aggtransfn or the aggcombinefn.
    4029                 :  */
    4030                 : static void
    4031 GIC       22166 : build_pertrans_for_aggref(AggStatePerTrans pertrans,
    4032                 :                           AggState *aggstate, EState *estate,
    4033                 :                           Aggref *aggref,
    4034                 :                           Oid transfn_oid, Oid aggtranstype,
    4035                 :                           Oid aggserialfn, Oid aggdeserialfn,
    4036                 :                           Datum initValue, bool initValueIsNull,
    4037                 :                           Oid *inputTypes, int numArguments)
    4038 ECB             : {
    4039 GIC       22166 :     int         numGroupingSets = Max(aggstate->maxsets, 1);
    4040                 :     Expr       *transfnexpr;
    4041                 :     int         numTransArgs;
    4042           22166 :     Expr       *serialfnexpr = NULL;
    4043           22166 :     Expr       *deserialfnexpr = NULL;
    4044                 :     ListCell   *lc;
    4045                 :     int         numInputs;
    4046 ECB             :     int         numDirectArgs;
    4047                 :     List       *sortlist;
    4048                 :     int         numSortCols;
    4049                 :     int         numDistinctCols;
    4050                 :     int         i;
    4051                 : 
    4052                 :     /* Begin filling in the pertrans data */
    4053 GIC       22166 :     pertrans->aggref = aggref;
    4054           22166 :     pertrans->aggshared = false;
    4055           22166 :     pertrans->aggCollation = aggref->inputcollid;
    4056           22166 :     pertrans->transfn_oid = transfn_oid;
    4057           22166 :     pertrans->serialfn_oid = aggserialfn;
    4058           22166 :     pertrans->deserialfn_oid = aggdeserialfn;
    4059           22166 :     pertrans->initValue = initValue;
    4060 CBC       22166 :     pertrans->initValueIsNull = initValueIsNull;
    4061 ECB             : 
    4062                 :     /* Count the "direct" arguments, if any */
    4063 CBC       22166 :     numDirectArgs = list_length(aggref->aggdirectargs);
    4064 ECB             : 
    4065                 :     /* Count the number of aggregated input columns */
    4066 CBC       22166 :     pertrans->numInputs = numInputs = list_length(aggref->args);
    4067 ECB             : 
    4068 GIC       22166 :     pertrans->aggtranstype = aggtranstype;
    4069                 : 
    4070 ECB             :     /* account for the current transition state */
    4071 GIC       22166 :     numTransArgs = pertrans->numTransInputs + 1;
    4072                 : 
    4073 ECB             :     /*
    4074                 :      * Set up infrastructure for calling the transfn.  Note that invtrans is
    4075                 :      * not needed here.
    4076                 :      */
    4077 GIC       22166 :     build_aggregate_transfn_expr(inputTypes,
    4078 ECB             :                                  numArguments,
    4079                 :                                  numDirectArgs,
    4080 GIC       22166 :                                  aggref->aggvariadic,
    4081                 :                                  aggtranstype,
    4082                 :                                  aggref->inputcollid,
    4083                 :                                  transfn_oid,
    4084 ECB             :                                  InvalidOid,
    4085                 :                                  &transfnexpr,
    4086                 :                                  NULL);
    4087                 : 
    4088 GIC       22166 :     fmgr_info(transfn_oid, &pertrans->transfn);
    4089           22166 :     fmgr_info_set_expr((Node *) transfnexpr, &pertrans->transfn);
    4090                 : 
    4091           22166 :     pertrans->transfn_fcinfo =
    4092           22166 :         (FunctionCallInfo) palloc(SizeForFunctionCallInfo(numTransArgs));
    4093           22166 :     InitFunctionCallInfoData(*pertrans->transfn_fcinfo,
    4094                 :                              &pertrans->transfn,
    4095 ECB             :                              numTransArgs,
    4096                 :                              pertrans->aggCollation,
    4097                 :                              (void *) aggstate, NULL);
    4098                 : 
    4099                 :     /* get info about the state value's datatype */
    4100 CBC       22166 :     get_typlenbyval(aggtranstype,
    4101                 :                     &pertrans->transtypeLen,
    4102                 :                     &pertrans->transtypeByVal);
    4103                 : 
    4104 GIC       22166 :     if (OidIsValid(aggserialfn))
    4105                 :     {
    4106             168 :         build_aggregate_serialfn_expr(aggserialfn,
    4107 ECB             :                                       &serialfnexpr);
    4108 GIC         168 :         fmgr_info(aggserialfn, &pertrans->serialfn);
    4109             168 :         fmgr_info_set_expr((Node *) serialfnexpr, &pertrans->serialfn);
    4110                 : 
    4111 CBC         168 :         pertrans->serialfn_fcinfo =
    4112 GIC         168 :             (FunctionCallInfo) palloc(SizeForFunctionCallInfo(1));
    4113 CBC         168 :         InitFunctionCallInfoData(*pertrans->serialfn_fcinfo,
    4114                 :                                  &pertrans->serialfn,
    4115 ECB             :                                  1,
    4116                 :                                  InvalidOid,
    4117                 :                                  (void *) aggstate, NULL);
    4118                 :     }
    4119                 : 
    4120 CBC       22166 :     if (OidIsValid(aggdeserialfn))
    4121                 :     {
    4122 GIC          60 :         build_aggregate_deserialfn_expr(aggdeserialfn,
    4123                 :                                         &deserialfnexpr);
    4124              60 :         fmgr_info(aggdeserialfn, &pertrans->deserialfn);
    4125              60 :         fmgr_info_set_expr((Node *) deserialfnexpr, &pertrans->deserialfn);
    4126                 : 
    4127 CBC          60 :         pertrans->deserialfn_fcinfo =
    4128 GIC          60 :             (FunctionCallInfo) palloc(SizeForFunctionCallInfo(2));
    4129 CBC          60 :         InitFunctionCallInfoData(*pertrans->deserialfn_fcinfo,
    4130                 :                                  &pertrans->deserialfn,
    4131 ECB             :                                  2,
    4132                 :                                  InvalidOid,
    4133                 :                                  (void *) aggstate, NULL);
    4134                 :     }
    4135                 : 
    4136                 :     /*
    4137                 :      * If we're doing either DISTINCT or ORDER BY for a plain agg, then we
    4138                 :      * have a list of SortGroupClause nodes; fish out the data in them and
    4139                 :      * stick them into arrays.  We ignore ORDER BY for an ordered-set agg,
    4140                 :      * however; the agg's transfn and finalfn are responsible for that.
    4141                 :      *
    4142                 :      * When the planner has set the aggpresorted flag, the input to the
    4143                 :      * aggregate is already correctly sorted.  For ORDER BY aggregates we can
    4144                 :      * simply treat these as normal aggregates.  For presorted DISTINCT
    4145                 :      * aggregates an extra step must be added to remove duplicate consecutive
    4146                 :      * inputs.
    4147                 :      *
    4148                 :      * Note that by construction, if there is a DISTINCT clause then the ORDER
    4149                 :      * BY clause is a prefix of it (see transformDistinctClause).
    4150                 :      */
    4151 GIC       22166 :     if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
    4152                 :     {
    4153             126 :         sortlist = NIL;
    4154             126 :         numSortCols = numDistinctCols = 0;
    4155 GNC         126 :         pertrans->aggsortrequired = false;
    4156                 :     }
    4157           22040 :     else if (aggref->aggpresorted && aggref->aggdistinct == NIL)
    4158                 :     {
    4159             686 :         sortlist = NIL;
    4160             686 :         numSortCols = numDistinctCols = 0;
    4161             686 :         pertrans->aggsortrequired = false;
    4162                 :     }
    4163 GIC       21354 :     else if (aggref->aggdistinct)
    4164                 :     {
    4165             267 :         sortlist = aggref->aggdistinct;
    4166             267 :         numSortCols = numDistinctCols = list_length(sortlist);
    4167             267 :         Assert(numSortCols >= list_length(aggref->aggorder));
    4168 GNC         267 :         pertrans->aggsortrequired = !aggref->aggpresorted;
    4169                 :     }
    4170                 :     else
    4171                 :     {
    4172 CBC       21087 :         sortlist = aggref->aggorder;
    4173 GIC       21087 :         numSortCols = list_length(sortlist);
    4174 CBC       21087 :         numDistinctCols = 0;
    4175 GNC       21087 :         pertrans->aggsortrequired = (numSortCols > 0);
    4176 ECB             :     }
    4177                 : 
    4178 GIC       22166 :     pertrans->numSortCols = numSortCols;
    4179 CBC       22166 :     pertrans->numDistinctCols = numDistinctCols;
    4180                 : 
    4181 ECB             :     /*
    4182                 :      * If we have either sorting or filtering to do, create a tupledesc and
    4183                 :      * slot corresponding to the aggregated inputs (including sort
    4184                 :      * expressions) of the agg.
    4185                 :      */
    4186 GIC       22166 :     if (numSortCols > 0 || aggref->aggfilter)
    4187 ECB             :     {
    4188 CBC         626 :         pertrans->sortdesc = ExecTypeFromTL(aggref->args);
    4189             626 :         pertrans->sortslot =
    4190             626 :             ExecInitExtraTupleSlot(estate, pertrans->sortdesc,
    4191                 :                                    &TTSOpsMinimalTuple);
    4192                 :     }
    4193                 : 
    4194           22166 :     if (numSortCols > 0)
    4195 ECB             :     {
    4196                 :         /*
    4197                 :          * We don't implement DISTINCT or ORDER BY aggs in the HASHED case
    4198                 :          * (yet)
    4199                 :          */
    4200 CBC         330 :         Assert(aggstate->aggstrategy != AGG_HASHED && aggstate->aggstrategy != AGG_MIXED);
    4201 ECB             : 
    4202                 :         /* ORDER BY aggregates are not supported with partial aggregation */
    4203 GIC         330 :         Assert(!DO_AGGSPLIT_COMBINE(aggstate->aggsplit));
    4204                 : 
    4205                 :         /* If we have only one input, we need its len/byval info. */
    4206             330 :         if (numInputs == 1)
    4207                 :         {
    4208 CBC         273 :             get_typlenbyval(inputTypes[numDirectArgs],
    4209                 :                             &pertrans->inputtypeLen,
    4210 ECB             :                             &pertrans->inputtypeByVal);
    4211                 :         }
    4212 CBC          57 :         else if (numDistinctCols > 0)
    4213                 :         {
    4214                 :             /* we will need an extra slot to store prior values */
    4215 GIC          42 :             pertrans->uniqslot =
    4216 CBC          42 :                 ExecInitExtraTupleSlot(estate, pertrans->sortdesc,
    4217                 :                                        &TTSOpsMinimalTuple);
    4218                 :         }
    4219                 : 
    4220                 :         /* Extract the sort information for use later */
    4221 GIC         330 :         pertrans->sortColIdx =
    4222 CBC         330 :             (AttrNumber *) palloc(numSortCols * sizeof(AttrNumber));
    4223 GIC         330 :         pertrans->sortOperators =
    4224             330 :             (Oid *) palloc(numSortCols * sizeof(Oid));
    4225 CBC         330 :         pertrans->sortCollations =
    4226 GIC         330 :             (Oid *) palloc(numSortCols * sizeof(Oid));
    4227             330 :         pertrans->sortNullsFirst =
    4228 CBC         330 :             (bool *) palloc(numSortCols * sizeof(bool));
    4229                 : 
    4230             330 :         i = 0;
    4231 GIC         747 :         foreach(lc, sortlist)
    4232                 :         {
    4233             417 :             SortGroupClause *sortcl = (SortGroupClause *) lfirst(lc);
    4234 CBC         417 :             TargetEntry *tle = get_sortgroupclause_tle(sortcl, aggref->args);
    4235                 : 
    4236                 :             /* the parser should have made sure of this */
    4237             417 :             Assert(OidIsValid(sortcl->sortop));
    4238 ECB             : 
    4239 GIC         417 :             pertrans->sortColIdx[i] = tle->resno;
    4240             417 :             pertrans->sortOperators[i] = sortcl->sortop;
    4241             417 :             pertrans->sortCollations[i] = exprCollation((Node *) tle->expr);
    4242             417 :             pertrans->sortNullsFirst[i] = sortcl->nulls_first;
    4243 CBC         417 :             i++;
    4244 ECB             :         }
    4245 CBC         330 :         Assert(i == numSortCols);
    4246 ECB             :     }
    4247                 : 
    4248 CBC       22166 :     if (aggref->aggdistinct)
    4249 ECB             :     {
    4250                 :         Oid        *ops;
    4251                 : 
    4252 CBC         267 :         Assert(numArguments > 0);
    4253             267 :         Assert(list_length(aggref->aggdistinct) == numDistinctCols);
    4254                 : 
    4255             267 :         ops = palloc(numDistinctCols * sizeof(Oid));
    4256 ECB             : 
    4257 GIC         267 :         i = 0;
    4258             612 :         foreach(lc, aggref->aggdistinct)
    4259 CBC         345 :             ops[i++] = ((SortGroupClause *) lfirst(lc))->eqop;
    4260                 : 
    4261 ECB             :         /* lookup / build the necessary comparators */
    4262 CBC         267 :         if (numDistinctCols == 1)
    4263             225 :             fmgr_info(get_opcode(ops[0]), &pertrans->equalfnOne);
    4264 ECB             :         else
    4265 CBC          42 :             pertrans->equalfnMulti =
    4266 GIC          42 :                 execTuplesMatchPrepare(pertrans->sortdesc,
    4267 ECB             :                                        numDistinctCols,
    4268 GIC          42 :                                        pertrans->sortColIdx,
    4269                 :                                        ops,
    4270 CBC          42 :                                        pertrans->sortCollations,
    4271                 :                                        &aggstate->ss.ps);
    4272 GIC         267 :         pfree(ops);
    4273                 :     }
    4274 ECB             : 
    4275 CBC       22166 :     pertrans->sortstates = (Tuplesortstate **)
    4276 GIC       22166 :         palloc0(sizeof(Tuplesortstate *) * numGroupingSets);
    4277 CBC       22166 : }
    4278                 : 
    4279 ECB             : 
    4280                 : static Datum
    4281 CBC        9047 : GetAggInitVal(Datum textInitVal, Oid transtype)
    4282                 : {
    4283                 :     Oid         typinput,
    4284 ECB             :                 typioparam;
    4285                 :     char       *strInitVal;
    4286                 :     Datum       initVal;
    4287                 : 
    4288 CBC        9047 :     getTypeInputInfo(transtype, &typinput, &typioparam);
    4289 GIC        9047 :     strInitVal = TextDatumGetCString(textInitVal);
    4290 CBC        9047 :     initVal = OidInputFunctionCall(typinput, strInitVal,
    4291                 :                                    typioparam, -1);
    4292            9047 :     pfree(strInitVal);
    4293 GIC        9047 :     return initVal;
    4294 ECB             : }
    4295                 : 
    4296                 : void
    4297 CBC       21188 : ExecEndAgg(AggState *node)
    4298 ECB             : {
    4299                 :     PlanState  *outerPlan;
    4300                 :     int         transno;
    4301 GIC       21188 :     int         numGroupingSets = Max(node->maxsets, 1);
    4302                 :     int         setno;
    4303 ECB             : 
    4304                 :     /*
    4305                 :      * When ending a parallel worker, copy the statistics gathered by the
    4306                 :      * worker back into shared memory so that it can be picked up by the main
    4307                 :      * process to report in EXPLAIN ANALYZE.
    4308                 :      */
    4309 GIC       21188 :     if (node->shared_info && IsParallelWorker())
    4310 ECB             :     {
    4311                 :         AggregateInstrumentation *si;
    4312                 : 
    4313 GIC          87 :         Assert(ParallelWorkerNumber <= node->shared_info->num_workers);
    4314 CBC          87 :         si = &node->shared_info->sinstrument[ParallelWorkerNumber];
    4315              87 :         si->hash_batches_used = node->hash_batches_used;
    4316 GIC          87 :         si->hash_disk_used = node->hash_disk_used;
    4317              87 :         si->hash_mem_peak = node->hash_mem_peak;
    4318                 :     }
    4319 ECB             : 
    4320                 :     /* Make sure we have closed any open tuplesorts */
    4321                 : 
    4322 GIC       21188 :     if (node->sort_in)
    4323 CBC          72 :         tuplesort_end(node->sort_in);
    4324 GIC       21188 :     if (node->sort_out)
    4325              21 :         tuplesort_end(node->sort_out);
    4326                 : 
    4327           21188 :     hashagg_reset_spill_state(node);
    4328                 : 
    4329           21188 :     if (node->hash_metacxt != NULL)
    4330                 :     {
    4331 CBC        3986 :         MemoryContextDelete(node->hash_metacxt);
    4332 GIC        3986 :         node->hash_metacxt = NULL;
    4333                 :     }
    4334                 : 
    4335 CBC       43299 :     for (transno = 0; transno < node->numtrans; transno++)
    4336 ECB             :     {
    4337 CBC       22111 :         AggStatePerTrans pertrans = &node->pertrans[transno];
    4338 ECB             : 
    4339 CBC       44723 :         for (setno = 0; setno < numGroupingSets; setno++)
    4340                 :         {
    4341 GIC       22612 :             if (pertrans->sortstates[setno])
    4342 UIC           0 :                 tuplesort_end(pertrans->sortstates[setno]);
    4343                 :         }
    4344 ECB             :     }
    4345                 : 
    4346                 :     /* And ensure any agg shutdown callbacks have been called */
    4347 CBC       42790 :     for (setno = 0; setno < numGroupingSets; setno++)
    4348 GIC       21602 :         ReScanExprContext(node->aggcontexts[setno]);
    4349 CBC       21188 :     if (node->hashcontext)
    4350 GIC        3986 :         ReScanExprContext(node->hashcontext);
    4351 ECB             : 
    4352                 :     /*
    4353                 :      * We don't actually free any ExprContexts here (see comment in
    4354                 :      * ExecFreeExprContext), just unlinking the output one from the plan node
    4355                 :      * suffices.
    4356                 :      */
    4357 CBC       21188 :     ExecFreeExprContext(&node->ss.ps);
    4358                 : 
    4359 ECB             :     /* clean up tuple table */
    4360 GIC       21188 :     ExecClearTuple(node->ss.ss_ScanTupleSlot);
    4361 ECB             : 
    4362 GIC       21188 :     outerPlan = outerPlanState(node);
    4363 CBC       21188 :     ExecEndNode(outerPlan);
    4364 GBC       21188 : }
    4365                 : 
    4366                 : void
    4367 GIC       82509 : ExecReScanAgg(AggState *node)
    4368                 : {
    4369 CBC       82509 :     ExprContext *econtext = node->ss.ps.ps_ExprContext;
    4370           82509 :     PlanState  *outerPlan = outerPlanState(node);
    4371           82509 :     Agg        *aggnode = (Agg *) node->ss.ps.plan;
    4372 ECB             :     int         transno;
    4373 GIC       82509 :     int         numGroupingSets = Max(node->maxsets, 1);
    4374                 :     int         setno;
    4375                 : 
    4376           82509 :     node->agg_done = false;
    4377                 : 
    4378           82509 :     if (node->aggstrategy == AGG_HASHED)
    4379 ECB             :     {
    4380                 :         /*
    4381                 :          * In the hashed case, if we haven't yet built the hash table then we
    4382                 :          * can just return; nothing done yet, so nothing to undo. If subnode's
    4383                 :          * chgParam is not NULL then it will be re-scanned by ExecProcNode,
    4384                 :          * else no reason to re-scan it at all.
    4385                 :          */
    4386 CBC       38650 :         if (!node->table_filled)
    4387 GIC         319 :             return;
    4388                 : 
    4389 ECB             :         /*
    4390                 :          * If we do have the hash table, and it never spilled, and the subplan
    4391                 :          * does not have any parameter changes, and none of our own parameter
    4392                 :          * changes affect input expressions of the aggregated functions, then
    4393                 :          * we can just rescan the existing hash table; no need to build it
    4394                 :          * again.
    4395                 :          */
    4396 GIC       38331 :         if (outerPlan->chgParam == NULL && !node->hash_ever_spilled &&
    4397             444 :             !bms_overlap(node->ss.ps.chgParam, aggnode->aggParams))
    4398 ECB             :         {
    4399 GIC         432 :             ResetTupleHashIterator(node->perhash[0].hashtable,
    4400 ECB             :                                    &node->perhash[0].hashiter);
    4401 GIC         432 :             select_current_set(node, 0, true);
    4402             432 :             return;
    4403                 :         }
    4404                 :     }
    4405                 : 
    4406                 :     /* Make sure we have closed any open tuplesorts */
    4407          125659 :     for (transno = 0; transno < node->numtrans; transno++)
    4408 ECB             :     {
    4409 CBC       87820 :         for (setno = 0; setno < numGroupingSets; setno++)
    4410                 :         {
    4411 GIC       43919 :             AggStatePerTrans pertrans = &node->pertrans[transno];
    4412                 : 
    4413           43919 :             if (pertrans->sortstates[setno])
    4414                 :             {
    4415 UIC           0 :                 tuplesort_end(pertrans->sortstates[setno]);
    4416               0 :                 pertrans->sortstates[setno] = NULL;
    4417                 :             }
    4418 ECB             :         }
    4419                 :     }
    4420                 : 
    4421                 :     /*
    4422                 :      * We don't need to ReScanExprContext the output tuple context here;
    4423                 :      * ExecReScan already did it. But we do need to reset our per-grouping-set
    4424                 :      * contexts, which may have transvalues stored in them. (We use rescan
    4425                 :      * rather than just reset because transfns may have registered callbacks
    4426                 :      * that need to be run now.) For the AGG_HASHED case, see below.
    4427                 :      */
    4428                 : 
    4429 CBC      163534 :     for (setno = 0; setno < numGroupingSets; setno++)
    4430                 :     {
    4431           81776 :         ReScanExprContext(node->aggcontexts[setno]);
    4432                 :     }
    4433 ECB             : 
    4434                 :     /* Release first tuple of group, if we have made a copy */
    4435 CBC       81758 :     if (node->grp_firstTuple != NULL)
    4436                 :     {
    4437 UBC           0 :         heap_freetuple(node->grp_firstTuple);
    4438               0 :         node->grp_firstTuple = NULL;
    4439                 :     }
    4440 GIC       81758 :     ExecClearTuple(node->ss.ss_ScanTupleSlot);
    4441                 : 
    4442                 :     /* Forget current agg values */
    4443          125659 :     MemSet(econtext->ecxt_aggvalues, 0, sizeof(Datum) * node->numaggs);
    4444           81758 :     MemSet(econtext->ecxt_aggnulls, 0, sizeof(bool) * node->numaggs);
    4445                 : 
    4446                 :     /*
    4447                 :      * With AGG_HASHED/MIXED, the hash table is allocated in a sub-context of
    4448                 :      * the hashcontext. This used to be an issue, but now, resetting a context
    4449                 :      * automatically deletes sub-contexts too.
    4450                 :      */
    4451 CBC       81758 :     if (node->aggstrategy == AGG_HASHED || node->aggstrategy == AGG_MIXED)
    4452                 :     {
    4453           37914 :         hashagg_reset_spill_state(node);
    4454                 : 
    4455 GIC       37914 :         node->hash_ever_spilled = false;
    4456           37914 :         node->hash_spill_mode = false;
    4457 CBC       37914 :         node->hash_ngroups_current = 0;
    4458                 : 
    4459 GBC       37914 :         ReScanExprContext(node->hashcontext);
    4460 EUB             :         /* Rebuild an empty hash table */
    4461 GIC       37914 :         build_hash_tables(node);
    4462 CBC       37914 :         node->table_filled = false;
    4463                 :         /* iterator will be reset when the table is filled */
    4464                 : 
    4465           37914 :         hashagg_recompile_expressions(node, false, false);
    4466 ECB             :     }
    4467                 : 
    4468 GIC       81758 :     if (node->aggstrategy != AGG_HASHED)
    4469                 :     {
    4470                 :         /*
    4471                 :          * Reset the per-group state (in particular, mark transvalues null)
    4472                 :          */
    4473 CBC       87736 :         for (setno = 0; setno < numGroupingSets; setno++)
    4474                 :         {
    4475          131667 :             MemSet(node->pergroups[setno], 0,
    4476                 :                    sizeof(AggStatePerGroupData) * node->numaggs);
    4477 ECB             :         }
    4478                 : 
    4479                 :         /* reset to phase 1 */
    4480 GIC       43859 :         initialize_phase(node, 1);
    4481 ECB             : 
    4482 GIC       43859 :         node->input_done = false;
    4483 CBC       43859 :         node->projected_set = -1;
    4484 ECB             :     }
    4485                 : 
    4486 GIC       81758 :     if (outerPlan->chgParam == NULL)
    4487 CBC          76 :         ExecReScan(outerPlan);
    4488                 : }
    4489                 : 
    4490 ECB             : 
    4491                 : /***********************************************************************
    4492                 :  * API exposed to aggregate functions
    4493                 :  ***********************************************************************/
    4494                 : 
    4495                 : 
    4496                 : /*
    4497                 :  * AggCheckCallContext - test if a SQL function is being called as an aggregate
    4498                 :  *
    4499                 :  * The transition and/or final functions of an aggregate may want to verify
    4500                 :  * that they are being called as aggregates, rather than as plain SQL
    4501                 :  * functions.  They should use this function to do so.  The return value
    4502                 :  * is nonzero if being called as an aggregate, or zero if not.  (Specific
    4503                 :  * nonzero values are AGG_CONTEXT_AGGREGATE or AGG_CONTEXT_WINDOW, but more
    4504                 :  * values could conceivably appear in future.)
    4505                 :  *
    4506                 :  * If aggcontext isn't NULL, the function also stores at *aggcontext the
    4507                 :  * identity of the memory context that aggregate transition values are being
    4508                 :  * stored in.  Note that the same aggregate call site (flinfo) may be called
    4509                 :  * interleaved on different transition values in different contexts, so it's
    4510                 :  * not kosher to cache aggcontext under fn_extra.  It is, however, kosher to
    4511                 :  * cache it in the transvalue itself (for internal-type transvalues).
    4512                 :  */
    4513                 : int
    4514 GIC     2367717 : AggCheckCallContext(FunctionCallInfo fcinfo, MemoryContext *aggcontext)
    4515                 : {
    4516         2367717 :     if (fcinfo->context && IsA(fcinfo->context, AggState))
    4517                 :     {
    4518         2330657 :         if (aggcontext)
    4519                 :         {
    4520          980689 :             AggState   *aggstate = ((AggState *) fcinfo->context);
    4521          980689 :             ExprContext *cxt = aggstate->curaggcontext;
    4522                 : 
    4523          980689 :             *aggcontext = cxt->ecxt_per_tuple_memory;
    4524                 :         }
    4525         2330657 :         return AGG_CONTEXT_AGGREGATE;
    4526                 :     }
    4527           37060 :     if (fcinfo->context && IsA(fcinfo->context, WindowAggState))
    4528                 :     {
    4529           36126 :         if (aggcontext)
    4530             295 :             *aggcontext = ((WindowAggState *) fcinfo->context)->curaggcontext;
    4531           36126 :         return AGG_CONTEXT_WINDOW;
    4532                 :     }
    4533                 : 
    4534                 :     /* this is just to prevent "uninitialized variable" warnings */
    4535             934 :     if (aggcontext)
    4536 CBC         910 :         *aggcontext = NULL;
    4537 GIC         934 :     return 0;
    4538 ECB             : }
    4539                 : 
    4540                 : /*
    4541                 :  * AggGetAggref - allow an aggregate support function to get its Aggref
    4542                 :  *
    4543                 :  * If the function is being called as an aggregate support function,
    4544                 :  * return the Aggref node for the aggregate call.  Otherwise, return NULL.
    4545                 :  *
    4546                 :  * Aggregates sharing the same inputs and transition functions can get
    4547                 :  * merged into a single transition calculation.  If the transition function
    4548                 :  * calls AggGetAggref, it will get some one of the Aggrefs for which it is
    4549                 :  * executing.  It must therefore not pay attention to the Aggref fields that
    4550                 :  * relate to the final function, as those are indeterminate.  But if a final
    4551                 :  * function calls AggGetAggref, it will get a precise result.
    4552                 :  *
    4553                 :  * Note that if an aggregate is being used as a window function, this will
    4554                 :  * return NULL.  We could provide a similar function to return the relevant
    4555                 :  * WindowFunc node in such cases, but it's not needed yet.
    4556                 :  */
    4557                 : Aggref *
    4558 CBC         123 : AggGetAggref(FunctionCallInfo fcinfo)
    4559 ECB             : {
    4560 GIC         123 :     if (fcinfo->context && IsA(fcinfo->context, AggState))
    4561                 :     {
    4562             123 :         AggState   *aggstate = (AggState *) fcinfo->context;
    4563                 :         AggStatePerAgg curperagg;
    4564                 :         AggStatePerTrans curpertrans;
    4565                 : 
    4566                 :         /* check curperagg (valid when in a final function) */
    4567             123 :         curperagg = aggstate->curperagg;
    4568                 : 
    4569             123 :         if (curperagg)
    4570 UIC           0 :             return curperagg->aggref;
    4571                 : 
    4572                 :         /* check curpertrans (valid when in a transition function) */
    4573 GIC         123 :         curpertrans = aggstate->curpertrans;
    4574                 : 
    4575             123 :         if (curpertrans)
    4576             123 :             return curpertrans->aggref;
    4577                 :     }
    4578 UIC           0 :     return NULL;
    4579                 : }
    4580 ECB             : 
    4581                 : /*
    4582                 :  * AggGetTempMemoryContext - fetch short-term memory context for aggregates
    4583                 :  *
    4584                 :  * This is useful in agg final functions; the context returned is one that
    4585                 :  * the final function can safely reset as desired.  This isn't useful for
    4586                 :  * transition functions, since the context returned MAY (we don't promise)
    4587                 :  * be the same as the context those are called in.
    4588                 :  *
    4589                 :  * As above, this is currently not useful for aggs called as window functions.
    4590                 :  */
    4591                 : MemoryContext
    4592 UBC           0 : AggGetTempMemoryContext(FunctionCallInfo fcinfo)
    4593                 : {
    4594 UIC           0 :     if (fcinfo->context && IsA(fcinfo->context, AggState))
    4595 ECB             :     {
    4596 UIC           0 :         AggState   *aggstate = (AggState *) fcinfo->context;
    4597 ECB             : 
    4598 LBC           0 :         return aggstate->tmpcontext->ecxt_per_tuple_memory;
    4599                 :     }
    4600 UBC           0 :     return NULL;
    4601                 : }
    4602                 : 
    4603                 : /*
    4604                 :  * AggStateIsShared - find out whether transition state is shared
    4605                 :  *
    4606                 :  * If the function is being called as an aggregate support function,
    4607                 :  * return true if the aggregate's transition state is shared across
    4608                 :  * multiple aggregates, false if it is not.
    4609                 :  *
    4610                 :  * Returns true if not called as an aggregate support function.
    4611                 :  * This is intended as a conservative answer, ie "no you'd better not
    4612                 :  * scribble on your input".  In particular, will return true if the
    4613                 :  * aggregate is being used as a window function, which is a scenario
    4614 EUB             :  * in which changing the transition state is a bad idea.  We might
    4615                 :  * want to refine the behavior for the window case in future.
    4616                 :  */
    4617                 : bool
    4618 GBC         123 : AggStateIsShared(FunctionCallInfo fcinfo)
    4619                 : {
    4620             123 :     if (fcinfo->context && IsA(fcinfo->context, AggState))
    4621                 :     {
    4622             123 :         AggState   *aggstate = (AggState *) fcinfo->context;
    4623                 :         AggStatePerAgg curperagg;
    4624                 :         AggStatePerTrans curpertrans;
    4625                 : 
    4626                 :         /* check curperagg (valid when in a final function) */
    4627 GIC         123 :         curperagg = aggstate->curperagg;
    4628                 : 
    4629             123 :         if (curperagg)
    4630 UIC           0 :             return aggstate->pertrans[curperagg->transno].aggshared;
    4631                 : 
    4632                 :         /* check curpertrans (valid when in a transition function) */
    4633 GIC         123 :         curpertrans = aggstate->curpertrans;
    4634                 : 
    4635             123 :         if (curpertrans)
    4636             123 :             return curpertrans->aggshared;
    4637                 :     }
    4638 UIC           0 :     return true;
    4639                 : }
    4640 ECB             : 
    4641                 : /*
    4642                 :  * AggRegisterCallback - register a cleanup callback for an aggregate
    4643                 :  *
    4644                 :  * This is useful for aggs to register shutdown callbacks, which will ensure
    4645                 :  * that non-memory resources are freed.  The callback will occur just before
    4646                 :  * the associated aggcontext (as returned by AggCheckCallContext) is reset,
    4647                 :  * either between groups or as a result of rescanning the query.  The callback
    4648                 :  * will NOT be called on error paths.  The typical use-case is for freeing of
    4649                 :  * tuplestores or tuplesorts maintained in aggcontext, or pins held by slots
    4650                 :  * created by the agg functions.  (The callback will not be called until after
    4651                 :  * the result of the finalfn is no longer needed, so it's safe for the finalfn
    4652 EUB             :  * to return data that will be freed by the callback.)
    4653                 :  *
    4654                 :  * As above, this is currently not useful for aggs called as window functions.
    4655 ECB             :  */
    4656                 : void
    4657 CBC         330 : AggRegisterCallback(FunctionCallInfo fcinfo,
    4658 ECB             :                     ExprContextCallbackFunction func,
    4659                 :                     Datum arg)
    4660 EUB             : {
    4661 GIC         330 :     if (fcinfo->context && IsA(fcinfo->context, AggState))
    4662                 :     {
    4663             330 :         AggState   *aggstate = (AggState *) fcinfo->context;
    4664             330 :         ExprContext *cxt = aggstate->curaggcontext;
    4665                 : 
    4666             330 :         RegisterExprContextCallback(cxt, func, arg);
    4667                 : 
    4668             330 :         return;
    4669                 :     }
    4670 UIC           0 :     elog(ERROR, "aggregate function cannot register a callback in this context");
    4671                 : }
    4672                 : 
    4673                 : 
    4674                 : /* ----------------------------------------------------------------
    4675                 :  *                      Parallel Query Support
    4676                 :  * ----------------------------------------------------------------
    4677                 :  */
    4678                 : 
    4679 ECB             :  /* ----------------------------------------------------------------
    4680                 :   *     ExecAggEstimate
    4681                 :   *
    4682                 :   *     Estimate space required to propagate aggregate statistics.
    4683                 :   * ----------------------------------------------------------------
    4684                 :   */
    4685                 : void
    4686 CBC         280 : ExecAggEstimate(AggState *node, ParallelContext *pcxt)
    4687                 : {
    4688 ECB             :     Size        size;
    4689                 : 
    4690                 :     /* don't need this if not instrumenting or no workers */
    4691 GIC         280 :     if (!node->ss.ps.instrument || pcxt->nworkers == 0)
    4692 GBC         226 :         return;
    4693                 : 
    4694 GIC          54 :     size = mul_size(pcxt->nworkers, sizeof(AggregateInstrumentation));
    4695              54 :     size = add_size(size, offsetof(SharedAggInfo, sinstrument));
    4696              54 :     shm_toc_estimate_chunk(&pcxt->estimator, size);
    4697              54 :     shm_toc_estimate_keys(&pcxt->estimator, 1);
    4698                 : }
    4699                 : 
    4700                 : /* ----------------------------------------------------------------
    4701                 :  *      ExecAggInitializeDSM
    4702                 :  *
    4703                 :  *      Initialize DSM space for aggregate statistics.
    4704                 :  * ----------------------------------------------------------------
    4705                 :  */
    4706                 : void
    4707             280 : ExecAggInitializeDSM(AggState *node, ParallelContext *pcxt)
    4708 ECB             : {
    4709                 :     Size        size;
    4710                 : 
    4711                 :     /* don't need this if not instrumenting or no workers */
    4712 GIC         280 :     if (!node->ss.ps.instrument || pcxt->nworkers == 0)
    4713 CBC         226 :         return;
    4714 ECB             : 
    4715 GIC          54 :     size = offsetof(SharedAggInfo, sinstrument)
    4716 CBC          54 :         + pcxt->nworkers * sizeof(AggregateInstrumentation);
    4717              54 :     node->shared_info = shm_toc_allocate(pcxt->toc, size);
    4718 ECB             :     /* ensure any unfilled slots will contain zeroes */
    4719 CBC          54 :     memset(node->shared_info, 0, size);
    4720 GIC          54 :     node->shared_info->num_workers = pcxt->nworkers;
    4721              54 :     shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id,
    4722              54 :                    node->shared_info);
    4723                 : }
    4724                 : 
    4725                 : /* ----------------------------------------------------------------
    4726                 :  *      ExecAggInitializeWorker
    4727                 :  *
    4728                 :  *      Attach worker to DSM space for aggregate statistics.
    4729 ECB             :  * ----------------------------------------------------------------
    4730                 :  */
    4731                 : void
    4732 GIC         768 : ExecAggInitializeWorker(AggState *node, ParallelWorkerContext *pwcxt)
    4733                 : {
    4734 CBC         768 :     node->shared_info =
    4735             768 :         shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, true);
    4736 GIC         768 : }
    4737 ECB             : 
    4738                 : /* ----------------------------------------------------------------
    4739                 :  *      ExecAggRetrieveInstrumentation
    4740                 :  *
    4741                 :  *      Transfer aggregate statistics from DSM to private memory.
    4742                 :  * ----------------------------------------------------------------
    4743                 :  */
    4744                 : void
    4745 GIC          54 : ExecAggRetrieveInstrumentation(AggState *node)
    4746                 : {
    4747                 :     Size        size;
    4748                 :     SharedAggInfo *si;
    4749                 : 
    4750              54 :     if (node->shared_info == NULL)
    4751 UIC           0 :         return;
    4752                 : 
    4753 GIC          54 :     size = offsetof(SharedAggInfo, sinstrument)
    4754 CBC          54 :         + node->shared_info->num_workers * sizeof(AggregateInstrumentation);
    4755 GIC          54 :     si = palloc(size);
    4756 CBC          54 :     memcpy(si, node->shared_info, size);
    4757              54 :     node->shared_info = si;
    4758 ECB             : }
        

Generated by: LCOV version v1.16-55-g56c0a2a