LCOV - Differential Code Coverage HEAD vs 15 - src/backend/access/brin/brin

LCOV - differential code coverage report

Current view:	top level - src/backend/access/brin - brin_bloom.c (source / functions)		Coverage	Total	Hit	UNC	UBC	CBC	DUB
Current:	Differential Code Coverage HEAD vs 15	Lines:	76.8 %	151	116	1	34	116	1
Current Date:	2023-04-08 17:13:01	Functions:	64.3 %	14	9	1	4	9
Baseline:	15	Line coverage date bins:
Baseline Date:	2023-04-08 15:09:40	(180,240] days:	0.0 %	1	0	1
Legend:	Lines: hit not hit	(240..) days:	77.3 %	150	116		34	116
		Function coverage date bins:
		(240..) days:	64.3 %	14	9	1	4	9

 Age         Owner                  TLA  Line data    Source code

                                  1                 : /*
                                  2                 :  * brin_bloom.c
                                  3                 :  *      Implementation of Bloom opclass for BRIN
                                  4                 :  *
                                  5                 :  * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
                                  6                 :  * Portions Copyright (c) 1994, Regents of the University of California
                                  7                 :  *
                                  8                 :  *
                                  9                 :  * A BRIN opclass summarizing page range into a bloom filter.
                                 10                 :  *
                                 11                 :  * Bloom filters allow efficient testing whether a given page range contains
                                 12                 :  * a particular value. Therefore, if we summarize each page range into a small
                                 13                 :  * bloom filter, we can easily (and cheaply) test whether it contains values
                                 14                 :  * we get later.
                                 15                 :  *
                                 16                 :  * The index only supports equality operators, similarly to hash indexes.
                                 17                 :  * Bloom indexes are however much smaller, and support only bitmap scans.
                                 18                 :  *
                                 19                 :  * Note: Don't confuse this with bloom indexes, implemented in a contrib
                                 20                 :  * module. That extension implements an entirely new AM, building a bloom
                                 21                 :  * filter on multiple columns in a single row. This opclass works with an
                                 22                 :  * existing AM (BRIN) and builds bloom filter on a column.
                                 23                 :  *
                                 24                 :  *
                                 25                 :  * values vs. hashes
                                 26                 :  * -----------------
                                 27                 :  *
                                 28                 :  * The original column values are not used directly, but are first hashed
                                 29                 :  * using the regular type-specific hash function, producing a uint32 hash.
                                 30                 :  * And this hash value is then added to the summary - i.e. it's hashed
                                 31                 :  * again and added to the bloom filter.
                                 32                 :  *
                                 33                 :  * This allows the code to treat all data types (byval/byref/...) the same
                                 34                 :  * way, with only minimal space requirements, because we're working with
                                 35                 :  * hashes and not the original values. Everything is uint32.
                                 36                 :  *
                                 37                 :  * Of course, this assumes the built-in hash function is reasonably good,
                                 38                 :  * without too many collisions etc. But that does seem to be the case, at
                                 39                 :  * least based on past experience. After all, the same hash functions are
                                 40                 :  * used for hash indexes, hash partitioning and so on.
                                 41                 :  *
                                 42                 :  *
                                 43                 :  * hashing scheme
                                 44                 :  * --------------
                                 45                 :  *
                                 46                 :  * Bloom filters require a number of independent hash functions. There are
                                 47                 :  * different schemes how to construct them - for example we might use
                                 48                 :  * hash_uint32_extended with random seeds, but that seems fairly expensive.
                                 49                 :  * We use a scheme requiring only two functions described in this paper:
                                 50                 :  *
                                 51                 :  * Less Hashing, Same Performance:Building a Better Bloom Filter
                                 52                 :  * Adam Kirsch, Michael Mitzenmacher, Harvard School of Engineering and
                                 53                 :  * Applied Sciences, Cambridge, Massachusetts [DOI 10.1002/rsa.20208]
                                 54                 :  *
                                 55                 :  * The two hash functions h1 and h2 are calculated using hard-coded seeds,
                                 56                 :  * and then combined using (h1 + i * h2) to generate the hash functions.
                                 57                 :  *
                                 58                 :  *
                                 59                 :  * sizing the bloom filter
                                 60                 :  * -----------------------
                                 61                 :  *
                                 62                 :  * Size of a bloom filter depends on the number of distinct values we will
                                 63                 :  * store in it, and the desired false positive rate. The higher the number
                                 64                 :  * of distinct values and/or the lower the false positive rate, the larger
                                 65                 :  * the bloom filter. On the other hand, we want to keep the index as small
                                 66                 :  * as possible - that's one of the basic advantages of BRIN indexes.
                                 67                 :  *
                                 68                 :  * Although the number of distinct elements (in a page range) depends on
                                 69                 :  * the data, we can consider it fixed. This simplifies the trade-off to
                                 70                 :  * just false positive rate vs. size.
                                 71                 :  *
                                 72                 :  * At the page range level, false positive rate is a probability the bloom
                                 73                 :  * filter matches a random value. For the whole index (with sufficiently
                                 74                 :  * many page ranges) it represents the fraction of the index ranges (and
                                 75                 :  * thus fraction of the table to be scanned) matching the random value.
                                 76                 :  *
                                 77                 :  * Furthermore, the size of the bloom filter is subject to implementation
                                 78                 :  * limits - it has to fit onto a single index page (8kB by default). As
                                 79                 :  * the bitmap is inherently random (when "full" about half the bits is set
                                 80                 :  * to 1, randomly), compression can't help very much.
                                 81                 :  *
                                 82                 :  * To reduce the size of a filter (to fit to a page), we have to either
                                 83                 :  * accept higher false positive rate (undesirable), or reduce the number
                                 84                 :  * of distinct items to be stored in the filter. We can't alter the input
                                 85                 :  * data, of course, but we may make the BRIN page ranges smaller - instead
                                 86                 :  * of the default 128 pages (1MB) we may build index with 16-page ranges,
                                 87                 :  * or something like that. This should reduce the number of distinct values
                                 88                 :  * in the page range, making the filter smaller (with fixed false positive
                                 89                 :  * rate). Even for random data sets this should help, as the number of rows
                                 90                 :  * per heap page is limited (to ~290 with very narrow tables, likely ~20
                                 91                 :  * in practice).
                                 92                 :  *
                                 93                 :  * Of course, good sizing decisions depend on having the necessary data,
                                 94                 :  * i.e. number of distinct values in a page range (of a given size) and
                                 95                 :  * table size (to estimate cost change due to change in false positive
                                 96                 :  * rate due to having larger index vs. scanning larger indexes). We may
                                 97                 :  * not have that data - for example when building an index on empty table
                                 98                 :  * it's not really possible. And for some data we only have estimates for
                                 99                 :  * the whole table and we can only estimate per-range values (ndistinct).
                                100                 :  *
                                101                 :  * Another challenge is that while the bloom filter is per-column, it's
                                102                 :  * the whole index tuple that has to fit into a page. And for multi-column
                                103                 :  * indexes that may include pieces we have no control over (not necessarily
                                104                 :  * bloom filters, the other columns may use other BRIN opclasses). So it's
                                105                 :  * not entirely clear how to distribute the space between those columns.
                                106                 :  *
                                107                 :  * The current logic, implemented in brin_bloom_get_ndistinct, attempts to
                                108                 :  * make some basic sizing decisions, based on the size of BRIN ranges, and
                                109                 :  * the maximum number of rows per range.
                                110                 :  *
                                111                 :  *
                                112                 :  * IDENTIFICATION
                                113                 :  *    src/backend/access/brin/brin_bloom.c
                                114                 :  */
                                115                 : #include "postgres.h"
                                116                 : 
                                117                 : #include "access/genam.h"
                                118                 : #include "access/brin.h"
                                119                 : #include "access/brin_internal.h"
                                120                 : #include "access/brin_page.h"
                                121                 : #include "access/brin_tuple.h"
                                122                 : #include "access/hash.h"
                                123                 : #include "access/htup_details.h"
                                124                 : #include "access/reloptions.h"
                                125                 : #include "access/stratnum.h"
                                126                 : #include "catalog/pg_type.h"
                                127                 : #include "catalog/pg_amop.h"
                                128                 : #include "utils/builtins.h"
                                129                 : #include "utils/datum.h"
                                130                 : #include "utils/lsyscache.h"
                                131                 : #include "utils/rel.h"
                                132                 : #include "utils/syscache.h"
                                133                 : 
                                134                 : #include <math.h>
                                135                 : 
                                136                 : #define BloomEqualStrategyNumber    1
                                137                 : 
                                138                 : /*
                                139                 :  * Additional SQL level support functions. We only have one, which is
                                140                 :  * used to calculate hash of the input value.
                                141                 :  *
                                142                 :  * Procedure numbers must not use values reserved for BRIN itself; see
                                143                 :  * brin_internal.h.
                                144                 :  */
                                145                 : #define     BLOOM_MAX_PROCNUMS      1   /* maximum support procs we need */
                                146                 : #define     PROCNUM_HASH            11  /* required */
                                147                 : 
                                148                 : /*
                                149                 :  * Subtract this from procnum to obtain index in BloomOpaque arrays
                                150                 :  * (Must be equal to minimum of private procnums).
                                151                 :  */
                                152                 : #define     PROCNUM_BASE            11
                                153                 : 
                                154                 : /*
                                155                 :  * Storage type for BRIN's reloptions.
                                156                 :  */
                                157                 : typedef struct BloomOptions
                                158                 : {
                                159                 :     int32       vl_len_;        /* varlena header (do not touch directly!) */
                                160                 :     double      nDistinctPerRange;  /* number of distinct values per range */
                                161                 :     double      falsePositiveRate;  /* false positive for bloom filter */
                                162                 : } BloomOptions;
                                163                 : 
                                164                 : /*
                                165                 :  * The current min value (16) is somewhat arbitrary, but it's based
                                166                 :  * on the fact that the filter header is ~20B alone, which is about
                                167                 :  * the same as the filter bitmap for 16 distinct items with 1% false
                                168                 :  * positive rate. So by allowing lower values we'd not gain much. In
                                169                 :  * any case, the min should not be larger than MaxHeapTuplesPerPage
                                170                 :  * (~290), which is the theoretical maximum for single-page ranges.
                                171                 :  */
                                172                 : #define     BLOOM_MIN_NDISTINCT_PER_RANGE       16
                                173                 : 
                                174                 : /*
                                175                 :  * Used to determine number of distinct items, based on the number of rows
                                176                 :  * in a page range. The 10% is somewhat similar to what estimate_num_groups
                                177                 :  * does, so we use the same factor here.
                                178                 :  */
                                179                 : #define     BLOOM_DEFAULT_NDISTINCT_PER_RANGE   -0.1    /* 10% of values */
                                180                 : 
                                181                 : /*
                                182                 :  * Allowed range and default value for the false positive range. The exact
                                183                 :  * values are somewhat arbitrary, but were chosen considering the various
                                184                 :  * parameters (size of filter vs. page size, etc.).
                                185                 :  *
                                186                 :  * The lower the false-positive rate, the more accurate the filter is, but
                                187                 :  * it also gets larger - at some point this eliminates the main advantage
                                188                 :  * of BRIN indexes, which is the tiny size. At 0.01% the index is about
                                189                 :  * 10% of the table (assuming 290 distinct values per 8kB page).
                                190                 :  *
                                191                 :  * On the other hand, as the false-positive rate increases, larger part of
                                192                 :  * the table has to be scanned due to mismatches - at 25% we're probably
                                193                 :  * close to sequential scan being cheaper.
                                194                 :  */
                                195                 : #define     BLOOM_MIN_FALSE_POSITIVE_RATE   0.0001  /* 0.01% fp rate */
                                196                 : #define     BLOOM_MAX_FALSE_POSITIVE_RATE   0.25    /* 25% fp rate */
                                197                 : #define     BLOOM_DEFAULT_FALSE_POSITIVE_RATE   0.01    /* 1% fp rate */
                                198                 : 
                                199                 : #define BloomGetNDistinctPerRange(opts) \
                                200                 :     ((opts) && (((BloomOptions *) (opts))->nDistinctPerRange != 0) ? \
                                201                 :      (((BloomOptions *) (opts))->nDistinctPerRange) : \
                                202                 :      BLOOM_DEFAULT_NDISTINCT_PER_RANGE)
                                203                 : 
                                204                 : #define BloomGetFalsePositiveRate(opts) \
                                205                 :     ((opts) && (((BloomOptions *) (opts))->falsePositiveRate != 0.0) ? \
                                206                 :      (((BloomOptions *) (opts))->falsePositiveRate) : \
                                207                 :      BLOOM_DEFAULT_FALSE_POSITIVE_RATE)
                                208                 : 
                                209                 : /*
                                210                 :  * And estimate of the largest bloom we can fit onto a page. This is not
                                211                 :  * a perfect guarantee, for a couple of reasons. For example, the row may
                                212                 :  * be larger because the index has multiple columns.
                                213                 :  */
                                214                 : #define BloomMaxFilterSize \
                                215                 :     MAXALIGN_DOWN(BLCKSZ - \
                                216                 :                   (MAXALIGN(SizeOfPageHeaderData + \
                                217                 :                             sizeof(ItemIdData)) + \
                                218                 :                    MAXALIGN(sizeof(BrinSpecialSpace)) + \
                                219                 :                    SizeOfBrinTuple))
                                220                 : 
                                221                 : /*
                                222                 :  * Seeds used to calculate two hash functions h1 and h2, which are then used
                                223                 :  * to generate k hashes using the (h1 + i * h2) scheme.
                                224                 :  */
                                225                 : #define BLOOM_SEED_1    0x71d924af
                                226                 : #define BLOOM_SEED_2    0xba48b314
                                227                 : 
                                228                 : /*
                                229                 :  * Bloom Filter
                                230                 :  *
                                231                 :  * Represents a bloom filter, built on hashes of the indexed values. That is,
                                232                 :  * we compute a uint32 hash of the value, and then store this hash into the
                                233                 :  * bloom filter (and compute additional hashes on it).
                                234                 :  *
                                235                 :  * XXX We could implement "sparse" bloom filters, keeping only the bytes that
                                236                 :  * are not entirely 0. But while indexes don't support TOAST, the varlena can
                                237                 :  * still be compressed. So this seems unnecessary, because the compression
                                238                 :  * should do the same job.
                                239                 :  *
                                240                 :  * XXX We can also watch the number of bits set in the bloom filter, and then
                                241                 :  * stop using it (and not store the bitmap, to save space) when the false
                                242                 :  * positive rate gets too high. But even if the false positive rate exceeds the
                                243                 :  * desired value, it still can eliminate some page ranges.
                                244                 :  */
                                245                 : typedef struct BloomFilter
                                246                 : {
                                247                 :     /* varlena header (do not touch directly!) */
                                248                 :     int32       vl_len_;
                                249                 : 
                                250                 :     /* space for various flags (unused for now) */
                                251                 :     uint16      flags;
                                252                 : 
                                253                 :     /* fields for the HASHED phase */
                                254                 :     uint8       nhashes;        /* number of hash functions */
                                255                 :     uint32      nbits;          /* number of bits in the bitmap (size) */
                                256                 :     uint32      nbits_set;      /* number of bits set to 1 */
                                257                 : 
                                258                 :     /* data of the bloom filter */
                                259                 :     char        data[FLEXIBLE_ARRAY_MEMBER];
                                260                 : } BloomFilter;
                                261                 : 
                                262                 : 
                                263                 : /*
                                264                 :  * bloom_init
                                265                 :  *      Initialize the Bloom Filter, allocate all the memory.
                                266                 :  *
                                267                 :  * The filter is initialized with optimal size for ndistinct expected values
                                268                 :  * and the requested false positive rate. The filter is stored as varlena.
                                269                 :  */
                                270                 : static BloomFilter *
  744 tomas.vondra              271 CBC        3882 : bloom_init(int ndistinct, double false_positive_rate)
                                272                 : {
                                273                 :     Size        len;
                                274                 :     BloomFilter *filter;
                                275                 : 
                                276                 :     int         nbits;          /* size of filter / number of bits */
                                277                 :     int         nbytes;         /* size of filter / number of bytes */
                                278                 : 
                                279                 :     double      k;              /* number of hash functions */
                                280                 : 
                                281            3882 :     Assert(ndistinct > 0);
                                282            3882 :     Assert((false_positive_rate >= BLOOM_MIN_FALSE_POSITIVE_RATE) &&
                                283                 :            (false_positive_rate < BLOOM_MAX_FALSE_POSITIVE_RATE));
                                284                 : 
                                285                 :     /* sizing bloom filter: -(n * ln(p)) / (ln(2))^2 */
                                286            3882 :     nbits = ceil(-(ndistinct * log(false_positive_rate)) / pow(log(2.0), 2));
                                287                 : 
                                288                 :     /* round m to whole bytes */
                                289            3882 :     nbytes = ((nbits + 7) / 8);
                                290            3882 :     nbits = nbytes * 8;
                                291                 : 
                                292                 :     /*
                                293                 :      * Reject filters that are obviously too large to store on a page.
                                294                 :      *
                                295                 :      * Initially the bloom filter is just zeroes and so very compressible, but
                                296                 :      * as we add values it gets more and more random, and so less and less
                                297                 :      * compressible. So initially everything fits on the page, but we might
                                298                 :      * get surprising failures later - we want to prevent that, so we reject
                                299                 :      * bloom filter that are obviously too large.
                                300                 :      *
                                301                 :      * XXX It's not uncommon to oversize the bloom filter a bit, to defend
                                302                 :      * against unexpected data anomalies (parts of table with more distinct
                                303                 :      * values per range etc.). But we still need to make sure even the
                                304                 :      * oversized filter fits on page, if such need arises.
                                305                 :      *
                                306                 :      * XXX This check is not perfect, because the index may have multiple
                                307                 :      * filters that are small individually, but too large when combined.
                                308                 :      */
                                309            3882 :     if (nbytes > BloomMaxFilterSize)
  744 tomas.vondra              310 UBC           0 :         elog(ERROR, "the bloom filter is too large (%d > %zu)", nbytes,
                                311                 :              BloomMaxFilterSize);
                                312                 : 
                                313                 :     /*
                                314                 :      * round(log(2.0) * m / ndistinct), but assume round() may not be
                                315                 :      * available on Windows
                                316                 :      */
  744 tomas.vondra              317 CBC        3882 :     k = log(2.0) * nbits / ndistinct;
                                318            3882 :     k = (k - floor(k) >= 0.5) ? ceil(k) : floor(k);
                                319                 : 
                                320                 :     /*
                                321                 :      * We allocate the whole filter. Most of it is going to be 0 bits, so the
                                322                 :      * varlena is easy to compress.
                                323                 :      */
                                324            3882 :     len = offsetof(BloomFilter, data) + nbytes;
                                325                 : 
                                326            3882 :     filter = (BloomFilter *) palloc0(len);
                                327                 : 
                                328            3882 :     filter->flags = 0;
                                329            3882 :     filter->nhashes = (int) k;
                                330            3882 :     filter->nbits = nbits;
                                331                 : 
                                332            3882 :     SET_VARSIZE(filter, len);
                                333                 : 
                                334            3882 :     return filter;
                                335                 : }
                                336                 : 
                                337                 : 
                                338                 : /*
                                339                 :  * bloom_add_value
                                340                 :  *      Add value to the bloom filter.
                                341                 :  */
                                342                 : static BloomFilter *
  697 tgl                       343           22023 : bloom_add_value(BloomFilter *filter, uint32 value, bool *updated)
                                344                 : {
                                345                 :     int         i;
                                346                 :     uint64      h1,
                                347                 :                 h2;
                                348                 : 
                                349                 :     /* compute the hashes, used for the bloom filter */
  744 tomas.vondra              350           22023 :     h1 = hash_bytes_uint32_extended(value, BLOOM_SEED_1) % filter->nbits;
                                351           22023 :     h2 = hash_bytes_uint32_extended(value, BLOOM_SEED_2) % filter->nbits;
                                352                 : 
                                353                 :     /* compute the requested number of hashes */
                                354          176184 :     for (i = 0; i < filter->nhashes; i++)
                                355                 :     {
                                356                 :         /* h1 + h2 + f(i) */
                                357          154161 :         uint32      h = (h1 + i * h2) % filter->nbits;
                                358          154161 :         uint32      byte = (h / 8);
                                359          154161 :         uint32      bit = (h % 8);
                                360                 : 
                                361                 :         /* if the bit is not set, set it and remember we did that */
                                362          154161 :         if (!(filter->data[byte] & (0x01 << bit)))
                                363                 :         {
                                364           54402 :             filter->data[byte] |= (0x01 << bit);
                                365           54402 :             filter->nbits_set++;
                                366           54402 :             if (updated)
                                367           54402 :                 *updated = true;
                                368                 :         }
                                369                 :     }
                                370                 : 
                                371           22023 :     return filter;
                                372                 : }
                                373                 : 
                                374                 : 
                                375                 : /*
                                376                 :  * bloom_contains_value
                                377                 :  *      Check if the bloom filter contains a particular value.
                                378                 :  */
                                379                 : static bool
  697 tgl                       380            4104 : bloom_contains_value(BloomFilter *filter, uint32 value)
                                381                 : {
                                382                 :     int         i;
                                383                 :     uint64      h1,
                                384                 :                 h2;
                                385                 : 
                                386                 :     /* calculate the two hashes */
  744 tomas.vondra              387            4104 :     h1 = hash_bytes_uint32_extended(value, BLOOM_SEED_1) % filter->nbits;
                                388            4104 :     h2 = hash_bytes_uint32_extended(value, BLOOM_SEED_2) % filter->nbits;
                                389                 : 
                                390                 :     /* compute the requested number of hashes */
                                391            5217 :     for (i = 0; i < filter->nhashes; i++)
                                392                 :     {
                                393                 :         /* h1 + h2 + f(i) */
                                394            5082 :         uint32      h = (h1 + i * h2) % filter->nbits;
                                395            5082 :         uint32      byte = (h / 8);
                                396            5082 :         uint32      bit = (h % 8);
                                397                 : 
                                398                 :         /* if the bit is not set, the value is not there */
                                399            5082 :         if (!(filter->data[byte] & (0x01 << bit)))
                                400            3969 :             return false;
                                401                 :     }
                                402                 : 
                                403                 :     /* all hashes found in bloom filter */
                                404             135 :     return true;
                                405                 : }
                                406                 : 
                                407                 : typedef struct BloomOpaque
                                408                 : {
                                409                 :     /*
                                410                 :      * XXX At this point we only need a single proc (to compute the hash), but
                                411                 :      * let's keep the array just like inclusion and minmax opclasses, for
                                412                 :      * consistency. We may need additional procs in the future.
                                413                 :      */
                                414                 :     FmgrInfo    extra_procinfos[BLOOM_MAX_PROCNUMS];
                                415                 :     bool        extra_proc_missing[BLOOM_MAX_PROCNUMS];
                                416                 : } BloomOpaque;
                                417                 : 
                                418                 : static FmgrInfo *bloom_get_procinfo(BrinDesc *bdesc, uint16 attno,
                                419                 :                                     uint16 procnum);
                                420                 : 
                                421                 : 
                                422                 : Datum
                                423            2300 : brin_bloom_opcinfo(PG_FUNCTION_ARGS)
                                424                 : {
                                425                 :     BrinOpcInfo *result;
                                426                 : 
                                427                 :     /*
                                428                 :      * opaque->strategy_procinfos is initialized lazily; here it is set to
                                429                 :      * all-uninitialized by palloc0 which sets fn_oid to InvalidOid.
                                430                 :      *
                                431                 :      * bloom indexes only store the filter as a single BYTEA column
                                432                 :      */
                                433                 : 
                                434            2300 :     result = palloc0(MAXALIGN(SizeofBrinOpcInfo(1)) +
                                435                 :                      sizeof(BloomOpaque));
                                436            2300 :     result->oi_nstored = 1;
                                437            2300 :     result->oi_regular_nulls = true;
                                438            2300 :     result->oi_opaque = (BloomOpaque *)
                                439            2300 :         MAXALIGN((char *) result + SizeofBrinOpcInfo(1));
                                440            2300 :     result->oi_typcache[0] = lookup_type_cache(PG_BRIN_BLOOM_SUMMARYOID, 0);
                                441                 : 
                                442            2300 :     PG_RETURN_POINTER(result);
                                443                 : }
                                444                 : 
                                445                 : /*
                                446                 :  * brin_bloom_get_ndistinct
                                447                 :  *      Determine the ndistinct value used to size bloom filter.
                                448                 :  *
                                449                 :  * Adjust the ndistinct value based on the pagesPerRange value. First,
                                450                 :  * if it's negative, it's assumed to be relative to maximum number of
                                451                 :  * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
                                452                 :  * tuples, which is likely a significant over-estimate). We also clamp
                                453                 :  * the value, not to over-size the bloom filter unnecessarily.
                                454                 :  *
                                455                 :  * XXX We can only do this when the pagesPerRange value was supplied.
                                456                 :  * If it wasn't, it has to be a read-only access to the index, in which
                                457                 :  * case we don't really care. But perhaps we should fall-back to the
                                458                 :  * default pagesPerRange value?
                                459                 :  *
                                460                 :  * XXX We might also fetch info about ndistinct estimate for the column,
                                461                 :  * and compute the expected number of distinct values in a range. But
                                462                 :  * that may be tricky due to data being sorted in various ways, so it
                                463                 :  * seems better to rely on the upper estimate.
                                464                 :  *
                                465                 :  * XXX We might also calculate a better estimate of rows per BRIN range,
                                466                 :  * instead of using MaxHeapTuplesPerPage (which probably produces values
                                467                 :  * much higher than reality).
                                468                 :  */
                                469                 : static int
                                470            3882 : brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
                                471                 : {
                                472                 :     double      ndistinct;
                                473                 :     double      maxtuples;
                                474                 :     BlockNumber pagesPerRange;
                                475                 : 
                                476            3882 :     pagesPerRange = BrinGetPagesPerRange(bdesc->bd_index);
                                477            3882 :     ndistinct = BloomGetNDistinctPerRange(opts);
                                478                 : 
                                479            3882 :     Assert(BlockNumberIsValid(pagesPerRange));
                                480                 : 
                                481            3882 :     maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
                                482                 : 
                                483                 :     /*
                                484                 :      * Similarly to n_distinct, negative values are relative - in this case to
                                485                 :      * maximum number of tuples in the page range (maxtuples).
                                486                 :      */
                                487            3882 :     if (ndistinct < 0)
                                488            3882 :         ndistinct = (-ndistinct) * maxtuples;
                                489                 : 
                                490                 :     /*
                                491                 :      * Positive values are to be used directly, but we still apply a couple of
                                492                 :      * safeties to avoid using unreasonably small bloom filters.
                                493                 :      */
                                494            3882 :     ndistinct = Max(ndistinct, BLOOM_MIN_NDISTINCT_PER_RANGE);
                                495                 : 
                                496                 :     /*
                                497                 :      * And don't use more than the maximum possible number of tuples, in the
                                498                 :      * range, which would be entirely wasteful.
                                499                 :      */
                                500            3882 :     ndistinct = Min(ndistinct, maxtuples);
                                501                 : 
                                502            3882 :     return (int) ndistinct;
                                503                 : }
                                504                 : 
                                505                 : /*
                                506                 :  * Examine the given index tuple (which contains partial status of a certain
                                507                 :  * page range) by comparing it to the given value that comes from another heap
                                508                 :  * tuple.  If the new value is outside the bloom filter specified by the
                                509                 :  * existing tuple values, update the index tuple and return true.  Otherwise,
                                510                 :  * return false and do not modify in this case.
                                511                 :  */
                                512                 : Datum
                                513           22023 : brin_bloom_add_value(PG_FUNCTION_ARGS)
                                514                 : {
                                515           22023 :     BrinDesc   *bdesc = (BrinDesc *) PG_GETARG_POINTER(0);
                                516           22023 :     BrinValues *column = (BrinValues *) PG_GETARG_POINTER(1);
                                517           22023 :     Datum       newval = PG_GETARG_DATUM(2);
                                518           22023 :     bool        isnull PG_USED_FOR_ASSERTS_ONLY = PG_GETARG_DATUM(3);
                                519           22023 :     BloomOptions *opts = (BloomOptions *) PG_GET_OPCLASS_OPTIONS();
                                520           22023 :     Oid         colloid = PG_GET_COLLATION();
                                521                 :     FmgrInfo   *hashFn;
                                522                 :     uint32      hashValue;
                                523           22023 :     bool        updated = false;
                                524                 :     AttrNumber  attno;
                                525                 :     BloomFilter *filter;
                                526                 : 
                                527           22023 :     Assert(!isnull);
                                528                 : 
                                529           22023 :     attno = column->bv_attno;
                                530                 : 
                                531                 :     /*
                                532                 :      * If this is the first non-null value, we need to initialize the bloom
                                533                 :      * filter. Otherwise just extract the existing bloom filter from
                                534                 :      * BrinValues.
                                535                 :      */
                                536           22023 :     if (column->bv_allnulls)
                                537                 :     {
                                538            7764 :         filter = bloom_init(brin_bloom_get_ndistinct(bdesc, opts),
                                539            3882 :                             BloomGetFalsePositiveRate(opts));
                                540            3882 :         column->bv_values[0] = PointerGetDatum(filter);
                                541            3882 :         column->bv_allnulls = false;
                                542            3882 :         updated = true;
                                543                 :     }
                                544                 :     else
                                545           18141 :         filter = (BloomFilter *) PG_DETOAST_DATUM(column->bv_values[0]);
                                546                 : 
                                547                 :     /*
                                548                 :      * Compute the hash of the new value, using the supplied hash function,
                                549                 :      * and then add the hash value to the bloom filter.
                                550                 :      */
                                551           22023 :     hashFn = bloom_get_procinfo(bdesc, attno, PROCNUM_HASH);
                                552                 : 
                                553           22023 :     hashValue = DatumGetUInt32(FunctionCall1Coll(hashFn, colloid, newval));
                                554                 : 
                                555           22023 :     filter = bloom_add_value(filter, hashValue, &updated);
                                556                 : 
                                557           22023 :     column->bv_values[0] = PointerGetDatum(filter);
                                558                 : 
                                559           22023 :     PG_RETURN_BOOL(updated);
                                560                 : }
                                561                 : 
                                562                 : /*
                                563                 :  * Given an index tuple corresponding to a certain page range and a scan key,
                                564                 :  * return whether the scan key is consistent with the index tuple's bloom
                                565                 :  * filter.  Return true if so, false otherwise.
                                566                 :  */
                                567                 : Datum
                                568            4104 : brin_bloom_consistent(PG_FUNCTION_ARGS)
                                569                 : {
                                570            4104 :     BrinDesc   *bdesc = (BrinDesc *) PG_GETARG_POINTER(0);
                                571            4104 :     BrinValues *column = (BrinValues *) PG_GETARG_POINTER(1);
                                572            4104 :     ScanKey    *keys = (ScanKey *) PG_GETARG_POINTER(2);
                                573            4104 :     int         nkeys = PG_GETARG_INT32(3);
                                574            4104 :     Oid         colloid = PG_GET_COLLATION();
                                575                 :     AttrNumber  attno;
                                576                 :     Datum       value;
                                577                 :     Datum       matches;
                                578                 :     FmgrInfo   *finfo;
                                579                 :     uint32      hashValue;
                                580                 :     BloomFilter *filter;
                                581                 :     int         keyno;
                                582                 : 
                                583            4104 :     filter = (BloomFilter *) PG_DETOAST_DATUM(column->bv_values[0]);
                                584                 : 
                                585            4104 :     Assert(filter);
                                586                 : 
                                587            4104 :     matches = true;
                                588                 : 
                                589            4239 :     for (keyno = 0; keyno < nkeys; keyno++)
                                590                 :     {
                                591            4104 :         ScanKey     key = keys[keyno];
                                592                 : 
                                593                 :         /* NULL keys are handled and filtered-out in bringetbitmap */
                                594            4104 :         Assert(!(key->sk_flags & SK_ISNULL));
                                595                 : 
                                596            4104 :         attno = key->sk_attno;
                                597            4104 :         value = key->sk_argument;
                                598                 : 
                                599            4104 :         switch (key->sk_strategy)
                                600                 :         {
                                601            4104 :             case BloomEqualStrategyNumber:
                                602                 : 
                                603                 :                 /*
                                604                 :                  * In the equality case (WHERE col = someval), we want to
                                605                 :                  * return the current page range if the minimum value in the
                                606                 :                  * range <= scan key, and the maximum value >= scan key.
                                607                 :                  */
                                608            4104 :                 finfo = bloom_get_procinfo(bdesc, attno, PROCNUM_HASH);
                                609                 : 
                                610            4104 :                 hashValue = DatumGetUInt32(FunctionCall1Coll(finfo, colloid, value));
                                611            4104 :                 matches &= bloom_contains_value(filter, hashValue);
                                612                 : 
                                613            4104 :                 break;
  744 tomas.vondra              614 UBC           0 :             default:
                                615                 :                 /* shouldn't happen */
                                616               0 :                 elog(ERROR, "invalid strategy number %d", key->sk_strategy);
                                617                 :                 matches = 0;
                                618                 :                 break;
                                619                 :         }
                                620                 : 
  744 tomas.vondra              621 CBC        4104 :         if (!matches)
                                622            3969 :             break;
                                623                 :     }
                                624                 : 
                                625            4104 :     PG_RETURN_DATUM(matches);
                                626                 : }
                                627                 : 
                                628                 : /*
                                629                 :  * Given two BrinValues, update the first of them as a union of the summary
                                630                 :  * values contained in both.  The second one is untouched.
                                631                 :  *
                                632                 :  * XXX We assume the bloom filters have the same parameters for now. In the
                                633                 :  * future we should have 'can union' function, to decide if we can combine
                                634                 :  * two particular bloom filters.
                                635                 :  */
                                636                 : Datum
  744 tomas.vondra              637 UBC           0 : brin_bloom_union(PG_FUNCTION_ARGS)
                                638                 : {
                                639                 :     int         i;
                                640                 :     int         nbytes;
                                641               0 :     BrinValues *col_a = (BrinValues *) PG_GETARG_POINTER(1);
                                642               0 :     BrinValues *col_b = (BrinValues *) PG_GETARG_POINTER(2);
                                643                 :     BloomFilter *filter_a;
                                644                 :     BloomFilter *filter_b;
                                645                 : 
                                646               0 :     Assert(col_a->bv_attno == col_b->bv_attno);
                                647               0 :     Assert(!col_a->bv_allnulls && !col_b->bv_allnulls);
                                648                 : 
                                649               0 :     filter_a = (BloomFilter *) PG_DETOAST_DATUM(col_a->bv_values[0]);
                                650               0 :     filter_b = (BloomFilter *) PG_DETOAST_DATUM(col_b->bv_values[0]);
                                651                 : 
                                652                 :     /* make sure the filters use the same parameters */
                                653               0 :     Assert(filter_a && filter_b);
                                654               0 :     Assert(filter_a->nbits == filter_b->nbits);
                                655               0 :     Assert(filter_a->nhashes == filter_b->nhashes);
                                656               0 :     Assert((filter_a->nbits > 0) && (filter_a->nbits % 8 == 0));
                                657                 : 
                                658               0 :     nbytes = (filter_a->nbits) / 8;
                                659                 : 
                                660                 :     /* simply OR the bitmaps */
                                661               0 :     for (i = 0; i < nbytes; i++)
                                662               0 :         filter_a->data[i] |= filter_b->data[i];
                                663                 : 
                                664               0 :     PG_RETURN_VOID();
                                665                 : }
                                666                 : 
                                667                 : /*
                                668                 :  * Cache and return inclusion opclass support procedure
                                669                 :  *
                                670                 :  * Return the procedure corresponding to the given function support number
                                671                 :  * or null if it does not exist.
                                672                 :  */
                                673                 : static FmgrInfo *
  744 tomas.vondra              674 CBC       26127 : bloom_get_procinfo(BrinDesc *bdesc, uint16 attno, uint16 procnum)
                                675                 : {
                                676                 :     BloomOpaque *opaque;
                                677           26127 :     uint16      basenum = procnum - PROCNUM_BASE;
                                678                 : 
                                679                 :     /*
                                680                 :      * We cache these in the opaque struct, to avoid repetitive syscache
                                681                 :      * lookups.
                                682                 :      */
                                683           26127 :     opaque = (BloomOpaque *) bdesc->bd_info[attno - 1]->oi_opaque;
                                684                 : 
                                685                 :     /*
                                686                 :      * If we already searched for this proc and didn't find it, don't bother
                                687                 :      * searching again.
                                688                 :      */
                                689           26127 :     if (opaque->extra_proc_missing[basenum])
  744 tomas.vondra              690 UBC           0 :         return NULL;
                                691                 : 
  744 tomas.vondra              692 CBC       26127 :     if (opaque->extra_procinfos[basenum].fn_oid == InvalidOid)
                                693                 :     {
                                694             357 :         if (RegProcedureIsValid(index_getprocid(bdesc->bd_index, attno,
                                695                 :                                                 procnum)))
                                696                 :         {
                                697             357 :             fmgr_info_copy(&opaque->extra_procinfos[basenum],
                                698                 :                            index_getprocinfo(bdesc->bd_index, attno, procnum),
                                699                 :                            bdesc->bd_context);
                                700                 :         }
                                701                 :         else
                                702                 :         {
  744 tomas.vondra              703 UBC           0 :             opaque->extra_proc_missing[basenum] = true;
                                704               0 :             return NULL;
                                705                 :         }
                                706                 :     }
                                707                 : 
  744 tomas.vondra              708 CBC       26127 :     return &opaque->extra_procinfos[basenum];
                                709                 : }
                                710                 : 
                                711                 : Datum
                                712             248 : brin_bloom_options(PG_FUNCTION_ARGS)
                                713                 : {
                                714             248 :     local_relopts *relopts = (local_relopts *) PG_GETARG_POINTER(0);
                                715                 : 
                                716             248 :     init_local_reloptions(relopts, sizeof(BloomOptions));
                                717                 : 
                                718             248 :     add_local_real_reloption(relopts, "n_distinct_per_range",
                                719                 :                              "number of distinct items expected in a BRIN page range",
                                720                 :                              BLOOM_DEFAULT_NDISTINCT_PER_RANGE,
                                721                 :                              -1.0, INT_MAX, offsetof(BloomOptions, nDistinctPerRange));
                                722                 : 
                                723             248 :     add_local_real_reloption(relopts, "false_positive_rate",
                                724                 :                              "desired false-positive rate for the bloom filters",
                                725                 :                              BLOOM_DEFAULT_FALSE_POSITIVE_RATE,
                                726                 :                              BLOOM_MIN_FALSE_POSITIVE_RATE,
                                727                 :                              BLOOM_MAX_FALSE_POSITIVE_RATE,
                                728                 :                              offsetof(BloomOptions, falsePositiveRate));
                                729                 : 
                                730             248 :     PG_RETURN_VOID();
                                731                 : }
                                732                 : 
                                733                 : /*
                                734                 :  * brin_bloom_summary_in
                                735                 :  *      - input routine for type brin_bloom_summary.
                                736                 :  *
                                737                 :  * brin_bloom_summary is only used internally to represent summaries
                                738                 :  * in BRIN bloom indexes, so it has no operations of its own, and we
                                739                 :  * disallow input too.
                                740                 :  */
                                741                 : Datum
  744 tomas.vondra              742 UBC           0 : brin_bloom_summary_in(PG_FUNCTION_ARGS)
                                743                 : {
                                744                 :     /*
                                745                 :      * brin_bloom_summary stores the data in binary form and parsing text
                                746                 :      * input is not needed, so disallow this.
                                747                 :      */
                                748               0 :     ereport(ERROR,
                                749                 :             (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                750                 :              errmsg("cannot accept a value of type %s", "pg_brin_bloom_summary")));
                                751                 : 
                                752                 :     PG_RETURN_VOID();           /* keep compiler quiet */
                                753                 : }
                                754                 : 
                                755                 : 
                                756                 : /*
                                757                 :  * brin_bloom_summary_out
                                758                 :  *      - output routine for type brin_bloom_summary.
                                759                 :  *
                                760                 :  * BRIN bloom summaries are serialized into a bytea value, but we want
                                761                 :  * to output something nicer humans can understand.
                                762                 :  */
                                763                 : Datum
                                764               0 : brin_bloom_summary_out(PG_FUNCTION_ARGS)
                                765                 : {
                                766                 :     BloomFilter *filter;
                                767                 :     StringInfoData str;
                                768                 : 
                                769                 :     /* detoast the data to get value with a full 4B header */
  224 peter                     770 UNC           0 :     filter = (BloomFilter *) PG_DETOAST_DATUM_PACKED(PG_GETARG_DATUM(0));
                                771                 : 
  744 tomas.vondra              772 UBC           0 :     initStringInfo(&str);
                                773               0 :     appendStringInfoChar(&str, '{');
                                774                 : 
                                775               0 :     appendStringInfo(&str, "mode: hashed  nhashes: %u  nbits: %u  nbits_set: %u",
                                776               0 :                      filter->nhashes, filter->nbits, filter->nbits_set);
                                777                 : 
                                778               0 :     appendStringInfoChar(&str, '}');
                                779                 : 
                                780               0 :     PG_RETURN_CSTRING(str.data);
                                781                 : }
                                782                 : 
                                783                 : /*
                                784                 :  * brin_bloom_summary_recv
                                785                 :  *      - binary input routine for type brin_bloom_summary.
                                786                 :  */
                                787                 : Datum
                                788               0 : brin_bloom_summary_recv(PG_FUNCTION_ARGS)
                                789                 : {
                                790               0 :     ereport(ERROR,
                                791                 :             (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                792                 :              errmsg("cannot accept a value of type %s", "pg_brin_bloom_summary")));
                                793                 : 
                                794                 :     PG_RETURN_VOID();           /* keep compiler quiet */
                                795                 : }
                                796                 : 
                                797                 : /*
                                798                 :  * brin_bloom_summary_send
                                799                 :  *      - binary output routine for type brin_bloom_summary.
                                800                 :  *
                                801                 :  * BRIN bloom summaries are serialized in a bytea value (although the
                                802                 :  * type is named differently), so let's just send that.
                                803                 :  */
                                804                 : Datum
                                805               0 : brin_bloom_summary_send(PG_FUNCTION_ARGS)
                                806                 : {
                                807               0 :     return byteasend(fcinfo);
                                808                 : }

Generated by: LCOV version v1.16-55-g56c0a2a