- goal of talk
  - explain to others that the topic is important
  - shared understanding
  - have other people work on the problem

  - fixing numa stuff will often help on non-numa systems too

  - implemented approaches in hacky way
  - not claiming topcis

- what is numa
  - increased latency, decreased bandwidth
  - "official" part
  - "inofficial" part

- why is it more important than it used to be
  - #cores increase
  - core-to-core latency is also increasing
  - chiplet design

- how does numa work on linux
  - default location: local
  - location determined on first use!
    -> using prewarm or such will drastically change location of memory
  - numa balancing (costly, but better than nothing)

- problems with numa in postgres

  - no insights -> can't explain performance behaviour

  - buffer related state is the worst problem

  - buffer replacement not numa aware: everything is interspersed
    - also bad for non-numa!
       - TLB misses
       - mdreadv()
    - example:
      - N sequential scans on independent tables
      - 1 [parallel] sequential scan on one table
    - solution:
      - #cpus freelists
        -
      - numa aware clock sweep

  - heavy heavy contention for frequently accessed read-only buffers
    - one numa node faster than all numa nodes!
    - use different database on each numa node:
      - scales linearly
    - duplicating data reduces cache efficiency
    - solution: fastpath locking for buffers
      - hard part: When to use.


  - without numactl --interleave=all: often all memory on one node

  - with interleave=all:
    - local memory also allocated in interleaved fashion
      ~10-20% perf loss

  - other stuff:
    - read-mostly and frequently changing data on same cacheline
      - example: TransamVariablesData
        - 50% faster concurrent xid assignment with subxids
    - procarray probably "too dense"
      - solution:
        - have per-numa node freelists?


- how to identify
  - perf c2c can be helpful
  - perf stat -e mem_load_l3_miss_retired.remote_dram,mem_load_l3_miss_retired.remote_fwd,mem_load_l3_miss_retired.remote_hitm,mem_load_l3_miss_retired.local_dram,uncore_imc/cas_count_read/,uncore_imc/cas_count_write/