- What's vertical scalability. - Architectures we care about. - Why do we even care about vertical vs. horizontal scalability - numbers of cores grow - latency, latency, latency - Past bottlenecks: - heavyweight locks - dynamic "lock identities" - fair - shared lock table - fast path locking - 9.2 - most relation locks don't conflict - if no other conflicts, acquire locally - no shared locks! - when acquiring conflicting lock: check all fastpath owners 2xE5-2676, pgbench readonly nclient fastpath plain 1 15767.613029 14648.321682 2 30181.087726 31050.552320 4 58022.919055 53580.015882 8 99929.592542 79902.211657 16 197863.674721 116520.584584 32 402500.476521 125716.291930 64 516190.013200 103557.810298 96 524033.645406 98968.013220 128 526929.427577 97300.116397 196 518387.253132 94800.963682 256 513310.256754 89605.783986 - XLOG Insertion - 9.4 - WAL is a sequential structure - buffered in memory - old way was exclusive locks - new way is reserving space, fill in parallel 1 - 52.711939 8 - 286.496054 16 - 346.113313 24 - 363.242111 nclient old way new way 1 45.054616 44.896155 2 61.825701 63.758291 4 84.540911 101.886975 8 86.992427 123.295212 16 81.344399 142.994028 32 82.789298 180.576711 64 71.673193 186.595098 96 60.401167 188.734743 128 57.654713 183.889669 196 50.175884 175.800850 256 48.403708 175.850582 - LWLock scalability - 9.5 - queued reader writer lock - used for buffer locks, protecting hash tables, and a lot of other things - often used in shared mode - but state protected by a spinlock => massive spinlock contention - protect lock state only using atomics - complexities around queueing 89.53% postgres postgres [.] s_lock 2.53% postgres postgres [.] LWLockAcquire 1.79% postgres postgres [.] LWLockRelease 0.63% postgres postgres [.] hash_search_with_hash_value old code new code 1 11466 11395 4 53846 53876 8 102673 102040 16 174818 176274 32 293249 295961 48 348542 377979 64 217754 447015 96 149011 461657 128 135191 457799 - Buffer replacement - 9.5 - used to be protected by single lock - first made to use a spinlock in a granular way - then removed the lock entirely Future problems: - Extension lock - diagram - somewhat simple - Snapshot computation - linear with connections - worse due to cache hierarchy effects - conflicts with commits - Two different approaches discussed - Cache replacement - Buffer pinning - Root pages