DB: THE TILE AWAKENS 2 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL The Tile Cache Probe • • • • • The Depth Block process all begins with a tile arriving on the SC_DB_tile interface. The entry point is the tile cache probe dispatcher. SC_DB_TILE The TCP holds an input fifo, usc_db_tile_fifo, which handles the requirements of the credit debit interface. This fifo is compiled ram based. After the first fifo, TCP dispatcher sends the tile coordinates to the TCP cache manager if necessary. The cache manager will send hit/miss information to the next stated, the hi-z-s checker The tile is deposited, along with whether the cm was queried, in a reg-based fifo of the same depth at the first. It waits for the cache manager to respond here TCP will assert RTS when CM has responded 3 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL Input Fifo Cache Manager s2s3 fifo HIZ_S_CHECKER The Hierarchical Z S Checker • • • • • • • First, tile information is accepted from the TCP and deposited in the TCP_TILE_FIFO. The tile fifo matches the depth of the cache lookup time so that cache data, containing the “dst” is aligned with the “src” data from TCP. Output of FIFO is the q0 stage Plane information is converted to 14-bit UNORM this is the q1 stage The q1 stage goes to both the “depth test” and “stencil compare” subblocks In the depth test, the current dst value is checked against a register-determined minz and maxz. If the dst is outside the range, the tile fails DB test and will be culled other, the Z test is done and a z_result is generated. The output of Z test is q2 In the stencil compare, the stencil values are evaluated and an stest_result is generated. The output of S compare is q2 The z_result and stest_results are brought into db_his_stencil_test where a final hiz_result is generated. There is no register stage in this block, so we are still at q2 The final result is flopped to the hz_tk outputs which will send the information to the tile_kill block 4 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL tcp_hz TCP_TILE_FIFO q0 float to U14 q1 DEPTH TEST (DB and Z) STENCIL COMPARE q2 HIS_TEST q2 OUTPUT FLOP hz_tk (q3) Updating the Tile Cache HI-Z 5 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL TILE CACHE TILE SUMMARIZER Hi Z Test • Function: >=< • Does the tile absolutely meet the test criteria? • Example : • • • • zfunc is 1 means < current (dst): min = 0x30, max = 0x3000 new(src): min = 0x20, max = 0x800 Result: MAYPASS, MAYFAIL • Result of Hi-Z updates the tile cache: 1. HiZ grows ZNear o MinZ is set to min(old MinZ, tile MinZ) for any tile that can pass if the Z is < the destination. o MaxZ is set to max(old MaxZ, tile MaxZ) for any tile that can pass if the Z is > the destination. o The tile MinZ and MaxZ are treated as the viewport min and viewport max if shader Z export is enabled. o The HiZ logic does the min and max operations above. The tile cache just writes the values handed to it. 2. HiZ shrink of Z Far if tile is fully covered o MaxZ is set to min(old MaxZ, tile MaxZ) for any tile that can pass if the Z is < the destination. o MinZ is set to max(old MinZ, tile MinZ) for any tile that can pass if the Z is > the destination. 6 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL Hi Stencil Test Reg Compare 0 Compare 1 Compare Val, Func, Enable SResults May Pass, May Fail Back/Front Face States Tile Compare Mask Expand Compare State Compare Mask Backfacing Compare Val, Func, Enable SResults May Pass, May Fail Stencil State Expand Compare State Value, Tile may be <, >, = Convert to <=, >= Value, Tile may be <, >, = Value, Tile may be <, >, = Value, Mask Tile may be <, >, = Stencil Test Mask Stencil Func Compare Vs Ref? Final Stencil State Get Known Known Expand Known Mask[7:0] Known Value[7:0] Known Mask[7:0] Known Value[7:0] Known Known Merge (OR) Convert to <=, >= Value, Mask Tile may be <, >, = Stencil Test Mask Stencil Func Known Tile may be {< > =} Ref Compare Vs Ref? Tile may be {< > =} Ref & Tile may be {< > =} Ref Stencil Func May Pass, May Fail 7 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL Value, Tile may be <, >, = Func Hi Stencil Test Unlike Z, an operation can occur on the stencil value even when things value An operation is specified for STENCIL FAIL, STENCIL PASS/ZPASS, and STENCIL PASS/ZFAIL There are also FrontFace and BackFace tiles 00 - STENCIL_KEEP : New value = Old Value 01 - STENCIL_ZERO : New value = 0 02 - STENCIL_ONES : New value = 8`hff 03 - STENCIL_REPLACE_TEST : New value = STENCIL_TEST_VAL 04 - STENCIL_REPLACE_OP : New value = STENCIL_OP_VAL 05 - STENCIL_ADD_CLAMP : New value = Old Value + STENCIL_OP_VAL (clamp) 06 - STENCIL_SUB_CLAMP : New value = Old Value - STENCIL_OP_VAL (clamp) 07 - STENCIL_INVERT : New value = ~Old value 08 - STENCIL_ADD_WRAP : New value = Old Value + STENCIL_OP_VAL (wrap) 09 - STENCIL_SUB_WRAP : New value = Old Value - STENCIL_OP_VAL (wrap) 10 - STENCIL_AND : New value = Old Value & STENCIL_OP_VAL 11 - STENCIL_OR : New value = Old Value | STENCIL_OP_VAL 12 - STENCIL_XOR : New value = Old Value ^ STENCIL_OP_VAL 13 - STENCIL_NAND : New value = ~(Old Value & STENCIL_OP_VAL) 14 - STENCIL_NOR : New value = ~(Old Value | STENCIL_OP_VAL) 15 - STENCIL_XNOR : New value = ~(Old Value ^ STENCIL_OP_VAL) 8 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL Tile Kill DB_SC_TILE_MASK DB_SC_TILE_TILE_RATE* HZ_TK_HIZS INFO TILE_KILL TK_DF_HIZS INFO TK_DF_FAST_Z_OP TK_DF_FAST_STENCIL_OP *Pixel Rate Tiles are what happen to Fast Tile Ops when color is on. The DB still does its operations at tile rate instead of sample rate, but the DB tells the SC to run its detail walker at 1xAA (pixel rate), therefore running the tile at 1xAA rates instead of the potentially slower AA sample rates. 9 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL Tile Kill One flop stage at output, thus tk_db (output) <= hz_tk (input) Output from hi_z_s_checker includes HiZS Info: { UNMAPPED S, UNMAPPED Z, MAY S UPDATE, MAY DB FAIL, MAY S FAIL, MAY Z FAIL, MAY Z PASS } This can be used to accelerate QUAD operations: Z Op Determination FAST_NO_OP: If the Z test is known (either !MayZPass or !MayZFail), and it will not write (either !MayZPass or Z writes disabled), then the Z Op is a NO-OP and can be run at tile rate since no work remains. FAST_SET: If the Z test is known to pass and will write to a fully covered tile (which also means Alpha2Mask, etc. is not on as preprocessed away by the HiZ checker), then this tile’s Z op is a FAST_SET and can run at tile rate. SLOW_OP: Otherwise, this tile either has Z tests being performed, or has Z writes being performed on a per sample basis, which makes it a SLOW_OP. Stencil Op Determination FAST_NO_OP: If the stencil test is known (either !MaySFail or only MaySFail) or has no effect, and it will not write stencil (!MayStecilUpdate), then the Stencil Op is a NO-OP and can be run at tile rate since no work remains. FAST_SET: If the Stencil test is known or has no effect and all possible test outcomes will perform the same stencil op which happens to be REPLACE or ZERO, then this tile’s Stencil op is a FAST_SET and can run at tile rate. CONST_OP: If the Stencil test is known or has no effect and all possible test outcomes will perform the same stencil op which is not REPLACE or ZERO, then this tile’s Stencil op is a CONST_OP and can possibly run at tile rate if the dest data is single stencil compressed. The Quad Op Pipe will determine this and compensate accordingly later. SLOW_OP: Otherwise, this tile either has stencil tests being performed, or has stencil writes being performed on a per sample basis, which makes it a SLOW_OP. 10 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL Questions? 11 | DB: THE TILE AWAKENS | MAY 13, 2024 | CONFIDENTIAL