Masking utilities#
Boolean masks broadcastable to (batch, num_heads, seq_q, seq_kv).
True means attend.
|
Returns a key-padding mask. |
|
Returns a lower-triangular causal mask. |
|
Returns a sliding-window attention mask. |
|
Returns a document-boundary mask for sequence packing. |
|
Element-wise logical AND of boolean masks. |
- attnax.make_padding_mask(input_ids, pad_token_id=0)[source]#
Returns a key-padding mask.
- Parameters:
input_ids (Array) – Integer ids of shape
(batch, seq_len).pad_token_id (int) – Token id treated as padding.
- Returns:
Boolean array of shape
(batch, 1, 1, seq_len);Truefor non-padding positions.- Return type:
Array
- attnax.make_causal_mask(seq_len)[source]#
Returns a lower-triangular causal mask.
- Parameters:
seq_len (int) – Sequence length.
- Returns:
Boolean array of shape
(1, 1, seq_len, seq_len);Trueat positions(i, j)withj <= i.- Return type:
Array
- attnax.make_sliding_window_mask(seq_q, seq_kv=None, *, window_size, causal=True)[source]#
Returns a sliding-window attention mask.
- Parameters:
- Returns:
Boolean array of shape
(1, 1, seq_q, seq_kv).- Raises:
ValueError – If
window_size <= 0.- Return type:
Array
- attnax.make_document_mask(document_ids)[source]#
Returns a document-boundary mask for sequence packing.
mask[b, 0, i, j]isTrueiffdocument_ids[b, i] == document_ids[b, j].- Parameters:
document_ids (Array) – Integer array of shape
(batch, seq_len).- Returns:
Boolean array of shape
(batch, 1, seq_len, seq_len).- Return type:
Array