Class Attention
java.lang.Object
smile.llm.llama.Attention
Multi-head attention. It caches key and value information, applying rotary
embeddings, and performing linear transformations.
-
Constructor Summary
Constructors -
Method Summary
-
Constructor Details
-
Attention
Constructor.- Parameters:
args- the model configuration parameters.
-
-
Method Details
-
forward
Forward pass through the attention module.- Parameters:
x- the input tensor.startPos- the starting position for attention caching.cis- the precomputed frequency tensor.mask- the attention mask tensor.- Returns:
- the output tensor.
-