Package smile.llm.llama
Class Transformer
java.lang.Object
smile.deep.layer.LayerBlock
smile.llm.llama.Transformer
The Transformer model. It consists of token embeddings, stacked
Transformer blocks, and the final output layer. This model can
be used for various natural language processing tasks, such as
language modeling or text generation.
-
Constructor Details
-
Transformer
Constructor.- Parameters:
args
- the model configuration parameters.device
- the compute device.
-
-
Method Details
-
forward
Forward pass through the model.- Parameters:
tokens
- the input token indices.startPos
- the starting position for attention caching.- Returns:
- the output tensor.
-
forward
Description copied from interface:Layer
Forward propagation (or forward pass) through the layer.- Parameters:
tokens
- the input tensor.- Returns:
- the output tensor.
-