Class Transformer

java.lang.Object
smile.deep.layer.LayerBlock
smile.llm.llama.Transformer
All Implemented Interfaces:
Function<Tensor,Tensor>, Layer

public class Transformer extends LayerBlock
The Transformer model. It consists of token embeddings, stacked Transformer blocks, and the final output layer. This model can be used for various natural language processing tasks, such as language modeling or text generation.
  • Constructor Details

    • Transformer

      public Transformer(ModelArgs args, Device device)
      Constructor.
      Parameters:
      args - the model configuration parameters.
      device - the compute device.
  • Method Details

    • forward

      public Tensor forward(Tensor tokens, int startPos)
      Forward pass through the model.
      Parameters:
      tokens - the input token indices.
      startPos - the starting position for attention caching.
      Returns:
      the output tensor.
    • forward

      public Tensor forward(Tensor tokens)
      Description copied from interface: Layer
      Forward propagation (or forward pass) through the layer.
      Parameters:
      tokens - the input tensor.
      Returns:
      the output tensor.