java.lang.Object

smile.llm.llama.Llama

public class Llama extends Object

LLaMA model specification.

Constructor Summary

Constructors

Constructor

Description

Llama(String name, Transformer model, Tokenizer tokenizer)

Constructor.
Method Summary

Modifier and Type

Method

Description

static Llama

build(String checkpointDir, String tokenizerPath, int maxBatchSize, int maxSeqLen)

Builds a Llama instance by initializing and loading a model checkpoint.

static Llama

build(String checkpointDir, String tokenizerPath, int maxBatchSize, int maxSeqLen, Integer deviceId)

Builds a Llama instance by initializing and loading a model checkpoint.

CompletionPrediction[]

chat(Message[][] dialogs, int maxGenLen, double temperature, double topp, boolean logprobs, Long seed, SubmissionPublisher<String> publisher)

Generates assistant responses for a list of conversational dialogs.

CompletionPrediction[]

complete(String[] prompts, int maxGenLen, double temperature, double topp, boolean logprobs, Long seed, SubmissionPublisher<String> publisher)

Performs text completion for a list of prompts

String

family()

Returns the model family name.

CompletionPrediction[]

generate(int[][] prompts, int maxGenLen, double temperature, double topp, boolean logprobs, Long seed, SubmissionPublisher<String> publisher)

Generates text sequences based on provided prompts.

String

name()

Returns the model instance name.

String

toString()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Constructor Details
- Llama
  
  public Llama(String name, Transformer model, Tokenizer tokenizer)
  
  Constructor.
  
  Parameters:
  
  name - the model name.
  
  model - the transformer model.
  
  tokenizer - the tokenizer.
Method Details
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- family
  
  public String family()
  
  Returns the model family name.
  
  Returns:
  
  the model family name.
- name
  
  public String name()
  
  Returns the model instance name.
  
  Returns:
  
  the model instance name.
- build
  
  public static Llama build(String checkpointDir, String tokenizerPath, int maxBatchSize, int maxSeqLen) throws IOException
  
  Builds a Llama instance by initializing and loading a model checkpoint.
  
  Parameters:
  
  checkpointDir - the directory path of checkpoint files.
  
  tokenizerPath - the path of tokenizer model file.
  
  maxBatchSize - the maximum batch size for inference.
  
  maxSeqLen - the maximum sequence length for input text.
  
  Returns:
  
  an instance of Llama model.
  
  Throws:
  
  IOException - if fail to open model checkpoint.
- build
  
  public static Llama build(String checkpointDir, String tokenizerPath, int maxBatchSize, int maxSeqLen, Integer deviceId) throws IOException
  
  Builds a Llama instance by initializing and loading a model checkpoint.
  
  Parameters:
  
  checkpointDir - the directory path of checkpoint files.
  
  tokenizerPath - the path of tokenizer model file.
  
  maxBatchSize - the maximum batch size for inference.
  
  maxSeqLen - the maximum sequence length for input text.
  
  deviceId - the optional CUDA device ID.
  
  Returns:
  
  an instance of Llama model.
  
  Throws:
  
  IOException - if fail to open model checkpoint.
- generate
  
  public CompletionPrediction[] generate(int[][] prompts, int maxGenLen, double temperature, double topp, boolean logprobs, Long seed, SubmissionPublisher<String> publisher)
  
  Generates text sequences based on provided prompts. This method uses the provided prompts as a basis for generating text. It employs nucleus sampling to produce text with controlled randomness.
  
  Parameters:
  
  prompts - List of tokenized prompts, where each prompt is represented as a list of integers.
  
  maxGenLen - Maximum length of the generated text sequence.
  
  temperature - Temperature value for controlling randomness in sampling.
  
  topp - Top-p probability threshold for nucleus sampling.
  
  logprobs - Flag indicating whether to compute token log probabilities.
  
  seed - the optional random number generation seed to sample deterministically.
  
  publisher - an optional flow publisher that asynchronously issues generated chunks. The batch size must be 1.
  
  Returns:
  
  The generated text completion.
- complete
  
  public CompletionPrediction[] complete(String[] prompts, int maxGenLen, double temperature, double topp, boolean logprobs, Long seed, SubmissionPublisher<String> publisher)
  
  Performs text completion for a list of prompts
  
  Parameters:
  
  prompts - List of text prompts.
  
  maxGenLen - Maximum length of the generated text sequence.
  
  temperature - Temperature value for controlling randomness in sampling.
  
  topp - Top-p probability threshold for nucleus sampling.
  
  logprobs - Flag indicating whether to compute token log probabilities.
  
  seed - the optional random number generation seed to sample deterministically.
  
  publisher - an optional flow publisher that asynchronously issues generated chunks. The batch size must be 1.
  
  Returns:
  
  The generated text completion.
- chat
  
  public CompletionPrediction[] chat(Message[][] dialogs, int maxGenLen, double temperature, double topp, boolean logprobs, Long seed, SubmissionPublisher<String> publisher)
  
  Generates assistant responses for a list of conversational dialogs.
  
  Parameters:
  
  dialogs - List of conversational dialogs, where each dialog is a list of messages.
  
  maxGenLen - Maximum length of the generated text sequence.
  
  temperature - Temperature value for controlling randomness in sampling.
  
  topp - Top-p probability threshold for nucleus sampling.
  
  logprobs - Flag indicating whether to compute token log probabilities.
  
  seed - the optional random number generation seed to sample deterministically.
  
  publisher - an optional flow publisher that asynchronously issues generated chunks. The batch size must be 1.
  
  Returns:
  
  The generated chat responses.

Class Llama

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

Llama

Method Details

toString

family

name

build

build

generate

complete

chat