Package smile.llm.llama
Record Class ModelArgs
java.lang.Object
java.lang.Record
smile.llm.llama.ModelArgs
- Record Components:
dim
- the dimension of token embedding.numLayers
- the number of transformer blocks.numHeads
- the number of attention heads.numKvHeads
- the number of key and value heads.vocabSize
- the size of the vocabulary.multipleOf
- make SwiGLU hidden layer size multiple of large power of 2.ffnDimMultiplier
- the multiplier for the hidden dimension of the feedforward layers.normEps
- the epsilon value used for numerical stability in normalization layers.ropeTheta
- the theta parameter in rotary positional encoding.maxBatchSize
- the maximum batch size.maxSeqLen
- the maximum sequence length for input data.
public record ModelArgs(int dim, int numLayers, int numHeads, Integer numKvHeads, int vocabSize, int multipleOf, Double ffnDimMultiplier, double normEps, double ropeTheta, boolean scaledRope, int maxBatchSize, int maxSeqLen)
extends Record
LLaMA model hyperparameters.
-
Constructor Summary
ConstructorDescriptionConstructor with default parameter values.ModelArgs
(int dim, int numLayers, int numHeads, Integer numKvHeads, int vocabSize, int multipleOf, Double ffnDimMultiplier, double normEps, double ropeTheta, boolean scaledRope, int maxBatchSize, int maxSeqLen) Creates an instance of aModelArgs
record class. -
Method Summary
Modifier and TypeMethodDescriptionint
dim()
Returns the value of thedim
record component.final boolean
Indicates whether some other object is "equal to" this one.Returns the value of theffnDimMultiplier
record component.static ModelArgs
Loads the model hyperparameters from a JSON file.final int
hashCode()
Returns a hash code value for this object.int
Returns the value of themaxBatchSize
record component.int
Returns the value of themaxSeqLen
record component.int
Returns the value of themultipleOf
record component.double
normEps()
Returns the value of thenormEps
record component.int
numHeads()
Returns the value of thenumHeads
record component.Returns the value of thenumKvHeads
record component.int
Returns the value of thenumLayers
record component.double
Returns the value of theropeTheta
record component.boolean
Returns the value of thescaledRope
record component.final String
toString()
Returns a string representation of this record class.int
Returns the value of thevocabSize
record component.
-
Constructor Details
-
ModelArgs
public ModelArgs()Constructor with default parameter values. -
ModelArgs
public ModelArgs(int dim, int numLayers, int numHeads, Integer numKvHeads, int vocabSize, int multipleOf, Double ffnDimMultiplier, double normEps, double ropeTheta, boolean scaledRope, int maxBatchSize, int maxSeqLen) Creates an instance of aModelArgs
record class.- Parameters:
dim
- the value for thedim
record componentnumLayers
- the value for thenumLayers
record componentnumHeads
- the value for thenumHeads
record componentnumKvHeads
- the value for thenumKvHeads
record componentvocabSize
- the value for thevocabSize
record componentmultipleOf
- the value for themultipleOf
record componentffnDimMultiplier
- the value for theffnDimMultiplier
record componentnormEps
- the value for thenormEps
record componentropeTheta
- the value for theropeTheta
record componentscaledRope
- the value for thescaledRope
record componentmaxBatchSize
- the value for themaxBatchSize
record componentmaxSeqLen
- the value for themaxSeqLen
record component
-
-
Method Details
-
from
Loads the model hyperparameters from a JSON file.- Parameters:
path
- the file path.maxBatchSize
- the maximum batch size.maxSeqLen
- the maximum sequence length for input data.- Returns:
- the model hyperparameters.
- Throws:
IOException
-
toString
Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components. -
hashCode
public final int hashCode()Returns a hash code value for this object. The value is derived from the hash code of each of the record components. -
equals
Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. Reference components are compared withObjects::equals(Object,Object)
; primitive components are compared with '=='. -
dim
public int dim()Returns the value of thedim
record component.- Returns:
- the value of the
dim
record component
-
numLayers
public int numLayers()Returns the value of thenumLayers
record component.- Returns:
- the value of the
numLayers
record component
-
numHeads
public int numHeads()Returns the value of thenumHeads
record component.- Returns:
- the value of the
numHeads
record component
-
numKvHeads
Returns the value of thenumKvHeads
record component.- Returns:
- the value of the
numKvHeads
record component
-
vocabSize
public int vocabSize()Returns the value of thevocabSize
record component.- Returns:
- the value of the
vocabSize
record component
-
multipleOf
public int multipleOf()Returns the value of themultipleOf
record component.- Returns:
- the value of the
multipleOf
record component
-
ffnDimMultiplier
Returns the value of theffnDimMultiplier
record component.- Returns:
- the value of the
ffnDimMultiplier
record component
-
normEps
public double normEps()Returns the value of thenormEps
record component.- Returns:
- the value of the
normEps
record component
-
ropeTheta
public double ropeTheta()Returns the value of theropeTheta
record component.- Returns:
- the value of the
ropeTheta
record component
-
scaledRope
public boolean scaledRope()Returns the value of thescaledRope
record component.- Returns:
- the value of the
scaledRope
record component
-
maxBatchSize
public int maxBatchSize()Returns the value of themaxBatchSize
record component.- Returns:
- the value of the
maxBatchSize
record component
-
maxSeqLen
public int maxSeqLen()Returns the value of themaxSeqLen
record component.- Returns:
- the value of the
maxSeqLen
record component
-