Llama 3 1 Nemotron Ultra 253B CPT V1
Other
Llama-3.1-Nemotron-Ultra-253B-CPT-v1 is a large language model based on Meta Llama-3.1-405B-Instruct, supporting 128K tokens context length, optimized through Neural Architecture Search to achieve a good balance between accuracy and efficiency.
Large Language Model
Transformers English