eCommerceNews Australia - Technology news for digital commerce decision-makers
Story image

Together AI launches fine-tuning tools to improve LLM models

Thu, 17th Apr 2025

Together AI has introduced a new Fine-Tuning Platform enabling users to customise AI models that continuously evolve and adapt to user preferences and fresh data inputs.

The updated platform features Direct Preference Optimization and continued training capabilities, designed to support businesses and developers in refining large language models (LLMs) over time, rather than as a one-off process. This approach allows models such as Llama and Gemma to align more closely with user expectations and specific industry requirements.

A newly launched web browser interface now facilitates fine-tuning runs without requiring any code. Previously, users needed to install a Python SDK or interact with an API, which involved additional setup. The browser UI supports uploading datasets, specifying training parameters, and viewing experiment results in one place.

The platform's Direct Preference Optimization (DPO) feature enables training LLMs using preference data, rather than replicating provided responses. With DPO, models can be steered towards producing responses that users prefer and away from undesired replies, which adds nuance and flexibility to model behaviour. DPO does not require an additional reward model and is available by uploading a preference dataset and selecting the corresponding training method within the platform.

The continued training function allows fine-tuning jobs to start from previous model checkpoints. This capability is particularly valuable for businesses seeking models that adapt over time as new data becomes available from user interactions. Users can initiate continued training by specifying the checkpoint of a prior job, enabling an iterative approach to model improvement.

Protege AI, an early user of the platform's continued fine-tuning feature, has applied it in developing hyper-personalised models for enterprise marketing compliance. Alex Chung, Founder of Protege AI, said, "After thoroughly evaluating multiple LLM infrastructure providers, we're thrilled to be partnering with Together for fine-tuning. The new resuming from a checkpoint functionality combined with LoRA serving has enabled our customers to deeply tune our foundational model, ShieldLlama, for their enterprise's precise risk posture. The level of accuracy would never be possible with vanilla open source or prompt engineering."

Additional improvements include support for new open models, such as Google's Gemma 3 and various versions of distilled DeepSeek-R1, ranging from DeepSeek-R1-Distill-Qwen-1.5B to DeepSeek-R1-Distill-Llama-70B. Plans are in place to add training support for even larger models, including Llama 4 and DeepSeek R1, in the coming weeks.

New fine-tuning features allow users to assign weights to specific messages in conversational data, encouraging the model to ignore or de-emphasise irrelevant or low-quality responses during training. The platform now also offers a cosine learning rate scheduler, a technique widely used for training leading LLMs, in addition to the linear scheduler. Users can configure the number of scheduler cycles and the portion of training steps allocated to warmup.

Data preprocessing has been optimised for greater efficiency, particularly with large datasets. Internal large-scale training runs indicated training speed improvements of up to 32% for the largest jobs and 17% for regular ones. These enhancements have been available since March, reducing processing times at no additional charge or impact on model quality.

Together AI has revised the pricing structure for fine-tuning services to lower and clarify costs. For models with up to 16 billion parameters, LoRA training is priced at USD $0.48 per one million tokens and full training at USD $0.54. Models in the 16.1–69 billion parameter range are priced at USD $1.50 and USD $1.65, respectively, while 70–100 billion parameter models are available at USD $2.90 for LoRA and USD $3.20 for full training. Pricing calculations include both training and validation data. Using the DPO feature raises costs by 2.5 times due to additional computing requirements. There is no minimum spending threshold, allowing users to undertake smaller customisation projects and pay only for resources used.

The platform provides users complete control and ownership over model outputs, with deployment possible either through Together AI's service or as a local download, regardless of how the model was trained. Together AI plans to expand the platform to support larger-scale training, introduce additional tools for end-to-end model development, and broaden the selection of available models and features.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X