HomeSample Page

Sample Page Title


How to Use Hugging Face AutoTrain to Fine-tune LLMs
Picture by Editor

 

 

In recent times, the Massive Language Mannequin (LLM) has modified how individuals work and has been utilized in many fields, akin to schooling, advertising, analysis, and many others. Given the potential, LLM could be enhanced to resolve our enterprise issues higher. Because of this we might carry out LLM fine-tuning.

We need to fine-tune our LLM for a number of causes, together with adopting particular area use circumstances, bettering the accuracy, information privateness and safety, controlling the mannequin bias, and lots of others. With all these advantages, it’s important to discover ways to fine-tune our LLM to have one in manufacturing.

One solution to carry out LLM fine-tuning routinely is through the use of Hugging Face’s AutoTrain. The HF AutoTrain is a no-code platform with Python API  to coach state-of-the-art fashions for numerous duties akin to Pc Imaginative and prescient, Tabular, and NLP duties. We are able to use the AutoTrain functionality even when we don’t perceive a lot in regards to the LLM fine-tuning course of.

So, how does it work? Let’s discover additional.

 

 

Even when HF AutoTrain is a no-code resolution, we will develop it on prime of the AutoTrain utilizing Python API. We might discover the code routes because the no-code platform isn’t steady for coaching. Nevertheless, if you wish to use the no-code platform, We are able to create the AutoTrain house utilizing the next web page. The general platform shall be proven within the picture under.

 

How to Use Hugging Face AutoTrain to Fine-tune LLMs
Picture by Writer

 

To fine-tune the LLM with Python API, we have to set up the Python package deal, which you’ll be able to run utilizing the next code.

pip set up -U autotrain-advanced

 

Additionally, we might use the Alpaca pattern dataset from HuggingFace, which required datasets package deal to accumulate.

 

Then, use the next code to accumulate the info we want.

from datasets import load_dataset 

# Load the dataset
dataset = load_dataset("tatsu-lab/alpaca") 
prepare = dataset['train']

 

Moreover, we might save the info within the CSV format as we would want them for our fine-tuning.

prepare.to_csv('prepare.csv', index = False)

 

With the surroundings and the dataset prepared, let’s attempt to use HuggingFace AutoTrain to fine-tune our LLM.

 

 

I’d adapt the fine-tuning course of from the AutoTrain instance, which we will discover right here. To start out the method, we put the info we might use to fine-tune within the folder referred to as information.

 

How to Use Hugging Face AutoTrain to Fine-tune LLMs
Picture by Writer

 

For this tutorial, I attempt to pattern solely 100 row information so our coaching course of could be rather more swifter. After now we have our information prepared, we might use our Jupyter Pocket book to fine-tune our mannequin. Be certain the info comprise ‘textual content’ column because the AutoTrain would learn from that column solely.

First, let’s run the AutoTrain setup utilizing the next command.

 

Subsequent, we would supply an data required for AutoTrain to run. For the next one is the details about the undertaking identify and the pre-trained mannequin you need. You’ll be able to solely select the mannequin that was out there within the HuggingFace.

project_name="my_autotrain_llm"
model_name="tiiuae/falcon-7b"

 

Then we might add HF data, if you would like push your mannequin to teh repository or utilizing a personal mannequin.

push_to_hub = False
hf_token = "YOUR HF TOKEN"
repo_id = "username/repo_name"

 

Lastly, we might provoke the mannequin parameter data within the variables under. You’ll be able to change them as you prefer to see if the result’s good or not.

learning_rate = 2e-4
num_epochs = 4
batch_size = 1
block_size = 1024
coach = "sft"
warmup_ratio = 0.1
weight_decay = 0.01
gradient_accumulation = 4
use_fp16 = True
use_peft = True
use_int4 = True
lora_r = 16
lora_alpha = 32
lora_dropout = 0.045

 

With all the knowledge is prepared, we might arrange the surroundings to simply accept all the knowledge now we have arrange beforehand.

import os
os.environ["PROJECT_NAME"] = project_name
os.environ["MODEL_NAME"] = model_name
os.environ["PUSH_TO_HUB"] = str(push_to_hub)
os.environ["HF_TOKEN"] = hf_token
os.environ["REPO_ID"] = repo_id
os.environ["LEARNING_RATE"] = str(learning_rate)
os.environ["NUM_EPOCHS"] = str(num_epochs)
os.environ["BATCH_SIZE"] = str(batch_size)
os.environ["BLOCK_SIZE"] = str(block_size)
os.environ["WARMUP_RATIO"] = str(warmup_ratio)
os.environ["WEIGHT_DECAY"] = str(weight_decay)
os.environ["GRADIENT_ACCUMULATION"] = str(gradient_accumulation)
os.environ["USE_FP16"] = str(use_fp16)
os.environ["USE_PEFT"] = str(use_peft)
os.environ["USE_INT4"] = str(use_int4)
os.environ["LORA_R"] = str(lora_r)
os.environ["LORA_ALPHA"] = str(lora_alpha)
os.environ["LORA_DROPOUT"] = str(lora_dropout)

 

To run the AutoTrain in our pocket book, we might use the next command.

!autotrain llm 
--train 
--model ${MODEL_NAME} 
--project-name ${PROJECT_NAME} 
--data-path information/ 
--text-column textual content 
--lr ${LEARNING_RATE} 
--batch-size ${BATCH_SIZE} 
--epochs ${NUM_EPOCHS} 
--block-size ${BLOCK_SIZE} 
--warmup-ratio ${WARMUP_RATIO} 
--lora-r ${LORA_R} 
--lora-alpha ${LORA_ALPHA} 
--lora-dropout ${LORA_DROPOUT} 
--weight-decay ${WEIGHT_DECAY} 
--gradient-accumulation ${GRADIENT_ACCUMULATION} 
$( [[ "$USE_FP16" == "True" ]] && echo "--fp16" ) 
$( [[ "$USE_PEFT" == "True" ]] && echo "--use-peft" ) 
$( [[ "$USE_INT4" == "True" ]] && echo "--use-int4" ) 
$( [[ "$PUSH_TO_HUB" == "True" ]] && echo "--push-to-hub --token ${HF_TOKEN} --repo-id ${REPO_ID}" )

 

If you happen to run the AutoTrain efficiently, it’s best to discover the next folder in your listing with all of the mannequin and tokenizer producer by AutoTrain.
 

How to Use Hugging Face AutoTrain to Fine-tune LLMs
Picture by Writer

 

To check the mannequin, we might use the HuggingFace transformers package deal with the next code.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "my_autotrain_llm"
tokenizer = AutoTokenizer.from_pretrained(model_path)
mannequin = AutoModelForCausalLM.from_pretrained(model_path)

 

Then, we will attempt to consider our mannequin primarily based on the coaching enter now we have given. For instance, we use the “Well being advantages of standard train” because the enter.

input_text = "Well being advantages of standard train"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = mannequin.generate(input_ids)
predicted_text = tokenizer.decode(output[0], skip_special_tokens=False)
print(predicted_text)

 

How to Use Hugging Face AutoTrain to Fine-tune LLMs

 

The result’s definitely nonetheless could possibly be higher, however at the very least it’s nearer to the pattern information now we have offered. We are able to attempt to taking part in round with the pre-trained mannequin and the parameter to enhance the fine-tuning.

 

 

There are few greatest practices that you just would possibly need to know to enhance the fine-tuning course of, together with:

  1. Put together our dataset with the standard matching the consultant job, 
  2. Research the pre-trained mannequin that we used,
  3. Use an acceptable regularization methods to keep away from overfitting,
  4. Making an attempt out the training price from smaller and steadily change into greater,
  5. Use fewer epoch because the coaching as LLM often study the brand new information fairly quick,
  6. Don’t ignore the computational price, as it will change into increased with greater information, parameter, and mannequin,
  7. Be sure you observe the moral consideration concerning the info you utilize.

 

 

Tremendous-tuning our Massive Language Mannequin is useful to our enterprise course of, particularly if there are particular necessities that we required. With the HuggingFace AutoTrain, we will enhance up our coaching course of and simply utilizing the out there pre-trained mannequin to fine-tune the mannequin.
 
 

Cornellius Yudha Wijaya is an information science assistant supervisor and information author. Whereas working full-time at Allianz Indonesia, he likes to share Python and Information ideas by way of social media and writing media.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles