đ Orkhan/llama-2-7b-absa
Orkhan/llama-2-7b-absa
is a fine - tuned version of the Llama - 2 - 7b model. It is optimized for Aspect - Based Sentiment Analysis (ABSA) using a manually labelled dataset of 2000 sentences. This model can adeptly identify aspects and accurately analyze sentiment, making it valuable for nuanced sentiment analysis in diverse applications. Its advantage over traditional ABSA models is that it generalizes well, so you don't need to train a model with domain - specific labeled data. However, it may require more computing power.
đ Quick Start
What does it do?
You prompt a sentence and get aspects, opinions, sentiments, and phrases (opinion + aspect) in the sentence.
Example
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print(output_dict)
>>>{'user_prompt': 'Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'aspects': ['weather', 'birds', 'smell'],
'opinions': ['nice', 'flying', 'bad'],
'sentiments': ['Positive', 'Positive', 'Negative'],
'phrases': ['nice weather', 'flying birds', 'bad smell']}
⨠Features
- Optimized for Aspect - Based Sentiment Analysis (ABSA).
- Can identify aspects and analyze sentiment accurately.
- Generalizes well without the need for domain - specific labeled data training.
đĻ Installation
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
đģ Usage Examples
Basic Usage
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
HfArgumentParser,
TrainingArguments,
pipeline,
logging,
)
from peft import LoraConfig, PeftModel
import torch
model_name = "Orkhan/llama-2-7b-absa"
base_model = AutoModelForCausalLM.from_pretrained(
model_name,
low_cpu_mem_usage=True,
return_dict=True,
torch_dtype=torch.float16,
device_map={"": 0},
)
base_model.config.use_cache = False
base_model.config.pretraining_tp = 1
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
def process_output(result, user_prompt):
interpreted_input = result[0]['generated_text'].split('### Assistant:')[0].split('### Human:')[1]
new_output = result[0]['generated_text'].split('### Assistant:')[1].split(')')[0].strip()
new_output.split('## Opinion detected:')
aspect_opinion_sentiment = new_output
aspects = aspect_opinion_sentiment.split('Aspect detected:')[1].split('##')[0]
opinions = aspect_opinion_sentiment.split('Opinion detected:')[1].split('## Sentiment detected:')[0]
sentiments = aspect_opinion_sentiment.split('## Sentiment detected:')[1]
aspect_list = [aspect.strip() for aspect in aspects.split(',') if ',' in aspects]
opinion_list = [opinion.strip() for opinion in opinions.split(',') if ',' in opinions]
sentiments_list = [sentiment.strip() for sentiment in sentiments.split(',') if ',' in sentiments]
phrases = [opinion + ' ' + aspect for opinion, aspect in zip(opinion_list, aspect_list)]
output_dict = {
'user_prompt': user_prompt,
'interpreted_input': interpreted_input,
'aspects': aspect_list,
'opinions': opinion_list,
'sentiments': sentiments_list,
'phrases': phrases
}
return output_dict
def process_prompt(user_prompt, model):
edited_prompt = "### Human: " + user_prompt + '.###'
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=len(tokenizer.encode(user_prompt))*4)
result = pipe(edited_prompt)
output_dict = process_output(result, user_prompt)
return result, output_dict
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print('raw_result: ', raw_result)
print('output_dict: ', output_dict)
Advanced Usage
The above code shows a complete process from model loading to inference. You can adjust the input prompt according to your needs to perform sentiment analysis on different sentences.
đ Documentation
Important Notes
- While inferencing, please note that the model has been trained on sentences, not on paragraphs.
- It fits T4 - GPU - enabled free Google Colab Notebook.
- You can use the whole code in this colab: [Colab Link](https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR - j7o10OKCSM1X5?usp=sharing)
đ License
This project is licensed under the apache - 2.0
license.
Image
