(Meta AI) LLaMA 사용법
LLaMA (Large Language Model Meta AI)
2023-02 발표
//-----------------------------------------------------------------------------
논문 LLaMA: Open and Efficient Foundation Language Models
https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/
GPT-3등 다른 모델과 성능 비교 벤치마크
//-----------------------------------------------------------------------------
사용법
https://huggingface.co/docs/transformers/main/model_doc/llama
* 유출된 모델 다운로드 방법
https://aituts.com/llama/
https://rentry.org/llama-tard-v2
* 원본모델파일을 Hugging Face Transformers 형식으로 변환
python src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path
- convert_llama_weights_to_hf.py 는 https://github.com/huggingface/transformers 에 포함되어 있음
- ex)
python convert_llama_weights_to_hf.py --input_dir LLaMA-m --model_size 7B --output_dir LLaMA-m/output7b
* 예제 코드
import torch
from transformers import AutoTokenizer, LlamaForCausalLM, LlamaTokenizer
# 로딩 시간 : 3분
model = LlamaForCausalLM.from_pretrained(
"../llama/output7b/llama-7b/", revision="float16", torch_dtype=torch.float16
).cuda()
tokenizer = LlamaTokenizer.from_pretrained(
"../llama/output7b/tokenizer/")
prompt = "Hey, are you consciours? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
# Generate
generate_ids = model.generate(inputs, max_length=30)
ret = tokenizer.batch_decode(
generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(ret)