GPT-J 사용법

AI 2023. 3. 3. 06:06

//-------------------------------------
eleuther.ai
* GPT-Neo (2021-03)
https://github.com/EleutherAI/gpt-neo


* GPT-J (2021-06) - 60억 (6B) - OpenAI GPT 모델 중 Curie급
GPT-3의 오픈소스 버전
https://huggingface.co/EleutherAI/gpt-j-6B

- 사용법
https://huggingface.co/docs/transformers/model_doc/gptj

https://github.com/kingoflolz/mesh-transformer-jax


* GPT-NeoX (2022-01) - 200억 (20B)
https://github.com/EleutherAI/gpt-neox
NLP Cloud 에서 API 서비스

 


//-----------------------------------------------------------------------------
사용법 예제 샘플 코드 소스

import torch
import gc
import os
import transformers
import time
from transformers import GPTJForCausalLM, AutoTokenizer

def gptj(text, dev):
    device = torch.device(dev)
    #config = transformers.GPTJConfig.from_pretrained("EleutherAI/gpt-j-6B")

    model = GPTJForCausalLM.from_pretrained(
        "EleutherAI/gpt-j-6B",
        revision="float16",
        torch_dtype=torch.float16,
        low_cpu_mem_usage=True,
        use_cache=True,
        gradient_checkpointing=True,
    )

    model.to(device)

    tokenizer = transformers.AutoTokenizer.from_pretrained(
        "EleutherAI/gpt-j-6B", pad_token='<|endoftext|>', eos_token='<|endoftext|>', truncation_side='left')

    prompt = tokenizer(text, return_tensors='pt',
                       truncation=True, max_length=2048)
    prompt = {key: value.to(dev) for key, value in prompt.items()}
    out = model.generate(**prompt,
                         # n=1,
                         min_length=16,
                         max_new_tokens=75,
                         do_sample=True,
                         top_k=35,
                         top_p=0.9,
                         # batch_size=512,
                         temperature=0.75,
                         no_repeat_ngram_size=4,
                         # clean_up_tokenization_spaces=True,
                         use_cache=True,
                         pad_token_id=tokenizer.eos_token_id
                         )
    res = tokenizer.decode(out[0])
    return res


#
text = "The Belgian national football team "
print("generated_text", gptj(text, "cuda"))

GPU에서 14초 정도 걸림

 


참고



//-----------------------------------------------------------------------------

< 참고 >
노트북 프로젝트 파일
https://github.com/NielsRogge/Transformers-Tutorials/blob/master/GPT-J-6B/Inference_with_GPT_J_6B.ipynb

pip install -q git+https://github.com/huggingface/transformers.git
pip install accelerate

- CPU에서는 작동 성공, CUDA에서 에러

model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", revision="float16")  # , low_cpu_mem_usage=True)
    model.to(device)  <==== 에러

OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 15.99 GiB total capacity; 15.10 GiB already allocated; 0 bytes free; 15.10 GiB      
reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory   
Management and PYTORCH_CUDA_ALLOC_CONF

해결방법
GPTJForCausalLM.from_pretrained() 에 옵션으로 torch_dtype=torch.float16,가 있어야 한다.

model = GPTJForCausalLM.from_pretrained(
        "EleutherAI/gpt-j-6B",
        revision="float16",
        torch_dtype=torch.float16,
        low_cpu_mem_usage=True,
        use_cache=True,
        gradient_checkpointing=True,
    )

 

//-----------------------------------------------------------------------------

< 참고 >

 

OpenAI
* GPT-3 (2020-06)
* GPT-3.5 (2022-03)

//-------------------------------------
eleuther.ai
* GPT-Neo (2021-03)
https://github.com/EleutherAI/gpt-neo


* GPT-J (2021-06) - 60억 (6B) - OpenAI Curie급
GPT-3의 오픈소스 버전
https://huggingface.co/EleutherAI/gpt-j-6B

- 사용법
https://huggingface.co/docs/transformers/model_doc/gptj

https://github.com/kingoflolz/mesh-transformer-jax



* GPT-NeoX (2022-01) - 200억 (20B)
https://github.com/EleutherAI/gpt-neox
NLP Cloud 에서 API 서비스


* GPT-NeoX-20B (2022-02)
EleutherAI/gpt-neox-20b 
https://huggingface.co/EleutherAI/gpt-neox-20b

 

 

반응형
Posted by codens