[NLP] GPT-2 사용법 (windows)
//------------------------------------- 
    - gpt-2의 tensorflow 버전은 v1.x 이하 
tensorflow v1.x의 최신 버전은 v1.15 
    - tensorflow의 필요 python 버전 
https://www.tensorflow.org/install/source#tested_build_configurations 
tensorflow-1.15.0  : python  v2.7, v3.3-3.7 
tensorflow-2.9.0 : python  v3.7-3.10 
//------------------------------------- 
    - 소스 다운로드 
> git clone https://github.com/openai/gpt-2.git 
> cd gpt-2 
//------------------------------------- 
    - 가상환경 만들기 
> conda create --name gpt2 python=3.7 
> conda activate gpt2 
//------------------------------------- 
    - pip가 없다면 설치 
https://pip.pypa.io/en/stable/installation/ 
> py -m ensurepip --upgrade 
https://bootstrap.pypa.io/get-pip.py 파일 다운로드 
> py get-pip.py 
> pip --version
> python -m pip install --upgrade pip
    - 설치된 패키지 버전 확인 
> pip list 
> pip show 패키지이름 
//-------------------------------------
- PyTorch 설치
https://pytorch.org/get-started/locally/
//------------------------------------- 
    - 텐서플로 설치 
> pip install tensorflow==1.15 
> pip3 install tensorflow-gpu==1.15.0 
 - protobuf 패키지는 v3.20 이하로 설치 
    - 필요 패키지 설치 
> pip install -r requirements.txt 
//------------------------------------- 
    - 모델 다운로드 (pre trained model download) 
GPT-2 model 리스트 : 117M, 124M, 355M, 774M, 1558M 
> python download_model.py 117M 
> python download_model.py 124M 
> python download_model.py 355M 
> python download_model.py 774M 
> python download_model.py 1558M 
//------------------------------------- 
    - 실행
> python src\generate_unconditional_samples.py
- 랜덤글 생성
> python src\interactive_conditional_samples.py
    - 입력한 글에 이어서 글 생성
//------------------------------------- 
    - 에러 
dlerror: cudart64_110.dll not found 
tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 
    - 해결 방법 
CUDA Toolkit 설치 
https://developer.nvidia.com/cuda-toolkit 
이전 버전 다운로드 
https://developer.nvidia.com/cuda-toolkit-archive 
    - 설치파일을 다운로드후, 압축을 풀어서, dll 파일만 CUDA폴더(C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin) 로 복사 
https://developer.nvidia.com/rdp/cudnn-archive 
    cudnn64_7.dll
//-----------------------------------------------------------------------------
//
실행시간 측정 (초)
- generate_unconditional_samples.py
CPU = Ryzen 5950X , GPU = RTX 2060
| 모델 | ||||
| 길이 | 124M | 355M | 774M | |
| CPU | 1024 | 66 | 165 | 320 | 
| GPU | 17 | 30 | 53 | |
| GPU 가 빠른 정도(배) | 5.1 | 7.1 | 7.2 | |
| 준비시간 | 5 | 8 | 10 | |
| GPU | 512 | 11 | 20 | 33 | 
| 256 | 8 | 15 | 23 | |
//----------------------------------------------------------------------------- 
* gpt-2 , interactive_conditional_samples.py 의 모델별 실행 결과 
    - 입력된 문장(gpt-3예제에 쓰인 문장) 
Suggest three names for an animal that is a superhero. 
Animal: Cat 
Names: Captain Sharpclaw, Agent Fluffball, The Incredible Feline 
Animal: Dog 
Names: Ruff the Protector, Wonder Canine, Sir Barks-a-Lot 
Animal: Duck 
Names: 
//------------------------------------- 
i --model_name="117M" 
 Hawkeye, Falcon 
Wild: Cat 
--- 
Real life: Toothbrush 
Game Marketing Approach 
//------------------------------------- 
i --model_name="124M" 
 Beast Maximoff, Aquaman , Tiger Shark 
Names: Folly the Beast, Warrior Marvel, and Other Animals 
Similarly excellent work online. 
//------------------------------------- 
i --model_name="355M" 
 Pharaoh's Snot Mug, Birdman 
Animal: Dolphin 
Names: Kudwani, Jack Hammerheart, A* resurrected 
Animal: Unicorn 
Names: Casino Koala, Heretic 
Animal: Wolf 
Names: Dirk 
//------------------------------------- 
i --model_name="774M" 
_______, Doxy Bloggie, Bunch of Dudes, Giant Voyageurs 
Animal: Chicken 
Names: Spetumor 
Animal: Creepy Clown 
Names: Orange Capri, Chima Mac% 
Animal: Robot 
//------------------------------------- 
i --model_name="1558M" 
 Mr. Karate 
Animal: Frog 
Names: Burning Dagger, Scion of Flame, Creature of Fairy Rain 
//-----------------------------------------------------------------------------
//------------------------------------- 
// 참고
https://github.com/openai/gpt-2/blob/master/DEVELOPERS.md
gpt-2 옵션 temperature  top_k top_p nsamples 에 대한 설명
https://towardsdatascience.com/how-to-sample-from-language-models-682bceb97277