OpenAI Whisper 음성 인식 사용법

AI 2023. 2. 24. 23:44

- 음성인식(speech recognition), transscript, dictation, 오디오 파일 받아쓰기 오픈소스 무료 프로그램

https://openai.com/blog/whisper/

https://github.com/openai/whisper

//-------------------------------------
* 설치
> pip install -U openai-whisper

- 다른 설치 방법
> pip install git+https://github.com/openai/whisper.git

  - 업그레이드
> pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git

  - ffmpeg 설치 필요
> choco install ffmpeg

  - rust 설치
> pip install setuptools-rust

//-------------------------------------
모델 종류
https://github.com/openai/whisper/blob/main/model-card.md

tiny.en,tiny,base.en,base,small.en,small,medium.en,medium,large(=large-v2), (large-v1 <== old)

자동다운로드되는 경로 : C:\Users\<username>\.cache\whisper\<model>.

//-------------------------------------
  - gpu 사용 방법
      - torch 설치
https://github.com/openai/whisper/discussions/47
> pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

  --device cuda 옵션 사용

//-------------------------------------
* 사용
> whisper --model medium.en --language en  --device cuda "오디오파일이름(확장명포함)"

- CPU 사용시 --device cuda 옵션 생략

  - 속도 : 8초 오디오 인식 시간
      - CPU (5950x) : 45초
      - GPU (RTX 4080) : 15초

//-------------------------------------

// 걸린시간 (RTX 4080 GPU 사용시, 영어 인터뷰 오디오 기준)

프로그램 로딩 시간 : 13~15초

오디오 길이(분)	총 걸린 시간(초)	순 분석 시간(초)
1	23	10
2	33	20
5	70	57
10	135	122

저작자표시 (새창열림)

'AI' 카테고리의 다른 글

GPT-J 사용법 (0)	2023.03.03
[AI 음악] Riffusion 사용법 (0)	2023.02.26
Nvidia GPU 코어 클럭이 일정 이상 올라가지 않는 문제 해결 방법 (0)	2023.02.03
Stable Diffusion web UI 사용법, 에러 해결 방법 (0)	2022.11.10
자연어 처리(NLP ) 역사, AI 역사 (0)	2022.07.24

Posted by codens

코드루덴스

OpenAI Whisper 음성 인식 사용법

'AI' 카테고리의 다른 글

카테고리

최근에 올라온 글

최근에 달린 댓글

태그목록

티스토리툴바


	by codens