코드루덴스

LLM Benchmarks 사이트 모음

AI 2024. 9. 27. 03:16

LLM(Large Language Model) 성능 측정 사이트 모음

* Chatbot Arena LLM Leaderboard: Community-driven Evaluation for Best LLM and AI chatbots
https://lmarena.ai/?leaderboard=

* Comparison of Models: Quality, Performance & Price Analysis
https://artificialanalysis.ai/models/

//-------------------------------------
* SEAL - LLM Leaderboards, Expert-Driven Private Evaluations
https://scale.com/leaderboard

* KLU - LLM Leaderboard
https://klu.ai/llm-leaderboard

//-------------------------------------
* LiveBench , A Challenging, Contamination-Free LLM Benchmark
https://livebench.ai/

//-------------------------------------
* LLM Benchmarks: Overview, Limits and Model Comparison
https://www.vellum.ai/blog/llm-benchmarks-overview-limits-and-model-comparison

Amazon Bedrock, Amazone Nova AI 모델 정보 (0)	2024.12.10
[펌] 인공지능: 디스토피아인가, 유토피아인가? - 비노드 코슬라 (0)	2024.10.02
Grok AI 사용법 (x.com) (0)	2024.09.14
OpenAI 모델 비교, gpt-4o mini vs gpt-3.5 turbo (0)	2024.07.24
openai/evals , LLM 평가 툴 사용법 (0)	2024.06.30

Posted by codens

이전 1 ··· 104 105 106 107 108 109 110 ··· 2063 다음