AI Solutions
Discover and compare the best AI tools, rated by the community
Discover and compare the best AI tools, rated by the community
API access to Claude models for AI applications.
API access to GPT-4, DALL-E, Whisper, and more AI models.
Enterprise AI platform for text generation and embeddings.
Jurassic language models for enterprise AI applications.
Build with Gemini models through Google's AI platform.
Cloud platform for running open-source AI models.
Run open-source ML models with a cloud API.
Platform for open-source AI models and datasets.
Ultra-fast AI inference with custom LPU hardware.
Fast inference platform for generative AI models.
Platform for scaling AI workloads with Ray.
API for Perplexity's search-augmented language models.
European AI company with powerful open-weight models.
AI compute platform with wafer-scale chips.
Cloud platform for running AI workloads serverlessly.
Serverless GPU infrastructure for ML inference.
Platform for deploying ML models as APIs.
Amazon's managed service for foundation models.
Efficient AI inference cloud for GenAI applications.
Serverless inference for popular AI models.
Enterprise AI platform with custom hardware.
OpenAI models hosted on Microsoft Azure.
Google Cloud's unified AI platform.
GPU cloud for AI inference and training.
Platform for fine-tuning and serving LLMs.
Enterprise LLM platform for fine-tuning and inference.
AI speech recognition and transcription API.
API for speech-to-text and audio intelligence.
Speech-to-text API with high accuracy.
Vector database for AI applications.
Open-source vector search engine.
Vector similarity search engine and database.
Open-source embedding database for AI apps.
Open-source vector database for AI workloads.
Serverless vector database for AI applications.
AI tool from awesome-llm
AI tool from awesome-llm
A pioneering benchmark specifically designed to assess honesty in LLMs comprehensively.
evaluates LLM's ability to call external functions/tools.
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
A Challenging, Contamination-Free LLM Benchmark.
AI tool from awesome-llm
An Automatic Evaluator for Instruction-following Language Models using Nous benchmark suite.
AI tool from awesome-llm
a benchmark designed to evaluate large language models in the legal domain.
a benchmark designed to evaluate large language models (LLMs) specifically in their ability to answer real-world coding-related questions.
a benchmark evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes).
a comprehensive benchmarking platform designed to evaluate large models' mathematical abilities across 20 fields and nearly 30,000 math problems.
CompassRank is dedicated to exploring the most advanced language and visual models, offering a comprehensive, objective, and neutral evaluation reference for the industry and research.
a ground-truth-based dynamic benchmark derived from off-the-shelf benchmark mixtures, which evaluates LLMs with a highly capable model ranking (i.e., 0.96 correlation with Chatbot Arena) while running locally and quickly (6% the time and cost of running MMLU).
a benchmark that evaluates large language models on a variety of multimodal reasoning tasks, including language, natural and social sciences, physical and social commonsense, temporal reasoning, algebra, and geometry.
focuses on understanding how these models perform in various scenarios and analyzing results from an interpretability perspective.
a meta-benchmark that evaluates how well factuality evaluators assess the outputs of large language models (LLMs).
a benchmark for evaluating the performance of large language models (LLMs) in various tasks related to both textual and visual imagination.
a multimodal question-answering benchmark designed to evaluate AI models' cognitive ability to understand human beliefs and goals.
a biomedical question-answering benchmark designed for answering research-related questions using PubMed abstracts.
a benchmark that evaluates large language models' ability to answer medical questions across multiple languages.
a large-scale Document Visual Question Answering (VQA) dataset designed for complex document understanding, particularly in financial reports.
a Swedish language understanding benchmark that evaluates natural language processing (NLP) models on various tasks such as argumentation analysis, semantic similarity, and textual entailment.
benchmark designed to evaluate large language models (LLMs) on solving complex, college-level scientific problems from domains like chemistry, physics, and mathematics.
a benchmark platform designed for evaluating large language models (LLMs) on a range of tasks, particularly focusing on their performance in different aspects such as natural language understanding, reasoning, and generalization.
a benchmark that evaluates large multimodal models (LMMs) on their ability to perform human-like mathematical reasoning.
a benchmark designed to assess the performance of multimodal web agents on realistic visually grounded tasks.
a benchmark dataset testing AI's ability to reason about visual commonsense through images that defy normal expectations.
Playground for devs to finetune & deploy LLMs
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
MLflow: An open-source framework for the end-to-end machine learning lifecycle, helping developers track experiments, evaluate models/prompts, deploy models, and add observability with tracing.
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
high quality and educational materials you don't want to miss.
Recent Advances on Foundation Models.
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
An Introductory LLM Textbook Based on [A Survey of Large Language Models](https://arxiv.org/abs/2303.18223).
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
Explore the world of Large Language Models with over 275 custom made figures in this illustrated guide!
AI tool from awesome-llm
AI tool from awesome-llm
AI tool from awesome-llm
The latest AI news, curated & explained by GPT-4.
Introducing Cohere Summarize Beta: A New Endpoint for Text Summarization
AI tool from awesome-llm
PFA (Portable Format for Analytics) format is a standard for representing and exchanging predictive models and analytics workflows in a portable, JSON-based format.
Neural Network Exchange Format (NNEF) is an open standard for representing neural network models to enable interoperability and portability across different machine learning frameworks and platforms.