Get Started
Playground
Compare and test multiple LLM models side by side
Overview
The ModelBench Playground is a powerful environment for testing and comparing different LLM models. It allows you to interact with multiple models simultaneously, add custom tools, and refine your prompts in real-time.
Key Features
Model Selection
- Choose from a wide range of models, including GPT-4, GPT-4 Mini, and many others.
- Compare multiple models side by side to evaluate their performance.
Tool Integration
- Add custom tools to enhance model capabilities.
- Example: Adding a
fetch_url_content
tool to allow models to browse web content.
Prompt Engineering
- Write and refine prompts in real-time.
- Test how different models respond to the same prompt.
Response Analysis
- View detailed logs of each interaction, including:
- Exact API requests
- Token usage
- Associated costs
Sharing
- Easily share your prompts and results with others using a generated link.
- Collaborate with prompt engineers and get feedback on your work.
How to Use
- Select the models you want to compare from the available options.
- Add any necessary tools by pasting their JSON schema into the tool section.
- Write your prompt in the input area.
- Run the prompt and observe how different models respond.
- Refine your prompt based on the results and repeat the process.
- Use the “Show Log” feature to view detailed information about each interaction.
- Share your work using the “Share” button to generate a public link.
The Playground is your sandbox for quick experimentation and model comparison. For more structured testing and benchmarking, check out our Workbench feature.