Popular repositories Loading
-
-
-
gorilla
gorilla PublicForked from ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Python
-
-
tau2-bench
tau2-bench PublicForked from sierra-research/tau2-bench
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
Python
Repositories
Showing 10 of 15 repositories
- benchmarked-free-ride-ci Public
sequrity-ai/benchmarked-free-ride-ci’s past year of commit activity - cracker Public
sequrity-ai/cracker’s past year of commit activity - inference-benchmark Public
sequrity-ai/inference-benchmark’s past year of commit activity - openclawbench Public
sequrity-ai/openclawbench’s past year of commit activity - agentdojo-benchmark Public Forked from ethz-spylab/agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
sequrity-ai/agentdojo-benchmark’s past year of commit activity - tau2-bench Public Forked from sierra-research/tau2-bench
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
sequrity-ai/tau2-bench’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…