News
Welcome to Code Arena
posted on July 12, 2024, 4:41 a.m. 0Hi everyone,
We're excited to introduce CodeArena, a dedicated platform designed for the comprehensive evaluation of Code Large Language Models (LLMs). CodeArena uniquely focuses on assessing both the functional correctness and code efficiency of LLM-generated code.
A Note for ACL Reviewers: CodeArena has been submitted to the DEMO track. Our primary goal for this submission is to showcase the capabilities and features of the CodeArena platform itself, rather than an in-depth analysis of metric computation or theoretical frameworks. We appreciate your understanding and focus on the demonstrative aspects of our work.
[🚧] We are currently constructing new version of CodeArena. Stay Tuned!
While numerous online judge (OJ) platforms like LeetCode, CodeForces, and AtCoder serve as excellent proving grounds for human programmers, they are generally not conducive to LLM-based evaluation—many even restrict LLM access. CodeArena bridges this gap by providing LLM-friendly APIs for seamless submission and rigorous testing of LLM-generated code. This is powered by Monolith, our high-concurrency sandbox engineered for secure and efficient code execution.
from codearena import codearena
# Initialize CodeArena Client
client = codearena.CodeArena(url_root='https://codearena.online', token='<codearena_token>')
# Get Problem List
problems = client.get_problems()
# Get Problem Details
problem = client.get_problem(problem_id='<problem_id>')
# Get Submission List
submissions = client.get_submissions()
# Get Submission Details
submission = client.get_submission(<submission_id>)
# Submit Solution
submission_result = client.post_submission(problem_code="<problem_id>", language="<language>", source="<source_code>")
To facilitate exploration, we provide a trial account (Account: Test / Password: Haveatry!) for anyone interested in browsing our data. Enjoy!