Benchmark .Net Tutorial

NeurIPS 2024 Datasets and Benchmarks Track

Jailbreakbench is an open-source robustness benchmark for jailbreaking large language models (LLMs). The goal of this benchmark is to comprehensively track progress toward (1) generating successful ...

GitHub

nanoFramework.Benchmark

The nanoFramework.Benchmark tool helps you to measure and track performance of the nanoFramework code. You can easily turn normal method into benchmark by just adding one attribute! Heavily inspired ...

officechai.com

OpenAI Releases GPT 5.2, Beats Google Gemini 3 Pro On Several Benchmarks

OpenAI had been stung by Google’s release of Gemini 3 Pro which had eclipsed it on most benchmarks, but it’s thrown a counterpunch with GPT 5.2. The new model, which OpenAI is calling GPT-5.2 Thinking ...

Microsoft

SWE-Sharp-Bench: A Reproducible Benchmark for C# Software Engineering Tasks

AI coding agents have shown great progress on Python software engineering benchmarks like SWE-Bench, and for other languages like Java and C in benchmarks like Multi-SWE-Bench. However, C# — a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results