LLM Benchmarks - Search News

Simbian Announces Industry’s First Benchmark to Comprehensively Measure LLM Performance in Security Operations Centers

New “AI SOC LLM Leaderboard” Uniquely Measures LLMs in Realistic IT Environment to Give SOC Teams and Vendors Guidance to Pick the Best LLM for Their Organization Simbian®, on a mission to solve ...

VentureBeat

MLPerf 3.1 adds large language model benchmarks for inference

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More MLCommons is growing its suite of MLPerf AI benchmarks with the addition ...

Business Wire

Cognite Launches the Cognite Atlas AI™ LLM & SLM Benchmark Report for Industrial Agents

AUSTIN, Texas & OSLO, Norway--(BUSINESS WIRE)--Cognite, the global leader in AI for industry, today announced the launch of the Cognite Atlas AI™ LLM & SLM Benchmark Report for Industrial Agents. The ...

Security

Simbian launches new security benchmark with AI SOC LLM Leaderboard

Simbian today announced the “AI SOC LLM Leaderboard,” a comprehensive benchmark to measure LLM performance in Security Operations Centers (SOCs). The new benchmark compares LLMs across a diverse range ...

SiliconANGLE

Researchers develop new LiveBench benchmark for measuring AI models’ response accuracy

A group of researchers has developed a new benchmark, dubbed LiveBench, to ease the task of evaluating large language models’ question-answering capabilities. The researchers released the benchmark on ...

VietNamNet

CMC OpenAI unveils Vietnam’s first legal LLM and benchmark suite

The Vietnamese tech group CMC is shaping the country’s legal AI future through VLegal-Bench and CMC-AI-Legal-32B, pioneering ...

SiliconANGLE

Nvidia claims first place in MLCommon’s first benchmarks for LLM inference, but Intel is a close second

MLCommons, the open engineering consortium for benchmarking the performance of chipsets for artificial intelligence, today unveiled the results of a new test that’s geared to determine how quickly ...

VentureBeat

Nvidia, Intel claim new LLM training speed records in new MLPerf 3.1 benchmark

Training AI models is a whole lot faster in 2023, according to the results from the MLPerf Training 3.1 benchmark released today. The pace of innovation in the generative AI space is breathtaking to ...

dbta

Deci Unveils Latest LLM, Sets New Benchmarks in Accuracy

Deci, the deep learning company harnessing AI to build AI, is adding a large language model, DeciLM-7B, to its suite of innovative generative AI models—setting new benchmarks in accuracy and ...

CNX Software

Rockchip RK1820/RK1828 SO-DIMM and M.2 LLM/VLM AI accelerator modules, devkits, and benchmarks

Rockchip unveiled two RK182X LLM/VLM accelerators at its developer conference last July, namely the RK1820 with 2.5GB RAM for ...

Geeky Gadgets

AI Benchmarks Are Broken : The Leaderboard Illusion

What if the tools we trust to measure progress are actually holding us back? In the rapidly evolving world of large language models (LLMs), AI benchmarks and leaderboards have become the gold standard ...

Hackaday

Examining The Vulnerability Of Large Language Models To Data-Poisoning

Large language models (LLMs) are wholly dependent on the quality of the input data with which these models are trained. While suggestions that people eat rocks are funny to you and me, in the case of ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results