Articles by amarble
23

Task-free intelligence testing of LLMs (marble.onl)

1

Task-free intelligence testing of LLMs (marble.onl)

1

Intelligence is not just about task completion (marble.onl)

2

If You Meet ET in Space, Kill Him (2024) (nautil.us)

1

Intelligence is not just about task completion (marble.onl)

1

Show HN: Gen AI Writing Showdown (writing-showdown.com)

2

Ifrro member Kopinor signs agreement on newspaper content for AI in Norway (ifrro.org)

1

Comparing language model performance on creative writing transformations (writing-showdown.com)

1

Eminembench (marble.onl)

1

Promptware Attacks Against LLM-Powered Assistants in Production (sites.google.com)

2

Managing LLM application performance through code standards (marble.onl)

1

Catching Claude Cheating (marble.onl)

2

Catching Claude Cheating (marble.onl)

3

Scanning AI application code for vulnerabilities and performance issues (marble.onl)

3

Show HN: A static scanner for LLM app code (github.com/kereva-dev)

2

Scanning AI application code for vulnerabilities and performance issues (marble.onl)

2

The Model Trust Score: The Framework for Strategic Enterprise AI Model Selection (credo.ai)

18

Evals are not all you need (marble.onl)

1

An AI Cyber Incident in Plain Sight (marble.onl)

2

AI agent using Anthropic's tool calling and the Pandas Python library (github.com/rbitr)

2

Following LLM Manufacturer's Instructions (armilla.ai)

1

AI Cybersecurity Lessons from GenAI (marble.onl)