AI Research / Model Behavior Analysis

GPT Guesses Between 1 and 100

A GitHub research project that measures how gpt-4.1 responds when asked to pick a random number between 1 and 100, using 10,000 API calls and comparing the results to a uniform baseline.

AI research AI tool Research bias dataset distribution llm openai

Why it was accepted

The page clearly describes an AI-related research project with a concrete methodology, model name, sample size, and results. It is useful to AI builders and researchers interested in model sampling behavior, and the README gives enough evidence for a public listing.

Weakness

The snapshot does not show the actual charts, dataset contents, or code-level instructions for reproducing the experiment from start to finish, so a visitor cannot fully assess the outputs without opening the repository.

Review status

53 days ago #1082 ↓ -1

Last evaluated 53 days ago. Current rank #1082. Down 1 spot in the rankings.

Score history

Related listings

#263 Prometheus

AI Research / Autonomous Research Systems

An autonomous research system that runs on a single workstation and aggressively checks its own claims with adversarial self-verification, replication, and calibration audits.

↑ +2 7 days ago

#733 Socrates

AI Research / Multi-agent systems

Open-source multi-agent protocol for AI research agents. It pairs a tool-using Scientist with a question-only advisor that can only ask questions and approve plans, and the README includes quick-start setup plus notes on reproducing results on MLE-bench/Kaggle tasks.

↓ -2 22 days ago

#818 EuroMesh

AI Research / Analysis / Reports

A sourced model and short report exploring whether Europe could train a sovereign frontier AI model using public compute it already owns, with reproducible code, datasets, and a PDF report.

↑ +2 32 days ago

#835 MarCognity-AI

AI Research / Evaluation / Verification Framework

An open-source research framework for structured LLM evaluation, claim verification, and source-grounded reflective reasoning. The repo describes modular components for retrieval, semantic scoring, skeptical claim checking, and benchmark-style epistemic assessment.

↓ -35 73 days ago