Scientific AI needs more than models. It needs shared knowledge, trainable scientific models, agent harnesses, rigorous evaluation, and public discovery artifacts — built as open infrastructure.
A commons of research skills — authored by human experts, executed by AI agents.
ResearchSkills.ai is an open platform that turns real scientific workflows into reusable skills for AI agents. Instead of storing raw chat logs, it reconstructs research sessions as decision trajectories — how researchers form hypotheses, diagnose failures, choose methods, and decide when to pivot — and distills them into portable skills that agents can retrieve and execute. Contributions are extracted locally, automatically de-identified, reviewed by domain experts for scientific accuracy, and published to an open library spanning 155+ scientific subdomains under CC BY 4.0. The goal is to give AI systems not just raw capability, but the tacit research judgment that determines which experiments matter, which dead ends to avoid, and when to persist versus change direction.
This paper systematically reveals how model scale, data volume, and compute jointly govern the scaling behavior of RL post-training for mathematical reasoning in large language models.
Recipes that turn open base models into Skills-fluent scientific agent brains.
BioUniGen.xyz is an integrated model platform for computational biology and drug discovery that unifies recognition and generation within a shared biological representation framework. Instead of treating tasks like molecule design, protein folding, function annotation, and de novo sequence generation as separate problems, it connects molecular sequences, 3D structures, and functional mechanisms in one adaptable system. By combining multi-modal biological inputs with joint predictive analysis and generative design, BioUniGen supports end-to-end research workflows such as molecular optimization, structural simulation, and functional mining. The goal is to overcome the fragmentation of existing biological AI tools and provide a more coherent engine for life science research.
An agent harness whose primitives are lab and literature native.
SUDP (Secret-Use Delegation Protocol) is a protocol for agentic systems that lets AI agents perform secret-backed operations without ever holding the underlying secret itself: instead of putting reusable credentials like API keys or OAuth tokens inside the agent runtime, it keeps secret ownership with the user and delegates only narrowly scoped, single-use, transaction-bound authorization for a specific action, recipient, and validity window. In practice, it works through three phases—setup, authorization grant, and consumption—so an agent can request an operation, the user can approve that exact operation with an authenticator-backed gesture, and the system can execute it without exposing the raw credential to the agent, making credential use more auditable and more resistant to leakage, replay, and misuse.
The most popular social simulation framework. 4.4k github star. The first large-scale agents society simulator and AI social scientist framework.
Multi-agent self-evolving framework.
This work introduces a unified scientific reasoning framework that combines token-level implicit retrieval with structured multi-agent refinement to improve accuracy while substantially reducing token usage and interaction steps.
A pluralistic research environment where AI agent scientists actually work.
SciAgentGym provides a scalable scientific tool-use environment with 1,780 domain-specific tools, a tiered benchmark for long-horizon agent evaluation, and SciForge for synthesizing logic-aware training trajectories to advance autonomous scientific agents.
LabUtopia is the first comprehensive laboratory-scale embodied intelligence platform that unifies multi-physics simulation, chemically meaningful interactions, procedural scientific scene generation, and hierarchical long-horizon benchmarks to push scientific agents from simple manipulation toward generalizable experimental reasoning.
The papers and findings we produce using our own stack.
This paper envisions AI-driven Science of Science as a new paradigm for automatically discovering large-scale research patterns, simulating scientific societies, and revealing the hidden mechanisms that drive innovation beyond the reach of traditional statistical and rule-based methods.
Agentic Reinforcement Learning (Agentic RL) reframes large language models from passive text generators into autonomous agents that learn to make decisions in dynamic, partially observable environments. This survey synthesizes over 500 recent works, systematically organizing core agentic capabilities, application domains, open-source environments, benchmarks, and frameworks to guide the development of scalable general-purpose AI agents.
Every output here links back to the Agent runs, the skills, and the evaluation tasks that produced it. Help us extend the stack into your subdomain.