Skills Officially Comes to Codex

(developers.openai.com)

55 points | by rochansinha 4 hours ago

10 comments

freakynit 19 minutes ago
I already was doing something similar on a regular basis.
I have many "folders"... each with a README.md, a scripts folder, and an optional GUIDE.md.
Whenever I arrive at some code that I know can be reused easily (for example: clerk.dev integration hat spans frontend and backend both), I used to create a "folder" of the same.
When needed, I used to just copy-paste all the folder content using my https://www.npmjs.com/package/merge-to-md package.
This has worked flawlessly well for me uptil now.
Glad we are bringing such capability natively into these coding agents.
cube2222 47 minutes ago
It's so nice that skills are becoming a standard, they are imo a much bigger deal long-term than e.g. MCP.
Easy to author (at its most basic, just a markdown file), context efficient by default (only preloads yaml front-matter, can lazy load more markdown files as needed), can piggyback on top of existing tooling (for instance, instead of the GitHub MCP, you just make a skill describing how to use the `gh` cli).
Compared to purpose-tuned system prompts they don't require a purpose-specific agent, and they also compose (the agent can load multiple skills that make sense for a given task).
Part of the effectiveness of this, is that AI models are heavy enough, that running a sandbox vm for them on the side is likely irrelevant cost-wise, so now the major chat ui providers all give the model such a sandboxed environment - which means skills can also contain python scripts and/or js scripts - again, much simpler, more straightforward, and flexible than e.g. requiring the target to expose remote MCPs.
Finally, you can use a skill to tell your model how to properly approach using your MCP server - which previously often required either long prompting, or a purpose-specific system prompt, with the cons I've already described.
[-]
- hu3 31 minutes ago
  Perhaps you could help me.
  I'm having a hard time figuring out how could I leverage skills in a medium size web application project.
  It's python, PostgreSQL, Django.
  Thanks in advance.
  I wonder if skills are more useful for non crud-like projects. Maybe data science and DevOps.
  [-]
  - freakynit 9 minutes ago
    Skills are not useful for single-shot cases. They are for: cross-team standardization (for LLM generated code), and reliable reusability of existing code/learnings.
  - jonrosner 21 minutes ago
    you could for example create a skill to access your database for testing purposes and pass in your tables specifications so that the agent can easily retrieve data for you on the fly.
rdli 18 minutes ago
This is great. At my startup, we have a mix of Codex/CC users so having a common set of skills we can all use for building is exciting.
It’s also interesting to see how instead of a plan mode like CC, Codex is implementing planning as a skill.
jonrosner 23 minutes ago
one thing that I am missing from the specification is a way to inject specific variables into the skills. If I create let's say a postgres-skill, then I can either (1) provide the password on every skill execution or (2) hardcode the password into my script. To make this really useful there needs to be some kind of secret storage that the agent can read/write. This would also allow me as a programmer to sell the skills that I create more easily to customers.
stared 57 minutes ago
Yes! I was raving about Claude Skills a few days ago (vide https://quesma.com/blog/claude-skills-not-antigravity/), and excited they come to Codex as well!
mikaelaast 56 minutes ago
Are we sure that unrestricted free-form Markdown content is the best configuration format for this kind of thing? I know there is a YAML frontmatter component to this, but doesn't the free-form nature of the "body" part of these configuration files lead to an inevitably unverifiable process? I would like my agents to be inherently evaluable, and free-text instructions do not lend themselves easily to systematic evaluation.
[-]
- Etheryte 48 minutes ago
  The modern state of the art is inherently not verifiable. Which way you give it input is really secondary to that fact. When you don't see weights or know anything else about the system, any idea of verifiability is an illusion.
  [-]
  - mikaelaast 33 minutes ago
    Sure. Verifiability is far-fetched. But say I want to produce a statistically significant evaluation result from this – essentially testing a piece of prose. How do I go about this, short of relying on a vague LLM-as-a-judge metric? What are the parameters?
  - hu3 29 minutes ago
    At least MCPs can be unit tested.
    With Skills however, you just selectively append more text to prompt and pray.
summarity 1 hour ago
See also:
Anthropic: https://www.anthropic.com/engineering/equipping-agents-for-t...
Copilot: https://github.blog/changelog/2025-12-18-github-copilot-now-...
karolcodes 1 hour ago
anyone using this in agentic workflow already? how is it?
rochansinha 4 hours ago
Agent Skills let you extend Codex with task-specific capabilities. A skill packages instructions, resources, and optional scripts so Codex can perform a specific workflow reliably. You can share skills across teams or the community, and they build on the open Agent Skills standard.
Skills are available in both the Codex CLI and IDE extensions.
[-]
- dan_wood 2 hours ago
  Thanks to Anthropic.
haffi112 1 hour ago
What are your favourite skills?
[-]
- pylotlight 47 minutes ago
  nunchuck skills