If you have looked into AI agents for your business, you have probably run into the word "tokens" within the first five minutes. It sounds like jargon from a developer conference. In practice, it is one of the simplest concepts in AI, and understanding it will save you money and stop vendors from overcharging you.
This guide breaks down what tokens are, how pricing works across the major AI providers, what real business tasks actually cost, and how token prices have changed dramatically over the past two years.
What is a token?
A token is a chunk of text that an AI model reads or writes. It is not exactly a word and not exactly a character. Most modern language models use a system called byte-pair encoding (BPE) to split text into tokens. The rough rule of thumb: one token is about three-quarters of an English word, or about four characters.
Some concrete examples:
If you want to see exactly how text gets tokenised, OpenAI has a free Tokenizer tool on their website where you can paste text and watch it split into tokens in real time.
Why does tokenisation matter for cost?
Because every AI provider charges per token. When your AI agent reads a customer email, that email is converted into tokens. When the agent writes a reply, those output tokens are also counted. Your bill is a direct function of how many tokens go in and how many come out.
Input tokens vs output tokens
This distinction matters because most providers charge different rates for each.
Input tokens are everything the AI reads: the customer's email, your business context, the instructions you have given the agent, any documents it needs to reference. Think of input tokens as the AI listening.
Output tokens are everything the AI writes back: the reply to that email, a summary of a document, a drafted invoice. Think of output tokens as the AI speaking.
Output tokens are typically more expensive than input tokens, often two to four times the price. This is because generating new text is computationally harder than reading existing text.
How context windows work
Every AI model has a context window, which is the maximum number of tokens it can hold in a single conversation or task. Think of it as the model's working memory.
The context window includes both the input and the output. If you feed the model a 50,000-token document and ask it to write a 2,000-token summary, you have used 52,000 tokens of the context window.
For most business tasks, you will never come close to hitting these limits. A typical customer email exchange uses a few hundred tokens. Even a lengthy contract review might use 20,000 to 30,000 tokens.
How the major models price tokens
Pricing varies between providers and between models within the same provider. Here are the current rates for the most commonly used business-grade models as of early 2026:
OpenAI (GPT-4o)
Anthropic (Claude 3.5 Sonnet)
Google (Gemini 1.5 Pro)
What do these numbers actually mean?
One million tokens sounds abstract. In practical terms, one million tokens is roughly 750,000 words, or about ten full-length novels. That US$2.50 to US$3.00 input cost is the price of reading ten novels. For a business processing a few hundred emails a day, the monthly token cost is usually trivial.
Real cost calculations for common business tasks
Let us put actual dollar figures on the tasks a small Australian business might automate.
Email handling
An average business email is about 150 to 300 words, or roughly 200 to 400 tokens. A reply is usually a similar length. Including system instructions (the context your agent needs about your business), a single email read-and-reply cycle uses about 1,000 to 1,500 tokens total.
At GPT-4o rates, that is roughly:
If your business handles 50 emails a day, that is about $0.35 per day, or roughly $10 per month.
Document drafting
Drafting a one-page business letter from a brief set of instructions typically uses about 500 input tokens and 800 output tokens.
Call transcription and summarisation
A 10-minute phone call generates about 1,500 words of transcript, or roughly 2,000 tokens. Summarising that transcript produces another 300 to 500 output tokens.
If your business transcribes and summarises 30 calls a month, that is about $0.30 per month.
Adding it all together
A typical small business running an AI agent for email handling, document drafting, and call summaries might use:
These figures use mid-range GPT-4o pricing converted to AUD at roughly 1.55 AUD per USD. Your actual costs will vary based on the model you use and the complexity of your tasks, but the order of magnitude is right: tens of dollars per month, not hundreds.
How to monitor and control your spend
Every major AI provider gives you tools to manage costs:
Spending caps. OpenAI, Anthropic, and Google all let you set hard monthly limits on your API account. Set it to $30, $50, or whatever your comfort level is. When the cap is hit, API calls stop. No surprise bills.
Usage dashboards. Each provider has a dashboard showing your daily and monthly token usage broken down by model. Check it weekly for the first month or two until you understand your patterns.
Model selection. You do not always need the most powerful model. Many routine tasks (email classification, simple replies, data extraction) work perfectly well on cheaper models like GPT-4o-mini (US$0.15 per million input tokens) or Claude 3 Haiku (US$0.25 per million input tokens). Reserve the bigger models for complex reasoning tasks.
Prompt optimisation. The instructions you give your AI agent (called the system prompt) are sent with every request. A bloated 2,000-word system prompt adds unnecessary tokens to every single call. Keeping prompts concise can reduce costs by 20 to 40 percent.
How token costs compare to other business expenses
To put AI token costs in perspective for an Australian small business:
The token cost of running an AI agent is comparable to a single Xero subscription. For many businesses, it is the cheapest software line item on the books.
Token prices have dropped dramatically
One of the most important trends in AI is that token prices have been falling rapidly, and this trend shows no sign of stopping.
According to data compiled by a16z and reported widely across the industry:
ARK Invest's Big Ideas 2025 report projected that AI inference costs would continue to fall at roughly 50 to 70 percent per year, driven by hardware improvements, model efficiency gains, and competition between providers.
What this means for your business: the AI agent you deploy today will cost less to run next year, and even less the year after. The economics only improve over time.
Who do you actually pay?
You pay the AI provider directly. If your agent runs on GPT-4o, you pay OpenAI. If it runs on Claude, you pay Anthropic. You set up your own API account, you own the billing relationship, and you can switch providers at any time.
This is different from some AI platforms that mark up token costs by 300 to 500 percent and bundle them into a flat monthly fee. When you pay the provider directly, you get wholesale rates and full transparency.
The bottom line
Tokens are just the units AI models use to measure text. The costs are low, they are transparent, and they are falling every year. For most Australian small businesses, running an AI agent costs less per month than a single team lunch. Understanding tokens puts you in control of your AI costs instead of relying on a vendor to tell you what things should cost.