AI Agents Explained Without Hype, From The Ground Up
AI agents are Big Data and Data Science in 2013 all over again. Everyone talks about it, but they all think different things. This causes marketing and sales challenges.
AI agents have a huge communication problem.
Everyone’s talking about them—but few can explain what they actually are, let alone agree on a definition. It’s Big Data in 2013 all over again.
This confusion slows down innovation and sales, since people aren’t on the same page.
And there’s largely three main reasons for this confusion.
First, people are talking about AI agents at different conceptual levels. Some describe AI agents as AI “employees”. Others say agents are smarter workflow automations. Wall Street Journal thinks they are “autonomous bots”. Meanwhile, OpenAI and Anthropic have slightly different definitions of agents. This is confusing and frustrating.
Second, most people simply don’t get why we need autonomous software—let alone what it actually means to give software autonomy in practice. So all these definitions? They’re flying way over people’s heads.
Third, there is too much hype on the technology, and not on concrete use cases. Especially, no one’s talking about when NOT to build AI agents. It’s basically like the early days of “data science” - people can’t understand why they need to bother. Plus, every automation company repositioned themselves as AI agent companies, so buyers are confused about how AI agents differ from automation.
So today, I will take a different approach. We will start from scratch, assuming almost nothing, without the buzzwords, and get clear on AI agents.
I will explain what AI agents are from the ground up in plain English (*). Along the way, I will provide my definitions of AI agents, workflows, and other agent concepts.
This post can be used as a common AI agent glossary to align with your customers or colleagues. For a more technical coverage, please see my piece for the AWS Machine Learning Blog on AI agents (2023).
Here, we’ll cover:
What is an AI workflow
What it means for software to be “agentic”
What is an AI agent, and what is not an AI agent
The types of problems you should use AI agents for, and why
The downsides of AI agents
What is AI Workflow, or AI Automation?
Before we talk about AI agents, let’s talk about automated workflows.
A workflow (automation) is essentially a predefined script that runs on some triggering event: a hardcoded, step-by-step sequence of actions that computer executes to complete a task. Each action could be some isolated task like reading an email, or creating a Slack message.
Sample use case: “when a lead arrives, send a slack message to marketing team channel”.
You can define workflows in various ways, using Python scripts, or using no-code or low-code builders (e.g. Zapier, Alteryx, Datarobot), etc.
But they are all ultimately scripts under the hood. You - the workflow creator - needs to predefine your logic, and program it. No-code and low-code tools just facilitate that programming.
Then, an AI workflow is a simply a traditional workflow that includes one or more steps that use AI models like LLMs. Such steps are for classifying emails, pulling structured data from a PDF, summarizing a conversation, generating different variations of Tweets, etc.
Prior to the release of OpenAI’s GPT-3.5-Turbo model in 2023, these types of steps was really hard. That’s why AI workflows exploded after 2023. OpenAI democratized AI tasks.
For example, imagine a 3-step support workflow that checks an inbox, classifies each message by topic, and then forwards it to the right team inbox. An OpenAI model is used for classifying email messages. After classifying an email (e.g. “refund”), the workflow routes it to the refunds department. This is an AI workflow, though trivial, since some of its components use an AI model.
But the key to remember is that AI workflows are hard-coded scripts. Everything is predefined.
You—the builder—have to think through every path in the workflow in advance. You program workflow automations like traditional software, e.g. thinking through “If this, then that.” You can slap a LLM into one of the steps, but the overall system is still static.
Why AI Workflows Are Limited
Predefined workflows are fine when the inputs are predictable. But many real life problems aren’t clean, and require dynamic adaptability, i.e. “thinking on feet”.
Let’s say we get another email, but this time more complicated:
“I want to cancel my subscription, but I already got charged—can I get a refund for this month?”
Now you’ve got multiple customer requests in one email (refunds, and cancellation). The existing workflow doesn’t know what to do, because it’s currently set up to forward emails to a single department.
The logic you hardcoded is now breaking down. So you now need to reprogram the workflow to handle this new situation.
If another edge case emerges (”billing address change”), you patch that too.
Soon, you’re duct-taping your way through a growing mess of exceptions. That’s the reality of building workflows, and software engineering at large.
This is the core weakness of AI workflows - they are rigid, unadaptable, and are only as good as the predefined logic. Plus, they only get smarter as you add complexity via programming new stuff.
This gets unscalable quickly. This is the fundamental issue with traditional software, beyond AI workflows.
To illustrate, Alexa - the popular AI assistant - was itself a giant AI workflow (I was an early Product Manager at Alexa). That’s why Amazon needed over 5,000 engineers to build Alexa, because it used pre-LLM era methods to handle edge cases. This approach is too expensive and doesn’t work at scale.
What is agency?
That brings us to agency—the core idea behind AI agents.
In the context of software, agency means software that can make decisions and take actions without you spelling out every step in advance.
Here’s the core premise: if we let software to figure things out on its own, then software can be more useful, especially for handling complex tasks.
What do I mean?
The main bottleneck with traditional software - or AI workflows - is that they need predefinition, whether it’s a simple Zapier workflow or a Microsoft Flight Simulator. But that predefinition is either infeasible or extremely costly in many situations, such as:
reacting to intelligently to ad hoc situations (e.g. AI customer service agent)
interfacing with real world environments (e.g. humanoid robot walking on new surfaces)
exploring a vast search space to find answers (e.g. OpenAI’s Deep Research)
For example, it’s infeasible to hard code heuristics into Deep Research as to how it should do research in the infinite number of domains available, across billions of websites. We reach a point where giving agency to software is the only viable solution.
Even seemingly simple workflows like triaging emails can be hard to program with traditional automation, because there are a lot of edge cases. Hard coding execution paths for all these edge cases is exhausting and expensive, because it needs programming ability.
This is where software with agency could shine. What if we could integrate programmatic intelligence such as LLMs to handle business logic “on the fly”, as opposed to predefining everything?
To reiterate, software with agency is valuable for dealing with problems that are either difficult or impossible to predefine and analyze upfront. That’s the crux of the value of giving software agency.
What is an AI agent?
So how do we give agency to software? In 2025, that means prompting a LLM to make decisions, and building an AI agent.
An AI agent is a system that can independently plan and take actions to pursue goals across multiple steps. An AI agent works by using an AI model (typically LLM) to plan actions, take actions using tools (e.g. API calls, etc), then observe the results of its actions to make progress toward its goals.
This is a mouthful, but the easiest way to “get” this is by observing how OpenAI’s Deep Research works.
Deep Research is popular AI agent from OpenAI capable of writing research papers on any topic by searching the web. In this case, Deep Research uses a LLM to plan its research, a web search as a tool to gather information, and a LLM again to process its learnings. This happens in a continuous loop.
This is best illustrated by this visual showing Deep Research’s thought trail when it researches a question, e.g. “non-dilutive financing options”.
As you can see, Deep Research is going in loops with different stages of planning, acting, and observing:
Planning: Thinking about what to search next, based on its research goals (user prompt, system prompt), and what it has researched so far.
Acting: Pulling in content from websites.
Observing: Reflecting about the content to decide if the research is done, or if we should research some more.
This loop of planning, acting, and observing is called “agentic loop” - akin to someone’s mind spinning around. It is the engine that drives the progress toward the goal. Each stage has a purpose.
In the center of this agentic loop is an AI model (typically LLM) that can reason and use tools (via methods like function calling).
Planning: LLM decides what actions (tools) to use. Either that, return a final answer. Ex) Deep Research’s O3 model decides to what information it needs from where.
Acting: The system uses the tools. Ex) Deep Research pulls the web content into the context.
Observing: Ponder the output of the tool use to determine the progress. Decide whether we are done, or if we need to continue. Ex) Deep Research’s O3 model ponders the new info, and updates its knowledge of the body of research.
So in a sense, the LLM inside of an AI agent is serving as the “brain” of a mouse finding its way out of a maze. Or the software inside self-driving cars like Waymo. The LLM is allowing for adapting and problem solving on the fly.
This ability to adapt leads to emergent problem solving behaviors.
If the agent is stuck, it can backtrack and try a different approach —just like a human when she hits a wall. The diagram below shows how Deep Research can utilize backtracking to steer itself towards the goal, over time.
As a result, every AI agent execution can experience different paths—because agents adapt in real time and adjust course as they observe new information. That’s the opposite of traditional AI workflows, where the logic is hardcoded and every execution walks a deterministic path.
That’s how agency is created. Through this loop. Actions are decided on-the-fly by the agent - not preprogrammed.
How To Create AI Agents
The word “agent” anthropomorphizes the word, which confuses many. To clarify, AI agents are code, not Terminator 1000 or Agent Smith.
But programming an AI agent is drastically different from traditional programming or making workflows.
Instead of predefining every logic (i.e. telling exactly how the script should execute), you simply define the agents’ goals, capabilities (tools), and constraints.
You don’t focus on the “how” as much as the “what” - what the AI agent should do, what it shouldn’t, and what rules it should abide by. Then the LLM will execute your bidding by iterating through an agentic loop.
This method of programming where you are mainly declaring the behavior of a system is called agentic programming.
To actually code an agent, there are many options - just like with AI workflows.
You can use any of the enterprise automation tools, or use frameworks like Pydantic.AI, OpenAI Agents SDK, Langchain, etc. But these frameworks are all doing the same thing interally of spinning up code that runs agentic loops.
For example, our AI agent can be written in Python or created as a workflow graph - but notice that instruction (contained in the red box) is basically doing all the heavy lifting. I wrote about the importance of prompts here.
The point is, you can create AI agents in multiple ways. An AI agent is not a concept that’s attached to a specific company’s framework (Langchain, Zapier, etc). So no, you do not need to use Langchain, for example, to create AI agents.
Building Blocks of an AI Agent
A minimally viable AI agent is composed of these five building blocks as a single unit. All these components need to be present to qualify as an AI agent.
Instruction: a prompt that defines the agent’s goals, constraints, guidance, etc.
ex) This will be the system prompt for Deep Research that’s invisible to users.
AI model (typically a LLM): used as the “brain” helps with planning and observing inside agentic loop. In practice, a multi-modal LLM (e.g. GPT-4o or GPT-O1) is used to output next action(s) as tokens e.g. “I need to use a calculator”.
ex) In Deep Research, it may look like “I need to open up cnbc.com to read news about stocks”.
Tools: Tools allow the AI agent to act inside the agentic loop, needed to make progress toward the goal (e.g. calculator, web browser, etc).
Environment (optional): the environment is the immediate system that the AI agent is running on, and has access to.
ex) For Deep Research, the environment consists of its tools, the server(s) it runs on, and the security boundaries of the server it runs on. Technically, environment refers to the entire “stack” that the AI agent runs on, including hardware, operating system, security policies, network, etc.
Guardrails (optional): guardrails are programmatically enforced policies - specified as code - that “hard-stop” AI agents or raise warnings when it displays undesirable behaviors. To reduce an AI agents’ agency, provide stricter guardrails.
ex) Deep Research may have a guardrail to not accidentally recommend suicidal ideation, or teach people how to make chemical weapons.
Levels of Autonomy, and Detecting AI agents
To recap so far:
what is an AI agent? A system that has some degree of agency by incorporating one or more agentic loops.
what ISN’T an AI agent? A system that is 100% predefined and has no agency. A tell is that it doesn’t have an agentic loop anywhere inside.
This means AI workflows that don’t have agentic loops are not technically AI agents.
On the flip side, if a workflow that contains one or more agentic components is an AI agent.
For example, you can have a straightforward automation for churning out marketing blogs, but if one of the steps is calling Deep Research (which is an AI agent), then your whole automation is an AI agent. That’s because the workflow’s behavior is not predefined.
Another example. If you have an AI workflow where every step is a call to a LLM, but each step is merely chained one after another - that’s not an AI agent.
Also, every AI agent will have different amounts of autonomy. More tools, looser instructions, and fewer guardrails give agents more freedom. This comes at the cost of reliability and predictability.
Autonomy also does not equate complexity. Deep Research is a very complicated agent, but it doesn’t have tremendous autonomy because it works on a narrow task with a limited set of tools. In contrast, an agent running inside a humanoid robot with unlimited use of its arms and legs has more autonomy.
Why Build AI Agents Now?
So why care about AI agents now?
The bet here is that LLMs in 2025 are getting “good enough” that you can just feed it your business logic as a prompt, give it some tools, and the program can figure out how to accomplish tasks without humans predefining complex logic upfront.
For now, in 2025, the question of “good enough” depends highly on the domain you are building in. But based on OpenAI’s recent releases and their product roadmap, the timelines are accelerating fast enough where it’s fine to start building AI agents now.
For more on OpenAI’s roadmap, I recommend reading my previous posts:
When and When Not To Build Agents
This has a simple answer.
Essentially, you create AI agents when your problem benefits from agency, either because the problem is too complex or tricky to predefine. That’s really the main reason.
That’s because AI agents also have downsides.
Granting agency to AI agents can be problematic if you need very high predictability, which is where traditional software and workflows shine.
Since LLMs are ultimately probabilistic models, 100% predictability is impossible. LLMs can - even in 2025 - hallucinate “wrong actions” or behave unpredictability, especially if you are using weak, small LLMs.
AI workflows versus AI agents
Ultimately, it’s about weighing the costs of unpredictability with the benefits of agency. Generally speaking, AI agents have proven value in production for 1) automation use cases that require small amounts of reasoning, such as document processing, 2) exploratory use cases such as report generation, coding, or research, or 3) interfacing with customers.
It’s critical to add only as much agency as you actually need.
Not every use case demands full autonomy. Think of agency as a dial rather than an on/off switch. Turn it up when tasks involve high uncertainty, nuanced decision-making, or frequent edge cases. Dial it back down for predictable tasks where workflows excel. For more details, refer to this post on Why AI agents feel scammy, despite impressive demos.
Crucially, AI agents and workflows are not mutually exclusive. An agent can handle a complex decision-making steps within a larger, structured workflow. Conversely, workflows can manage the repeatable steps surrounding agent-driven tasks.
When to use agentic components:
Tasks involving significant ambiguity, complexity, or dynamic conditions.
Situations where handling edge cases individually would be impractical.
Processes that benefit significantly from flexibility and adaptive decision-making.
When not to use agentic components:
Predictable, repetitive tasks where traditional workflows suffice.
Highly regulated scenarios with little room for error or experimentation.
Simple linear sequences where adding agency adds unnecessary risk or complexity.
🚀 If you have read until here, you may be a good fit for the coming AI workshop, which is for those who are passionate about AI and serious about taking the next step.
The first cohort starts on April 23rd, and I am currently accepting applications. For more information, check out this page for more. Don’t hesitate to apply.
About John Hwang
I write the "Enterprise AI Trends" newsletter, read by over 30K readers worldwide. I mainly spend time writing, trading, or coding. Previously, I was a Generative AI architect at AWS, an early PM at Alexa, and the Head of Volatility Index Trading at Morgan Stanley. I studied CS and Math at Stanford (BS, MS).