Generative AI is just the Beginning AI Agents are what Comes next | Daoud Abdel Hadi | TEDxPSUT
Nonprofits & Activism
Introduction
About six years ago, I was nearing the end of my master's degree in AI, and despite exploring various projects involving machine learning, genetic algorithms, and generative AI, I still felt that we were far from creating true intelligence in computers. While AI, especially machine learning, has transformed many fields—diagnosing illnesses, detecting fraud, optimizing traffic flows—it often operates as a specialist, excelling in very specific tasks. Unlike humans, AI systems struggle to generalize across different tasks. At the time, I believed that complete automation of work would remain a far-fetched idea.
However, I soon discovered I was wrong—very wrong. Just two years later, OpenAI introduced the world to large language models with the release of GPT-3, the predecessor to ChatGPT. This marked a massive leap forward for AI, as it was an experiment exploring what would happen if we gathered enormous data sets—books, articles, research papers—and trained an AI using the most powerful computers available. Many of you have likely experienced ChatGPT and witnessed its capabilities firsthand. It writes naturally, answers questions on a multitude of topics, understands and generates code, and composes various forms of writing, such as articles, songs, and poetry. Its remarkable ability to reason and recognize patterns in a manner akin to human cognition is striking.
Yet, while many have appreciated ChatGPT for brainstorming ideas, writing content, and answering queries, these functionalities merely scratch the surface of what generative AI can achieve. To truly grasp the potential of this technology, it's essential to recognize what it does not do well. For one, language models can make errors, including fabricating facts, a phenomenon known as "hallucination." Additionally, their information may not always be up to date, and they may struggle with basic math and multitasking. Though these limitations can pose significant challenges depending on the tasks at hand, it's important to note that humans share similar shortcomings.
Where we excel is in our problem-solving capabilities that go beyond mere knowledge. Our intelligence involves planning, breaking problems into smaller components, reflecting on the outcomes of actions, and effectively using tools to accomplish our goals. This brings us to a fascinating idea: what if we adopted a human-like problem-solving approach using language models? By rethinking AI like ChatGPT, not simply as chatbots needing constant human input, but as autonomous agents that can automate workflows end-to-end, we start to glimpse a powerful future.
Instead of relying on humans to manage tools, we can simply describe our tasks and goals to an AI. The AI will then determine which tools to use, how to execute them, and complete the task autonomously. This has profound implications. Picture yourself as an entrepreneur who has just launched a business but lacks the technical skills to create a website. Instead of hiring a web developer, you can describe your business to an AI, outline how you envision the website, and the AI builds it in seconds. Alternatively, if you've been accumulating data and require insights for sound business decisions, the AI uses analysis tools to provide instant responses to your queries. Or imagine wanting to plan a holiday; the AI organizes travel options, accommodations, and activities for you.
Such agents function as digital labor, capable of automatically browsing the web, managing files, using applications, and potentially controlling devices on our behalf. The potential realization of this vision might sound like science fiction, but advances in today's technology are making it increasingly viable. Importantly, everything on our screens is simply a visual representation of code. Every action and interface corresponds with underlying code, allowing us to create programs that combine functionalities from various applications in innovative ways.
The essence of how agents work lies in their ability to interchangeably use different tools. When a user requests an action—like "book me the cheapest flight to London"—the agent initiates a feedback loop of querying and responding, planning necessary actions, executing them, and adapting based on outputs until the task is completed. Currently, we are already witnessing agents in action. Microsoft's Copilot assists in Excel, analyzing spreadsheets with natural language. Similarly, tools like Shopify's Sidekick help users create websites, while HyperWrite acts as a personal assistant that can book flights, order food, and manage emails. Even ChatGPT now features a catalog of agents referred to as GPTs.
As language models become more affordable and user-friendly, I predict that numerous businesses will start incorporating agents in their operations and client services. Already, the barriers to accessing such technology are lowering. However, as agents continue to evolve—becoming more intelligent and sophisticated—they will transform our understanding of computing. The shift from command-line interfaces to graphical user interfaces revolutionized human-computer interaction; I believe the next leap will involve AI-assisted interfaces, akin to Tony Stark's AI, Jarvis.
Embracing a future with intelligent assistants may seem daunting; the technical skills we once believed were unique to humans are being delegated to AI. As a data scientist, I ponder how long it will take before an AI can perform tasks I once specialized in. This thought is simultaneously frightening and empowering. With democratized skills, the barriers to innovation shrink, enabling a broader range of people to contribute to problem-solving and creative pursuits—once dominated by large corporations and specialized professionals. In summary, while AI may outperform us in executing tasks, it offers us the chance to focus on what truly matters: our creativity, ingenuity, and human experience.
Keywords
- Generative AI
- AI Agents
- Automation
- Large Language Models
- ChatGPT
- Machine Learning
- Problem-Solving
- Digital Labor
- Intelligent Assistants
- Innovation
FAQ
Q: What is generative AI?
A: Generative AI refers to technology that can generate content, such as text, images, and music, based on the input data it has learned from.
Q: How do AI agents differ from traditional AI applications?
A: AI agents can automate entire workflows with minimal human intervention, planning and executing tasks independently based on user goals.
Q: Can AI agents perform complex tasks without human assistance?
A: Yes, AI agents can analyze data, perform calculations, and manage tools to complete tasks without requiring constant human input.
Q: What are some examples of AI agents in use today?
A: Examples include Microsoft’s Copilot for Excel, Shopify’s Sidekick for website creation, and personal assistants like HyperWrite for booking and task management.
Q: What are the implications of AI agents for the workforce?
A: AI agents may change job roles and skill requirements, democratizing access to tools and innovation while allowing humans to focus on creativity and problem-solving.
Q: Is there a risk of AI replacing human jobs?
A: While AI agents may automate certain tasks, they are also likely to create opportunities for new types of work that focus on human creativity and experience.