Focus AI: When AI goes from talking to acting

From generative to “interactive” AI: AI systems like ChatGPT will soon not only deliver texts but also be able to fulfill tasks as assistants. To do this, they need to learn how to use tools. ChatGPT is already making a start by integrating the image AI DALL-E and other plug-ins.

Helmut Spudich

Enough chattering; it’s time to get down to business. Since ChatGPT saw the light of day ten months ago, there has been a lot of chatting – from love poems, drafts for authors, and social media texts to school assignments and term papers, which have left teachers in serious doubt. ChatGPT can write long essays for us, summarize texts, or interpret blood values copied from a PDF. But it fails to answer simple questions about the latest developments. “I’m sorry, but my last training data set ends in January 2022,” is the sparse response of the otherwise talkative AI on current topics.

This limitation is not surprising, as ChatGPT is a so-called “Large Language Model” (LLM), which means that it draws its “knowledge” from an incredibly massive amount of text on which it was trained at time X. However, it cannot simply search the Internet and include breaking news yet. For now, it still needs to be improved in its range of action; for example, it cannot work with Excel or summarize information from a video file.

The era of chatting is now giving way to action: ChatGPT has new features. These include a direct link to DALL-E, the image-generating AI also developed by OpenAI. What looks like a small step for users is a big step for the generative AI system.

Previously, ChatGPT could be used to refine the necessary “prompts” – instructions for an AI system – to create image files with DALL-E to get better results. But it took the user’s copy and paste to get from one AI to another. This happens in one process now, assuming a paid subscription to GPT Plus.

Plug-ins, small additional programs, also aim to help ChatGPT learn new tricks. One example for this still-young development is opening up the world of PDF files. AskYourPDF, for example, can “read” PDF documents and create summaries for users or answer specific questions about the content of the PDF, requiring a web link or uploading the PDF file.

AI systems will gradually develop into real assistants. The travel website Expedia and the restaurant reservation system Open Table are among the first plug-in developers. They will soon be able to provide their information on travel and restaurant reservations via ChatGPT. These are still beta programs intended to gain experience from the “real world.” However, in the not too distant future, we can expect to carry out routine tasks such as making appointments or booking flights with the help of AI assistants.

Today’s systems are a long way from that. Mustafa Suleyman, a co-founder of the British AI company DeepMind, which Google has eventually acquired, sees generative AI systems as a further step towards “interactive AI.” By this, he means bots that can complete tasks by relying on other software as well as the input of other people.

AI, explains Suleyman in an interview with WIRED magazine, developed in three phases. The first phase was about classification, in which AI systems learned to classify and analyze different types of input data, such as images, video, or speech. The current second phase of generative AI can use this input data to produce new data.

“The third stage of AI development will be an interactive phase. I bet this is the future interface of computers: instead of clicking buttons and typing, you talk to the AI. And this AI can then do things based on a given goal and use the necessary tools,” says Suleyman. The AI researcher wants to implement this with his own AI,

Let us take the example of booking a flight: To fulfill this task, an AI assistant must be able to research airline databases independently, possibly obtain additional information from service centers via email, make suggestions to the client, and ultimately make a booking make use of credit card information.

However, the capabilities of such interactive AI must be carefully controlled and limited by humans, warns Suleyman. The AI’s ability to self-improve is particularly critical: “You don’t want your little AI to go off and update its own code without you having oversight.” Suleyman is convinced that government regulation is essential for the further development of AI. “I think everybody is having a complete panic that we will not be able to regulate it. But that is complete nonsense. As with other successfully regulated areas, this will also be possible with AI.”

发布日期: 2023年09月29日