DeepMind – Google’s Approach to Artificial Intelligence
Artificial intelligence is a very current topic, and we have had several posts about AI. We will mention just a few: (ChatGPT, MaxAI, VeniceAI). In this post, we focus on the projects of a company dedicated to the study of deep learning and artificial intelligence. DeepMind is a company founded in 2010 and was acquired by Google in 2014, which officially marked Google’s significant entry into this now highly relevant field.
The Origin and Development of DeepMind
The explicit idea of the DeepMind project is to use artificial intelligence to improve various areas of human life: health, work, play, etc. This would be achieved using artificial general intelligence (AGI) as opposed to narrowly specialized AI systems. DeepMind combines knowledge from mathematics, neurology, computer science, machine learning, and simulations.
Today, DeepMind consists of two major units – laboratories: Google Brain and DeepMind, which work together. Google Brain was founded in 2011 with the idea of making Google products and services accessible to as many people as possible through artificial intelligence. Its inventions include open-source programs JAX and TensorFlow. In 2017, Google Brain invented the Transformer architecture, which is the basis of most language models today.
DeepMind has had a large number of more or less significant projects. Here are just a few of the most successful ones:
AlphaGo is a program that uses AI to play Go (this strategy game is considered more complex than chess in terms of possible moves). In 2016, AlphaGo managed to defeat the world champion.
AlphaGo Zero learned to play Go solely based on artificial intelligence, without any human-provided procedures.
AlphaCode is a tool for autonomous programming with results that often surpass human capabilities. It has won programming competitions.
AlphaFold predicts the 3D structure of proteins from amino acid sequences, making a significant contribution to medicine, biology, and pharmacy. AlphaFold Database is a database with predicted 3D protein structures.
DeepMind Health collaborates with hospitals in the UK, assisting in diagnosing and predicting various diseases, thereby aiding healthcare.
WaveNet generates realistic human speech, changing the way we interact with computers.
Gato is an example of artificial general intelligence (AGI) that can create text, play games, or manipulate objects.
The part of the DeepMind company visible to the public directs its development in three main directions, which we will briefly present:
DeepMind assistant – Gemini
Gemini is a new version of a digital assistant that provides a wide range of services using artificial intelligence. It is used for internet searches, where questions can now be formulated in natural human language. Gemini also serves as a translator, somewhat like an enhanced Google Translate. You can converse with Gemini like any other AI chatbot. It can answer complex questions and assist with various tasks by providing advice, reminding you of obligations, or writing emails independently. Gemini is trained on a large amount of data to recognize the context of questions well. The model is constantly updated and belongs to the group of comprehensive assistants.
Project Astra is based on the Gemini model. The idea is to use artificial intelligence, cameras, speakers, and other inputs to locate objects in the environment and help optimize their use.
DeepMind image generator – Imagen 3
Imagen 3 is Google’s model for generating images from text. It is capable of creating more realistic details and managing lighting well. It has a wide range of visual styles and can produce simple sketches or high-resolution images. Currently, it is in the pre-release phase within ImageFX. It understands textual descriptions given by the user well and embeds a watermark at the pixel level that the human eye cannot see. Much work has been done on query filters to increase the safety of the output images’ content. Imagen 3 is expected to become widely used in a broad spectrum of Google services and products soon.
DeepMind video generator – Veo
Veo is Google’s new model for generating video content. It creates high-resolution videos that can last longer than a minute. Like the previous two models, it is based on a good understanding of natural language. As input, besides text, you can use an image or a combination of image and text. Veo applies a range of cinematic styles and follows textual queries to the finest details. It is not yet publicly available, and interested parties can join a waiting list. Its application is expected to be found in many areas, from visual storytelling to education.
The Future of DeepMind
It is evident that AI systems are now solving tasks that were recently considered unattainable for humans, let alone machines. This brings up several important issues, such as job loss, potential misuse, and ethical concerns. DeepMind has a dedicated ethics team, emphasizing awareness of these ethical issues. They strive to reduce their energy footprint and contribute to sustainable development, which is very much in vogue now.
The competition in the AI market is currently fierce, and it will undoubtedly become even more intense in the future. It is clear that DeepMind will do everything to remain at the forefront of the AI revolution, regardless of ethics, sustainable development, and energy footprint. The open question remains: Does DeepMind, backed by the power of the company behind it, contribute to overall progress, or does it stifle and buy out the ideas of other creative individuals and companies?
With these three current projects, DeepMind has positioned itself in the most attractive part of the AI market. In the coming period, we can expect significant integrations of artificial intelligence with robotics and medicine. Google will certainly not miss these changes.