Artificial intelligence continues to advance at a rapid pace. Even in 2020, a year that did not lack compelling news, AI advances commanded mainstream attention on multiple occasions. OpenAI’s GPT-3, in particular, showed new and surprising ways we may soon be seeing AI penetrate daily life. Such rapid progress makes prediction about the future of AI somewhat difficult, but some areas do seem ripe for breakthroughs. Here are a few areas in AI that we feel particularly optimistic about in 2021.
Two of 2020’s biggest AI achievements quietly shared the same underlying AI structure. Both OpenAI’s GPT-3 and DeepMind’s AlphaFold are based on a sequence processing model called the Transformer. Although Transformer structures have been around since 2017, GPT-3 and Alphafold demonstrated the Transformer’s remarkable ability to learn more deeply and quickly than the previous generation of sequence models, and to perform well on problems outside of natural language processing.
Unlike prior sequence modelling structures such as recurrent neural networks and LSTMs, Transformers depart from the paradigm of processing data sequentially. They process the whole input sequence at once, using a mechanism called attention to learn what parts of the input are relevant in relation to other parts. This allows Transformers to easily relate distant parts of the input sequence, a task that recurrent models have famously struggled with. It also allows significant parts of the training to be done in parallel, better leveraging the massively parallel hardware that has become available in recent years and greatly reducing training time. Researchers will undoubtedly be looking for new places to apply this promising structure in 2021, and there’s good reason to expect positive results. In fact, in 2021 OpenAI has already modified GPT-3 to generate images from text descriptions. The transformer looks ready to dominate 2021.
Graph neural networks
Many domains have data that naturally lend themselves to graph structures: computer networks, social networks, molecules/proteins, and transportation routes are just a few examples. Graph neural networks (GNNs) enable the application of deep learning to graph-structured data, and we expected GNNs to become an increasingly important AI method in the future. More specifically, in 2021, we expect that methodological advances in a few key areas will drive broader adoption of GNNs.
Dynamic graphs are the first area of importance. While most GNN research to date has assumed a static, unchanging graph, the scenarios above necessarily involve changes over time: For example, in social networks, members join (new nodes) and friendships change (different edges). In 2020, we saw some efforts to model time-evolving graphs as a series of snapshots, but 2021 will extend this nascent research direction with a focus on approaches that model a dynamic graph as a continuous time series. Such continuous modeling should enable GNNs to discover and learn from temporal structure in graphs in addition to the usual topological structure.
Improvements on the message-passing paradigm will be another enabling advancement. A common method of implementing graph neural networks, message passing is a means of aggregating information about nodes by “passing” information along the edges that connect neighbors. Although intuitive, message passing struggles to capture effects that require information to propagate across long distances on a graph. Next year, we expect breakthroughs to move beyond this paradigm, such as by iteratively learning which information propagation pathways are the most relevant or even learning an entirely novel causal graph on a relational dataset.
Many of last year’s top stories highlighted nascent advances in practical applications of AI, and 2021 looks poised to capitalize on these advances. Applications that depend on natural language understanding, in particular, are likely to see advances as access to the GPT-3 API becomes more available. The API allows users to access GPT-3’s abilities without requiring them to train their own AI, an otherwise expensive endeavor. With Microsoft’s purchase of the GPT-3 license, we may also see the technology appear in Microsoft products as well.
Other application areas also appear likely to benefit substantially from AI technology in 2021. AI and machine learning (ML) have spiraled into the cyber security space, but 2021 shows potential of pushing the trajectory a little steeper. As highlighted by the SolarWinds breach, companies are coming to terms with impending threats from cyber criminals and nation state actors and the constantly evolving configurations of malware and ransomware. In 2021, we expect an aggressive push of advanced behavioral analytics AI for augmenting network defense systems. AI and behavioral analytics are critical to help identify new threats, including variants of earlier threats.
We also expect an uptick in applications defaulting to running machine learning models on edge devices in 2021. Devices like Google’s Coral, which features an onboard tensor processing unit (TPU), are bound to become more widespread with advancements in processing power and quantization technologies. Edge AI eliminates the need to send data to the cloud for inference, saving bandwidth and reducing execution time, both of which are critical in fields such as health care. Edge computing may also open new applications in other areas that require privacy, security, low latency, and in regions of the world that lack access to high-speed internet.
The bottom line
AI technology continues to proliferate in practical domains, and advances in Transformer structures and GNNs are likely to spur advances in domains that haven’t yet readily lent themselves to existing AI techniques and algorithms. We’ve highlighted here several areas that seem ready for advancement this year, but there will undoubtedly be surprises as the year unfolds. Predictions are hard, especially about the future, as the saying goes, but right or wrong, 2021 looks to be an exciting year for the field of AI.
Ben Wiener is a data scientist at Vectra AI and has a PhD in physics and a variety of skills in related topics including computer modeling, optimization, machine learning, and robotics.
Daniel Hannah is a data scientist and researcher with more than 8 years of experience turning messy data into actionable insights. At Vectra AI, he works at the interface of artificial intelligence and network security. Previously, he applied machine learning approaches to anomaly detection as a fellow at Insight Data Science.
Allan Ogwang is a data scientist at Vectra AI with a strong math background and experience in econometrics, statistical modeling, and machine learning.
Christopher Thissen is a data scientist at Vectra AI, where he uses machine learning to detect malicious cyber behaviors. Before joining Vectra, Chris led several DARPA-funded machine learning research projects at Boston Fusion Corporation.