Dileep George

Rajkumar Vasudeva Raju, J. Swaroop Guntupalli, Guangyao Zhou, Miguel Lazaro-Gredilla, Dileep George

December 2022 arXiv 2022

Space is a latent sequence: Structured sequence learning as a unified theory of representation in the hippocampus

Fascinating and puzzling phenomena, such as landmark vector cells, splitter cells, and event-specific representations to name a few, are regularly discovered in the hippocampus. Without a unifying princi- ple that can explain these divergent observations, each experiment seemingly discovers a new anomaly or coding type. Here, we provide a unifying principle that the mental representation of space is an emergent property of latent higher-order sequence learning. Treating space as a sequence resolves myriad phenomena, and suggests that the place-field mapping methodology where sequential neuron responses are interpreted in spatial and Euclidean terms might itself be a source of anomalies. Our model, called Clone-structured Causal Graph (CSCG), uses a specific higher-order graph scaffolding to learn latent representations by mapping sensory inputs to unique contexts. Learning to compress sequential and episodic experiences using CSCGs result in the emergence of cognitive maps - mental representations of spatial and conceptual relationships in an environment that are suited for planning, introspection, consolidation, and abstraction. We demonstrate that over a dozen different hippocampal phenomena, ranging from those reported in classic experiments to the most recent ones, are succinctly and mechanistically explained by our model.

arXiv

Dileep George, Rajeev V. Rikye,, Nishad Gothoskar,, J. Swaroop Guntupalli,, Antoine Dedieu,, Miguel Lázaro-Gredilla

April 2021 Biorxiv Preprint

Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps

Cognitive maps are mental representations of spatial and conceptual relationships in an environment, and are critical for flexible behavior. To form these abstract maps, the hippocampus has to learn to separate or merge aliased observations appropriately in different contexts in a manner that enables generalization and efficient planning. Here we propose a specific higher-order graph structure, clone-structured cognitive graph (CSCG), which forms clones of an observation for different contexts as a representation that addresses these problems. CSCGs can be learned efficiently using a probabilistic sequence model that is inherently robust to uncertainty. We show that CSCGs can explain a variety of cognitive map phenomena such as discovering spatial relations from aliased sensations, transitive inference between disjoint episodes, and formation of transferable schemas. Learning different clones for different contexts explains the emergence of splitter cells observed in maze navigation and event-specific responses in lap-running experiments. Moreover, learning and inference dynamics of CSCGs offer a coherent explanation for disparate place cell remapping phenomena. By lifting aliased observations into a hidden space, CSCGs reveal latent modularity useful for hierarchical abstraction and planning. Altogether, CSCG provides a simple unifying framework for understanding hippocampal function, and could be a pathway for forming relational abstractions in artificial intelligence.

DOI Nature Communications (2021)

Dileep George, Miguel Lazaro-Gredilla, Wolfgang Lehrach, Antoine Dedieu, Guangyao Zhou

October 2020 Biorxiv Preprint

A detailed mathematical theory of thalamic and cortical microcircuits based on inference in a generative vision model

Understanding the information processing roles of cortical circuits is an outstanding problem in neuroscience and artificial intelligence. Theory-driven efforts will be required to tease apart the functional logic of cortical circuits from the vast amounts of experimental data on cortical connectivity and physiology. Although the theoretical setting of Bayesian inference has been suggested as a framework for understanding cortical computation, making precise and falsifiable biological mappings need models that tackle the challenge of real world tasks. Based on a recent generative model, Recursive Cortical Networks, that demonstrated excellent performance on visual task benchmarks, we derive a family of anatomically instantiated and functional cortical circuit models. Efficient inference and generalization guided the representational choices in the original computational model. The cortical circuit model is derived by systematically comparing the computational requirements of this model with known anatomical constraints. The derived model suggests precise functional roles for the feed-forward, feedback, and lateral connections observed in different laminae and columns, assigns a computational role for the path through the thalamus, predicts the interactions between blobs and inter-blobs, and offers an algorithmic explanation for the innate inter-laminar connectivity between clonal neurons within a cortical column. The model also explains several visual phenomena, including the subjective contour effect, and neon-color spreading effect, with circuit-level precision. Our work paves a new path forward in understanding the logic of cortical and thalamic circuits.

Biorxiv Preprint (2020)

Daniel P. Sawyer, Miguel Lázaro-Gredilla, Dileep George

February 2020 Cognitive Science

A Model of Fast Concept Inference with Object-Factorized Cognitive Programs

The ability of humans to quickly identify general concepts from a handful of images has proven difficult to emulate with robots. Recently, a computer architecture was developed that allows robots to mimic some aspects of this human ability by modeling concepts as cognitive programs using an instruction set of primitive cognitive functions. This allowed a robot to emulate human imagination by simulating candidate programs in a world model before generalizing to the physical world. However, this model used a naive search algorithm that required 30 minutes to discover a single concept, and became intractable for programs with more than 20 instructions. To circumvent this bottleneck, we present an algorithm that emulates the human cognitive heuristics of object factorization and sub-goaling, allowing human-level inference speed, improving accuracy, and making the output more explainable.

Cognitive Science 2020

Rajeev V. Rikye,, Nishad Gothoskar,, J. Swaroop Guntupalli,, Antoine Dedieu,, Miguel Lázaro-Gredilla, Dileep George

January 2020 Biorxiv Preprint

Learning cognitive maps as structured graphs for vicarious evaluation

Cognitive maps enable us to learn the layout of environments, encode and retrieve episodic memories, and navigate vicariously for mental evaluation of options. A unifying model of cognitive maps will need to explain how the maps can be learned scalably with sensory observations that are non-unique over multiple spatial locations (aliased), retrieved efficiently in the face of uncertainty, and form the fabric of efficient hierarchical planning. We propose learning higher-order graphs – structured in a specific way that allows efficient learning, hierarchy formation, and inference – as the general principle that connects these different desiderata. We show that these graphs can be learned efficiently from experienced sequences using a cloned Hidden Markov Model (CHMM), and uncertainty-aware planning can be achieved using message-passing inference. Using diverse experimental settings, we show that CHMMs can be used to explain the emergence of context-specific representations, formation of transferable structural knowledge, transitive inference, shortcut finding in novel spaces, remapping of place cells, and hierarchical planning. Structured higher-order graph learning and probabilistic inference might provide a simple unifying framework for understanding hippocampal function, and a pathway for relational abstractions in artificial intelligence.

Biorxiv Preprint (2020)

Miguel Lazaro-Gredilla, Wolfgang Lehrach, Dileep George

December 2019 Advances in approximate Bayesian inference – 2019.

Learning undirected models via query training

Query training is a a technique that lets you train graphical models using ideas from deep learning.

2019 Approximate Bayesian Inference Workshop

Nishad Gothoskar, J. Swaroop Guntupalli, Rajeev Rikhye, Miguel Lazaro-Gredilla, Dileep George

October 2019 Cognitive Computational Neuroscience

Different clones for different contexts: Hippocampal cognitive maps as higher-order graphs of a cloned HMM

Hippocampus encodes cognitive maps that support episodic memories, navigation, and planning. Under-standing the commonality among those maps as well as how those maps are structured, learned from experience, and used for inference and planning is an interesting but unsolved problem. We propose higher-order graphs as the general principle and present, as a plausible model, a cloned hidden Markov model (HMM) that can learn these graphs efficiently from experienced sequences. In our experiments, we use the cloned HMM for learning spatial and abstract representations. We show that inference and planning in the learned CHMM encapsulates many of the key properties of hippocampal cells observed in rodents and humans. Cloned HMM thus provides a new frame-work for understanding hippocampal function.

2019 Cognitive Computatational Neuroscience

Rajeev Rikhye, Nishad Gothoskar, J. Swaroop Guntupalli, Miguel Lazaro-Gredilla, Dileep George

October 2019 Cognitive Computational Neuroscience

Memorize-Generalize: An online algorithm for learning higher-order sequential structure with cloned Hidden Markov Models

Sequence learning is a vital cognitive function and has been observed in numerous brain areas. Discovering the algorithms underlying sequence learning has been a major endeavour in both neuroscience and machine learning. In earlier work we showed that by constraining the sparsity of the emission matrix of a Hidden Markov Model (HMM) in a biologically-plausible manner we are able to efficiently learn higher-order temporal dependencies and recognize contexts in noisy signals. The central basis of our model, referred to as the Cloned HMM (CHMM), is the observation that cortical neurons sharing the same receptive field properties can learn to represent unique incidences of bottom-up information within different temporal contexts. CHMMs can efficiently learn higher-order temporal dependencies, recognize long-range contexts and, unlike recurrent neural networks, are able to natively handle uncertainty. In this paper we introduce a biologically plausible CHMM learning algorithm, memorize-generalize, that can rapidly memorize sequences as they are encountered, and gradually generalize as more data is accumulated. We demonstrate that CHMMs trained with the memorize-generalize algorithm can model long-range structure in bird songs with only a slight degradation in performance compared to expectation-maximization, while still outperforming other representations.

2019 Cognitive Computatational Neuroscience

Antoine Dedieu∗†, Nishad Gothoskar∗†, Scott Swingle, Wolfgang Lehrach, Miguel Lazaro-Gredilla, Dileep George

May 2019 Science

Learning higher-order sequential structure with cloned HMMs

Variable order sequence modeling is an important problem in artificial and natural intelligence. While overcomplete Hidden Markov Models (HMMs), in theory, have the capacity to represent long-term tem- poral structure, they often fail to learn and converge to local minima. We show that by constraining HMMs with a simple sparsity structure inspired by biology, we can make it learn variable order sequences efficiently. We call this model cloned HMM (CHMM) because the sparsity structure enforces that many hidden states map deterministically to the same emission state. CHMMs with over 1 billion parameters can be efficiently trained on GPUs without being severely affected by the credit diffusion problem of standard HMMs. Unlike n-grams and sequence memoizers, CHMMs can model temporal dependencies at arbitrarily long distances and recognize contexts with “holes” in them. Compared to Recurrent Neural Networks and their Long Short-Term Memory extensions (LSTMs), CHMMs are generative models that can natively deal with uncertainty. Moreover, CHMMs return a higher-order graph that represents the temporal structure of the data which can be useful for community detection, and for building hierarchical models. Our experiments show that CHMMs can beat n-grams, sequence memoizers, and LSTMs on character-level language modeling tasks. CHMMs can be a viable alternative to these methods in some tasks that require variable order sequence modeling and the handling of uncertainty.

Arxiv 2019

Miguel Lazaro-Gredilla, Dianhuan Lian, J. Swaroop Guntupalli, Dileep George

February 2019 Science Robotics

Beyond Imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs

Concepts are formalized as programs on a special computer architectrue called the Visual Cognitive Computer (VCC). By learning programs on VCC, concepts transfer from schematic inputs to real-wrold robots.

Science Robotics. Open Access (2019) Science Magazine Video Vicarious Blog: A thought is a program Code: Learning abstractions Dataset: Tabletop visual cognitive concepts Press: Fortune magazine Press: TechCrunch Press: ScienceNews

Dileep George, Wolfgang Lehrach, Miguel Lazaro-Gredilla

October 2018 Cognitive Computational Neuroscience

Cortical micro-circuits from a generative vision model

A hierarchical vision model that emphasizes the role of lateral and feedback connections and treats classification, segmentation geneeration, and occlusion-reasoning in a unified framework.

Cognitive Computational Neuroscience 2018

Dileep George, Wolfgang Lehrach, Miguel Lazaro-Gredilla

October 2018 Cognitive Computational Neuroscience

Explaining Visual Cortex Phenomena using Recursive Cortical Network

A hierarchical vision model that emphasizes the role of lateral and feedback connections and treats classification, segmentation geneeration, and occlusion-reasoning in a unified framework.

CCN 2018

Nicholas Hay, Michael Stark, Alexander Schlegel, Carter Wendelken, Dennis Park, Eric Purdy, Tom Silver, D. Scott Phoenix, Dileep George

February 2018 AAAI

Behavior is Everything: Towards Representing Concepts with Sensorimotor Contingencies

AI has seen remarkable progress in recent years, due to a switch from hand-designed shallow representations, to learned deep representations. While these methods excel with plentiful training data, they are still far from the human ability to learn concepts from just a few examples by reusing previously learned conceptual knowledge in new contexts. We argue that this gap might come from a fundamental misalignment between human and typical AI representations: while the former are grounded in rich sensorimotor expe- rience, the latter are typically passive and limited to a few modalities such as vision and text. We take a step towards closing this gap by proposing an interactive, behavior-based model that represents concepts using sensorimotor contingencies grounded in an agent’s experience. On a novel conceptual learning and benchmark suite, we demonstrate that conceptually meaningful behaviors can be learned, given supervision via training curricula.

AAAI 2018 Vicarious Blog. From action to abstraction.

Dileep George, Wolfgang Lehrach, Ken Kansky, Miguel Lazaro-Gredilla, Christopher Laan, Bhaskara Marthi, Xinghua Lou, Zhaoshi Meng, Yi Liu, Huayan Wang, Alex Lavin, D. Scott Phoenix

October 2017 Science

A generative model for vision that trains with high data efficiency and breaks text-based CAPTCHAs

Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing–based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence.

Science: Open access (2017) Vicarious Blog: Commonsense, Cortex and CAPTCHA Code/Github BBC NPR Wired The Independent

Ken Kansky, Tom Silver, David A. Mély, Mohamed Eldawy,, Miguel Lázaro-Gredilla,, Xinghua Lou,, Nimrod Dorfman,, Szymon Sidor, Scott Phoenix, Dileep George

September 2017 Science

Schema Networks: Zero-shot transfer with a generative causal model of intuitive physics

The recent adaptation of deep neural network-based methods to reinforcement learning and planning domains has yielded remarkable progress on individual tasks. Nonetheless, progress on task-to-task transfer remains limited. In pursuit of efficient and robust generalization, we introduce the Schema Network, an object-oriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals. The richly structured architecture of the Schema Network can learn the dynamics of an environment directly from data. We compare Schema Networks with Asynchronous Advantage Actor-Critic and Progressive Networks on a suite of Breakout variations, reporting results on training efficiency and zero-shot generalization, consistently demonstrating faster, more robust learning and better transfer. We argue that generalizing from limited data and learning causal relationships are essential abilities on the path toward generally intelligent systems.

ICML 2017 Vicarious Blog Press: Wired

Miguel Lázaro-Gredilla, Yi Liu,, D. Scott Phoenix,, Dileep George

June 2017 Arxiv

Hierarchical Compositional Feature Learning

We introduce the hierarchical compositional network (HCN), a directed generative model able to discover and disentangle, without supervision, the building blocks of a set of binary images. The building blocks are binary features defined hierarchically as a composition of some of the features in the layer immediately below, arranged in a particular manner. At a high level, HCN is similar to a sigmoid belief network with pooling. Inference and learning in HCN are very challenging and existing variational approximations do not work satisfactorily. A main contribution of this work is to show that both can be addressed using max-product message passing (MPMP) with a particular schedule (no EM required). Also, using MPMP as an inference engine for HCN makes new tasks simple: adding supervision information, classifying images, or performing inpainting all correspond to clamping some variables of the model to their known values and running MPMP on the rest. When used for classification, fast inference with HCN has exactly the same functional form as a convolutional neural network (CNN) with linear activations and binary weights. However, HCN's features are qualitatively very different.

ArXiv 2017 Vicarious Blog

Austin Stone, Huayan Wang, Michael Stark, Yi Liu, D. Scott Phoenix, Dileep George

June 2017 CVPR 2017

Teaching compositionality to CNNs

Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surroundings and from each other, thereby promoting better generalization. Our method is agnostic to the specific details of the underlying CNN to which it is applied and can in principle be used with any CNN. As we show in our experiments, the learned representations lead to feature activations that are more localized and improve performance over non-compositional baselines in object recognition tasks.

CVPR 2017 Vicarious Blog: Toward Learning a Compositional Visual Representation

Dileep George

January 2017 Behavioral and Brain Sciences

What can the brain teach us about building artificial intelligence?

This paper is an invited commentary on Lake et al's Behavioral and Brain Sciences article titled “Building machines that learn and think like people”. Lake et al's paper offers a timely critique on the recent accomplishments in artificial intelligence from the vantage point of human intelligence, and provides insightful suggestions about research directions for building more human-like intelligence. Since we agree with most of the points raised in that paper, we will offer a few points that are complementary

Behavioral and Brain Sciences (2017) Arxiv version (2017)

Xinghua Lou, Ken Kansky, Wolfgang Lehrach, CC Laan, Bhaskara Marthi, D. Scott Phoenix, Dileep George

December 2016 NeurIPS

Generative shape models

Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing–based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence.

NeurIPS 2016

Dileep George, Jeff Hawkins

September 2009 PLoS Computational Biology

Towards a Mathematical Theory of Cortical Micro-circuits

The theoretical setting of hierarchical Bayesian inference is gaining acceptance as a framework for understanding cortical computation. In this paper, we describe how Bayesian belief propagation in a spatio-temporal hierarchical model, called Hierarchical Temporal Memory (HTM), can lead to a mathematical model for cortical circuits. An HTM node is abstracted using a coincidence detector and a mixture of Markov chains. Bayesian belief propagation equations for such an HTM node define a set of functional constraints for a neuronal implementation. Anatomical data provide a contrasting set of organizational constraints. The combination of these two constraints suggests a theoretically derived interpretation for many anatomical and physiological features and predicts several others. We describe the pattern recognition capabilities of HTM networks and demonstrate the application of the derived circuits for modeling the subjective contour effect. We also discuss how the theory and the circuit can be extended to explain cortical features that are not explained by the current model and describe testable predictions that can be derived from the model.

PLoS Computational Biology (2009)

Dileep George

September 2008 PhD Thesis. Electrical Engineering. Stanford University

How the brain might work: A hierarchical model of learning and recognition

Stanford Univeristy PhD Thesis (2008)

Jeff Hawkins, Dileep George, Jamie Niemasik

September 2008 Proceedings of the Royal Society B

Sequence memory for prediction, inference and behaviour

In this paper, we propose a mechanism which the neocortex may use to store sequences of patterns. Storing and recalling sequences are necessary for making predictions, recognizing time-based patterns and generating behaviour. Since these tasks are major functions of the neocortex, the ability to store and recall time-based sequences is probably a key attribute of many, if not all, cortical areas. Previously, we have proposed that the neocortex can be modelled as a hierarchy of memory regions, each of which learns and recalls sequences. This paper proposes how each region of neocortex might learn the sequences necessary for this theory. The basis of the proposal is that all the cells in a cortical column share bottom-up receptive field properties, but individual cells in a column learn to represent unique incidences of the bottom-up receptive field property within different sequences. We discuss the proposal, the biological constraints that led to it and some results modelling it.

Phil Trans. Royal Society B (2008)

Dileep George, Jeff Hawkins

September 2005 International Joint Conference on Neural Networks

A Hierarchical Bayesian Model of Invariant Pattern Recognition in the Visual Cortex

We describe a hierarchical model of invariant visual pattern recognition in the visual cortex. In this model, the knowledge of how patterns change when objects move is learned and encapsulated in terms of high probability sequences at each level of the hierarchy. Configuration of object parts is captured by the patterns of coincident high probability sequences. This knowledge is then encoded in a highly efficient Bayesian Network structure.The learning algorithm uses a temporal stability criterion to discover object concepts and movement patterns. We show that the architecture and algorithms are biologically plausible. The large scale architecture of the system matches the large scale organization of the cortex and the micro-circuits derived from the local computations match the anatomical data on cortical circuits. The system exhibits invariance across a wide variety of transformations and is robust in the presence of noise. Moreover, the model also offers alternative explanations for various known cortical phenomena.

IJCNN 2005

Pat Langley, Dileep George

September 2003 ICML

Robust induction of process models from time-series data

In this paper, we revisit the problem of in- ducing a process modelfrom time-series data. Weillustrate this task with a realistic ecosys- tem model, review an initial method for its induction, then identify three challenges that require extension of this method. These in- clude dealing with unobservable variables, finding numeric conditions on processes, and preventing the creation of models that over- fit the training data. Wedescribe responses to these challenges and present experimental evidence that they have the desired effects. After this, we show that this extended ap- proach to inductive process modeling can ex- plain and predict time-series data from bat- teries on the International Space Station. In closing, we discuss related work and consider directions for future research.

ICML 2003

Dileep George

AGI Research at DeepMind

Entrepreneur, Scientist, and Engineer

Previously Co-founder & CTO at Vicarious AI

Interests

Education

Featured Publications

Space is a latent sequence: Structured sequence learning as a unified theory of representation in the hippocampus

Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps

A detailed mathematical theory of thalamic and cortical microcircuits based on inference in a generative vision model

A Model of Fast Concept Inference with Object-Factorized Cognitive Programs

Learning cognitive maps as structured graphs for vicarious evaluation

Learning undirected models via query training

Different clones for different contexts: Hippocampal cognitive maps as higher-order graphs of a cloned HMM

Memorize-Generalize: An online algorithm for learning higher-order sequential structure with cloned Hidden Markov Models

Learning higher-order sequential structure with cloned HMMs

Beyond Imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs

Cortical micro-circuits from a generative vision model

Explaining Visual Cortex Phenomena using Recursive Cortical Network

Behavior is Everything: Towards Representing Concepts with Sensorimotor Contingencies

A generative model for vision that trains with high data efficiency and breaks text-based CAPTCHAs

Schema Networks: Zero-shot transfer with a generative causal model of intuitive physics

Hierarchical Compositional Feature Learning

Teaching compositionality to CNNs

What can the brain teach us about building artificial intelligence?

Generative shape models

Towards a Mathematical Theory of Cortical Micro-circuits

How the brain might work: A hierarchical model of learning and recognition

Sequence memory for prediction, inference and behaviour

A Hierarchical Bayesian Model of Invariant Pattern Recognition in the Visual Cortex

Robust induction of process models from time-series data