Raven's Progressive Matrices Solver

31 Jul 2016

The goal of this project was to produce an agent capable of reasoning and solving complex visual problems: Raven’s Progressive Matrices (RPMs). The agent used knowledge-based AI principles like problem reduction, analogical reasoning, and visuospatial reasoning.

Across a battery of RPM tests, the agent performed with test accuracies between 64%-90%.

In broad terms, the primary algorithm roughly followed a sequential strategy:

  1. Initialize a store of figure transformation concepts. The first step in the agent reasoning initializes a knowledge bank of applicable figure transformations. These transformations are concepts that are applied during analogical reasoning in further steps. They include unary image operations, or transformations that are conducted on one image to produce a new image, as well as binary image operations.
  2. Problem reduction. The next step in the agent reasoning attempts to reduce the problem and build a smarter tester for testing candidate solutions. The agent examines the examples for duplicates, and if none exist, eliminates any matching figures in the candidate solutions.
  3. Analogical reasoning. The third step in the agent reasoning builds canonical analogies between figures. This involved construction of the semantic networks based on observation of transformations.
  4. Application and generalization. The penultimate step tested the remaining candidate solutions (figures that were not discarded during Step 2) with the canonical analogies, assigning and enumerating heuristics on the likelihood of correctness, according to the semantic networks.
  5. Decision-making. In the final step, the collected heuristics from Step 4 across all selected analogies were combined by the agent to determine which candidate solution was most likely the correct answer. The highest-performing candidate solution was then selected.

The agent was built in Python, and used visual and numerical libraries like Pillow and numpy; no computer vision (CV) libraries were used, and all CV algorithms were coded from scratch.

Due to proprietary, privacy, or academic concerns, the source code is not publicly available, but can be happily provided upon request.


I'm a software engineer, proud veteran, and even prouder husband and father. I live and work in Silicon Valley, and love to learn about learning (EdTech), ML/AI/RL, and cybersecurity.