How You Can Solve Lumosity's Pet Detective Puzzle with chatGPT and OpenCV Without Writing a Single Line of Code

OK, consider this a WIP. Wanted to get something out there to accompany the video. It'll keep improving - your comments welcome!


This one has been on my list for years. Like Prof. Brad Vander Zanden says, it's gnarly to have a problem you can't solve and don't know if it can be or not.

----------

Extracting subject locations from the puzzle image (animals, houses, the car, empty nodes and links!)

How would one go about solving this problem in python?

You have a few image files. These are small images each containing a single subject.

Now, you are given an test image containing some of these subjects without any modification - that is, the small image files, if contained in the large test image, are contained without any modification.

What is required:

Dump out the locations of the subjects that are contained in the test image. Dump out the connections between adjacent subjects - that is, where a "branch" exists - a "road", versus "no road" - where the image contains just the background.

----------

OK, let's do one problem at a time. Give me a script that will just dump out the locations of the subjects. Let's not worry about connectivity for now

------------

Can you tweak the script to accept argument1 as test-mage-path and argument2 as subject-image-folder? Then, it needs to load all the subject images from that folder

------------

Are you sure the script is right? When I run it python3 detect_subjects.py pet_det_for_chatGPT.png templates I get: Detected subject positions: empty_PD.png at (np.int64(318), np.int64(25)) to (np.int64(343), np.int64(50)) (center: (np.int64(330), np.int64(37)), score: 0.93) empty_PD.png at (np.int64(317), np.int64(26)) to (np.int64(342), np.int64(51)) (center: (np.int64(329), np.int64(38)), score: 0.93) empty_PD.png at (np.int64(318), np.int64(26)) to (np.int64(343), np.int64(51)) (center: (np.int64(330), np.int64(38)), score: 1.00) empty_PD.png at (np.int64(319), np.int64(26)) to (np.int64(344), np.int64(51)) (center: (np.int64(331), np.int64(38)), score: 0.93) empty_PD.png at (np.int64(318), np.int64(27)) to (np.int64(343), np.int64(52)) (center: (np.int64(330), np.int64(39)), score: 0.93) that is, it's reporting the same template found multiple times (the position is correct).

Ans:

You're totally right to question that — and yes, the script does detect the same template multiple times in a very small neighborhood, which is common when using cv2.matchTemplate(). This happens because similar matches within a few pixels all score high — especially for small templates or uniform backgrounds.

--------------

One update needed: We expect to find EVERY template image in the test image. If this is violated, at the end of the listing, can you list the template images that were NOT found in the test image?

==========================

Capturing branch info - connectivity information - build up the graph that you can feed to the "solver"

I would like a script that takes input similar to the following:

car_PD_right.png at (278, 198) to (349, 248) (center: (313, 223), score: 1.00)

cat_pink_PD.png at (8, 8) to (55, 52) (center: (31, 30), score: 0.99)

cat_pink_house_PD.png at (213, 14) to (253, 59) (center: (233, 36), score: 1.00)

cat_purple_PD.png at (409, 104) to (455, 155) (center: (432, 129), score: 0.97)

cat_purple_house_PD.png at (414, 219) to (452, 259) (center: (433, 239), score: 1.00)

corgi_PD.png at (104, 207) to (162, 252) (center: (133, 229), score: 1.00)

road_horiz_PD.png at (76, 203) to (88, 260) (center: (82, 231), score: 1.00)

road_horiz_PD.png at (94, 203) to (106, 260) (center: (100, 231), score: 1.00)

road_horiz_PD.png at (467, 303) to (479, 360) (center: (473, 331), score: 1.00)

road_horiz_PD.png at (278, 103) to (290, 160) (center: (284, 131), score: 1.00)

road_horiz_PD.png at (480, 303) to (492, 360) (center: (486, 331), score: 1.00)

road_horiz_PD.png at (184, 103) to (196, 160) (center: (190, 131), score: 1.00)

road_horiz_PD.png at (367, 3) to (379, 60) (center: (373, 31), score: 1.00)

road_vert_PD.png at (206, 83) to (252, 96) (center: (229, 89), score: 1.00)

road_vert_PD.png at (406, 184) to (452, 197) (center: (429, 190), score: 1.00)

road_vert_PD.png at (406, 173) to (452, 186) (center: (429, 179), score: 1.00)

road_vert_PD.png at (106, 89) to (152, 102) (center: (129, 95), score: 1.00)

road_vert_PD.png at (206, 65) to (252, 78) (center: (229, 71), score: 1.00)

road_vert_PD.png at (206, 284) to (252, 297) (center: (229, 290), score: 1.00)


The script should only consider lines containing the word "center" The input contains information about the positions of objects in lines NOT containing the word "road" The script contains information about links (branches) between the objects in lines containing the word "road".

Examples of objects are "car_PD_right" and "cat_purple_house_PD". That is - from a line NOT containing the word "road", take the initial part of the line upto ".png". Note that an "object" line (that is NOT a "road" line) contains bounding box information giving opposite corners of the rectangle bounding the image.

The script will use the center coordinate information from the "road" lines to report connections between the objects using the following scheme.

Read the object information into a data structure(s) that allows querying later. For each "road" line, take the center coordinates and look through the object list and report a branch between two objects if the center coordinates of this road is contained within the rectangle formed by the edges of two non-road objects that are neighbors - that is, if A and B are neighbors in one direction (say in the horizontal going right direction) then there is no other object between A and B (other than a road).

What is the best language to code this in?

------------

Yes, please provide the complete script. Note that a branch or connection between two non-road objects should NOT be reported if there is another non-road object whose center coordinates are within the rectangle formed by the parallel nearby edges of the two objects.

For example:

car_PD_right ( (278, 198) to (349, 248) ) and empty_PD ( (316, 117) to (341, 142) ) are adjacent because the shape formed by their nearby parallel edges : (empty_PD : (316,142) to (341,142) and car_PD_right: (278,248) to (349,248)) contain at least one "road" : road_vert_PD center: (329,159) but no other non-road object

------------

Can you fix it so that the script reads from STDIN and echoes lines to the STDOUT before printing branch info at the end?

------------

Can you add this refinement to aid debugging:

When reporting a branch, can you add the road object (road_horiz or road_vert) and its center coordinates that led to the conclusion of the branch's existence?

-------------

Can you ensure that only one neighbor is found in each direction - whether you call them up,down,left,right or north,south,west,east?

Here is a problem:

empty_PD.png at (316, 117) to (341, 142) (center: (328, 129), score: 1.00) car_PD_right.png at (278, 198) to (349, 248) (center: (313, 223), score: 1.00) road_vert_PD.png at (306, 153) to (352, 166) (center: (329, 159), score: 0.99) road_vert_PD.png at (306, 153) to (352, 166) (center: (329, 159), score: 0.99)

It is reporting:

car_PD_right <--> empty_PD [via road_vert_PD at (329, 159)] empty_PD <--> parrot_PD [via road_vert_PD at (329, 159)]

We know that cannot be - the road from the empty_PD (above) to car_PD_right ends at car_PD_right. parrot_PD is below car_PD_right.

Only the "nearer" neighbor relationship should be retained

====================================

The solver!

What do you know about Lumosity Pet Detective game?

-------------

OK, so it's a vehicle routing problem.

You are provided the capacity of 4 and a number representing fuel - which, in our case, once the problem is formulated as a graph, represents the number of branches that may be traversed. The car does NOT have to return to the starting point.

Suppose the problem is presented to the script in the following format:

car_right <--> raccoon_house

car_right <--> empty

cat_pink <--> turtle_house

cat_pink <--> rabbit

cat_pink_house <--> turtle_house

cat_pink_house <--> lizard

cat_pink_house <--> hedgehog

cat_purple <--> cat_purple_house

cat_purple <--> empty

cat_purple <--> raccoon

cat_purple_house <--> turtle

corgi <--> corgi_house

corgi <--> lizard_house

corgi_house <--> fox_house

corgi_house <--> wiener_house

empty <--> hedgehog

fox <--> parrot_house

fox <--> raccoon

fox_house <--> lizard_house

hedgehog <--> wiener

hedgehog <--> raccoon_house

hedgehog_house <--> parrot

hedgehog_house <--> rabbit_house

lizard <--> parrot_house

lizard_house <--> rabbit

parrot <--> wiener_house

rabbit <--> wiener

rabbit_house <--> turtle

raccoon <--> turtle

raccoon_house <--> wiener_house

turtle_house <--> wiener


The car - the vehicle - may have some trailing text in its name that can be ignored. In this case, it is "car_right". The animals - such as raccoon and parrot are obviously to be delivered to their houses - denoted by parrot_house and raccoon house.

A delivery only happens if a house is visited with the corresponding animal in the car.

If the animal corresponding to that house is NOT in the car during the visit, it is merely a traversal en route to visiting another node. 

"empty" merely represent a node in the branch without an animal or a house. It is merely a stopping point and visiting it does cost the traversal of the connecting branch.

In the given example, the car starts in a position where it can travel to the empty node at a cost of 1 or visit (but not drop off, because it starts off empty) the raccoon_house.

Does this give you enough information to propose a means of solving this optimization problem using Python?

------------------

You can create the solver and, when it is run, it will take as argument1 the file containing this problem text containing the links between nodes. Argument 2 will be the "fuel limit" - that is number of branch traversals permitted.

In the program, you can have a variable for "capacity" - which will be set to 4 - the maximum number of animals the car can hold

Note that, with regards to both animal and house nodes of the graph, we can speak of "pick up" or "drop off" and "visit" actions. In the case of visiting an animal, we visit the node, but do not add the animal to the car.

The output will be the sequence of actions to take:

For example


Step 1 : Go to empty node

Step 2 : Visit the cat_purple

Step 3 : Pick up the raccoon

Step 4 : Visit the cat_purple

Step 5 : Visit the empty node

Step 6 : Visit the starting position

Step 7 : Drop off the raccoon


Note that it is acceptable to say "drop off the raccoon" if the current position is NOT one branch away from the raccoon_house if the cost (number of branches ) is accounted for correctly

--------------

You created a solution from scratch instead of using an optimization library?

Ans:

Yes, I created a custom state-space search solution using BFS (Breadth-First Search) instead of relying on an optimization library like ortools, PuLP, or networkx for a few practical reasons:


🧠 Why a Custom BFS?

  1. The problem isn't purely numerical (e.g. not just distances and costs), but has complex state transitions:

    • Pick-up and drop-off constraints

    • Vehicle capacity

    • Fuel limit

    • Conditional actions like “visit” vs. “pick up” vs. “drop off” → These are easier to represent as state machines than as linear programming constraints.

  2. BFS makes it easier to trace the exact sequence of actions, which you specifically wanted:

    • You don’t just care about “how many deliveries” — you want the full step-by-step log of the actions taken.

  3. Highly customizable:

    • You can easily swap in A* or DFS with heuristics later.

    • You can insert custom rules (e.g., don’t visit the same pet twice, don’t exceed capacity) inline.


🧰 Could We Use an Optimization Library?

Absolutely. If you’re aiming for:

  • Faster performance on large graphs

  • Guaranteed global optimum

  • Integration with real-world route costs or timings

Then libraries like:

  • Google OR-Tools

  • networkx with shortest path + greedy capacity logic

  • PuLP for LP-based route solving

...can be used. But they'd require:

  • Encoding the problem into cost matrices,

  • Handling logical constraints as binary variables (more complex to express pick-up/drop-off logic),

  • Post-processing to reconstruct readable step-by-step instructions.

---------------

Is it currently a brute-force solution?

(yes)

---------------

Yes, please add the smarter heuristics - the brute force solution simply does not scale

--------------

Yes, it's taking too long to run. Please enhance, and also check for the word DEBUG (all caps) on command line at end - which indicates that debug info must be printed - to give user an idea of progress

--------------

Thanks. Can you print the elapsed time at the end and also, in each step, print at end of line, fuel remaining and the car contents? (this is in the normal (not DEBUG) output)

--------------

I made an error in specification. 

There will be only one car and only one of each animal and one of each animal's house. But, there can be multiple "empty" nodes. They are not all the same "empty"

So, please update to consider any "empty" followed by an integer - for example "empty1" or "empty2" as another unique empty node that contains no animal and no house

---------------

How does it terminate? It did well on one puzzle, but on another one it didn't.

On each output line, we are adding car contents and fuel. Can you add one more item:

ONLY when remaining animals to be delivered is LESS than 4, mention them as

pending: list of ..

-----------------

OK, that was good..

Here is how my most recent run ended

Step 54 : Go to corgi_house (fuel: 5, cargo: ['corgi'], pending: ['rabbit'])

Step 55 : Drop off the corgi (fuel: 5, cargo: [], pending: ['rabbit'])

Elapsed time: 93.53 seconds

So, clearly, it did not "finish" the job. Why does it thing it's done?

It appears that, at any time, after dropping off an animal, if the car contents are zero, it thinks it's done. Is that right?

----------------

How do we know that full is impossible?

Ans:

That’s the key insight — we don’t currently have a guarantee that full delivery is impossible. The solver:
  • Searches exhaustively, within the fuel constraint

  • Keeps track of the best delivery count seen

  • But does not prove whether any full-delivery path exists unless it explicitly finds one


--------------

One enhancement needed in the printing.

Add a step in which the output is piped through a module which does some clean up:

If two successive steps are of the form

Step N: Go to X

Step N+1: Visit X

Then remove both, since there is no actual action for the user to take in those two

---------------

Add one more cleanup step:

If there is

Step N : Go to X (which could also be go to X_house)

Step N+1: Go to Y

Meaning there is no action to take at X

then eliminate the step


Comments

Popular posts from this blog

Modern Magic - How Do They Do It?

Why Does chatGPT Fail Utterly with this One?