How You Can Solve Lumosity's Pet Detective Puzzle with chatGPT and OpenCV Without Writing a Single Line of Code
OK, consider this a WIP. Wanted to get something out there to accompany the video. It'll keep improving - your comments welcome!
This one has been on my list for years. Like Prof. Brad Vander Zanden says, it's gnarly to have a problem you can't solve and don't know if it can be or not.
----------
Extracting subject locations from the puzzle image (animals, houses, the car, empty nodes and links!)
How would one go about solving this problem in python?
You have a few image files. These are small images each containing a single subject.
Now, you are given an test image containing some of these subjects without any modification - that is, the small image files, if contained in the large test image, are contained without any modification.
What is required:
Dump out the locations of the subjects that are contained in the test image. Dump out the connections between adjacent subjects - that is, where a "branch" exists - a "road", versus "no road" - where the image contains just the background.
----------
OK, let's do one problem at a time. Give me a script that will just dump out the locations of the subjects. Let's not worry about connectivity for now
------------
Can you tweak the script to accept argument1 as test-mage-path and argument2 as subject-image-folder? Then, it needs to load all the subject images from that folder
------------
Are you sure the script is right? When I run it python3 detect_subjects.py pet_det_for_chatGPT.png templates I get: Detected subject positions: empty_PD.png at (np.int64(318), np.int64(25)) to (np.int64(343), np.int64(50)) (center: (np.int64(330), np.int64(37)), score: 0.93) empty_PD.png at (np.int64(317), np.int64(26)) to (np.int64(342), np.int64(51)) (center: (np.int64(329), np.int64(38)), score: 0.93) empty_PD.png at (np.int64(318), np.int64(26)) to (np.int64(343), np.int64(51)) (center: (np.int64(330), np.int64(38)), score: 1.00) empty_PD.png at (np.int64(319), np.int64(26)) to (np.int64(344), np.int64(51)) (center: (np.int64(331), np.int64(38)), score: 0.93) empty_PD.png at (np.int64(318), np.int64(27)) to (np.int64(343), np.int64(52)) (center: (np.int64(330), np.int64(39)), score: 0.93) that is, it's reporting the same template found multiple times (the position is correct).
Ans:
You're totally right to question that — and yes, the script does detect the same template multiple times in a very small neighborhood, which is common when using cv2.matchTemplate(). This happens because similar matches within a few pixels all score high — especially for small templates or uniform backgrounds.
--------------
One update needed: We expect to find EVERY template image in the test image. If this is violated, at the end of the listing, can you list the template images that were NOT found in the test image?
==========================
Capturing branch info - connectivity information - build up the graph that you can feed to the "solver"
I would like a script that takes input similar to the following:
car_PD_right.png at (278, 198) to (349, 248) (center: (313, 223), score: 1.00)
cat_pink_PD.png at (8, 8) to (55, 52) (center: (31, 30), score: 0.99)
cat_pink_house_PD.png at (213, 14) to (253, 59) (center: (233, 36), score: 1.00)
cat_purple_PD.png at (409, 104) to (455, 155) (center: (432, 129), score: 0.97)
cat_purple_house_PD.png at (414, 219) to (452, 259) (center: (433, 239), score: 1.00)
corgi_PD.png at (104, 207) to (162, 252) (center: (133, 229), score: 1.00)
road_horiz_PD.png at (76, 203) to (88, 260) (center: (82, 231), score: 1.00)
road_horiz_PD.png at (94, 203) to (106, 260) (center: (100, 231), score: 1.00)
road_horiz_PD.png at (467, 303) to (479, 360) (center: (473, 331), score: 1.00)
road_horiz_PD.png at (278, 103) to (290, 160) (center: (284, 131), score: 1.00)
road_horiz_PD.png at (480, 303) to (492, 360) (center: (486, 331), score: 1.00)
road_horiz_PD.png at (184, 103) to (196, 160) (center: (190, 131), score: 1.00)
road_horiz_PD.png at (367, 3) to (379, 60) (center: (373, 31), score: 1.00)
road_vert_PD.png at (206, 83) to (252, 96) (center: (229, 89), score: 1.00)
road_vert_PD.png at (406, 184) to (452, 197) (center: (429, 190), score: 1.00)
road_vert_PD.png at (406, 173) to (452, 186) (center: (429, 179), score: 1.00)
road_vert_PD.png at (106, 89) to (152, 102) (center: (129, 95), score: 1.00)
road_vert_PD.png at (206, 65) to (252, 78) (center: (229, 71), score: 1.00)
road_vert_PD.png at (206, 284) to (252, 297) (center: (229, 290), score: 1.00)
The script should only consider lines containing the word "center" The input contains information about the positions of objects in lines NOT containing the word "road" The script contains information about links (branches) between the objects in lines containing the word "road".
Examples of objects are "car_PD_right" and "cat_purple_house_PD". That is - from a line NOT containing the word "road", take the initial part of the line upto ".png". Note that an "object" line (that is NOT a "road" line) contains bounding box information giving opposite corners of the rectangle bounding the image.
The script will use the center coordinate information from the "road" lines to report connections between the objects using the following scheme.
Read the object information into a data structure(s) that allows querying later. For each "road" line, take the center coordinates and look through the object list and report a branch between two objects if the center coordinates of this road is contained within the rectangle formed by the edges of two non-road objects that are neighbors - that is, if A and B are neighbors in one direction (say in the horizontal going right direction) then there is no other object between A and B (other than a road).
What is the best language to code this in?
------------
Yes, please provide the complete script. Note that a branch or connection between two non-road objects should NOT be reported if there is another non-road object whose center coordinates are within the rectangle formed by the parallel nearby edges of the two objects.
For example:
car_PD_right ( (278, 198) to (349, 248) ) and empty_PD ( (316, 117) to (341, 142) ) are adjacent because the shape formed by their nearby parallel edges : (empty_PD : (316,142) to (341,142) and car_PD_right: (278,248) to (349,248)) contain at least one "road" : road_vert_PD center: (329,159) but no other non-road object
------------
Can you fix it so that the script reads from STDIN and echoes lines to the STDOUT before printing branch info at the end?
------------
Can you add this refinement to aid debugging:
When reporting a branch, can you add the road object (road_horiz or road_vert) and its center coordinates that led to the conclusion of the branch's existence?
-------------
Can you ensure that only one neighbor is found in each direction - whether you call them up,down,left,right or north,south,west,east?
Here is a problem:
empty_PD.png at (316, 117) to (341, 142) (center: (328, 129), score: 1.00) car_PD_right.png at (278, 198) to (349, 248) (center: (313, 223), score: 1.00) road_vert_PD.png at (306, 153) to (352, 166) (center: (329, 159), score: 0.99) road_vert_PD.png at (306, 153) to (352, 166) (center: (329, 159), score: 0.99)
It is reporting:
car_PD_right <--> empty_PD [via road_vert_PD at (329, 159)] empty_PD <--> parrot_PD [via road_vert_PD at (329, 159)]
We know that cannot be - the road from the empty_PD (above) to car_PD_right ends at car_PD_right. parrot_PD is below car_PD_right.
Only the "nearer" neighbor relationship should be retained
====================================
The solver!
What do you know about Lumosity Pet Detective game?
-------------
OK, so it's a vehicle routing problem.
You are provided the capacity of 4 and a number representing fuel - which, in our case, once the problem is formulated as a graph, represents the number of branches that may be traversed. The car does NOT have to return to the starting point.
Suppose the problem is presented to the script in the following format:
car_right <--> raccoon_house
car_right <--> empty
cat_pink <--> turtle_house
cat_pink <--> rabbit
cat_pink_house <--> turtle_house
cat_pink_house <--> lizard
cat_pink_house <--> hedgehog
cat_purple <--> cat_purple_house
cat_purple <--> empty
cat_purple <--> raccoon
cat_purple_house <--> turtle
corgi <--> corgi_house
corgi <--> lizard_house
corgi_house <--> fox_house
corgi_house <--> wiener_house
empty <--> hedgehog
fox <--> parrot_house
fox <--> raccoon
fox_house <--> lizard_house
hedgehog <--> wiener
hedgehog <--> raccoon_house
hedgehog_house <--> parrot
hedgehog_house <--> rabbit_house
lizard <--> parrot_house
lizard_house <--> rabbit
parrot <--> wiener_house
rabbit <--> wiener
rabbit_house <--> turtle
raccoon <--> turtle
raccoon_house <--> wiener_house
turtle_house <--> wiener
The car - the vehicle - may have some trailing text in its name that can be ignored. In this case, it is "car_right". The animals - such as raccoon and parrot are obviously to be delivered to their houses - denoted by parrot_house and raccoon house.
A delivery only happens if a house is visited with the corresponding animal in the car.
If the animal corresponding to that house is NOT in the car during the visit, it is merely a traversal en route to visiting another node.
"empty" merely represent a node in the branch without an animal or a house. It is merely a stopping point and visiting it does cost the traversal of the connecting branch.
In the given example, the car starts in a position where it can travel to the empty node at a cost of 1 or visit (but not drop off, because it starts off empty) the raccoon_house.
Does this give you enough information to propose a means of solving this optimization problem using Python?
------------------
You can create the solver and, when it is run, it will take as argument1 the file containing this problem text containing the links between nodes. Argument 2 will be the "fuel limit" - that is number of branch traversals permitted.
In the program, you can have a variable for "capacity" - which will be set to 4 - the maximum number of animals the car can hold
Note that, with regards to both animal and house nodes of the graph, we can speak of "pick up" or "drop off" and "visit" actions. In the case of visiting an animal, we visit the node, but do not add the animal to the car.
The output will be the sequence of actions to take:
For example
Step 1 : Go to empty node
Step 2 : Visit the cat_purple
Step 3 : Pick up the raccoon
Step 4 : Visit the cat_purple
Step 5 : Visit the empty node
Step 6 : Visit the starting position
Step 7 : Drop off the raccoon
Note that it is acceptable to say "drop off the raccoon" if the current position is NOT one branch away from the raccoon_house if the cost (number of branches ) is accounted for correctly
--------------
You created a solution from scratch instead of using an optimization library?
Ans:
Yes, I created a custom state-space search solution using BFS (Breadth-First Search) instead of relying on an optimization library like ortools, PuLP, or networkx for a few practical reasons:
🧠Why a Custom BFS?
-
The problem isn't purely numerical (e.g. not just distances and costs), but has complex state transitions:
-
Pick-up and drop-off constraints
-
Vehicle capacity
-
Fuel limit
-
Conditional actions like “visit” vs. “pick up” vs. “drop off” → These are easier to represent as state machines than as linear programming constraints.
-
-
BFS makes it easier to trace the exact sequence of actions, which you specifically wanted:
-
You don’t just care about “how many deliveries” — you want the full step-by-step log of the actions taken.
-
-
Highly customizable:
-
You can easily swap in A* or DFS with heuristics later.
-
You can insert custom rules (e.g., don’t visit the same pet twice, don’t exceed capacity) inline.
-
🧰 Could We Use an Optimization Library?
Absolutely. If you’re aiming for:
-
Faster performance on large graphs
-
Guaranteed global optimum
-
Integration with real-world route costs or timings
Then libraries like:
-
networkxwith shortest path + greedy capacity logic -
PuLPfor LP-based route solving
...can be used. But they'd require:
-
Encoding the problem into cost matrices,
-
Handling logical constraints as binary variables (more complex to express pick-up/drop-off logic),
-
Post-processing to reconstruct readable step-by-step instructions.
---------------
Is it currently a brute-force solution?
(yes)
---------------
Yes, please add the smarter heuristics - the brute force solution simply does not scale
--------------
Yes, it's taking too long to run. Please enhance, and also check for the word DEBUG (all caps) on command line at end - which indicates that debug info must be printed - to give user an idea of progress
--------------
Thanks. Can you print the elapsed time at the end and also, in each step, print at end of line, fuel remaining and the car contents? (this is in the normal (not DEBUG) output)
--------------
I made an error in specification.
There will be only one car and only one of each animal and one of each animal's house. But, there can be multiple "empty" nodes. They are not all the same "empty"
So, please update to consider any "empty" followed by an integer - for example "empty1" or "empty2" as another unique empty node that contains no animal and no house
---------------
How does it terminate? It did well on one puzzle, but on another one it didn't.
On each output line, we are adding car contents and fuel. Can you add one more item:
ONLY when remaining animals to be delivered is LESS than 4, mention them as
pending: list of ..
-----------------
OK, that was good..
Here is how my most recent run ended
Step 54 : Go to corgi_house (fuel: 5, cargo: ['corgi'], pending: ['rabbit'])
Step 55 : Drop off the corgi (fuel: 5, cargo: [], pending: ['rabbit'])
Elapsed time: 93.53 seconds
So, clearly, it did not "finish" the job. Why does it thing it's done?
It appears that, at any time, after dropping off an animal, if the car contents are zero, it thinks it's done. Is that right?
----------------
-
Searches exhaustively, within the fuel constraint
-
Keeps track of the best delivery count seen
-
But does not prove whether any full-delivery path exists unless it explicitly finds one
--------------
One enhancement needed in the printing.
Add a step in which the output is piped through a module which does some clean up:
If two successive steps are of the form
Step N: Go to X
Step N+1: Visit X
Then remove both, since there is no actual action for the user to take in those two
---------------
Add one more cleanup step:
If there is
Step N : Go to X (which could also be go to X_house)
Step N+1: Go to Y
Meaning there is no action to take at X
then eliminate the step
Comments
Post a Comment