But Is It a Cat?

05 Jul, 2025

I wanted to have some fun with AI. As you do.

So I thought what might be super fun was to actually mess with AI and see if I could replicate the normal bug with AI that if you have an image and you change the pixels you can change a lot of pixels before AI actually figures out that you changed the photo more generally.

But I cannot be trusted.

Instead of starting subtle, I sent AI this:

cat

Which I think we can all agree is a cat.

AI is so very unsure about this:

a photo of a cat: 0.2721
a screenshot of a video game: 0.2554
a photo of a toy: 0.2515
a computer-generated image: 0.2485
a cartoon character: 0.2383

So I guess we have to be boring now.

I also sent it a picture of a skeleton.

skeleton

This time, the model felt only marginally more certain:

a photo of a skeleton: 0.2729
a computer-generated image: 0.2494
a photo of a person: 0.2467
a corrupted JPEG: 0.2340
a photo of a toy: 0.2338

At this point, a reasonable person would go find a better image and move on.

I am not a reasonable person.

So I figure, maybe the adversarial attack can make it look more like a cat? I mean the skeleton has a lot more chill of a background.

Of course these photos have too many pixels for me to be willing to deal with them right now. So we made them smaller.

resized_cat

a photo of a cat: 0.2538
a computer-generated image: 0.2422
a highly pixelated image: 0.2328
a distorted photo: 0.2295
a screenshot of a video game: 0.2274

resized_skeleton

a photo of a skeleton: 0.2652
a computer-generated image: 0.2643
a blurry photo: 0.2511
a corrupted JPEG: 0.2486
a photo of a toy: 0.2448

We can make a note:

Cosine similarity between images: 0.6902

So let's play.

I'm not the most patient person and I mostly just wanted to know how much I could change the image and have it still be a cat.

So I used the following:

def modify_image(image_base, image_overlay, image_base_expected: str = "a photo of a cat", iteration: int = 1):
    image_base_np = np.array(image_base)
    image_overlay_np = np.array(image_overlay)

    initial_score = get_category_value(image_base_np, image_base_expected)
    print(f"Initial score for '{image_base_expected}': {initial_score:.4f}")

    height, width, _ = image_base_np.shape

    current_image_copy = image_base_np.copy()
    best_scoring_array = current_image_copy
    best_score = initial_score
    worst_scoring_array = None
    worst_score = None

    for i in range(height):
        print(f"Processing row {i + 1}/{height}...")
        print(f"    Best score so far: {best_score:.4f}")
        print(f"    Worst score so far: {worst_score:.4f}" if worst_score is not None else "    No worst score yet")
        for j in range(width):
            current_pixel = current_image_copy[i, j]
            overlay_pixel = image_overlay_np[i, j]
            if np.array_equal(current_pixel, overlay_pixel):
                continue
            else:
                current_image_copy[i, j] = overlay_pixel

            new_score = get_category_value(current_image_copy, image_base_expected)

            if best_score is None or new_score >= best_score:
                best_score = new_score
                best_scoring_array = current_image_copy.copy()
            elif worst_score is None or new_score < worst_score:
                worst_score = new_score
                worst_scoring_array = current_image_copy.copy()
                current_image_copy = best_scoring_array.copy()

    print(f"Best score: {best_score:.4f}")
    best_changed_pixels = count_changed_pixels(image_base_np, best_scoring_array)
    print(f"Changed pixels for best score: {best_changed_pixels}")
    if worst_score:
        print(f"Worst score: {worst_score:.4f}")
        worst_changed_pixels = count_changed_pixels(image_base_np, worst_scoring_array)
        print(f"Changed pixels for worst score: {worst_changed_pixels}")

    Image.fromarray(best_scoring_array).save("images/resized_cat_{iteration}.png")
    Image.fromarray(worst_scoring_array).save("images/worst_resized_cat_{iteration}.png")

This is not the best code I’ve ever written, but that’s also not the point.

What it does is take images and see how much of the skeleton it can paste into the cat image before the AI stops believing it’s a cat.

It loops through every pixel and swaps it out if the change doesn’t hurt the AI’s confidence that it’s still a cat. It’s not trying to preserve the best-looking cat image — it’s trying to see how far we can push it before the ai notices.

This is the "best" cat it found:

resized_cat_1

The model is 27.64% sure this is a cat. Which is pretty fantastic since we started at 25.38%. This is almost a 9% improvement in the score which is pretty impressive. Especially given how little of a cat is actually left at this point. I guess all we need is paws and whiskers.

It is really interesting it is not obviously a skeleton at this point:

a photo of a cat: 0.2764
a computer-generated image: 0.2755
a corrupted JPEG: 0.2637
a distorted photo: 0.2542
a blurry photo: 0.2537

Skeleton is one of the lowest categories coming in at 21%.

We can do even better though, after a second round this is 28.5% definitely a cat. After the third round it looked 28.65% like a cat. I was hoping at this point if we hoped really hard we could hit 33%, but that's a climb. After the fourth round it only made it to 28.9%. Unfortunately, cat 5 didn't manage much she only went up 0.08 which left her at 28.99% like a cat. Cat 6 was at 29.14% so it did more than the previous cat, but not a ton.

I thought it would be interesting to check how much of the mask we can get if we only keep improvements. So I did that also, I wanted to see how much I could make it look more like a cat.

better_resized_cat_1

I want to note here, these are ONLY changes that make it look more like a cat. Replacing the entire center of the image made it more similar to a cat.

What's weird about this one, is that putting a skull in the middle of the cat is apparently helpful? I guess then it's not standing up anymore?

For only improving images the first round scored 27.64%, the second round scored 28.52% which was basically the same as the version keeping all changes that weren't negative. The third round we started seeing a lot of acceleration on this quality over the other version. It made it to 29% in the third round. In the forth round it made it to 29.44%, it maybe our great hope to hit 30%. In the fifth round it made it to 29.57%. We might be slowing down.

We are able to basically make small gains from here, but it takes absolutely forever, so are you ready to see round 6 which is 29.6% almost definitely a cat.

better_resized_cat_6

Awwww look at it's little ears!

What have we learned here? that skulls really bring out the cattiness of my cat.

I did run a final round of categorization. And it is worth noting cat is no longer winning on either image. It's in second place.

For the highly masked image we have:

a computer-generated image: 0.2939
a photo of a cat: 0.2919
a corrupted JPEG: 0.2818
a distorted photo: 0.2754
a highly pixelated image: 0.2735
...
a photo of a skeleton: 0.2455

For the attempt to only improve the image

a computer-generated image: 0.3028
a photo of a cat: 0.2960
a corrupted JPEG: 0.2940
a highly pixelated image: 0.2870
a distorted photo: 0.2800
...
a photo of a skeleton: 0.2476

I mean it's not, not a computer generated image, so good job ai?

I am quite surprised. I actually expected it to mostly steal the background of the skeleton image since it's a lot less busy and sort of block out everything but the cat. I wasn't prepared for it to just abandon her entire tummy.

#ai #image recognition