Today, in our ongoing series:
Closed Source Software, Not Even Once
I wanted to illustrate a point using Microsoft's Copilot thing, so I asked it:
- Can you generate a painting of a field of poppies in the style of claude monet?
Which it responded to with:
Then I asked it:
- can you generate a painting of sonic the hedgehog running through a field of poppies in the style of claude monet
To which it responded:
Uh... What now?
Okay so that's a dry well, let's experiment:
I'll spare you the images, but Copilot responded to all of the following prompts:
- Can you generate an image of a field of poppies?
- Can you generate an image of sonic the hedgehog and poppies
- Can you generate an image of sonic the hedgehog in the style of claude monet?
- Can you generate an image of poppies in the style of claude monet?
But,
- Can you generate an image of poppies and sonic the hedgehog in the style of claude monet?
So, any two of Sonic the Hedgehog, poppies, and "in the style of cluade monet" will generate a picture, but trying to do all three at once somehow trips the internal censor.
I would really like to know what on earth could possibly be causing that behavior.
If any technical people have any ideas how to figure that out, I'd be really interested in hearing about them, or just other people who have been cursed with windows trying the same experiment and seeing if you also trip the censor.
@menheraboypussy "Also, it also could depend how far apart the terms are conceptually. Sonic and Poppies can be bunched into one (sonic runs through fields) and Poppies and monet can be similar (Monet was impressionist) but sonic and Monet are far apart conceptually (monet didn't draw sonic. There maybe no or very little mention of monet in sonic media). So while the three concepts thing might be the problem, maybe their closeness also affects it. Maybe try "sonic,poppies and eggman""
Hmm, this does suggest further experimentation, what if I ask for a prompt from a different artist? Can Copilot generate an "image of sonic the hedgehog and poppies in the style of Pablo Picasso"?
Yes!
Well, no, in the sense that that's not the least bit like Picasso but yes in the sense of "does the prompt return an image".
What if we add a fourth term?
- Prompt: Can you generate an image of sonic fighting eggman and poppies in the style of claude monet?
- Prompt: Can you generate an image of sonic and poppies and mario in the style of claude monet?
Okay, now we're getting somewhere! Any fourth element seems to get us back to returning an image. So let's try,
- Prompt: Can you generate an image of sonic and poppies and in the style of claude monet?
Okay, just adding in "and" seems to get past it, although, wait a minute, I also forgot something else, reader, did you see it? That time I just wrote "Sonic" instead of "Sonic the Hedgehog" maybe that is the problem.
Let's check the original prompt again:
- Prompt: Can you generate an image of sonic the hedgehog and poppies in the style of claude monet?
Okay, this time let's copy and paste that prompt and then modify it- Damn it, I got distracted and hit enter before I changed the prompt:
- Prompt: Can you generate an image of sonic the hedgehog and poppies in the style of claude monet?
Wait, what the hell?
Okay, time to repeat the same prompt multiple times.
In this session I have now used the above prompt, unaltered, ten times and gotten the following results:
- In three cases, the prompt returned the error message
- In two case, the prompt returned two images
- In the other five cases, the prompt returned one image.
My current best guess at what is going on:
Generally speaking, when you ask Copilot for an image it returns four results for the prompt. However, when I asked it "Can you generate an image of sonic the hedgehog and poppies in the style of pablo picasso" it only returned a single image.
Trying that prompt again several times has yielded similar behavior; after using that prompt several times it returns a variable number of images between one and three.
What I believe is happening is that when you ask copilot for an image, it has DALL-E produce four images. These images are then passed along to some kind of censoring program that scans them for objectionable or explicit imagery.
The censoring program then deletes any images that it thinks of as objectionable, then passes the remaining images on to the user.
If all four of the images are seen as having objectionable material, it returns the error message.
The above is all speculation on my part.
What is definitely the case though is that the error message being returned is not giving correct information to the user.
70%(ish) of the time the prompt I was using returns images, but the error message suggests, first that the prompt itself was the problem, and second that the only way to rectify the error is to use another prompt.
The error message does not give the user the correct information that the same prompt may return images if used again, perhaps because there is no way to know how often useable images will be returned. A 70% hit rate is worth trying the prompt again; a 0.0000001% hit rate isn't.
Actually, if you ask it for something that is definitely against the Terms of Service you get this:
The previous error message is an image file displayed in the place where the image outputs are usually displayed, while this is a chat response from Copilot itself, so something different is happening in the two cases.
My guess is that Copilot first scans the prompt itself and simply doesn't pass it along if it feels like the prompt is against TOS; if it thinks the prompt is okay, it passes it to DALL-E but still scans the incoming images to make sure that the results of an accepted prompt still don't generate unexpected behavior.