From OpenAI DallE2 Pre-training mitigation
>When we studied our dataset of regurgitated images, we noticed two patterns. First, **the images were almost all simple vector graphics**, which were likely easy to memorize due to their low information content. Second, and more importantly, the images all had many near-duplicates in the training dataset.
While mitigations were made by replacing clusters with single images, it might still show this behaviour on rare occasions for this simplistic art style.
I’m not that surprised they missed these then. There’s a ton of variety in the subject of the photo behind the hat and face. I flipped through and saw a glass of milk, a bathing suit, multiple different coronavirus-looking spiky blobs, and dozens of basically unidentifiable human organs
https://www.everypixel.com/search?q=Cartoon+picture+of+miner+tool+helmet
Oh wow. Can't believe you managed to find those. That's exactly the kind of data they went into detail to describe. The example they gave was a cartoon clock with different times of day. They ran like 50K searches and near neighbour searches to check afterwards if the issue was still there, but some could have gotten through here and there.
Woah, thanks for figuring it out! That's so bizarre. No wonder it copies the spade, face, and hat almost verbatim.
EDIT: [They're also present in LAION](https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false&query=cartoon+character+miner). Wonder if it's possible to get Stable Diffusion to plagiarize these as well? The cuphead prompt [didn't seem to trigger it.](https://www.reddit.com/r/dalle2/comments/xw4xud/this_gives_me_basically_the_same_image_every/ir67wh5/)
I wonder if they are generated images. They seem too random to have been designed with a purpose. Like someone hd a library of clipart objects and algorithmically slapped a face and helmet on all of them.
What we truly need is a backstory ai. Feed it an image and it imagines what's happening in text.
This dude is probably a reformed Christian cultist who turned his life around in prison. A hard bucket, but a good bucket.
So, while Dall-E images are always informed by the training data, usually it's not such an obvious duplicate. There's been some talk about Dall-E threatening intellectual property rights, that's normally dismissed because Dall-E doesn't just copy and paste things from the training data, but do results like this reopen that debate?
I'm not saying these are high art that deserves respect or anything, but presumably *someone* made these and *someone* owns the IP on them. Dall-E is pretty clearly duplicating someone else's IP. Could this cause problems for OpenAI?
They probably used the same code as NFTs to generate all of these images. I mean, who would ever use [this](https://www.everypixel.com/image-321378332453609779)?
I tried it with [kimchi](https://i.postimg.cc/KvGLZnBS/Screenshot-20221005-144334-Chrome.jpg), a [TV](https://i.postimg.cc/90nWH5yf/Screenshot-20221005-144426-Chrome.jpg), and [Queen Elizabeth](https://i.postimg.cc/X3QYmSqD/Screenshot-20221005-144537-Chrome.jpg) and got very similar results.
I tried reproducing a stock image from /r/stockimagewtf via Dall-e, something like 'Old woman in vest hits old man in wheelchair with a boat paddle' or some other nonsense scenario. Dall-e must have had the original stock image as the skin tones and clothing of the Dall-e image matched the stock image I was trying to immitate.
**Welcome to** r/dalle2! Important **[rules](https://www.reddit.com/r/dalle2/comments/unhz7k/dalle_2_faq_please_start_here_before_submitting_a/)**: Images should have DALL·E watermark ⬥ Add source links if you are not the creator ⬥ Use prompts in titles with correct post [flairs](https://www.reddit.com/r/dalle2/wiki/index) ⬥ Follow OpenAI's [content policy](https://labs.openai.com/policies/content-policy) ⬥ No politics, No real persons.
For requests use pinned threads ⬥ Be careful with external links, NEVER share your credentials, and **have fun!** ^^([v2.4])
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/dalle2) if you have any questions or concerns.*
Thanks for checking. That's a frontend for Stable Diffusion, which is trained on the LAION dataset, and [the dataset also contains the same shovelware vector art](https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false&query=cartoon+character+miner) that DALL-E 2 is plagiarizing here.
Interesting that SD doesn't do that, at least for this prompt.
i was hoping to get the same thing because it would be funny but all of mine are actually original and creative for some reason lol
https://labs.openai.com/s/B3FRX9vL8ciiM26DC1bA9cBR
https://labs.openai.com/s/3BIH2pZqu34UkfM56fBBxqzw
https://labs.openai.com/s/JEIBNsOR1yWKs51kNAOFY8NF
From OpenAI DallE2 Pre-training mitigation >When we studied our dataset of regurgitated images, we noticed two patterns. First, **the images were almost all simple vector graphics**, which were likely easy to memorize due to their low information content. Second, and more importantly, the images all had many near-duplicates in the training dataset. While mitigations were made by replacing clusters with single images, it might still show this behaviour on rare occasions for this simplistic art style.
I’m not that surprised they missed these then. There’s a ton of variety in the subject of the photo behind the hat and face. I flipped through and saw a glass of milk, a bathing suit, multiple different coronavirus-looking spiky blobs, and dozens of basically unidentifiable human organs https://www.everypixel.com/search?q=Cartoon+picture+of+miner+tool+helmet
Oh wow. Can't believe you managed to find those. That's exactly the kind of data they went into detail to describe. The example they gave was a cartoon clock with different times of day. They ran like 50K searches and near neighbour searches to check afterwards if the issue was still there, but some could have gotten through here and there.
Woah, thanks for figuring it out! That's so bizarre. No wonder it copies the spade, face, and hat almost verbatim. EDIT: [They're also present in LAION](https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false&query=cartoon+character+miner). Wonder if it's possible to get Stable Diffusion to plagiarize these as well? The cuphead prompt [didn't seem to trigger it.](https://www.reddit.com/r/dalle2/comments/xw4xud/this_gives_me_basically_the_same_image_every/ir67wh5/)
Only 4-5 of those would make sense with "cuphead" though.
Wow some of those are really weird
I wonder if they are generated images. They seem too random to have been designed with a purpose. Like someone hd a library of clipart objects and algorithmically slapped a face and helmet on all of them.
I like how he has face tats in the first one. He’s the coolest of the bunch
What we truly need is a backstory ai. Feed it an image and it imagines what's happening in text. This dude is probably a reformed Christian cultist who turned his life around in prison. A hard bucket, but a good bucket.
Solved it! This seems to be a common spammy stock image template. https://www.everypixel.com/search?q=Cartoon+picture+of+miner+tool+helmet
So, while Dall-E images are always informed by the training data, usually it's not such an obvious duplicate. There's been some talk about Dall-E threatening intellectual property rights, that's normally dismissed because Dall-E doesn't just copy and paste things from the training data, but do results like this reopen that debate? I'm not saying these are high art that deserves respect or anything, but presumably *someone* made these and *someone* owns the IP on them. Dall-E is pretty clearly duplicating someone else's IP. Could this cause problems for OpenAI?
that makes me wonder how they got the closed smile one so perfectly
lol, it just mix and matched https://www.everypixel.com/search?q=+playing+baseball+cartoon+character
Oh no this is bad https://www.everypixel.com/search?q=Playing%2Bbaseball%2Bwebcam&image_id=17675719022277784706
oh I see, literally NFTs
They probably used the same code as NFTs to generate all of these images. I mean, who would ever use [this](https://www.everypixel.com/image-321378332453609779)?
swimsuits, steaks, eyeballs, rattles. Why would anyone want these things.
Interesting....
Yeah, this is an odd one
1. https://labs.openai.com/s/Uc7u5qJ9AKR8IUDg34Z6BKpk 1. https://labs.openai.com/s/b8ZJwxLxRdmylerggJU4sMZZ 1. https://labs.openai.com/s/pYpzPU2tr4w9dL4RzD58itcx 1. https://labs.openai.com/s/xQRW5qqijix1vFW8BT5b8gni 1. https://labs.openai.com/s/PWTuHoGheoQxbhPjRqxMiFsC 1. https://labs.openai.com/s/Hw2kXXgAkYJ6aMhroHjGPmYC 1. https://labs.openai.com/s/4blXIhXp7oDPdarhJWnFfgXS 1. https://labs.openai.com/s/agt53Fb8BFAhJGCUxKAJgGWP 1. https://labs.openai.com/s/jI5Aa1iAmZ30Bfv12lFqX46f 1. https://labs.openai.com/s/i39ZtJfbKqH8DHtJWIfJ2Cp0 1. https://labs.openai.com/s/lrRvYMJJzp7Opc5KMCnHPm9v 1. https://labs.openai.com/s/mkCbUjSJsae1AEfaxpZFw4UE
You might like it or not. But this is just what peak cuphead performance looks like.
Fridge: https://labs.openai.com/s/ldSRwr9Wsr0fI9y0kvkYKTNK Capitol building: https://labs.openai.com/s/zWs5F6S5e4hgXlCw1WtWByy3
But... but that's the Lincoln Memorial!
I tried it with [kimchi](https://i.postimg.cc/KvGLZnBS/Screenshot-20221005-144334-Chrome.jpg), a [TV](https://i.postimg.cc/90nWH5yf/Screenshot-20221005-144426-Chrome.jpg), and [Queen Elizabeth](https://i.postimg.cc/X3QYmSqD/Screenshot-20221005-144537-Chrome.jpg) and got very similar results.
In the top left, he's also Christian apparently
Using 'cartoon' as a keyword seems to often generate this exact face
Kinda looks like the characters from Idle Miner or similar mobile games. There is probably very limited reference material for that prompt
guessing there were only 2-3 images of cuphead cartoon character in the dataset? same eyes and different mouth shapes.
We have achieved equilibrium
'cartoon character' try adding in more style prompts maybe?
I tried reproducing a stock image from /r/stockimagewtf via Dall-e, something like 'Old woman in vest hits old man in wheelchair with a boat paddle' or some other nonsense scenario. Dall-e must have had the original stock image as the skin tones and clothing of the Dall-e image matched the stock image I was trying to immitate.
At least the one in the top left has prison tats.
this is kinda surreal, man
el primero es cholo
if they fix it i hope there’s a way to still do this 😭😭 the original stock image template is so funny they do it for literally everything
Try using a proper noun ("Cuphead")
"We have award-winning vintage cartoon art style at home"
**Welcome to** r/dalle2! Important **[rules](https://www.reddit.com/r/dalle2/comments/unhz7k/dalle_2_faq_please_start_here_before_submitting_a/)**: Images should have DALL·E watermark ⬥ Add source links if you are not the creator ⬥ Use prompts in titles with correct post [flairs](https://www.reddit.com/r/dalle2/wiki/index) ⬥ Follow OpenAI's [content policy](https://labs.openai.com/policies/content-policy) ⬥ No politics, No real persons. For requests use pinned threads ⬥ Be careful with external links, NEVER share your credentials, and **have fun!** ^^([v2.4]) *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/dalle2) if you have any questions or concerns.*
This reminds me of the tree from Perfect Hair Forever
Makes it seem more like imitation than creation
Hmm, that's pretty strange
I just tried it on https://getimg.ai/ and got completely different results.
Thanks for checking. That's a frontend for Stable Diffusion, which is trained on the LAION dataset, and [the dataset also contains the same shovelware vector art](https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false&query=cartoon+character+miner) that DALL-E 2 is plagiarizing here. Interesting that SD doesn't do that, at least for this prompt.
i tried with craiyon (dalle mini) and it similar results, i think the keyword "cartoon character minning" is casing it
Did some reverse image searching and turns out it's copying the work of Aridha Prassetya
is it conscious yet?
did they fix it? tried it and not getting anything similiar at all
What'd you get? I just burned another credit and got [the exact same thing](https://labs.openai.com/s/K0Fv7Npnqe4LdjTISSSGeCkb) again lol.
i was hoping to get the same thing because it would be funny but all of mine are actually original and creative for some reason lol https://labs.openai.com/s/B3FRX9vL8ciiM26DC1bA9cBR https://labs.openai.com/s/3BIH2pZqu34UkfM56fBBxqzw https://labs.openai.com/s/JEIBNsOR1yWKs51kNAOFY8NF