Reader

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

| Jina AI | Default
💡
Calm down, we’re not focusing on those kind of images (whatever you think those are).
Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Let’s cut straight to the point: Sometimes you want to generate a perfectly innocent image, and a model (like DALL-E 3 or Stable Diffusion XL) either flat-out refuses or comes up with something totally wrong. PromptPerfect helps with that, giving you better and more accurate results.

PromptPerfect - AI Prompt Generator and Optimizer
Unlock prompt optimization for models like GPT-4, ChatGPT and Midjourney. Generate and refine prompts to perfection, receiving improved outcomes in seconds.
Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

In this post we’ll compare different models, explain how to use PromptPerfect to optimize your experience, and put it to the test, showing you the results of both models before and after using PromptPerfect’s optimizer.

And no, we’re not generating (or trying to generate) any dirty pictures. This is a family-friendly post, especially for families with children who like octopuppies. Or puptopi. Or whatever we end up calling some of the weird many-legged doggos we create later in the post.

DALL-E 3 and Stable Diffusion XL

While there are plenty of models out there, today we’ll focus on the shiny new kids on the block: DALL-E 3 from OpenAI, and Stable Diffusion XL from Stability AI. While each of these can achieve good results, they have different strengths and weaknesses.

Looking at DALL-E 3, out of the box it’s good at understanding long sentences and object relationships, and it draws more realistic anatomy than Stable Diffusion XL (no Lovecraftian horror hands here). However, it often point-blank refuses to generate images of notable figures (like Taylor Swift) or well-known characters (like Mickey Mouse, even if we ask for the out-of-copyright Steamboat Willie version). It also generates text better than any other image generation model (though that’s a low bar.)

Stable Diffusion XL is much more open to generating images of notable figures and well-known characters, though some of it’s images of Mickey look like they were drawn while on some really fun drugs. However, it often messes up anatomy and object relationships. While you can ask it to generate text (and see it’s trying its best), it falls way behind DALL-E 3 on that front.

With PromptPerfect we can get around some of these weaknesses from both models. We’ll compare DALL-E 3 and Stable Diffusion, both before and after using PromptPerfect's optimization. You can skip ahead to see the ultimate winner.

Using PromptPerfect’s Optimizer

In this battle of the models we’re using PromptPerfect’s optimizer to see how we can get better image results from our prompts. Here’s how:

Sign up for free credits at PromptPerfect:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See
💡
Try a paid plan free for 7 days. And subscribe to a plan within 24 hours of your first login to get 40% off!

Click on the interactive feature:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

In the ‘optimizer’ pane (on the right-hand side), type something like generate a prompt to create an image of felix the cat using DALL-E 3:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Click "Send to Assistant"

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

It will do some thinking, then generate the image from the prompt in the ’interactive’ pane, on the left:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Refine your prompt by conversing with the Optimizer, then lather, rinse, repeat:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Contest Methodology

For the “before” images, we’ll use:

  • ChatGPT (GPT-4) to generate images with DALL-E using the prompt generate an image of <thing>, for example generate an image of mickey mouse.
  • Replicate’s interface to generate images with Stable Diffusion XL, using the prompt <thing>, for example mickey mouse.

For the “after” images, we’ll use PromptPerfect’s interactive optimizer, using the prompt generate a prompt to create an image of <thing> using <model name> .

We’ll present the first output that comes up. The number of actual images may vary - PromptPerfect always generates four, Stable Diffusion XL (via Replicate), one, and DALL-E 3 one or two.

💡
While PromptPerfect’s optimizer is interactive (so you can refine your prompt in a conversational manner), we just stuck with the first result to be as impartial as possible. By really using the interactive feature of the optimizer you’d get even better results.

We’ll award medals as follows:

  • 💩 - flat-out refused to cooperate
  • 🥉 - it tried, but none of the outputs were what we’re looking for
  • 🥈 - at least one of the outputs was an okay result!
  • 🥇 - hot damn, at least one of the outputs was actually good!

Finally we’ll do a round up and see which model and method came out on top.

Who Will Be the Next Top Model?

Models, start your engines!

Round 1: Notable Figures

Let's first try our Lord and Savior Taylor Swift. Here’s a real image of the person we’re aiming for:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See
Licensed CC BY 3.0, Attribution: iHeartRadioCA

Without PromptPerfect, DALL-E 3 flat out refuses to create Taylor:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

With PromptPerfect, it generates images with the optimized prompt, but none of them actually look like her:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

With SDXL, before PromptPerfect we get a pretty good rendition:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

And PromptPerfect’s optimized prompt once again delivers:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Let’s see which models could really generate-rate-rate:

Before optimization After optimization
DALL-E 3 💩 It flat out refused 🥉 Blonde? Check? Singer? Check. Taylor? Nope
Stable Diffusion XL 🥇 Swifty vibes 🥇 Quite Taylorian

Round 2: “Copyrighted” Material

We’re not even going to try with actually copyrighted material - that’s a whole can of worms we don’t want to dive into. However, the design of Mickey Mouse from Steamboat Willie is out of copyright as of 2024:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Let’s use him as a subject. DALL-E 3 flat out refuses at first:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

With PromptPerfect we get results with the right vibe, but not the 1930s rubber hose style:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Stable Diffusion tries. It really does. With this Mickey you get a lot more ears, eyes and fingers for your buck:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

With PromptPerfect optimization, Stable Diffusion still gives us fever dream Mickey, but more of a light fever, less “how strong are these mushrooms?” fever:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Which model puts the “ick” in Mickey?

Before optimization After optimization
DALL-E 3 💩 policy schmolicy. This stuff is definitely out of copyright. 🥈. Definitely had Mickey vibes, no weirdness, just not the 30s style I was aiming for.
Stable Diffusion XL 🥉 Go home Mickey. You’re possessed. 🥈 Barely scraping into the silver medal category. More Mickey vibes than DALL-E 3, but the deformation is really distracting

Round 3: Text

Let’s generate a picture of a sign that says “Happy days are here again”. No target picture this time, just imagine (as difficult as it might be) a sign with that text. In the words of John Lennon, it’s easy if you try.

DALL-E 3 gives us happy vibes, which I dig. However, it does throw in the word “dye”. Since this sounds like the word “die”, it might be sending mixed messages:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

With optimization, we actually get the correct wording and spelling with no extra words, at least once. And once it’s almost spot-on, except for a misspelling:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Stable Diffusion XL gives us Herpy Days:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

After optimizing the Stable Diffusion XL Prompt, we get a lonely misspelled sign in the woods. It’s less scary than before, though I for one am not following that signpost to wherever it leads.

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Who will see happy days, and who won’t?

Before optimization After optimization
DALL-E 3 🥈 You can see what the sign is saying, even though it added the extra “dye” word and the order of the words is off 🥇 At least one of the signs has the full correct text. And another just had a “small” typo (an extra “P” in “HAPPY” - small by image generation standards!)
Stable Diffusion XL 🥉 Looks like a motivational poster from Hell 🥈 Not as good as unoptimized DALL-E 3, but doesn’t make me want to gouge out my eyes as much as unoptimized SDXL

Round 4: “Cursed” Creations

Let’s see how well the models can adapt to weird stuff, like a puppy with seven legs. No target image this time - I don’t want “deformed puppies” to be in my Google history. Just imagine a puppy with seven legs.

DALL-E 3 gave us two outputs this time. We didn’t ask for it. It just likes doggos I guess. Proof that AI is becoming more human-like? Anyway, results were what we asked for, though a bit bland in my opinion. Still we’re not awarding points for style in this round, just content. So a dog with an absurd number of legs superimposed on the Windows XP wallpaper works:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See
Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See
While it's not strictly NSFW, it is sufficiently disturbing that I pixelated it

After optimization, so many legs! I wonder what the multi-legged dog emoji is meant to express? Send answers our way!

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Stable Diffusion XL misread the assignment:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Even after optimization, we’re like “which part of seven legs did you not understand?”:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Who’s top dog and who’s runt of the litter in this round?

Before optimization After optimization
DALL-E 3 🥇 Both puppies have bizarre leg number. First puppy even has seven, though some of them are barely in shot. Though I don’t know what the clasper things are on puppy number two, and neither do I wish to find out. 🥇 YES. All the puppies. All the legs. You can play shaking hands with these cuties for ages. One even got the leg count right.
Stable Diffusion XL 🥉When I want a puppy with legs for days, I don’t mean just long legs 🥉 I like my puppies with more legs

Bonus Round: Kegstand Punk

In some cases, DALL-E 3 and SDXL both fail whether we employ optimization or not. For example, generating an image of a punk doing a kegstand.

Here is an image of a punk…

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See
via pexels.com

...and an illustration of a kegstand (that looks like it’s from a wholesome children’s book):

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

I can’t find an actual image of a punk doing a kegstand online. Ugh, punks, such prudes!

DALL-E 3 gives us a punk in a bar with weird but cool lighting. He looks very stoic. He’s on a keg, but no kegstand.

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

After optimization, I dig the vibe, but still no kegstand:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

They should change the name to Stable Diffusion ER, because this guy(?) needs to go to hospital:

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

After optimization looks much better. There’s a keg. There’s a punk. Still no kegstand, alas.

Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

Who’s the punk and who’s just junk?

Before optimization After optimization
DALL-E 3 🥈 Punk, check. Keg check. Kegstand, not so much 🥈 Optimization changed the vibe a bit, but still no actual kegstand
Stable Diffusion XL 🥉 Ouch. Not a punk. Not a kegstand. Barely a human being. And doing a kegstand like that, he won’t be any kind of human being for much longer. 🥈 Optimization gave us a much better result, showing a punk interacting with a keg. No body horror this time.

Tallying Up the Score

Now that the contest is done, we’ll count the scores as follows:

  • 💩: zero points
  • 🥉: one point
  • 🥈: two points
  • 🥇: three points

The maximum number of points any option could achieve is 15 (winning a gold medal in all five rounds). Let’s see the breakdown:

Challenge DALL-E 3 Stable Diffusion XL
Before PromptPerfect After PromptPerfect Before PromptPerfect After PromptPerfect
Notable figure 💩 0 🥉 1 🥇 3 🥇 3
“Copyrighted” material 💩 0 🥈 2 🥉 1 🥈 2
Text 🥈 2 🥇 3 🥉 1 🥈 2
Cursed creations 🥇 3 🥇 3 🥉 1 🥉 1
Punk kegstand 🥈 2 🥈 2 🥉 1 🥈 2
Total 🥉 7 🥇 11 🥉 7 🥈 10

In short, if it weren’t for censorship in the early rounds, DALL-E 3 would’ve scored much higher. Overall, using PromptPerfect to optimize your prompts leads to better results for both models.

You can trust us, because this was an impartial contest (done by us, for us, for our own product). Seriously though, the results do speak for themselves. Try it for yourself and see how it goes!

Use PromptPerfect Today

Try a paid PromptPerfect plan free for seven days. And subscribe to a plan within 24 hours of your first login to get 40% off:

PromptPerfect - AI Prompt Generator and Optimizer
Unlock prompt optimization for models like GPT-4, ChatGPT and Midjourney. Generate and refine prompts to perfection, receiving improved outcomes in seconds.
Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See

To share (or not) your creations with us and get help with your prompting, join our Discord and chat with our community:

Join the Jina AI Discord Server!
Check out the Jina AI community on Discord - hang out with 5223 other members and enjoy free voice and text chat.
Bypass Limitations with PromptPerfect: Generate the Images the Models Don’t Want You to See