Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Stable Diffusion and ControlNet: “Hidden” Text (see thumbnail vs. full image) (reddit.com)
101 points by b0ner_t0ner on July 23, 2023 | hide | past | favorite | 37 comments


The user is using the ControlNet model: https://huggingface.co/monster-labs/control_v1p_sd15_qrcode_...

I'm so happy someone has actually trained a QRcode ControlNet model so we can generate cool QRcodes, the previous methods were pretty bad and didn't reach the quality of the images produced by that random guy who made them originally (without sharing details about how the ControlNet model was trained).


How do you use that? I'm a noob and the readme on that page just mentions it's a "controlnet for SD"... do I merge this with the SD weights or something?



Between this and the QR code thing, AI really shines at making images that have patterns but look natural. Honestly some of the coolest uses i have seen of AI image generation.


I wonder what equiv.'s might exist in the other mediums... it's not that text can't have this sort of complexity, we just don't have ControlNet for LLMs. The Codegen folks are working on schema enforcement, which has similer motivations..


It can definitely work in text; visually a short word like "the" at the end of a line and again at the start of the next can be missed, or the way you can mix up the spelling of words and still be comirhpebnesle so long as the start and end are correct, or ASCII art limited to real words and sentences; conceptually with rhetorical language painting emotional states without necessarily making a single claim, and I'm sure all of us here have at some point had the experience of someone getting indignant about something we neither wrote nor even intended to imply.

In audio: Yani/Laurel, barber-pole effect, use of instruments to fake voices.


> Make a poem that when reading the first character of each lines reads "HELP ME"

    Hear the whisper of despair, faint and low,
    Eclipsed by shadows, where the dark fears grow.
    Lost in a maze, with paths unclear,
    Please guide me out, bring me near.

    Moonlight's touch, so distant, cold,
    Every step heavy, yet the story unfolds.


I feel like I'm missing something given the comments here and on reddit. Do people really not see the text in the enlarged image? (Even enlarging extra) A few are a tad less clear but nothing is unreadable. I'd be extremely impressed if people were not able to read Rio, Istanbul, or London. Is there some collective unspoken agreement going on, am I just an outlier, or am I not uncommon and just no one is commenting?


If you have poor eyesight then you will not get the full effect. Taking my glasses off makes the enlarged New York significantly more visible.

Edit: They are definitely differently well obscured and there may be a learned aspect to it.


I don't think this is true, at least for my case. I have 20/10 and 20/13 vision in my eyes, so far from poor.


Viewing both thumb and zoom on a phone, I am either looking at a thumb too resolution to see the e.g. buildings that make up the text, or a zoom in so far that I am looking at a single building and have to scroll to see its part of the text.

Are you on a desktop or super screen or something?


Desktop (macbook Air fwiw). I've even tried zooming in and out to get intermediate values. They don't disappear till after I'm past full screen, and for some of them not ever (I mean the whole word falls off the screen but the negative space is clearly an identifiable letter). Amsterdam and Rome are the ones where I can lose text the soonest.


The only thing I didn't immediately see was Rome, which I honestly wouldn't have spotted unless I saw the thumbnail first. This is on a mobile phone with less-than perfect eyesight.


So some are better than others.

For me i absolutely cannot see the text in the "Rome" image when full sized.

What is even crazier is i can see the text of the full sized image if i just hold my phone further away (say half a meter) from my face


The enlarged version is more readable if you have a small screen or poor eyesight


I have the same confusion as yourself. The text is readable at any size for me.


Yeah I guess we're freaks lol. But it was really surreal given how convincing it is to others, that we're having a very different interpretation. Fwiw, if I zoom in on Amsterdam and Rome I can get them to disappear, although they still stand out in the scene, just become less legible.



Does this work on MacBook Pro M1 Max?


i believe stable diffusion can work on the M1, see https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki...


I think this is exploiting image resizing algorithms, which are inaccurate because no matter how much better anyone should know, they never do the math in linear-light space so it always gets bright areas wrong.

Every Photoshop filter does this wrong too.


I think it's more that the human eye is much more contrast sensitive when looking at low detail images, as highlighted in the einstein/monroe illusion https://psychologicalscience.blog.gustavus.edu/2022/05/10/ei...


I don't think it takes doing anything wrong for diffuse spatial patterns to become well-defined when you substantially compress the space. What's dispersed across quite a few pixels in the full-size is now condensed into an easily discerned sharper edge of pixels.

If I defocus my vision, kind of like when looking at those old "MagicEye" 3D-like images, I can see the words pretty well in the full-size...

In the full-size there's a bunch of high-frequency noise (like building windows) interfering, but that's necessarily lost in the down-scaling, and now the low-frequency information forming the name is clearly visible.


You can squint and see the letters, so I’m assuming this isn’t it.


you can also take a few steps back from the screen, it works too


> which are inaccurate because no matter how much better anyone should know, they never do the math in linear-light space so it always gets bright areas wrong.

I dont think its that nobody knows, just that it doesn't matter too much and you can resize things much more efficiently with the incorrect algorithm (especially jpgs).

I'm also doubtful that this is responsible for the effect


I vaguely remember seeing effects that exploit resizing algorithms, but I do not think that is what is going on here. I can view the image full size on a high resolution monitor, walk to the other side of the room, and see the text clearly. It is also visible if I squint.


The content doesn't have high contrast so no that's not it


Off-topic, but did anyone else find the broken back button super annoying?


Couldn't get back to thumbnail view on iPhone - tried like 5-6 times...


There's a tiny "Back to grid view" link. Maybe you're seeing a mobile version that doesn't have that link?

You should view this on a desktop screen, though. On an iPhone you'll probably see the text even when maximized.


This is super fascinating - the effect worked seamlessly for me, not visible when the images are large but clear enough once they're far away/I squint/zoom far out. My husband on the other hand could see the text constantly, if he could see the image at all he could see the text. Worth noting that he has some pretty serious glasses and it's been a while since he last had his vision checked, he needs another trip soon. Fascinating.


As a hobbyist artist, and way before stable diffusion was a thing, back in 2014 (I think?) I made a similar illusion where only it can be read by looking at the thumbnail or squinting your eyes!

>https://tamim.io/random_shares/thumbnail_illusion_tamimi.jpg

Can you read it!?


unieue? In other words... I don't think so


Close enough, it is unique!


I can easily see the text on the iphone14promax after enlarging, but it seems from the comments that some can't.

Is this an optical illusion situation or a blue/gold dress?


PSA: if you have bad eyesight, you see everything a little bit blurred all the time, so the effect won't work well (or at all) on you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: