Feb 23, 2023

Age Changer

Sci Fi fandom

In my previous blog post, I accomplished my first successful "inpainting", a digital correction of part of an AI-generated image. Because I'm a fan of The X-Files, I happened to be using an image with characters from The X-Files. Sometimes while "collaborating" with an artificial intelligence system, something unexpected happens. In this case, suddenly I was imagining Dana Scully and Monica Reyes in their childhood years, as science fiction fans.

The AI-generated image shown to the right was made by "Mr. Wombo", A.K.A. the WOMBO Dream cloud-based CLIP-guided image generating software system. I used this text prompt: "show a magazine cover of an orange space alien, Roswell magazine cover, two teenage science fiction fans, show Dana Scully at age 14, show Monica Reyes at age 12, The X-Files". Yes, I also did some post-processing with Photoshop, but the point is, the software can gestimate how to change a person's image as appropriate for different ages. It is fun to imagine Dana Scully as a budding scientist and science fiction fan.

First Reality
I've long been frustrated by the apparent lack of photographs of Isaac Asimov as a young man, particularly color images. Shown to the left is what I got using this text prompt: "photograph of Isaac Asimov" with the WOMBO Dream "Realistic v2" style.

Figure 1. now in color
I next asked for a color image of Asimov: "color photograph of Isaac Asimov".

In my view, neither the gray-scale image or that first color image (Figure 1) really looks much like Asimov.

In any case, I tried to get a more interesting image with this text prompt: "color photograph of Isaac Asimov, Asimov as a young man, when Asimov was thirty years old in Brooklyn, Asimov beside pulp magazines inside a candy store background" (see Figure 2, below). As usual, I did some work with Photoshop, in this case to insert an imaginary pulp science fiction magazine into the image.

Figure 2. Time travel Asimov

 Figure 2 illustrates a common problem with the Stable Diffusion system for computerized image generation. The neural network was trained on 512x512 pixel images, so the tall images used by Mr. Wombo often split into two stacked images. This, in Figure 2, there are two stacked Asimovs. However, this is a good thing, because in the Exode Saga, I like to imagine that Isaac Asimov received infites from a positronic robot. That robot was involved in what came to be known as the Roswell New Mexico "UFO crash".

That encounter with the robot eventually led to Asimov being recruited for a time travel mission. As a consequence off that mission into the past, there were two copies of Asimov on Earth. I like to imagine that in Figure 2, the time-traveling Asimov was able to provide his earlier self with a copy of an as-yet-unpublished edition of Astounding magazine.

Figure 3. Robyn Asimov

 For Figure 3, I used this text prompt: "color photograph of Isaac Asimov and his daughter Robyn, a young Asimov holding a science fiction book, Issac at age thirty, show Robyn at age twelve, book signing background, inside a bookstore". In an alternate Reality where a Barclay Shaw hardcover edition of The End of Eternity was published in the 1960s. I did not ask Mr. Wombo to include a third individual in the image, but maybe this dude with the gray hair is Asimov's father.

I have no idea if Robyn Asimov looked anything like this when she was 12 years old. In the Exode Saga, the time traveling Asimov is able to use nanites to alter his appearance and he take the place of magazine editor John Campbell after Campbell dies in an accident. However, here in the Final Reality, Campbell lived longer and Asimov never had a chance back in the 1940s to travel through time.

Figure 4. Three Sci Fi fans

I asked Mr. Wombo to make an image depicting Asimov, Campbell and young fan. As shown in Figure 4, I also inserted a copy of a magazine called The Atlan Intervention

I think Mr. Wombo is handicapped by a bias towards the many images of Asimov scraped from the internet that showed him in his later years. This, Mr. Wombo struggles to depict a younger Asimov. In the case of Campbell, for whom there are very few digitized color images on the interwebs, Mr. Wombo simply can't make a reasonable depiction of his facial appearance.

Figure 5. age progression (tool)
I took one of the faked images of Isaac Asimov that was generated by Mr. Wombo and processed it through an age progression routine. None of the resulting images really looks like Asimov (Figure 5). It will be interesting to see how much these software algorithms can improve in the future.

Figure 6. Asimov age changer
I also fed a real image of Asimov into the age progression tool and got the results shown in Figure 6. Again, this image processing software fails to capture the "essence of Asimov".

Figure 7. Asimov in a bookstore
One of the better AI-generated depictions of Asimov that I got from Mr. Wombo is shown in Figure 7. There is actually a hint of Asimov's real appearance in this image.

Tom Hanks pulped
The more famous the person is, the more of their images went into the Stable Diffusion data set and the better Mr. Wombo can do depicting them. I tried: "color photographic portrait  of actor Tom Hanks, Tom Hanks at age thirty, Tom Hanks pulp science fiction magazine cover illustration, a furry blue space alien, NASA background, Ed Emshwiller style". I get so tired of Mr. Wombo making images of people who just stand there like a statue. 

The Tom Hanks episode
What if Tom Hanks was in an episode of The X-Files? Text Prompt: "color photograph of actor Tom Hanks with Dana Scully, The X-Files biology research, green alien head in a jar, Tom Hanks in a laboratory with Dana Scully, Ed Emshwiller style". 

Figure 8. Dana and Tom image

The image to right had some post-processing with Photoshop, but the basic scene was generated by Mr. Wombo. I had to repair a serious problem with Tom's shoulder and then I fed that back to Mr. Wombo as a reference image and got Figure 8. Actually, Figure 8 is a composite of three different images since Mr. Wombo never made three reasonable alien parts at the same time; "color photograph of actor Tom Hanks with Dana Scully, The X-Files biology research, green alien heads floating inside glass jars, Tom Hanks in a laboratory with Dana Scully, The X-Files style".

time travel machine
I'm not particularly interested in Tom Hanks, I just assumed that there would be excellent coverage of his facial features in images that were used to train the Stable Diffusion system. In order to depict one of my own story characters such as Tyhry, I think I need to select someone who is fairly well captured in the Stable Diffusion data set, and use them as a template for Tyhry. 

Sometimes it seems like Mr. Wombo has a sense of humor. I asked for a time travel machine and got the image shown to the right. Maybe if you travel through time you sprout an extra torso. I'm trying to use Margot Robbie as a template for Tyhry.

Hair nanites experiment.

I asked Mr. Wombo for an image with Tyhry inside a nanotechnology machine and with Marda at the controls.

Manny, Tyhry, Zeta and Marda

Mr. Wombo seems to have a preference for including more than just two people in an image. I don't mind a scene in which Manny is trying to advise Marda and Tyhry on the best methods of using advanced technology. The image to the right might illustrate an upgrade the the Reality Simulation System interface, providing Tyhry and Marda with technology for viewing simulations.

Marda and Tyhry on vacation

I'm still working on illustrations for Tyhry and Marda for the time during their vacation in Alastor Cluster. The image to the left began with a reference image that was a tropical jungle and did not include any water, but Mr. Wombo started doing backgrounds that were all aerial views of tropical islands.

by Mr. Wombo
One of the mysterious alien artifacts imagined by Jack Vance was Monument Cliff on Xi Puppis X. I challenged Mr. Wombo to generate an image of such an artifact with this text prompt: "a tropical island with a stone mountain, carved face on a rocky island, giant carved figure, carved cliff face, distant view of the island, evening light, like Mount Rushmore".

Tyhry and Marda visit Xi Puppis X.
I imagine that Tyhry and Marda might use the Reality Simulation System to visit Xi Puppis X.

For the image to the left, I spliced together two separately generated images; 1) an image with Tyhry and Marda and 2) an image of the stone face. 

Generated by the WOMBO "Flora v2 style"

As usual, the "Flora v2 style" for WOMBO Dream produced a unique version of a tropical island carved with a face.  

Figure 9. tall aliens?
I asked Mr. Wombo to generate an image showing two women on a beach (see Figure 9).

alternate version of Tyhry and
Marda at Monument Cliff

Mr. Wombo then generated: "photograph of women on a beach seen from behind, holding hands, tropical island background, sculpted faces on cliffs" (see the image to the right).

I'm imagining that Tyhry and Marda visit Xi Puppis X in an attempt of obtain nanites of alien design. I revived a previous attempt to depict nanites.

Fig 10. nanites assembling cloth
I tried to determine to what extent Mr. Wombo can deal with depictions of sub-microscopic structures like a fabric that is composed of nanites. 

I can imagine that the image shown to the left is during the assembly of a garment from nanite components.

hair nanites

Of particular importance to Tyhry and Marda are hair nanites, which turn out to have unexpected value for the mind clones. The image to the right depicts hair nanites self-assembling to form artificial hair fibers.

applications of medical nanites
 Game Changer. The hair Nanites that Tyhry and Marda obtain in Alastor Cluster turn out to also contain built-in programs for treating some medical problems.

The image to the left was generated by Mr. Wombo by means of several iterations with: "photograph inside a futuristic pediatric hospital, closeup view of a pediatric nurse, pulp science fiction cover image, young children". Without being asked, Mr. Wombo provided a child with one leg and then I added the text: Limb Regeneration.

Big Bird
As usual, the "Flora v2" style of Mr. Wombo provided a unique perspective (see the image to the right) on a children's hospital scene. I hope that medical nanites don't transform small boys into two-headed birds.

Queen of Hearts
With the help of Mr. Wombo, two-headed birds do transform into the Queen of Hearts (see the image to the left).

In the Exode Saga, some of the telepathic mind clones (particularly Brak) can access information in the Sedron Time Stream. This allows for "visions" of the future. I'm not sure how to depict this kind of information exchange in an illustration.

The Sedron Time Stream
When I was making networks of nanites, suddenly a face appeared in the cloud of nanites. After several iterations of Photoshop and Mr. Wombo's efforts, I arrived at the image shown to the right. I'm still uncertain in my own thinking about how anyone could sort through all of the available data in the Sedron Time Stream.

teleportation

 Teleportation. I've long struggled with depicting teleportation in a static image. I asked Mr. Wombo for an image with: "photograph of a science fiction woman using teleportation device, physics laboratory background, show complex particle physics equipment in the background". In the image shown to the left, I'm not sure what the big fishbowl is for.

Marda using a teleporter
The figure to the right shows an alternative teleportation scene devised by Mr. Wombo. "photograph of a woman under blue light, with her skin radiating blue light, physics laboratory background".

Marda "in" teleportation beam.
After a lot of work, I was able to obtain the image shown to the left. I could not find the "magic word" that would make Mr. Wombo put Marda inside a glowing teleportation beam.

Marda being teleported.
Eventually, when one's frustration with Mr. Wombo grows, it is time to either 1) use Photoshop to create the desired effect or 2) start over and simplify things for Mr. Wombo. It is pretty clear that Mr. Wombo has no good data for what teleportation might look like. 

Starting fresh, rather than ask for a depiction of teleportation, I simply asked Mr. Wombo to put Marda inside a cylinder of light. This was a better approach, but eventually I needed Photoshop so I could combine the cylinder with the correct background (see the image to the right).

Next: Manny on vacation.

visit the Gallery of Movies, Book and Magazine Covers


Feb 21, 2023

What are the Rules?

Figure 1. OpenArt community
 For the past month (start here), I've been experimenting a cloud-based CLIP-guided image generating system (WOMBO Dream) as a tool for helping me make illustrations for my science fiction stories. Here in this blog post, I'm going to get serious about trying to discover the rules for making good text prompts!

The image to the right was generated with this text prompt: "This prompt book is brought to you by OpenArt, a platform and community dedicated to AI-native content and written by members from the community". The WOMBO Dream training set of images seems to have been very rich in absurd robot images. I've learned that a request for an image depicting a space alien is likely to result in something like Figure 1. Also, some WOMBO Dream-generated images randomly include text. What if I want to remove that text?

image source
 Inpainting. My first goal was to find the secret of AI-based "inpainting"; the removal/correction of specific parts of images. The interface for the OpenArt Stable Diffusion cloud-based system (image to the left) explicitly includes a "Negative Prompt" field. 

image processing
I tried to get the OpenArt Stable Diffusion software to selectively remove the words (text) from Figure 1

Sadly, as seems to be the industry standard, there are no instructions for how to use the "Negative Prompt" feature.

Figure 2. inpainting fail

 Figure 2 shows the image that I got when I tried to alter the image shown above in Figure 1. Clearly, I need help in figuring out how to efficiently modify AI-generated images either by means of a "Negative Prompt" or by "inpainting".

 The Rules. There are content rules for the OpenArt community. All images are supposed to be "rated G".

 Rule #1. As for sensible rules for making good text prompts, I suppose Rule #1 is: earlier items in a text prompt are supposed to be given greater priority than items that come towards the end.

Figure 3.
To test this idea of priority being given to the linear order of items in a text prompt, I used: "science fiction, a red humanoid alien from outer space, a woman in the left side of the scene, they are looking at each other, handshake, exoplanet background" (see Figure 3). I was now using WOMBO Dream as my Stable Diffusion user interface. I'll anthropomorphize this software by calling it "Mr. Wombo".

handshake
Based on Figure 3, you might wonder just how much the Stable Diffusion database was loaded up with sample images of handshakes. The image to the right was generated by: "people shake hands, handshake, a man, a woman, they are looking at each other". 

Figure 4.

 All Aliens Club. I changed the order of the words for my robot handshake scene to: "science fiction, handshake, a red humanoid alien from outer space, a woman in the left side of the scene, they are looking at each other, exoplanet background" and got my alien handshake (Figure 4). The poor woman was left out of it.

Figure 5. blond woman

I tried to re-emphasize the woman by adding more information about her: "science fiction, handshake, a red humanoid alien from outer space, a woman in the left side of the scene, they are looking at each other, the woman has long blond hair, exoplanet background". 

I've previously concluded that this software does not know left from right and the "prompt book" says nothing about using the terms "left" and "right" in text prompts.

I was impressed by the proliferation of people in Figure 5. This image was made by the pre-defined WOMBO Dream Flora v2 style.

not HDR
 Modifiers. There is a large collection of image "modifiers" here.

The "modifiers" HDR, UHD and 64K seemed interesting based on what I read in the Openart prompt book. However, I suspect that "modifiers" such as these are already included in WOMBO Dream "styles" such as "Realistic v2".

Working with the "Soft Touch" style of WOMBO Dream, I used the "Two women, Dana Scully, Monica Reyes, The X-Files, alien biology research laboratory, Monica in a white lab coat, Monica has black hair, Dana wears a blue jumpsuit, alien body parts" (see the image to the left).

with HDR
I then tried, "HDR, UHD, 64K, Two women, Dana Scully, Monica Reyes, The X-Files, alien biology research laboratory, Monica in a white lab coat, Monica has black hair, Dana wears a blue jumpsuit, alien body parts, alien body parts", but I see no difference (image to the right).

Figure 6. OpenArt not HDR
I tried this same comparison at OpenArt Stable Diffusion and got similar results (Figure 6).

setting used for Figure 6
The settings used for Figure 6 are shown to the right.

The image run with "HDR, UHD, 64K" in the prompt are shown in Figure 7.

Figure 7. OpenArt with HDR.
Again, I don't see a difference between Figure 6 and Figure 7. Since the instructions suck, I'm going to guess that "HDR, UHD, 64K" is an attempt to specify a greater weight on training images that were tagged as high resolution images.

Alternatively, there might only be one trained neural network, and "HDR, UHD, 64K" is an attempt to modify the "diffusion" process and bias it towards generating higher resolution images.

Figure 8. 50 steps
I tried increasing the number of steps for image production and that made no difference either (see Figure 8).

Another possibility is that since the Stable Diffusion algorithm was trained on 512 x 512 pixel images, the "HDR, UHD, 64K" might only effect larger generated images.

Figure 9. full sized image here
By playing around with Mr. Wombo and using Photoshop, I was able to generate the image shown in Figure 9.

Figure 10. HDR, 50 steps
I fed the Figure 9 image into the OpenArt system and got Figure 10. This was using their default of 75% for the "strength" if my reference image.

gang of three
The image to the left shows Mr. Wombo's preference for groups of three, not two. This image was with the "Buliojourney v2" style. This was generated by text prompt and without a reference image. 

I'm tempted to invent a story for the image to the left. I don't know who the third woman is, but maybe this is simply waiting for a stall in the hospital's restroom. A more interesting story line would be if they were waiting for the results of an alien DNA analysis.

I then put the Figure 9 image into "Buliojourney v2" style as a reference image.

original text prompt
add "alien heads" to start
With the original text prompt, I got the image to the left. After including an addition "alien heads" at the start of the text prompt, I got the image to the right.

Sometimes I wonder if WOMBO Dream devotes more processing steps to a job if you keep repeating it.

Mr. Wombo has three levels of dependency on reference images; weak normal and strong. Examples for "weak" and "strong" are shown below.

weak image-to-image
strong image-to-image
The image to the left really emphasized the "alien heads" part of the text prompt. The image to the right fairly accurately reproduced the reference image (Figure 9).

Below, in its entirety, are the "instructions" for "inpainting" that are in the OpenArt prompt book. I put "instructions" in quotes, because they don't really explain how to do inpainting with their software. This is the industry standard for crappy instructions.

"instructions" for "inpainting" that are in the OpenArt prompt book.
Zombie Leyla Harrison?
transparent
One of the images from Mr. Wombo with three people, not two, is shown to the left. I tried this with Mr. Wombo and a text prompt saying: "Dana Scully, Monica Reyes, The X-Files, alien biology research laboratory, Monica in a white lab coat, Monica has black hair, Dana wears a blue jumpsuit".

I could not figure out how to get Mr. Wombo to do this inpainting. However, working with the OpenArt interface and putting "inpaint and do not show a third person" in for the "Negative Prompt" and using the offending part of the image made transparent, I got the result in Figure 11.

Figure 11. gone Leyla
This little bit of inpainting resulted in the longest lab coat in history, but it basically worked.

setting for inpaint example
The OpenArt interface settings that I used for this example of inpainting are shown to the right. 

I changed the "seed" from 17 to 23 and got a somewhat different scene (Figure 12, below), now with a return of black hair and with Dana the redhead closer to the camera, blocking our view of the lower parts of Monica.

Figure 12. seed changed to 23

In Figure 12, we don't have to look at an amazingly long lab coat. 

redhead cloning project
I could not figure out how to get Mr. Wombo to do inpainting, but I did get the image to the right in which the irregular transparent "Leyla area" was replaced by a white rectangle. I edited the AI-generated image to add some text to the white area.

I guess Mr. Wombo really likes redheads, even when I specify "black hair".

Next: changing the ages of characters

visit the Gallery of Movies, Book and Magazine Covers