Jan 11, 2025

Portrait of a Wysken as a Young Alien

Figure 1. Generated by Whisk.
 Using a hint from the "Google Labs" Discord server, I finally got Whisk to generate an image with a portrait aspect ratio (Figure 1, to the right). Previously, when I selected "portrait" or "square" from the Whisk aspect menu, nothing ever happened. What seemed to work for me was to first enter my subject image (Figure 2) and scene image (Figure 3) and then use the aspect menu (shown in Figure 5, below) to first select "square" and then select "portrait".

Figure 2. Glass subjects.
For the "subject" image that was uploaded to Whisk, I used an image that was generated by Mr. Wombo and used in a blog post called "3 Problems " in June of 2024 (Figure 2). Here is the Whisk-generated description of that image: "Three translucent, female-appearing figures stand in a shallow pool of light blue liquid.  The figures are sculpted to appear anatomically correct, with visible musculature and skeletal structures.  They are rendered in shades of light and dark blue, with hints of purple in the central figure.  The liquid appears to be splashing around their feet and legs.  The figure on the left has a wave of the same liquid forming a head of hair around her head.  

Figure 3. Fractal scene.
 The central figure has a tall, slender, clear glass structure atop her head. The figure on the right has a wider, shorter, clear glass structure atop her head.  All three figures have similar facial features, with pale skin tones and neutral expressions.  Their hair is sculpted as part of the liquid effect.  A clear glass beaker is partially submerged in the pool of liquid in front of the figures.  The liquid in the beaker is the same light blue as the pool.  In the background, a blurred-out laboratory setting is visible, with metal structures and glass containers.  The overall lighting is cool and somewhat dim".

For the "scene" image that was uploaded to Whisk, I used an old image from 2020 (Figure 3). Note: there are some "floating golden metal shards" towards the bottom of Figure 3. I instructed Whisk to create a storyboard using these two "subject" and "scene" images and I included this text instruction: "Three transparent female humanoid aliens are preforming an anti-gravity experiment with floating golden metal shards".

Generated by Mr. Wombo.
Here is how Whisk described the "scene" image (Figure 3): "A digital fractal image depicting a complex, three-dimensional structure resembling an organic, floral, or coral-like formation. The image is dominated by a multitude of circular and ovular shapes, varying in size and color.  These shapes are densely clustered together, creating a sense of depth and texture.  The color palette is rich and varied, with shades of orange, yellow, green, blue, pink, and purple prominently featured.  Many of the shapes have a textured, almost spiky or hairy appearance, giving the overall image a sense of movement and energy.  There are areas of smoother, more flowing forms, particularly in the lower central portion of the image, which contrast with the more textured areas.  These smoother areas are predominantly pink, purple, and blue, and have a pattern that resembles stylized floral or paisley designs. 

Generated by Mr. Wombo.
 There are also elements that resemble golden, sharp, angular shapes scattered throughout the image, particularly in the lower portion.  The overall lighting suggests a soft, diffused light source, with subtle highlights and shadows that enhance the three-dimensionality of the structure.  The image has a slightly aged or textured appearance, as if it were painted on a slightly rough surface." I was pleased that Whisk noticed the "golden, sharp, angular shapes".

For the style I requested: "The overall style has a high degree of photo-realism, with a focus on rich colors and intricate details. The colors are vibrant and saturated. The image is sharp and clear, with a high level of detail visible in the subjects and the scope of the scene. The overall mood is futuristic evoking a spirit of adventure and wonder. The subjects are depicted photo-realistically. The level of detail is consistent throughout, creating a cohesive and immersive scene"

SSS9
Whisk changed the style to, "The image is a digital painting, rendered in a style reminiscent of concept art for science fiction video games or films.  The color palette is predominantly warm, utilizing earthy tones of browns, tans, and muted greens, contrasted by the cooler blues and grays of the sky and distant landscape.  The lighting is soft and diffused, creating a sense of depth and atmosphere without harsh shadows.  The overall rendering style is detailed but not photorealistic, with a slightly painterly quality to the textures and surfaces.  The level of detail is consistent throughout, with a focus on creating a believable and immersive environment.  The style leans towards a slightly stylized realism, avoiding overly sharp lines or overly smooth surfaces," and used the subject, scene and style to generate the image that is shown in Figure 1. The style image is shown both in the third panel of the image to the left and in Figure 4, below. I believe there is currently a bug in Whisk that places the Gemini-generated text description of the storyboard image into the text description of the style image, replacing the user-provided style description. Once again, I was dismayed by Whisk's endless efforts to include guns in the "style" images that it generates (Figure 4).

Figure 4. Whisk-generated style image.

Figure 5. The Whisk user interface displaying a an image with a portrait aspect ratio.

Generated by Mr. Wombo.

As shown in Figure 5, although Whisk actually generate an image with a portrait aspect ratio, it displayed that image inside a larger plain white rectangle of landscape aspect ratio proportions. Also shown in Figure 5 is the additional text prompt that I provided to Whisk, suggesting that the generated storyboard image depict, "The transparent humanoid aliens and an anti-gravity experiment with floating golden metal shards".  

The image to the left began with a Whisk-generated "transparent woman" image that was then used as a reference image for WOMBO Dream. Both Whisk and Mr. Wombo have a hard time making transparent faces.

I forced Whisk to return to my requested "photorealism style", but then Whisk began using the landscape aspect ratio again (see Figure 6).

Figure 6. Defaulted back to landscape aspect ratio.
Generated by Mr. Wombo.

Here is the Whisk-generated description of Figure 6: "Photorealistic image, vibrant and saturated colors. Three transparent, female-appearing humanoid aliens, with pale skin tones and neutral expressions, perform an anti-gravity experiment within a complex, three-dimensional fractal structure resembling organic coral.  The structure is predominantly orange, yellow, green, blue, pink, and purple, with textured, spiky, and hairy shapes densely clustered.  Smoother, flowing pink, purple, and blue areas with stylized floral patterns are present. Golden, sharp, angular shapes are scattered throughout. The aliens stand in a pool of light blue liquid, splashing around their feet and legs. 

Generated by Mr. Wombo.
  The alien on the left has a wave of blue liquid forming hair; the central figure has a tall, clear glass structure atop her head; the right figure has a shorter, wider glass structure.  Floating golden metal shards surround them, defying gravity.  The lighting is soft and diffused, highlighting the three-dimensionality of the fractal structure.  The overall mood is futuristic, evoking adventure and wonder.  High level of detail is consistent throughout."

When I look at Whisk-generated images such as the one shown in Figure 6, I wonder what the AI is attempting to do while illustrating "humanoid aliens". Maybe these aliens can "perform an anti-gravity experiment" while having their arms hang at their sides. One of my greatest problems with AI-generated images is an endless struggle to get images of human (or alien) figures who are doing something interesting rather than just standing around like statues.

Figure 7. Reaching out for the gold.
Generated by Mr. Wombo.


I tried setting the Whisk aspect ratio back to "portrait" and provided Whisk with this additional text added to the prompt: ''The transparent female humanoid alien is reaching out and grabbing on to one of the floating golden metal objects." The resulting image is shown in Figure 7. The aspect ratio remained stuck in the landscape orientation.

I was pleased that the "alien" figure on the right side of Figure 7 now seemed to be reaching up for one of the golden crystals, but Whisk was no longer paying any attention to the idea of illustrating an anti-gravity experiment. 😔

I instructed Whisk to alter the image in Figure 7 so as to make the humanoid aliens look more alien. Whisk then generated the image that is shown in Figure 8.

Figure 8. Whisk-generated humanoid aliens.
Generated by Mr. Wombo.

After seeing Figure 8, I decided to try a different experiment using four subject images like the one that is shown to the right combined with another of my old fractal scenes from 2020. The resulting Whisk-generated storyboard is shown in Figure 9.
Figure 9.

Generated by Mr. Wombo.

Whisk went ahead and put fairly conventional space helmets on the subjects (Figure 9). I wish that Whisk had found a way to make it look like the human figures were emerging from a river of molten lava. 

I then asked Whisk to alter the image so as to make the human figures float like ghosts and Whisk generated the image that is shown in Figure 10.

Figure 10. Floating through the tunnel.

Figure 10. Pest control.
Generated by Mr. Wombo.
I had Whisk generate an alternative storyboard where the floating aliens had laser guns and I asked for a depiction of laser beams having been fired down the tunnel (see Figure 10, above). In the original Whisk-generated version of Figure 10 there were two small human figures near the bottom of the image and it looked like the blue aliens were using their laser beams to attack the humans. I had Mr. Wombo generate images of a total of four humans that I pasted into the scene, one corresponding to each of the four blue aliens.

SSS10
Now with those small human figures on the floor of the tunnel, I wanted to add something near the roof of the tunnel. For this new "subject", I used the bio-mechanical object that is in the top panel of the image to the left. I used Figure 10 as the new "scene" image for Whisk. Whisk generated a new storyboard that is in Figure 11.

Figure 11.  A flying red drone.
 Turning the Table. I used WOMBO Dream to make modifications to another Whisk-generated storyboard image that was similar to the one shown in Figure 11. Mr. Wombo made more precise and detailed depictions of the human figures and the floating machine that is seen against the bright light of the far end of the tunnel. The new resulting Whisk-generated image is shown below in Figure 12.
Figure 12. The drones are defending the humans.
Generated by Mr. Wombo.

I think Whisk got carried away when giving the blue aliens what looks like clubs (Figure 12, above), but maybe that makes for a more dramatic scene with the mechanical drones still able to protect the humans.

When I instructed Whisk to, "Make the blue floating female transparent aliens look more alien, " Whisk generated the new storyboard image that is shown in Figure 13.

Figure 13.
Unlike what Whisk did for Figure 8 (above), the blue aliens in Figure 13 do not look significantly more alien than those in Figure 12.
SSS11

Whisk finally allowed the dark human figures to hold objects and defend themselves against the blue aliens. I have no idea why Whisk changed the floating red drone of Figure 12 into a crawling bug for Figure 13.

Since Whisk seemed totally devoted to placing the blue aliens in water, I provided Whisk with two new blue alien "subject" images that had been generated by Mr. Wombo and a generic image of a hot tub as the new "scene" image (see the image to the left). Whisk generated the scene below (Figure 14).

Figure 14. Aliens in a hot tub.

Sadly, Whisk prefers to have people and aliens just stand around with their arms at their sides. I told Whisk to, "Make the subjects have a more alien appearance and a less human appearance, the subjects are using their hands to splash water from the tub on each other, the subjects are laughing and having a water fight." 

alien 1
alien 2
Even with these new instructions for Whisk, once again I did not achieve an alien appearance that seemed significantly different (compare Figure 14 to Figure 15). However, I was amused by the splashed water, so I stopped massaging this aliens in the hot tub "storyboard".

Figure 15. Aliens splashing in a hot tub.

Next: testing Whisk on golf.

Visit the Gallery of Movies, Book and Magazine Covers

No comments:

Post a Comment