Jan 5, 2025

Making Wysken

On Tar'tron (see Making Vendela)
 According to Gemini, the English language word "whisk" can be traced back to the Middle English word "wyske". It is possible that the plural of "wyske" was "wysken". In my previous blog post, I documented my first experience using Google's Whisk software to generate images. Here in this blog post, I describe my first efforts to use Whisk as a tool for making what I'm tempted to refer to as "wysken", my silly name for images that can illustrate my science fiction stories.

Note: the numbers used for the figures (below) continue from those used in the previous blog post (1-32).

Figure 33. Tyna as a subject.
 Wysken, Take Three. When I create images for my science fiction stories, I often have a vision in my mind for a setting on a distant exoplanet. I continually struggle with AI-image generating systems to find ways to produce images as story illustrations that match what is in my imagination. For the rest of this blog post, I'm going to try to use Whisk to generate illustrations for two of my science fiction works in progress, 1) a story called "D*" which I started in December 2023 and 2) The Nanites of Love, which I started in 2022 and have not yet been able to complete. I'm current working on Part 6 of "D*", which involves a character named Tyna who has visions of the future. These visions arrive by way of the Sedron Time Stream and often Tyna does not understand the relationship of the vision to her life.

Figure 34. Exoplanet alien life.
That's Tyna in Figure 33, with some blue alien creatures lurking behind her. The images in Figure 33 and in Figure 34 were both generated by Wombo Dream. To generate Figure 34, I used this text prompt: "colorful plants on a hillside, the hill is covered by plants, a creature that looks like an alien dinosaur on the savanna of an exoplanet, alien plant life" and the "Monster V3" style. For the Whisk style, I provided this text prompt: "The overall style is reminiscent of magazine cover illustrations, with a focus on rich colors and intricate details. The colors are vibrant and saturated. The image is sharp and clear, with a high level of detail visible in the subjects and design of the scene. The overall mood is futuristic evoking a spirit of adventure and wonder. The subjects are depicted photo-realistically. There is fractal complexity in the living plants and animals of the scene. There is photo-realistic detail in the hair and clothing of the subjects".

Figure 35. Whisk-generated style image.
The Imagen-3 software generated image that was automatically generated by Whisk to illustrate that requested style is shown in Figure 35. I was rather startled to see two dudes with blasters in this "style image". I almost never have weapons or violence in my stories and this is a good example of how AI software often forces unwanted elements on me. I was amused that Imagen 3 included some text in the Whisk-generated image. The two Whisk generated images for this "subject", "scene" and "style" are shown below in Figure 36.
Figure 36. The two similar "storyboards" that were generated for my third Whisk experiment.

SSS#3
Here is the description of these images that was also generated by Whisk: "A digital painting in a painterly style, featuring a muted, earthy palette dominated by greens, browns, and muted yellows. Soft, diffused lighting creates depth and atmosphere. The style is reminiscent of fantasy illustration, with a focus on detail and texture.  A young woman with long, wavy blonde hair and fair skin, wearing a light beige blazer, stands looking off to the side with a serious expression. Behind her, partially obscuring her shoulders, are two blue dragons, their scales and textures highly detailed, but rendered in muted greens and browns. Their eyes are a dull yellow. The dragons and woman are positioned against a background of a vibrant, but desaturated, blue, purple, and orange dinosaur standing on a gently sloping hill. The dinosaur's spiky protrusions and fantastical plants are rendered in muted purples, pinks, yellows, and oranges. Purple mountains and a pale blue celestial body are visible under a bright, but desaturated, teal sky. The overall mood is mysterious and slightly ominous, with a vintage or nostalgic feel. The rendering is highly detailed, with a focus on realism in the depiction of natural elements."

Is that a worm?
Here is the Gemini-generated description of the "subject" image that I uploaded to Whisk (see Figure 33, above): "A painting of a young woman with long, wavy blonde hair and fair skin. She is wearing a light beige blazer. She is looking off to the side, with a serious expression. Behind her, partially obscuring her shoulders, are two blue dragons with yellow eyes and sharp teeth. The dragons are highly detailed, with scales and textures clearly visible. The background is a plain off-white. The style is realistic, with a focus on detail and light. The overall mood is mysterious and slightly ominous".

I used the right hand panel from Figure 36 as a reference image for Mr. Wombo and got the image that is shown to the left. I was amused that Mr. Wombo seemed to show the alien creature bursting up from below the ground.

Figure 37. A meaner alien
has Tyna worried (Wombo).
And here is the Gemini-generated description of the "scene" image that I uploaded to Whisk (see Figure 34, above): "A vibrant, colorful dinosaur, predominantly blue, purple, and orange, stands prominently in the foreground.  Its body is adorned with spiky, brightly colored protrusions along its back and head. The dinosaur's mouth is open, revealing sharp teeth.  Its legs are powerful and clawed. The dinosaur is positioned on a gently sloping hill, covered in a variety of brightly colored, fantastical plants. These plants range in color from deep purples and pinks to bright yellows and oranges, with various textures and shapes.  The background features a landscape of purple mountains under a bright, teal sky.  A pale blue celestial body, possibly a moon, is visible in the upper left corner of the image.  The overall style is fantastical and surreal, with an emphasis on bold colors and vibrant textures.  The lighting suggests a daytime scene, with the sun seemingly positioned behind the viewer".

Generated by Mr. Wombo.
I'll confess that I like the results in Figure 36 because for my story illustrations, I'm often trying to reproduce the kinds of science fiction book cover illustrations that I grew up enjoying back in the 1970s during my personal Golden Age of discovering Sci Fi. At the "Google Labs" Discord server, I saw a user comment on how sensitive Whisk is to small changes in the "style". I tried to understand why Whisk created Figure 36 in a "painterly style". Gemini decided that my "subject" image (see Figure 33, above) was "A painting of a young woman". Maybe Gemini's interpretation of that image of Tyna as being a painting caused Whisk to render Figure 36 in a "painterly style". I tried editing out the "painterly style" from the text description. 

Generated by Mr. Wombo.
My new description: "A photo-realistic depiction of an exoplanet, featuring an earthy palette dominated by greens, browns, and yellows. Soft, diffused lighting creates depth and atmosphere. The style is reminiscent of science fiction illustration, with a focus on detail and texture.  A young woman with long, wavy blonde hair and fair skin, wearing a light beige blazer, stands looking off to the side with a serious expression. Behind her, partially obscuring her shoulders, are two blue alien creatures, their scales and textures highly detailed, but rendered in muted greens and browns. Their eyes are a dull yellow. The blue creatures and the woman are positioned against a background of a vibrant, but desaturated, blue, purple, and orange dinosaur standing on a gently sloping hill. The dinosaur's spiky protrusions and fantastical plants are rendered in muted purples, pinks, yellows, and oranges. Purple mountains and a pale blue celestial body are visible under a bright, but desaturated, teal sky. The overall mood is mysterious and slightly ominous, with a vintage or nostalgic feel. The rendering is highly detailed, with a focus on realism in the depiction of natural elements".

Figure 38. The original images from Figure 36 were updated to be more photo-realistic and science fictionish.

Alien first contact.

I was particularly intrigued by the right hand panel in Figure 38. In my imagination, there are now two bipedal aliens in front of Tyna in this image. There is a larger version of that image that was slightly processed so as to enhance the colors (Figure 39, below).

 Alternate science fiction cover illustration. The image shown to the right is a related image that was generated by WOMBO Dream and then manually turned into a book cover illustration by me. In the original image generated my Mr. Wombo, there was what looked like an observation tower on top of the mountain. I slightly enlarged that tower and added a red beacon and a red laser beam aimed into outer space from the mountain top.

Figure 39. Generated by Whisk; enlarged from Figure 38, color adjusted.
Figure 40. A talking alien.

There seems to be a mysterious darkened hemisphere towards the left side of the left hand panel in the  image shown in Figure 38, above. In my experience, AI image generators often cannot resist placing multiple moons and planets in the sky if the word "exoplanet" is mentioned. My guess is that the darkened hemisphere was almost another moon.

 Talking alien. I could not resist making a version of the "First Contact" book cover illustration in which the alien has an open mouth and I can imagine that this sentient alien is talking to Tyna (see Figure 40, the image to the right). 

 Take Four. In Part 3 of my science fiction story The Nanites of Love, there is a visit to the Erre District on the planet Tar'tron, near the Galactic Core. I asked Gemini to generate an image depicting a:

Figure 41. Image generated by Gemini.
 "science fiction scene on an Earth-like exoplanet called "Erre" that is only 5,000 light-years from the center of our galaxy. Imagine an outdoors marketplace on Erre at night, with the bright stars of the galactic core glowing above in the sky. The market has a disorganized maze of stalls and shops where various oddments of futuristic technology are on display". 

Here is my text prompt for the new Whisk "style": "The overall style is reminiscent of magazine cover illustrations, with a focus on night-time colors and intricate details. The colors of the subjects and scene are vibrant and saturated, but viewed under the dim illumination of night. 

Figure 42. Whisk's image illustrating the requested style.
 The image is sharp and clear, with a high level of detail visible in the subjects and design of the scene. The overall mood is futuristic evoking a spirit of adventure and wonder. The beautiful subjects are depicted photo-realistically. There is fractal complexity in the market place stalls of the scene, the walls of buildings and the tiled floor. There is photo-realistic detail in the hair and clothing of the subjects".

Figure 43.
Gemini's description of subject #1 (she is shown in the top panel of the image to the left, generated by Mr. Wombo.): "A digital painting of a young woman in profile view, facing left.    The woman has long, bright blue hair styled in a way that suggests a ponytail pulled back from her face. A silver metallic band or headband is visible in her hair, near the top of her head. Her skin tone is fair, almost porcelain-like, and her eyes are a light blue. Her lips are painted a light pink or rose color. She appears to be wearing a garment with a gold and light blue patterned sleeve or shoulder piece, which has a shimmering or sparkly texture. The background is blurry and dark, with hints of blue and purple tones, suggesting a nighttime or futuristic setting.  The overall style is highly stylized and painterly, with a focus on smooth gradients and soft lighting." The uploaded image for subject #2 (see the second panel in Figure 43) was also described as a "digital painting" by Gemini.

Gemini's description of the uploaded scene (the image was generated by Mr. Wombo, Figure 43): "A digital painting depicts a cobblestone street at night, under a vibrant Milky Way galaxy.    The street is lined with stalls or shops on either side. The stalls are dark, with some having illuminated signs or displays. The stalls on the left side of the street appear to be lit with a cool, bluish light, while the stalls on the right are lit with a warmer, yellowish light. The stalls appear to sell various items, including what looks like technological devices and artifacts.   Three figures are visible in the distance walking down the street. One figure is in the center, wearing a long, dark robe. The other two figures are slightly behind and to the sides of the central figure, also appearing to wear dark clothing.    

Generated by Mr. Wombo.
 Two additional figures are visible closer to the viewer. One is standing near a stall on the left, appearing to be a statue-like figure with light-colored skin and what appears to be a robe or cloak. The other figure is also near a stall on the left, appearing to be a slender figure with light-colored skin.   The buildings lining the street are tall and dark, with pointed tops, giving the scene a medieval or fantasy feel. A tall, slender tower is visible in the background, also dark in color.    The Milky Way is a prominent feature in the sky, with its bright, swirling colors contrasting with the dark buildings and street. The stars are numerous and clearly visible against the dark sky. The overall color palette is dark, with cool blues and purples dominating the left side of the street and warmer oranges and yellows on the right. The cobblestones are a mix of dark and light tones." One of the two Whisk-generated "storyboards" is shown below...

Figure 44. A Whisk-generated storyboard created using the input that is shown in Figure 43.

Generated by Mr. Wombo.
Here is Gemini's description of the Whisk-generated storyboard shown in Figure 44: "A painterly digital painting in a warm, slightly desaturated fantasy art style.  The scene depicts a cobblestone street at night, illuminated by the vibrant Milky Way.  Warm oranges, reds, and yellows dominate the lighting from the market stalls, contrasting with cooler blues and purples in the shadows and background.  Two young women stand on the street. One has long, bright blue hair styled in a ponytail, fair skin, light blue eyes, and rose-colored lips. She wears a shimmering gold and light blue garment. The other has long, wavy purple hair, light skin, blue eyes, and wears a teal robe with gold accents and blue gemstone jewelry.  Her expression is serious.  The street is lined with dark stalls selling technological artifacts, lit with cool blue and warm yellow light.  Tall, dark buildings with pointed tops line the street, and a slender tower is visible in the distance.  Three figures in dark robes walk down the street in the distance. Two additional light-skinned figures stand near stalls on the left.  Visible brushstrokes and varied textures create a sense of depth and atmosphere."

Generated by Mr. Wombo.
I tried editing the AI-generated description to make this more of a science fiction setting: "A photo-realistic, slightly desaturated science fiction art style.  The scene depicts a futuristic tiled street at night, illuminated by the many bright stars that surround this exoplanet near the center of the galaxy.  Warm oranges, reds, and yellows dominate the lighting from the market stalls, contrasting with cooler blues and purples in the shadows and background.  Two beautiful young women are seen shopping for a new digital language translation device. One cute woman has long, bright blue hair styled in a ponytail, fair skin, light blue eyes, and rose-colored lips. She wears a shimmering gold and light blue garment. The other pretty girl has long, wavy purple hair, light skin, blue eyes, and wears a teal robe with gold accents and blue gemstone jewelry.  Her expression is serious.  The street is lined with dark stalls selling technological artifacts, lit with cool blue and warm yellow light.  Tall, dark buildings with pointed tops line the street, and a slender tower is visible in the distance.  Three figures dressed in futuristic metallic jumpsuits walk down the street in the distance. Two additional light-skinned figures stand near stalls on the left.  The photo-realistic depiction of the two human subjects and varied textures create a sense of depth and atmosphere". The updated storyboard image is shown below in Figure 45.

Figure 45. Whisk-generated; updated storyboard, more of a science fiction scene.
Figure 46. By Mr. Wombo.

The Whisk-generated "figures dressed in futuristic metallic jumpsuits" in Figure 45 are rather strange and not what I was expecting. In my experience, Ai image generating software tends to trot out some pretty lame "standardized" versions of robots and aliens and insert then into images, regardless of what users actually want. It is often hard work to avoid these powerful attractors. I had Mr. Wombo generate the alternative jumpsuits shown in Figure 46.

I tried to have Whisk, "Change the human figures in the background to dress them in flashy metallic jumpsuits, walking Victoria's Secret catwalk style," but Whisk then entirely changed the entire scene as shown in Figure 47.

Figure 47. By Whisk; flashy metallic jumpsuits, catwalk style.



From the new Gemini-generated description of Figure 47: "Three figures in flashy metallic jumpsuits walk down the street, Victoria's Secret catwalk style.  Two additional figures stand near stalls on the left." I had to re-edit the storyboard description in Whisk once again (as above, for Figure 45) and got the updated storyboard that is shown below in Figure 48.

Figure 48. Whisk generated two figures in metallic jumpsuits (now on the right hand side).

futuristic device by Mr. Wombo.

 This (Figure 48, above) is not too bad for an illustration of the Erre District on Tar'tron. I could not resist making some alternative versions of this scene in which there was a more unusual device being held, such as the one shown in the image to the right (and the image at the top right corner of this blog post). Some other variants on these themes by both Whisk and Mr. Wombo are shown here.

Whisk options.
After playing with Whisk for two days and working through the creation of four "storyboards" (why not call them wysken?), it is clear that Whisk has substantial advantages over working with the free version of Gemini, which still does not like to generate human images.

 To Do: crack the code on the secret of using other aspect ratios for the images that are created by Whisk.

Next

Bonus: AI-generated music made with MusicFx .....

 Music text prompt: "science fiction music composed by a technologically advanced sentient humanoid alien on a distant exoplanet".

Visit the Gallery of Movies, Book and Magazine Covers

Jan 4, 2025

Whisk

An imaginary science fiction novel
called Pharism. image source
 Near the beginning of "Trullion" by Jack Vance, the star-drive that is used by spaceships in Alastor Cluster is casually referred to as "whisk". At labs.google/fx Google now has a generative imagery experiment called "Whisk". Here is how Google describes the Whisk image-generating software: "Behind the scenes, the Gemini model automatically writes a detailed caption of your images. It then feeds those descriptions into Google’s latest image generation model, Imagen 3." During the past year I've been occasionally using Gemini to generate images, but Whisk has a specific formalized system for uploading two reference images ("subject" and "scene") and combining them with a specific "style" (see Figure 1, below).

  Figure 1. The Whisk formula for image creation.

Following Google's link to Whisk, I arrived at the "subject" upload user interface page shown in Figure 2, below.

Figure 2. The Whisk landing page.
I dragged one of my existing AI-generated images (see Figure 4, below) into the Whisk "subject" field and this is what was generated (Figure 3):

Figure 3. After uploading a "subject" image.

 

 

 

 

 

 Note: the first time I tried to use Whisk I got the standard "login with Google" prompt. That's my Google user icon in the upper right corner of Figure 2.

Figure 4. A "subject" reference
image by Mr. Wombo.

The "subject" image (Figure 4) that I uploaded (drag and drop) into Whisk was an old AI-generated image that was made by Mr. Wombo (WOMBO Dream) back in 2024.

 Excremental. My guess is that the hideous yellow background color that was selected for the Whisk user interface is intended to be a color that most people would not include in their own reference images. So although there might be a good reason for using such a bizarre color, this strikes me as a choice that is just as bad as using the name "Bard" for a chatbot. After releasing the Bard chatbot, Google quickly changed the name to Gemini. I won't be surprised if Google switches Whisk to a new background color that is easier on the eye.

Figure 5. Add image.
There is a mysterious "add your own image" field in the Whisk interface (Figure 5), but when I used that to upload a "subject" image, Whisk did not process it; the uploaded image was just left there doing nothing (see Figure 6).

Figure 6. After upload using the
"ADD YOUR OWN IMAGE" field.
I suppose that since this is all just an "experiment", we should not expect much from the Whisk user interface. However, I find "drag and drop" to be very annoying. Fortunately, you do not have to drag and drop where it says "DROP AN IMAGE HERE". If you click there, then you can use a standard file selection dialog window to select and upload your image.

Figure 7. Two subjects from one upload.

Here is what the Whisk FAQ says about "subject" reference  images: "That’s what the image is about! Character, objects or a combination of such. An old rotary phone! A cool chair! A cardboard movie display. A mysterious renaissance vampire." When I uploaded the image from Figure 6 into Whisk, it generated two new "plushies" (see Figure 7).

Figure 8.
Another "subject".

The image that I uploaded into the "ADD YOUR OWN IMAGE" field is shown in Figure 8. As can be seen in Figure 7, the AI-generated "plushie" came with a black shirt (that shows her belly button), blue pants and blond hair, all correctly matching the reference image in Figure 8. I'm not sure how Whisk selected the hair color/style for the "vampire" plushie (to the right in Figure 7). Maybe the hair for the vampire was "borrowed" from the lady in Figure 8. It does look like the plushie for the tall blond is slightly taller than the vampire plushie.

Figure 9. Enamel pin example.
Currently, there are only three "styles" available in the drop-down menu. An example with the "enamel pin" style is shown in Figure 9. I uploaded the plushie from the left side of Figure 7 as a new reference image and Whisk converted it into a "pin".

Figure 10. The "tool" interface.
As seen in Figure 9, as soon as Whisk generates a new image, there is a new button: "OPEN IN TOOL". Apparently that "tool" is the "real" user interface. The interface shown in Figure 2 is apparently just a simplified "starter page" for new users. When in the "tool" user interface (see Figure 10) there is a tab on the left that allows you to add a "scene", either by providing a text prompt or by uploading another image. At this point, I tried to change the hair of the vampire with this text prompt: "Change the vampire subject's hair color to black hair."

Not shown in Figure 10 is the fact that Whisk generates two versions of each image. 

Figure 11. Two edited versions of the plushies.  Click on image to enlarge.
The image in Figure 11 shows the two Whisk-generated plushie images side-by-side (upper row) and also the two edited versions (lower row) in which the vampire was given black hair. There were also some modifications to the belly button plushie: she was allowed to stand up and she was given just the slightest hint of cleavage.

Figure 12. Whisk image buttons; upper left, REFINE. Upper right, delete.
I then uploaded a background image for the "scene", but nothing happened (see Figure 15, below). Shown in Figure 12 are the buttons that appear when you hover over an image (this Whisk-generated image originated from Figure 3, above). Clicking on the "REFINE" button zooms you in on that one image and allows you to apply a text prompt that will modify the image.

13
Clicking on the other button (the icon looks like a printed page and a pencil) that is just to the right of the "REFINE" button, gives you access to a Gemini-generated text description of that image. 

No scene.
Gemini-generated image description example for Figure 12: "A photograph of a chibi plushie of a young woman with shoulder-length, wavy blonde hair and fair skin.  The plushie is made of soft, cuddly fabric with soft, button eyes and a friendly expression. She wears a light gray, short-sleeved crop top and a dark gray, short A-line skirt. A dark brown belt is cinched around her waist. The plushie is sitting on a table, centered and uncropped against a plain white background. The lighting is even and soft. The background is a detailed depiction of a futuristic spaceship interior, rendered in a chibi style. The walls are dark blue and metallic, with various technological elements such as control panels and screens, all appearing soft and plush. A large circular window with orange accents is visible. A tall, slender metallic structure extends from the floor to the ceiling. The floor is dark with glowing lines.  The overall aesthetic is soft, cuddly, and friendly, with a focus on the plushie's features and the spaceship's details rendered in a similarly soft and plush style." (Note: the free version of Gemini refused to generate an image when provided with this text prompt, saying: "Generating images of people is only available in early access with Gemini Advanced".)

Figure 13. Whisk's pink hotdog fingers (and buns).
That Gemini-generated description of Figure 12 is rather confusing. First it mentions "a plain white background" then says, "The background is a detailed depiction of a futuristic spaceship interior". I tried altering the text description to, "A photograph of a chibi plushie of a young woman with pink hotdog fingers jutting out from the tip of each hand". The edited images are shown in Figure 13. I was thinking of "Everything Everywhere All at Once".

Figure 14. New hotdog fingers.

I tried to edit ("refine") with this text prompt" "Change the subject's fingers to look like the hotdog fingers of Michelle Yeoh in the film "Everything Everywhere All at Once", but Whisk refused to make the edit. Suspecting that mention of a real world person was blocking the edit, I tried altering the text prompt to: "Change the subject's fingers to look like the hotdog fingers in the film "Everything Everywhere All at Once". Whisk then generated the altered image that is shown in Figure 14. (Note: some AI-generated (WOMBO Dream) images of Michelle Yeoh with hotdogs are shown here.)

15
Returning the mystery of how to get the Whisk software to combine the subject and scene images according to the specified style, I eventually realized that you have to click on the arrow icon. I really despise the "modern" software user interface where there are meaningless icons, some of which are hidden until you know the magic trick to get them to even appear on the screen.

Figure 16. Alien vampire hunter "storyboard". In my imagination, the lady on the left is an alien.
Figure 17.


Before letting Whisk generate the image shown in Figure 16, I first edited the style by providing this text prompt: "A photograph of the subject as a beautiful alien creature from outer space. The alien creature is similar in appearance to a human female. The alien creature is tall and slim". The subject, scene and style that were used to generate Figure 16 are shown in Figure 17. Notice that the woman with red hair and a red dress that was in the original "scene" image got included in Figure 16. It is particularly amusing that the vampire is sunk into the ground. 

Here is Gemini's description of Figure 16: "A photograph of two alien creatures, styled as beautiful extraterrestrial beings, against a backdrop of a miniature scene.  The top creature is a tall, slender, fair-skinned female with a visible abdomen, wearing a black, open jacket and low-rise jeans. The bottom creature has pale skin and black, pointed ears, wearing a wide-brimmed hat adorned with pink and red roses and a dark burgundy jacket.  Dark eyeshadow and lipstick accentuate their serious expression, and small, pointed fangs are visible.  

Figure 18. The original Whisk-generated
image (from Figure 16) was modified by Mr. Wombo.
 Both figures are positioned in front of a dark, textured background featuring stylized, bioluminescent blue tendrils resembling wet roots or vines.  A miniature figurine of a young woman with light skin and shoulder-length reddish-brown hair, wearing a rust-colored dress, stands facing away in the background. The base is textured mud with small, dark green plants and moss. The lighting emphasizes the creatures against the darker background. The overall aesthetic is a mystical, fantasy scene with an otherworldly feel".

Figure 19. Wombo-generated.

Once again, there are some interesting errors in Gemini's description of the image. Sadly, while Gemini claims that "pointed fangs are visible", I don't see the fangs. Gemini also incorrectly says that the redhead is, "facing away in the background". While my uploaded image did have her facing away, Figure 16 has her facing the camera. Some fangs that were generated by Mr. Wombo are visible in Figure 19, but these small fans are hard to see where I pasted them into Figure 18. Mr. Wombo was even able to generate some nipples for Figure 18, which is rather rare for the free version of Wombo Dream.

20
One of the icons in the Whisk user interface provides a link to the "Google Labs" Discord server. The whole process of getting started with Discord was confusing, because as soon as I put in my email address (you also have to provide your phone number), the software indicated that I already had a Discord account, which I did not (Discord support).

Figure 21.
I've never used Discord before, so when I clicked on that button I was presented with an "invite" to the Google Labs Discord server (Figure 21). In the support pages for discord it shows users how to go to User Settings > Privacy & Safety. However, there is no such tab in my user settings.

Figure 22. Data and Privacy.
As shown in Figure 22, there is a tab called "Data & Privacy" which has three green toggle buttons for restricting the user data that is kept by Discord.

Figure 23. Discord email spam.
Sadly, emails from Discord seem to go into my primary Gmail inbox while emails form other online sites like DeviantArt go into the "promotions" or "social" inboxes. So, I have to wonder how much email and phone spam I will be getting from Discord. I set my Discord screen name to be "JWSAISCIFI", which, aster the leading initials of my name, means artificial intelligence science fiction. I will learn if the discussions on the Discord server are useful.

Figure 24. Server rules.
image by WOMBO Dream

 It is free - you get what you pay for. A word of warning about using text prompts to generate images with Whisk. This Gemini-based system will often refuse to generate an image corresponding to what you ask for. If you compose an intricate text prompt in Whisk, be sure to save it in a location outside of Whisk so that you don't loose your work. You might only need to change one word in your text prompt to get Whisk to generate the image, so there is no point in re-composing a long prompt from scratch while repeatedly trying to get Whisk to generate your image. Also, sometimes Whisk will act like it is generating an image, but it has crashed and will never complete the image no matter how long you wait. Other image generating systems will give you an error message if there is a problem, but Whisk can just hang, with no indication that anything is wrong. As a Macintosh user, I use Apple's Notes application to save copies all of my text prompts while making AI-generated images.

Figure 25. Introductions (right side) and major discussion categories (left side).
Figure 26.
When I found and used the "start from scratch" option (lower center in Figure 2), I was able to enter new subject and scene images (see Figure 26) and there was a fourth holiday themed style with this description: "A glass ornament, hanging from a Christmas tree, is depicted in a close-up shot. The background is blurred, focusing attention on the ornament. The lighting is soft and warm, creating a gentle glow around the ornament. The overall style is reminiscent of traditional Christmas decorations, with a focus on rich colors and intricate details. The image has a slightly vintage feel, suggesting a handcrafted or antique aesthetic. The colors are vibrant and saturated, with a focus on reds, greens, and golds. The image is sharp and clear, with a high level of detail visible in the ornament's texture and design. The overall mood is festive and cheerful, evoking the spirit of the Christmas season.
Figure 27. Useful information.

If a location is provided, incorporate characters into a creative festive scene with their location AS the ornament. The characters and scene should be made out of fun 3D materials to form an intricate little sculpture on of the ornament.
 OTHERWISE any characters should be drawn as a single blown glass Christmas ornament. Show the final product hanging on a cute Christmas tree branch
".

Figure 28. Edited alien.
After "starting from scratch", there was now the useful tip (about clicking on the arrow button) that is shown in Figure 27. After clicking on the "generate" button, there was no image generated and no message provided. In the Discord discussions, I saw mention of there being some limit on how much a user can generate with Whisk in one day. Had I reached the limit? Also, I know that Gemini will refuse to generate images for some text prompts.

The "subject" image that I used (top panel of Figure 26) was the image shown in Figure 18. Here is Gemini's description of that image: "Three female figures are posed in a diorama.

The figure on the left is tall and slender with pale skin, long blonde hair, and pointed ears. She is topless, wearing a black leather jacket and blue jeans. She has a necklace and what appears to be a jeweled bracelet on her left wrist.

Figure 29. From bust to burial.

The central figure is a bust of a female with pale skin and dark, pointed ears. She wears a wide-brimmed burgundy hat adorned with pink roses, and a dark burgundy jacket. Her makeup is dark and dramatic, with dark lipstick and eye makeup.

Whisk's bust.
 The figure on the right is a woman with long, auburn hair and fair skin. She wears a long, brown dress.
The background of the diorama is dark and moody, with teal, branch-like structures. The figures are positioned on a base of dark soil and moss".

I removed the nipples (see Figure 28) and made certain that Whisk did not convert the vampire into a bust and I continued with the new "subject" that is shown in Figure 30, below. 

Figure 30. Updated "subject image".
Using that new "subject" image (Figure 30, above), Whisk generated the new "storyboard" shown in Figure 31, below...
Figure 31. More of a science fiction storyboard, but still with an alien vampire.
Figure 32. Is her hand
passing through glass?
The Whisk "storyboard" in Figure 31 has more of a science fiction flavor than what Whisk generated for Figure 16, above. I don't know how Whisk decided to put the vampire inside the glass globe and leave the other two figures outside. Whisk did have some truble dealing with the bottle, with hands sometimes seemingly passing through the glass (see the image to the right).

alternate alien

It was a lucky turn of events that Whisk interpreted one of my uploaded images (Figure 18) as being a diorama and there was the holiday themed "style" that looked like an ornamental tree bulb. That led to me placing the vampire character inside a glass bottle (Figure 31), a classic pulp Sci Fi topic. In my next blog post, I'll continue my exploration of Whisk as a tool for making story illustrations. Above on this page, I got caught up with the "renaissance vampire" that is present as an example on the introductory page for Whisk. My personal interests are in the domain of science fiction. Next, I'll turn my efforts to using Whisk for creating illustrations for science fiction stories.

Next: Making Sci Fi Wysken.

Visit the Gallery of Movies, Book and Magazine Covers