|
Championship trophy.
|
Back in 2022, I had a blog post called "
G-syms for G-world" in which there was an exoplanet called "G-world" where all of the galaxy's golf champions gathered to compete. In that science fiction story, I was concerned with the influence that the existence of human telepathic ability would have on how a sport like golf was played. Here in this blog post, I return to the challenge of creating images of people who are doing interesting things, such as playing golf. In a 2023 blog post called "
Action", I explored the ability of
WOMBO Dream to generate sensible images of human figures performing interesting activities. Here, I'll be using
Whisk to obtain AI-generated images.
|
reference images
|
Subject, Scene, Style. I provided Whisk with four "subject" images, each image depicting a golfer (see the image to the left). I also provided a "scene" image showing a golfing tee box and a golfing green floating in outer space. I also specified the same photo-realism "style" that I used in
my previous blog post. I asked Whisk to, "
Place the four subjects on an isolated golfing tee box located in outer space. Three subjects are watching the other subject swing a gold club. In the distance is an isolated golfing green floating in outer space. The dark of outer space with sparkling stars surround the verdant green of the tee box and the golfing green". With the reference images shown to the left and those text instructions, Whisk generated
Figure 1, below.
|
Figure 1. Golfing in space. Whisk-generated golfing "storyboard" version 1.
|
|
More clubs, please. And balls, too.
|
Space Golf. It is amusing that Whisk depicted the alien golfer as being barefoot (Figure 1, above). The "subject" image for the blue alien was generated by Gemini using the following text prompt: "a tall, thin, blue space alien who is standing on a golfing green and swinging a golf club".
Sadly, one of the golfers in Figure 1 (the one with the socks) is swinging two clubs at the same time. Also, Whisk should have had only one golfer swinging their club and lined up to hit a ball onto the green that is in the distance. One golfer in Figure 1 seems to be swinging a putter on the tee.
|
Figure 2. Whisk-generated alien.
|
Here is the Whisk-generated description of the "storyboard" image that is shown in
Figure 1: "
Photorealistic depiction of four golfers on a space-faring tee box, observing a swing against a backdrop of a distant, floating green and a star-studded cosmos. A tall, slender, light blue-skinned humanoid with pointed ears, androgynous in appearance, holds a dark golf club, poised attentively near a white golf ball on a vibrant green tee box. A young woman with light-tan skin, long brown hair, and a white baseball cap, is mid-golf swing, wearing a white collared shirt, dark-green skirt, and white socks with dark golf shoes. Two additional young women with light skin tones, brown hair (one in a ponytail), and white baseball caps, watch; one in a green shirt and beige skirt, the other in a green shirt and black pleated mini-skirt, both with golf clubs and bags nearby. The tee box and a smaller, circular putting green float in the inky blackness of space, speckled with bright, varied stars. The vibrant greens of the tee boxes and putting green contrast sharply with the dark background. The overall mood is one of futuristic adventure and wonder, with sharp details and saturated colors."
|
Figure 3. Updated Whisk-generated golfing in space "storyboard" image version 2a. Click image to enlarge.
|
|
Figure 4. From storyboard 2b.
|
I changed that description of the "storyboard to: "Photorealistic depiction of four golfers on a tee box that is floating in outer space. One of the golfers is swinging a golf club while the other three golfer watch the swing against a backdrop of a distant, floating green and a star-studded cosmos. A tall, slender, light blue-skinned humanoid with pointed ears, androgynous in appearance, holds a driver. There is a white golf ball on golf tee near the golfer who is swinging her golf club on the vibrant green tee box. A young woman with light-tan skin, long brown hair, and a white baseball cap, is mid-golf swing, wearing a white collared shirt, dark-green skirt, and white socks with dark golf shoes. The golfer who is swinging the golf club is lined up to strike the ball that is on the golf tee and hit the ball onto the distant golf green. Two additional young women hold their golf clubs and watch the woman who is swinging her golf club.
|
Figure 5. Golf girls with alien ears.
|
One of these two other women has light skin tones, brown hair (one in a ponytail), and white baseball caps, watch; one in a green shirt and beige skirt, the other in a green shirt and black pleated mini-skirt. There are golf clubs bags nearby, at the side of the tee box. The tee box and a smaller, circular putting green float in the inky blackness of space, speckled with bright, varied stars. The vibrant greens of the tee boxes and putting green contrast sharply with the dark background. The overall mood is one of futuristic adventure and wonder, with sharp details and saturated colors." Using this new description, Whisk generated
Figure 3. Maybe the putting green that is behind the golfers in
Figure 3 is the previous hole that they already played and the woman swinging her club is facing the next putting green, which is not visible in this image.
|
Golfer girl alien.
|
Whisk generally generates two "storyboard" images at a time. The golfing storyboard image version 2b had an alien with a huge tail, but this time the alien was given shoes to wear (see
Figure 2). Also, that version 2b "storyboard" had a more distant butting green in the background as shown in
Figure 4. Whisk gave pointed "alien ears" to the two women who are watching the swinger (see
Figure 5).
The image to the right originated with a Whisk-generated alien wearing shorts and shoes. I used that as a reference image for Mr. Wombo who generated this golfer girl.
I added an additional sentence to the "storyboard" description: "The distant putting green is shown to have green top surface of grass while the bottom of the floating putting green looks like an asteroid. A brightly-colored alien spaceship is seen in the distance, hovering over the putting green".
|
Figure 6. Spaceship added. Whisk-generated golfing storyboard image version 3a. |
|
The hover cart "subject" image.
|
I tried adding a golf cart to the "storyboard" with this text: "
A golf caddie is riding in a hovering golf cart behind the tee box. The caddie is a young girl with red hair". I provided a new "subject", as shown in the image to the right. Whisk generated the new "storyboard" in
Figure 7.
|
Figure 7. White golf cart; "storyboard" version 4a.
|
Sadly, Whisk seemed to completely ignore my desire to have a futuristic hovering golf cart and just inserted a conventional golf cart (Figure 7).
|
A blue alien.
|
Also, when Whisk re-writes the "storyboard" description after a user enters text in the "
Add additional details..." text entry line, Whisk then often takes out existing directives such as making images with "photo-realism", resulting in the kind of low-resolution, cartoon-like human figures that are seen in
Figure 7.
I tried again with these instructions for Whisk: "A golf caddie is riding in a futuristic hovering anti-gravity golf cart behind the tee box. The caddie is a young girl with red hair who is seated in the hover cart, waiting to fly the golfers to the next golf green after they all hit their tee shots." An updated version of the "storyboard" is shown below in Figure 7.
|
Figure 7. Hovering blue golf cart; "storyboard" version 5a by Whisk. |
I was pleased that in Figure 7 Whisk did not try to park the golf cart on either the tee box or the putting green. However, this version of the golf cart did not really look like the kind of futuristic stream-lined hover cart that I had in mind.
Here are my final instructions for this golfing in space "storyboard": "
Place the four subjects on an isolated golfing tee box located in outer space. Three subjects are watching the other subject swing a golf club. In the distance is an isolated golfing green floating in outer space. The dark of outer space with sparkling stars surround the verdant green of the tee box and the golfing green. A golf caddie is riding in a futuristic hovering anti-gravity golf cart. The hovering cart is floating in space between the tee box and the distant golf green. The caddie is a young girl with red hair who is seated in the hover cart, waiting to fly the golfers to the next golf green after they all hit their tee shots."
I had
Mr. Wombo make a version of the golf cart from
Figure 7 (see the image that is shown to the left). Both Whisk and Mr. Wombo were into glorifying the red hair of the caddie.
The final version of the detailed image instructions was edited to include, "A hovering futuristic anti-gravity golf cart is behind the tee box. The hovercraft golf cart is sleek and modern, reflecting an overall futuristic aesthetic appropriate for anti-gravity technology." The final "storyboard" version is shown below in Figure 8.
|
Figure 8. Final version (version 6a of the golf in space "storyboard). |
|
reference images
|
|
the "style" image
|
I uploaded an old (see
this 2024 blog post) alien baseball player "subject" image and an "Alien Baseball League" baseball card as the "scene" image. An un-cropped version of the "scene" image is shown to the right. Once again, I used my photo-realism style that I used in
my previous blog post and above on this page for the golf in space "storyboards" (see the image to the left). When using these three reference images, Whisk generated images like the one shown in
Figure 9, below.
|
Portrait orientation.
|
I was also able to switch Whisk to using the portrait aspect ratio (see the image with two UFOs that is shown to the right), but as soon as some additional editing action was done, Whisk returned to the default landscape aspect ratio. According to what has been posted to the
"Google Labs" Discord server, they are aware of this bug.
|
Figure 9. Whisk-generated alien with a big tail.
|
Artificial Intelligence and text in images. I was impressed that Whisk was able to put "Lefty Baker" on the uniform of the alien in Figure 9. I provided this text addition: "Place the text 'Dbacks' on the chest of the alien baseball player," but Whisk left out the letter "B".
|
Mr. Wombo "Monster v3" style.
|
I then specified: "Spell 'Dbacks' with a "B" in the word." and Whisk was able to make the "Storyboard" version that is shown blow in Figure 10.
The text elements in Figure 10 were edited by me. At the upper left corner, I cleaned up some overlap between the "L" and "I" in "Alien. In the lower left, the Whisk-generated image originally said, "ALEEN BASEBAGUS". And originally it said "Lefty Balok" in the lower right corner, but I replaced that with, "Arizona 2042", trying to place this scene after the future arrival of aliens on Earth.
|
Logo. |
I liked the green Alien Baseball League logo that Whisk invented. Also notice the logo at the lower left corner of
Figure 10 which has similarities to the logo for the Arizona Diamondbacks. I suppose Google could get in trouble for allowing its AI to use the logos of professional sports teams.
|
Figure 10. Updated Whisk-generated alien with "DBACKS" on the front of the uniform.
|
Sadly, Whisk cannot spontaneously create a reasonable layout for a baseball field. Whisk's problems are not only limited to baseball. I did not mind when Whisk started making night-time images of ball parks, but the evening stars in
Figure 10 seem randomly sprinkled into the night sky, even on the clouds.
The image to the right was generated by Mr. Wombo using part of the "style" image that Whisk-generated for this alien baseball card "storyboard" experiment. There was also a man with a gun on the left hand side of the original Whisk-generated image, but I was interested in the control panels arrayed near the woman. Poor Mr. Wombo seemed confused about what era this control tower scene was set in. I don't understand why Whisk always generates images for the "styles" that involve military scenes and settings.
|
reference images
|
Football. I provided Whisk with the inputs shown to the left and asked for, "
The subjects are on a football playing field with the blue aliens playing flag football against the human girls".
For the style, I requested, "The overall style is reminiscent of magazine cover illustrations, with a focus on rich colors and intricate details. The colors are vibrant and saturated. The image is sharp and clear, with a high level of detail visible in the subjects and design of the scene. The overall mood is futuristic evoking a spirit of adventure and wonder. The subjects are depicted photo-realistically. There is fractal complexity in the living plants and animals of the scene. There is photo-realistic detail in the hair and clothing of the subjects."
However, Whisk changed it to, "A young person with light skin and antlers attached to their head, wearing a tan, utilitarian outfit and carrying a rifle, stands in a lush, overgrown forest. Beside them, a young person with light skin, wearing similar clothing, sits on a large root. A deer and a small, tawny-colored feline-like creature are also present. In the background, a figure appears to be standing behind a large, partially obscured orb. The overall scene is fantastical, with bioluminescent plants and unusual flora."
I tried several times to make Whisk accept my requested style, but it just kept making up its own alternative styles. On previous days, I was able to get my text description of the style to stay in the style tab if I entered it twice. Not this time! Being frustrated by my inability to control the style, I submitted the "bug report" that is shown in the image below.
Here is the description of the resulting "storyboard" that is shown below in
Figure 11: "
A
fantastical, bioluminescent forest-style football field. Three
light-skinned teenage girls in purple athletic outfits play flag
football against two blue-skinned, slender humanoid extraterrestrials
with elongated heads and pointed ears. The girls wear light gray and
white athletic shoes. One girl bends over, another stands behind her,
hands on her waist, and a third is partially visible in the background.
The aliens have a serious, focused expression as they engage in a
handoff of an orange-brown football. The field is short-cut, bright
green grass with a faint white boundary line. A paved path and a
fenced-in grassy field are in the mid-ground, with two distant figures
(a girl in light clothing and a man in dark clothing). Lush green trees
and a clear blue sky form the background. The overall lighting is
dramatic, with a bright, sun-like celestial body illuminating the
scene. The aliens and girls are rendered with a slightly textured,
fantastical style, incorporating bioluminescent elements into the grass
and foliage." The Whisk generated "storyboard" is shown below in
Figure 11.
|
Figure 11. Alien invasion plan: "We need two footballs so that we can give one to the cute girls."
|
I changed the "storyboard" description to "A photo-realistic depiction of a football field for playing flag football. Two girls are playing flag football against two humanoid aliens. The light-skinned teenage girls in purple athletic outfits play flag football against two blue-skinned, slender humanoid extraterrestrials with large eyes and pointed ears. The girls wear light gray and white athletic shoes. One girl is running, another stands behind her, hands poised near her waist. The aliens have a serious, focused expression as they engage in a hand-off of an orange-brown football. The field is short-cut, bright green grass with a faint white boundary lines of a football playing field. A paved path and a fenced-in grassy field are in the mid-ground. Lush green trees and a clear blue sky form the background. The overall lighting is dramatic, with a bright, sun-like celestial illumination of the playing field. The faces and hair of the girls are rendered with a photo-realism and exquisite detail." The resulting "storyboard" is shown in Figure 12, below.
|
Figure 12. Updated alien football "storyboard" version 2.
|
The humans and aliens in
Figure 12 do look like they might be on a football field, but Whisk seems quite random in paining the field and positioning the players on the field.
The image to the right was generated by WOMBO Dream using a reference image that was made by Whisk. I tend to imagine humanoid aliens who are not very different from we humans. Text prompt for Mr. Wombo: "a tall slender blue-skinned woman with large pointed ears, the woman has long blue hair, the woman with blue skin is cute, the woman with blue skin has long pointed ears, dressed Victoria’s secret style in silk".
When I added to my instructions: "An alien spaceship is on the ground behind the fence in the background," Whisk generated the "storyboard" that is shown in Figure 13.
|
Figure 13. Alien football "storyboard" version 3. |
I congratulate Whisk on dressing the aliens in the purple football uniforms. The alien spaceship is also pretty interesting.
The image to the right shows an interesting alien spaceship that was generated by WOMBO Dream using the "Monster v3" style.
I asked Whisk to make the aliens float above the ground and Whisk made the image in Figure 14.
|
Figure 14. Alien football "storyboard" version 4. |
Sadly, only one alien is floating above the ground in
Figure 14, but the alien spaceship got even fancier.
|
AI-generated |
The image shown to the left was generated by WOMBO Dream, starting with a Whisk-generated reference image. The original AI-generated alien had a large rib cage, which inspired me to make the altered version of this alien that is shown to the right. One last version of the football "storyboard" is here and at the very bottom of this blog post.
Maybe the game should be just for girls, as shown in Figure 15.
|
Figure 15. Alien football "storyboard" version 5. |
Here is the version of the "storyboard" description used to generate Figure 15: "A fantastical, photorealistic scene on a sunny day in a park. Three light-skinned teenage girls in purple athletic outfits play flag football against two blue-skinned, slender, humanoid female extraterrestrials. The aliens, each hovering a foot above the short-cut green grass, have elongated heads and large pointed ears, their skin slightly textured. They are seriously focused on the game. The girls are wearing light gray and white athletic shoes. One girl bends over, another stands with hands on her waist, and a third is partially visible in the background. The game takes place on a grassy football field with a faint white boundary line. A paved path and a fenced-in grassy field are in the mid-ground, with two distant figures (a girl in light clothing and a man in dark clothing) visible.
Lush green trees and a clear blue sky form the background. Behind the fence, a large alien spaceship rests on the ground, glowing faintly with sparkling light. The overall style is fantastical, with an emphasis on vibrant greens, purples, and the contrasting blue of the aliens. The lighting is bright and sunny, highlighting the details of the scene. The aliens float in the air, their feet ten inches above the grass of the playing field. The humanoid female aliens have long blue hair that is depicted photorealistically."
Next: testing Whisk on a different task without humans or aliens.
No comments:
Post a Comment