|
SSS chess |
I've been experimenting with Google's
Whisk for creating illustrations for science fiction stories. A question arises: are there any limitations imposed by Google on Whisk? Maybe there is an upper limit on the length of text prompts or possibly an upper limit on the number of "subject" images that can be used at one time. In
my previous blog post, I used a Whisk text prompt that had 328 words and almost 3,000 characters, but I had the feeling that Whisk could not really usefully process that many instructions.
Ten Chess Pieces. As shown in the image to the right, I uploaded seven "subject" images to Whisk. These seven subject images contained a human with red hair and nine humanoid aliens. I also uploaded a "scene" image that was a chess board with chess playing pieces on the board. I used the following text prompt for my Whisk "style": "The overall style has a high degree of photo-realism, with a focus on rich colors and intricate details. The colors are vibrant and saturated. The image is sharp and clear, with a high level of detail visible in stars the scene. The overall mood is scientific evoking a spirit of precision in details with no extraneous objects included. The natural features of the subject are depicted photo-realistically. The level of detail is consistent throughout, creating a cohesive and immersive scene."
The first two Whisk generated "storyboards" are shown below...
|
Figure 1. Chess storyboard #1a.
|
|
Figure 2. Chess storyboard #1b. |
The images shown in Figure 1 and Figure 2 included both standard chess pieces and humanoid figures. I was hoping for more humanoids to be placed in the "storyboard" image and no conventional chess pieces. I provided this text prompt to Whisk: "Place two complete sets of humanoid chess pieces on the chess board. Make each chess piece look like a humanoid creature." The new Whisk-generated storyboard is shown below in Figure 3.
|
Figure 3. Chess storyboard #2. |
Whisk abandoned the original "scene" image and failed to make all of the chess pieces have the form of a humanoid (Figure 3, above). I changed the text prompt to: "Place two complete sets of humanoid chess pieces on the chess board. Make each chess piece look like a humanoid creature. The chess pieces are all in the shapes of humanoid creatures. The pieces are arranged in a mid-game configuration." Whisk then generated the image shown in Figure 5.
|
Figure 5. Chess storyboard #3. |
Fail. I was expecting that Whisk could generate large numbers of humanoid chess pieces, more like what
Mr. Wombo generated for the image shown in
Figure 4.
|
Figure 6. Generated by Gemini.
|
I asked Gemini to make humanoid chess pieces and got the image shown in
Figure 6. Using this slightly longer text prompt, "
Generate an image depicting two complete sets of humanoid chess pieces on the chess board. Make every chess piece have the form of a humanoid creature. The chess pieces are all in the shapes of humanoid alien creatures with skin of various colors including green and blue. The pieces are arranged in a mid-game configuration." I got the image shown in
|
Figure 7. Generated by Gemini. |
Figure 7.
At this point, I tried replacing the "scene" image used by Whisk. I uploaded a new "scene" that was similar to the image in Figure 5, but with no conventional chess pieces. I changed the description of the story board to: "A cosmic chess match unfolds on a starlit board, where humanoid chess pieces, rendered in a realistic style with dramatic lighting and atmospheric effects, engage in a mid-game battle. Two complete sets of humanoid chess pieces occupy a mid-game configuration on a chessboard set against a backdrop of a dark space speckled stars.
Each piece is a unique humanoid figure: a young woman with long, wavy, bright red hair and fair skin in a teal and gold jumpsuit; an avian creature with purple feathers and a beige robe; a purple-skinned humanoid with pointed ears and a dark purple, sparkly outfit; a purple-skinned female with backward-curving horns, yellow-green eyes, and a jeweled headdress; a light purple-skinned elf with platinum blonde hair and dark green armor; a young woman with pointed ears, long white hair, and purple skin in a green and gold outfit; a blue-skinned woman in a red baseball cap, light green shirt, and denim shorts; and a bald, blue-skinned female humanoid partially submerged in water, surrounded by orange coral flowers. Some half-height humanoid figures represent shorter pawn pieces on the board. The lighting is dramatic, with a bright, intense core surrounded by softer, diffused light, creating a sense of depth and smooth rendering with photorealistic precision."
Whisk then generated the "storyboard" shown in Figure 8.
|
Figure 8. Whisk-generated chess storyboard #4. |
|
Figure 9. Whisk-generated chess storyboard #5. |
|
Two scenes.
|
I applied these instructions to the image in
Figure 8 and Whisk generated the image in
Figure 9. I then tried making a new "style" image, an edited version for
Figure 9, covering up the conventional chess pieces (see the image to the right). I tried leaving the older "style" image, but only the newer one seemed to be used. The resulting "storyboard" is shown in
Figure 10.
|
Figure 10. Whisk-generated chess storyboard #6. |
Golf-chess. One of the subject images that I used was a golfing alien from
this blog page. I was amused that Whisk included golf clubs in
Figure 10.
|
Figure 11. Whisk-generated chess storyboard #7. |
|
Generated by Mr. Wombo.
|
Once again I applied these instructions, "
Place two complete sets of humanoid chess pieces on the chess board. Replace each conventional chess piece with a humanoid creature. The pieces are arranged in a mid-game configuration," and got an image very similar to the one shown in
Figure 11. The only edit that I made to
Figure 11 was that I spliced in a replacement humanoid (image shown to the right) where Whisk had placed a conventional chess piece.
Continuing the iterative process, I now made the image in Figure 11 the new "scene". Whisk then generated the new "storyboard" image that is shown in Figure 12.
|
Figure 12. Whisk-generated chess storyboard #8. |
Grass. Mysteriously, for the image in
Figure 12, Whisk made some of the spaces on the chess board look like plots of grass. However, I was pleased that this image only had one conventional chess piece, at the far left. I suppose the "hot tub" in the lower right corner should be surrounded by a lawn.
|
Generated by Mr. Wombo.
|
I again gave Whisk the instructions: "
Place two complete sets of humanoid chess pieces on the chess board. Replace each conventional chess piece with a humanoid creature. The pieces are arranged in a mid-game configuration," and Whisk generated an image close in appearance to the one shown in
Figure 13.
The only change I made to the image that is shown in Figure 13 was to chop off the two sides of the Whisk-generated image which had a conventional chess piece in each corner. Sadly, Whisk seemed to forget about the idea of having some shorter "pawn" pieces.
|
Figure 13. Whisk-generated chess storyboard #9. |
|
Figure 14. Whisk-generated chess storyboard #10a. |
|
Figure 15. Whisk-generated chess storyboard #10b |
|
Figure 16. Whisk-generated chess storyboard #11a |
|
The final "scene" uploaded to Whisk.
|
At this point, I made one more new "scene", shown to the right.
Whisk then generated the images shown in Figure 17 and Figure 18, which were both rather amusing.
Once more I provided my "fine tuning" instructions and then Whisk generated the image that is shown in Figure 19, below.
|
Figure 17. Whisk-generated chess storyboard #12a |
|
Figure 18. Whisk-generated chess storyboard #12b |
|
Figure 19. Whisk-generated chess storyboard #12b |
Not expecting an AI to count (Whisk almost got the correct number of playing pieces on the board), I was fairly satisfied with Figure 19, even though Whisk seemed to have no idea what a chess "mid-game configuration" is. My final instructions provided before Figure 19 was generated: "Place two complete sets of humanoid chess pieces on the chess board. Replace each conventional chess piece with a humanoid creature. The pieces are arranged in a mid-game configuration. Shorter humanoids act like pawns in the middle of the board. Place in the background of the rendered image a simple dark star-field." Also, Whisk could never stop producing at least a few conventional chess pieces for these scenes. Perhaps my greatest disappointment was that although every "subject" was female, for this "final image (Figure 19), Whisk has removed all the estrogen.
|
Figure 20. Generated by Gemini.
|
I had
Claude describe the scene that is shown in
Figure 19. I then provided that Claude-generated verbal description to
Gemini and the image shown to the left in
Figure 20 was generated by Gemini.
ImageFX made chess-playing alien scenes that were very similar to those made by Whisk.
Next:
No comments:
Post a Comment