Kenny's Frog Pond

"in fact, computers don't know shit"

Since AI image generation's uprising, I have always had a close eye on its development, whether I wanted to or not. Discussions over the ethics of training AI without the artists' consent happened frequently among my acquaintances on Twitter, and while I understood what was happening to an extent, I was relatively lax about having my work trained upon. I treated the action as a, "Yeah, that was bound to happen", and I didn't want to bother trying to fight the inevitable.

... On the other hand, I was absolutely tired of hearing how AI could replace my entire career in design. The endless drivel by AI enthusiasts showing off their work as if it could even compare to human made work drove me up the wall. It does spark a small worry though. What if AI was able to get to a working state? How accurate is it now and how accurate will it be in another year? Another month? These thoughts dug at my mind as I moved into starting this project.

We started with using Teachable Machine to get us acclimated with machine learning models and artificial intelligence. My immediate thought was to try and create an "artificial version of myself." Something about making flawed versions of me felt very appealing, so as a simple starter project I decided to create a FUNNY or UNFUNNY image classifier . By feeding the AI a small dataset of what I found humorous, I got the AI to "assume" (more on this later) what I thought of other images.

Unsurprisingly, the AI was inaccurate(because I find this image extremely funny). Because of all the :naruhodo: images in the dataset (see The Grand Naruhodo Count), the model was heavily biased towards finding black and white images FUNNY. In comparison, there were many pictures of people in the UNFUNNY category, which led more realistic photos to be sorted together.

Over the course of the project introduction, we would continue to read into and experiment more with AI. An in-class demo to create a snake game with image recognition controls revealed to me how AI was fast to train with relatively good results, but would still have glaring flaws.

I knew much of the information from the lectures already through exposure from social media, but they reinforced the negative aspects of AI and model training, such as the intensive emissions and disrespect of privacy and image rights. What was new to me though was the negative impact across the entire chain, leading back to the labeling of images.

In addition, there were projects like Triple Chaser which were legitimate positive uses of AI, compared to many of the intrusive and abusive AI projects currently in circulation.

Takeaways

how does AI impact me?

The next few days of class would be spent discussing different topics around AI, analyzing AI anatomy, and questioning if certain designs should be using AI. All in all, I continued to develop my existing negative opinion of AI, capitalizing off other’s work in the dataset gathering, labeling, and generation processes. While the technology itself was very interesting, 99% of the time it was in the hands of bad actors.

I started developing ideas for my final model. Going on my previous fun of creating "flawed mes", I brainstormed many different ideas that involved the creation of a false self through assumptions and the correction of myself through active monitoring.

While many of these involved my physical body or my daily routines, they didn't feel personal - like I had a stake in the situation. The idea that really captured my attention though involved AI art. With the sudden rise of AI art generation through DALL-E and Midjourney, my hobby as an artist on Twitter has seen numerous ups and downs with human artists constantly fighting against "AI Artists" (those air-quotes should show where I stand in this conversation). At the time, I held the opinion that my art getting trained was inevitable and I couldn't really do much to fight it. I do hold a generally negative opinion over AI art generation but I really don't involve myself in online discourse due to my current lack of knowledge over the topic, though I have dabbled a bit in Stable Diffusion and DALL-E to stay aware of the technology.

My plan was to gather a collection of art I've drawn so that I could train a Stable Diffusion style LORA to generate bootleg versions of my own art. I would then train a second image recognition model using Teachable Machine with my art and AI generated art to see if my recognition model could detect if art inputted was drawn by me or not. AI counter-programs like Glaze and Nightshade always interested me, and I knew of other AI image detectors but I was curious as to how accurate they could be. With this concept in place, I was ready for feedback from my peers.

Alfredo's Feedback

“Conversating if AI can replicate or learn to understand a style from an artist and recreate/copy it. Creates an interesting aspect of being the data that is being fed into this machine to create an output, yet also being guided by the machine. I find the aspect that you are creating the art that is being fed into the ai model to be really interesting to follow and wondering if the art you create for this will be affected by the project. Will some of the artwork exaggerate certain features or styles to make it more apparent for the machine to learn? I wonder what the machine will exaggerate or highlight about your style that makes it so distinct.”

Alice's Feedback

“Interesting how you're examining how AI art recognizes art "style"/legitimacy — I could see some long term implications if style recognition turns positive, such as features to generate AI art in a specific (person's) style, or if AI can be tricked into believing an AI generated work is original artwork...? There could be lots of debate on that...unless this is a feature that already exists, oops. This makes me wonder if AI can recognize file metadatas as well…”

A lot of this feedback ended up being interesting questions that I would like to revisit on the completion of this project. I wasn’t sure if Teachable Machine could recognize metadata, but I certainly believe that some AI out there would be able to read the metadata of images.

Takeaways

Postmortem Questions

creating the monster

With critique in mind, I began work on training my first model, with classes all being workdays from this point on. I already had Stable Diffusion web UI, a free interface for Stable Diffusion that runs locally, installed from previous AI escapades, so my first step was to gather and label my training dataset.

To keep the style consistent, I decided to pick around 50 drawings from 2022-2023 to train off of. I also knew that any more images would make the training process exponentially longer and I was also curious as to how accurate the model could be on a small dataset. Because of how model training works though, I did still have to train off a base style checkpoint, in my instance Anything 3.0, along with whatever base Stable Diffusion was trained on, which made me feel conflicted about this step. I was going about training my style LORA ethically by only using my art, but the fact that I still had to use others' work as a basis unsettled me slightly.

The next step was using inbuilt tools to label my art. Since Anything 3.0 was trained using booru (image boards, often used for anime content) tags, I would have to do the same using BLIP captioning. The process was simple, I just had to throw my images in, tweak some numbers, and wait for the program to process everything. Roughly 30 minutes later and my images were done, and the results were pretty surprising. Overall the tags were fairly accurate, with only one mis-gendering as a noticeable error. What was more significant though were some of tags. The AI seemed to need to mention breasts, cleavage, and large breasts, which does seem to follow with the average human booru tagging habits, but regardless was still funny to look at.

Crawford in Atlas of AI argues that this categorization comes from the bias of the AI designers, giving them the role of the grand role of judge on their own. "Designers get to decide what the variables are and how people are allocated to categories. Again, the practice of classification is centralizing power: the power to decide which differences make a difference." In a similar vein, Danbooru and other booru sites tend to have the image tags only reviewed by the poster, leading to an amalgamation of everyone as the judge of their own uploads, who are often not the original creator.

With tagging done, I moved on to training the model. After running some programs for an hour, I was met with the issue that I was just too broke to train a model locally. My GPU, a 4GB 1050TI, was just too weak and was actively rejected by the training program, which surprised me since I thought it would just let me run it for longer to achieve the same result. I ended up using a Google Colab cloud GPU to train my model, which ended up only taking an hour to complete. The entire process was well documented, which grew my belief that AI was actually fairly easy to train because of all the community built tools available.

As the model was training, it would spit out sample images for each of the training epochs, and I was shocked by only the midpoint. Some aspects were still rough, but the model was able to capture how I drew eyes and separate hair strands. The training would continue to epoch 15, at which the influence of my style became too strong and began creating distortions.

At this point I was already in a state of horror and amazement, but I proceeded to test the model at different epochs and strengths to find the best combination. With the model complete, I switched to running my local install of Stable Diffusion and began generating Strength/Epoch plots to see what would give me the most accurate style imitation while avoiding the distortions seen earlier. This process ended up taking several hours, with me leaving it overnight to run. I woke up to my completed tests and observed the results.

Overall, the lower right quadrants of both tests looked to be the most accurate, but some of the minor details were more interesting to me. The ones at the far bottom right were actively taking full inspiration from some of the training pictures, such as the 2 character composition or position of the arm. This was very interesting as I felt like I was finally grasping how AI "copies" from its dataset.

At this point, I realized that I had included images that I had drawn of my friends' characters. Even though I was still the artist, and this model would never be interacting outside of my computer I felt very bad about even using my friends ideas to train my model. Regardless of this thought, I pushed forward with testing the model.

Takeaways

stitching together amalgamations of my art

Now that the training was done, I began to experiment with various prompts. After generating a few one off images with random prompts, I had the idea to use the generated prompts from the dataset and see what was created.

What came out were what I could describe as bastardizations of my own work. The AI was desperately trying to replicate my compositions, but it could only go so far. The AI attempting to write words and ending up with meaningless glyphs was pretty interesting, but what was more concerning was the "sexification" of women in a lot of the generations. I could only assume that it was due to the influence of the Anything 3.0 model I had trained on, since it drew from boorus that usually contain a large amount of NSFW artwork.

More experimentation with my style LORA at strengths way higher than recommended resulted in interesting artwork corruption. Some of the results were repulsive, but I recognized that this kind of inhuman distortion could be harnessed in some way as its own work of art. The aforementioned "sexification" also seemed to increase at higher strengths.

As a test, I presented a grid of 1 of my own art and 24 AI artworks based on the original to people unfamiliar with my art. I was able to notice the difference in how certain parts were drawn, but around half of my participants guessed wrong.

The more I generated, the more I noticed that all of the generated images seemed to be pulling most of their "inspiration" from 4 specific images from the original dataset. This was honestly extremely damning for me and my view of how AI steals artwork. There were some pieces that were basically the EXACT same composition as my own, which was incredibly alarming to see.

Finally, to prepare for the image recognition model, I used variable prompts to generate a variety of pieces to be used for training, which took several hours. Every one gave me an uncanny valley feeling as I scrolled through hundreds of bootleg artworks. Each one of them was just slightly off, and many had an extra degree of sexualization. An overall a growing sense of dread formed from all of my testing. What if there was just that "one guy", who decides to try and train and profit off my work? 50 images was all it took to achieve a relatively similar style.

Takeaways - Trainer

Takeaways - Trainee

rock'em sock'em robots - 2023 edition

Like my previous explorations with Teachable Machine, getting the skeleton of the model trained was very simple. After noticing the 4 "inspiration" pieces, I wanted my recognition model to be able to detect if

- An image was my own art
- AI Generated Art: - If the AI generated art had elements of one of the 4 inspiration pieces; - Miscellaneous AI generated art
-Not my art

I then went through and manually sorted all of my art, AI generations, and random other images to go in each category. Once they were all put into Teachable Machine, I waited for the model to finish training, put in one of my sample AI generations, and…

Success! It was able to identify the generated image as inspired by one of my pieces. I tested it with the rest of my categories and all was going well until I started running into a few false positives. Most of the time if it was AI art, the model was able to tell what it was inspired from but the main source of false positives came from incorrectly identifying generated art as my own. I assume this was likely due to the fact that I didn’t have enough of my own art to supply to the dataset, so instead I continued to generate even more images to help improve the accuracy.

"Prompt - “masterpiece, best quality, 1girl, solo,, {blonde hair|green hair|blue hair|red hair|white hair|black hair}, {short hair|long hair|very long hair} ,white background, {yellow eyes|green eyes|blue eyes|red eyes|black eyes}, {hair ornament|hat|circlet} < lora:frostillust-000012:0.7 >,”

A lot of these generations turned out sexualized despite me not including many pieces of that nature in my dataset. How much of this was due to my own work and how much of this was due to the influence of the Anything 3.0 model I had trained on?

After copious cycles of generation and retraining, I ended up with a fairly high accuracy rate for detection. With my model working, I moved to building my web interface. I added the model link to a p5.js + ml5.js project, put in a basic image upload option, and tested once more…

There was an immediate issue. While the final result was the same between my project site and Teachable Machine, the confidence levels output were slightly different. Further testing yielded even stranger results. The same image in both interfaces would give me completely different outputs.

Upon further investigation, I found that it was due to Teachable Machine’s resizing and cropping of images to 224x224px for training and detection, whereas my interface would detect based on the full image. Copying the cropped image from Teachable into my interface would end up yielding the same result.

I would then go and try numerous fixes, all to no avail including

- Cropping images before detection

- Resizing images before detection

- Teachable’s boilerplate code

Specifically, even though I was resizing the image before it was recognized, it still wasn’t detecting that change for the final output. Even if that did work, p5.js’s downscaling was different than Teachable Machine’s which would still lead to different results. Overall, compared to around 95% with Teachable, I only had a 70% accuracy rating with my own interface, and 100 more images in the dataset didn’t help.

After all of this, I had it. I was done with AI. I could get to this 99% point but I couldn’t ever reach perfection. I give up. I may have been able to best AI on making better art, but it beat me in how frustrating it is to train AI.

Amy's Feedback

“It's scary to think that AI is able to copy your art style from 54 pieces of original work, and I find it troubling that it blatantly copies the compositions and the color palette. It's cool that you did a two-step process of making AI art and then using a model to discern AI art from original work. That training model seems to have pretty good results which hopefully helps deter AI theft. I wonder if that brings you any comfort? How can future designers of AI use AI to enforce boundaries and regulations?”

Alfredo's Feedback

“Through the exploration of this ai design it became more eye opening just how fast the machine you trained was able to copy and replicate certain aspects of your art style. The message within your ai model/presentation I found your contradiction and reflection on teaching this machine while also going against what this machine is a really strong message. Its basically a question of how ai is affecting art and the creative process and where it could take this skill while being a reflection of the emotion behind the art/artist.”

Carrie's Feedback

Future Developments

Takeaways - Trainee

what am i left with?

Overall, I was actually fairly surprised at the strengths of AI when it came to the generation aspect of the project. The fact that it only took 50 images to replicate my style was astonishing and quite violating. I was wondering how much it really needed, because I read that some LORAs could even accurately copy characters at 10-15 images. The accuracy of the AI detection model also surprised me a bit, but I did know on a pixel-by-pixel analysis, there is an inherent quality about AI artwork that makes it easy to detect that isn’t easily seen by the human eye. The inaccuracy across Teachable Maschine and ml5.js was also slightly surprising, with results varying wildly even though it was the same image at different resolutions.

In general, AI affordances tended to push me towards feeding it more and more training data to create a more accurate result, which was primarily noticed with Teachable Machine. The inherent loss of control when the final outcome was spit out and the specifics of the training data used (images, resolution, colors) also encouraged a lot of fiddling with different aspects of AI in order to get it to behave how you want. This was also reflected in my use of Stable Diffusion, where I had to run multiple epoch/strength tests to see what version of my model produced the most stable results.

I think the most ethical use of AI when it comes to art creation is just for very early concept drafting. Like the process of looking for reference images on websites, the AI could work to just get some very basic ideas out but in the end, the artist will still input most of the work. Otherwise, I feel like using AI later in the process leads to “non-authentic work” from an artist.

Revisiting some of my questions from earlier...

- I don’t think that the way I drew changed much after starting this project, but I do feel like I am much more aware of what makes my art “my style”, from the eyes, to how I separate chunks of hair, and how I have a preference for drawing neutral faced women. I do think in this aspect, AI is an interesting way to explore what makes one’s art their own at a fundamental level.

- The image recognition AI did end up misidentifying some AI generated images as my own, but overall, it was more due to the translation to ml5, since the Teachable Machine hosted version worked surprisingly accurately, better than some human reviewers.

- I think the implications of counter-AI are bright. With my new perspective on having my art trained upon, having programs like Glaze and Nightshade to protect my own art, and AI image detectors like the one I made to protect viewers definitely help me keep peace of mind.

www.technologyreview.com/2023/10/23/1082189/data-poisoning-artists-fight-generative-ai/.

This endeavor definitely pushed me to use Glaze and Nightshade, whereas before I did not feel the need to. Reading on article on Nightshade, I echo the statement by illustrator Eva Toorenent that, “It is going to make [AI companies] think twice, because they have the possibility of destroying their entire model by taking our work without our consent.” Hopefully through this push back, artists and AI companies will be able to come to an agreement on respecting artists' work, but I do feel that this problem is systemic as it extends further out to people not respecting artists’ in general.

As this project was wrapping up, I happened to stumble upon something extremely relevant. I found that one person in a community I was in for a game I drew most of my art for was creating LORAs of the characters from the game. This immediately set me off as the sub-community always touted respect for creators and their work.

Looking at some of the sample images, I could already recognize compositions made by other fan artists. This made me suspicious, so I questioned the creator on where they were sourcing their dataset images from - official art or fan artists.

Their answer? Both.

I am a fairly significant content creator in the community (humble brag), so the chance that I was part of that dataset was very likely.

And that disgusted me.

After all I had experienced with generating my own art, I didn’t want to have my work defiled by AI. And the fact that now there was “that guy” who I thought wouldn't exist made me angry. Adding on to this, these LORAs were available publicly, meaning that my work was being spread without my consent in a manner I didn’t appreciate.

I didn’t want to cause a big stink in another person’s sub-community, but I did make sure to express my disdain for what was happening from my perspective as an artist. I was able to convince the creator to take down public access, but the response from the sub-community admin was appalling.

While they weren’t in 100% support of AI art generation, the fact that they were still accepting it in the space despite my discomfort was upsetting. I had worked with this person for commissions and they had respected my work, but to hear them take a stance like that was appalling.

I did end up downloading some of the LORAs to see what I could recognize from its generative outputs. While I wasn’t able to identify any of my own art, I did recognize copious fan art and official art from over the years. Like my own model based off Anything 3.0, these models also seemed to have some form of inherent sexualization as it would generate alternative costumes with more skin showing. What was incredibly concerning was the fact that there were LORAs of child characters with the same amount of sexualization, which delves into a completely different whole of AI image generation abuse for illegal pornography and highly specialized fetishes.

With the existence of this bastardization, I was very glad I took the time to investigate how generative AI pulls from its dataset. While it isn’t “stealing” per se and artists do take reference from other art, the degree at which the AI pulls from source images is far too great for my liking.

In addition, though the creator did relent and prevent the spread by privating the LORAs, I am now more aware of the limitations of AI image detectors with the development of my own. Though they are accurate, they are only so to an extent. False positives are bound to happen and I don’t want to make false accusations against real artists, so I feel as though I must tread carefully. There is also the ethics of having to use human generated art to train an AI image detector, but I feel in this case it is justified as it is to the benefit of the artist and there isn’t a way to replicate their work using the interface like an image generation model does.

Crawford brings up that "In this sense, AI systems are expressions of power that emerge from wider economic and political forces, created to increase profits and centralize control for those who wield them." This current realm of AI image generation and social media artists is one of constant strife as artists continuously fight for fair ownership rights but are usually powerless against the large corporations who only seek to profit off their work and disrespect the craft of art. This goes all the way down to the average consumer level, where AI enthusiasts and small businesses generate without thought or care for the artist's role in the work or even the state of the final product.

Overall, my entire experience with learning more about image generation and identification to my eventual confrontation with "that one guy" was very eye opening and has lead me to take a much harder stance against AI image generation. Whereas before I was somewhat ambivalent about its existence, learning of how easy it is for someone to replicate my style without my consent and distribute it to make content I don't approve of has lead me to change my view of AI to be far more negative. The technology itself is still interesting and I did get some enjoyment out of number-fiddling but overall the amount of bad actors using AI currently has lead me to more actively condemn AI "artists".

And to conclude this ramble on my ever growing hatred of the weird and twisted users of AI, I leave you a self-indicating AI generated piece from my experiments.

Me and The Machine