AIBooru

On the future of partial metadata

Posted under Tags

Considering the quite lengthy argument in the Discord over this, I believe it is worthwhile to bring this to the forum to reach a final general consensus on what the tag should or should not govern. I'll present the basic argument and leave it up to the rest of the users here to decide their judgement on this argument. Please do not argue with other users here about their views, I simply want to ensure that this tag is no longer ambiguous.

The basic argument is that partial metadata should not govern posts containing only the positive prompt if the model used is a proprietary model where the end user cannot specify different values, and therefor the metadata is "complete" as far as metadata partiality goes. For moderation this makes sense, since the idea of partial metadata is simply for well, metadata that is partial. SD and NAI give lots of parameters to the end user and it is important for these parameters to be documented in order for the metadata to be complete. DALL-E, Midjourney, Nijijourney and others do not offer these parameters of control to the end user besides the positive prompt and a few other features. For me to truly view partial metadata, I would have to filter out these posts, when there's really nothing at all that can be done for these posts.

I would like to ask for people's thoughts on the tag however, as I do not want this to be a lingering issue on how or how not the tag ought to be used, else it would make more sense for the tag to be nuked outright because it is ambiguous. So I would like to see if a general consensus could be made on how the tag ought to be used concerning these specific posts that are made with proprietary models.

The issue with assuming that there are parameters that we aren't being given in regards to closed black-box image generators is that we actually have no idea if there are missing parameters or even what they are. We don't understand Midjourney or Nijijourney architecture because they do not release any information on it. For all we know, these truly are the entirety of the parameters that you input into their program. They're called closed-source for a reason.

Like while Stable Diffusion makes up the vast majority of the image set here, we should not assume all image generators are based off of Stable Diffusion. If we can't even define what are the parameters in use for these image generators, we have no reason to know or even suspect that anything is missing.

Metadata in Stable Diffusion isn't going to be definitive, either-- much can be lost from the initial gen after inpainting. As much as I try to include all of the relevant metadata in my posts, there are times where I absentmindedly upload an image after inpainting and upscale and find I no longer have the original after cleanup (take post #33451 as an example, where I inpainted the face of the second subject but deleted the rest of the prompt to do so). Never mind preserving the original seed after multiple inpaints or manipulation in Photoshop.

The data can be manipulated with EXIF editors as well; we really cannot assume everything with metadata here is accurate. Does anyone here have motivation to manipulate metadata? Probably not. But I think it should be treated as a "nice to have" rather than a definitive reference for generation prompt techniques and trends.

The techniques of modern image generation and the involvement in closed source ecosystems simply makes most metadata shaky in reliability at best.

Therefore I suggest the partial metadata tag be nuked since documenting incomplete metadata if it is impossible to determine what complete metadata is for any kind of image generation, open source or not.

azreturned said:

Metadata in Stable Diffusion isn't going to be definitive, either-- much can be lost from the initial gen after inpainting. As much as I try to include all of the relevant metadata in my posts, there are times where I absentmindedly upload an image after inpainting and upscale and find I no longer have the original after cleanup (take post #33451 as an example, where I inpainted the face of the second subject but deleted the rest of the prompt to do so). Never mind preserving the original seed after multiple inpaints or manipulation in Photoshop.

The data can be manipulated with EXIF editors as well; we really cannot assume everything with metadata here is accurate. Does anyone here have motivation to manipulate metadata? Probably not. But I think it should be treated as a "nice to have" rather than a definitive reference for generation prompt techniques and trends.

The techniques of modern image generation and the involvement in closed source ecosystems simply makes most metadata shaky in reliability at best.

Therefore I suggest the partial metadata tag be nuked since documenting incomplete metadata if it is impossible to determine what complete metadata is for any kind of image generation, open source or not.

Complete metadata is still nice to have if it's an SD model or a NAI model. I already sort of defined what constitutes as partial in the page itself now that I rewrote it a bit, of course you're correct for anything that isn't Stable Diffusion, which is why I really only defined it in the case of SD models. However, SD is more "open" than a lot of AI image models out there, so really it would be worthwhile if the metadata could at least give a very good view of what it took to generate the image. It is precisely why I also really went hard on wanting metadata fields here so that I could further define parts of metadata so people can recreate my images easily, but of course, the only way people can truly recreate an image is either if they have the same GPU as the original generation or if the original generation used the CPU for randomization. Yet, it is still important because it can give people an idea on how to go about generating their own images and what settings to base off of. If you look at my oldest posts, I was using a neg prompt that were pretty much copy/pasted from some post I found that I forgot now.

Lopi999 said:

Yet, it is still important because it can give people an idea on how to go about generating their own images and what settings to base off of. If you look at my oldest posts, I was using a neg prompt that were pretty much copy/pasted from some post I found that I forgot now.

I'll admit, I have used posts on AIbooru for this same purpose. But I still think it should be considered a "nice to have" and doesn't deserve a dedicated tag. Since my post I also noticed you made BUR #327 to implicate partial_metadata into metadata_request. Wouldn't your BUR make this whole thread pointless if partial_metadata is rolled into another tag?

azreturned said:

I'll admit, I have used posts on AIbooru for this same purpose. But I still think it should be considered a "nice to have" and doesn't deserve a dedicated tag. Since my post I also noticed you made BUR #327 to implicate partial_metadata into metadata_request. Wouldn't your BUR make this whole thread pointless if partial_metadata is rolled into another tag?

To me I don't think of it as simply a nice to have, but I see it as a thing where either you give it all or you give it nothing. Sure, the base metadata fields we were stuck with until we could customize metadata more weren't exactly 100% the way there to generate an image, especially without the ability to add variation seeds and clip skip, but they're relatively good enough to begin the process of recreation, or at least to create similar images. Metadata should at least provide the bare minimum of what should constitute as being replicable, even if CFB says it's not really too viable. It is mainly to be helpful and again, to help newcomers to Stable Diffusion to see what kinds of prompts they can test out. I distinctly remember a guy who had an issue trying to replicate one of my images, and I generally just assumed he didn't use CPU randomization but what could I know of course.

The BUR simply is to help cement the tag in stone, and I think nuking is officially off the table now that I think we have a more general consensus on how to use this tag.

Ocean3 said:

@azreturned the tag is useful for sifting out posts with partial metadata as is, why would it need to be nuked. Say a user finds an image with a positive prompt after they search using `has:metadata`, but the post is missing other elements that would be useful, such as cfg scale/steps etc.

...

And again @Lopi999, other than GPU and CPU randomization, there are 100s of other factors. For example, there are now many minor settings in webUI/Forge/Comfy etc. that alter seed/image replication. Stuff dealing with how samplers function, the software version, compatability settings etc. etc.

If you acknowledge the endless possibilities of metadata prompts, where is the help in flagging what prompts are complete and which are not? How could completeness ever be defined without further input or knowledge from the artist of the image? If I genned an image with a CFG of 30 with CFG Fix on, but inpainted with CFG Fix off, that metadata would never be included unless I went out of my way to include it. To anyone else but me, is that metadata "complete?" Who would know?

My point is there is no help to be found in such a tag for it is impossible to define "complete" metadata; Stable Diffusion works with and without many parameters. You could hypothetically gen without a negative prompt in SD and no one would be the wiser, that is what pony_diffusion_xl was trained to do. Therefore it should be nuked as it does not provide value to users of the site.

Attention to the completeness of AI metadata is unhelpful and a distraction to newcomers. I agree the metadata on AIbooru is invaluable, and brings attention to tools and techniques one could use to improve genning. I myself, have used AIBooru to this end. If there is any tagging that should be done, it should be tagging tools and techniques found in the metadata like FreeU or Soft Inpainting.

Ocean3 said:

I think you've got a limited pov there tbh and are only looking at things from your own use.

So instead of answering where there is value in this you want to attack my perspective instead.

There are many uses for the tag, both now and in the future. The tag is not 'confusing' for new users as there is little to no focus given to it and is a small pool atm anyways.

If there is no focus for it, there is no point for it. It is a waste of time. It is equally as wasteful to try to define completeness of metadata.

Searching with it as a negative tag (-partial_metadata) is already niche for a non-new user. On the technical side, someone who may be using a lot of images for parameter sourcing/referencing, downloading the images, etc. etc. would be able to sift out unwanted posts that lack what they are looking for. Should be pretty self-explanatory.

This I might be able to agree with, if it wasnt for my previois misgivings about defining the completeness of metadata. If you can define what constitutes complete metadata, particularly in regard to extensions and tools like FreeU or Soft Inpainting, I would concede this matter to you.

I also don't find this tag particularly useful. The average user has no use for it, and advanced users can filter through metadata in a more effective manner otherwise. It feels like a weird stopgap that nobody will use. Tagging it for the sake of tagging it, and not because anybody will use it.

azreturned said:

So instead of answering where there is value in this you want to attack my perspective instead.

If there is no focus for it, there is no point for it. It is a waste of time. It is equally as wasteful to try to define completeness of metadata.

This I might be able to agree with, if it wasnt for my previois misgivings about defining the completeness of metadata. If you can define what constitutes complete metadata, particularly in regard to extensions and tools like FreeU or Soft Inpainting, I would concede this matter to you.

Look at the wiki page, I have already basically wrote what constitutes as metadata being "complete". If it's missing anything that's written on there, then it's partial.

Well, it looks like @Ocean3 has dipped out of the conversation; but I am hoping for their return and some positive discussion going forward on this.

I was looking around at some meta tags and discovered the partial_commentary tag; which parallels partial_metadata. It is dawning on me more what it @Lopi999 is trying to accomplish; however, the issue I have still remains unaddressed. I believe the extensive amounts of potential parameters that could be passed into metadata are too insurmountable to ever define what can be considered "complete." Unlike partial_commentary, where a complete translation is easily identifiable, partial_metadata assumes a bare minimum of parameters that need to be passed into SD to generate an image are considered enough to assert completeness.

Additionally, limiting the scope what is "complete metadata" to only parameters Stable Diffusion produces may not be portable should another open source image/video generator rise to prominence.

As an example of why I'm so bothered by limiting "complete metadata" to just the bare minimum; consider comparing this image (Image 1) with the same generated with PAG (Image 2). The work PAG puts in alone is enough to significantly improve the appeal of Image 2-- enough to say Image 2 would lose its quality without PAG. That being said, I would assert "complete" metadata for this image must include PAG parameters; and this only I, the uploader, would know that information. Because only I am privy to that information, how could any one assume Image 2 without PAG metadata deserves the partial_metadata tag? That knowledge is lost only to the original uploader/artist and the tag does not help anyone else categorize or pull data from the site.

Sure, the majority of images already tagged with partial_metadata do have many parameters missing. If advanced users are going to be scraping from the metadata on the site; it would be best just to search by the dedicated AI metadata search tool (which will automatically exclude missing parameters) or by utilizing the API.

azreturned said:

Well, it looks like @Ocean3 has dipped out of the conversation; but I am hoping for their return and some positive discussion going forward on this.

I was looking around at some meta tags and discovered the partial_commentary tag; which parallels partial_metadata. It is dawning on me more what it @Lopi999 is trying to accomplish; however, the issue I have still remains unaddressed. I believe the extensive amounts of potential parameters that could be passed into metadata are too insurmountable to ever define what can be considered "complete." Unlike partial_commentary, where a complete translation is easily identifiable, partial_metadata assumes a bare minimum of parameters that need to be passed into SD to generate an image are considered enough to assert completeness.

Additionally, limiting the scope what is "complete metadata" to only parameters Stable Diffusion produces may not be portable should another open source image/video generator rise to prominence.

As an example of why I'm so bothered by limiting "complete metadata" to just the bare minimum; consider comparing this image (Image 1) with the same generated with PAG (Image 2). The work PAG puts in alone is enough to significantly improve the appeal of Image 2-- enough to say Image 2 would lose its quality without PAG. That being said, I would assert "complete" metadata for this image must include PAG parameters; and this only I, the uploader, would know that information. Because only I am privy to that information, how could any one assume Image 2 without PAG metadata deserves the partial_metadata tag? That knowledge is lost only to the original uploader/artist and the tag does not help anyone else categorize or pull data from the site.

Sure, the majority of images already tagged with partial_metadata do have many parameters missing. If advanced users are going to be scraping from the metadata on the site; it would be best just to search by the dedicated AI metadata search tool (which will automatically exclude missing parameters) or by utilizing the API.

Here's the problem: you have to define complete metadata in a way that wouldn't result in literally almost every single post on this site being partial metadata. If I defined that PAG stuff, well, literally every single post would have partial metadata and you'd have to ask so many people about it that it wouldn't be worth it. Complete metadata was defined on the wiki under partial metadata in a way that would account for older posts where you could really only originally enter the prompt, neg prompt, CFG, seed, steps, sampler and model hash. Newer posts of course now you have metadata fields, but we have to account for these older posts. It's just the way it has to be; and to me at least, the metadata is "complete" when all of those fields are given. Don't take "complete metadata" as meaning "100% completeness", only uploaders like me would do that with metadata.

Lopi999 said:

Here's the problem: you have to define complete metadata in a way that wouldn't result in literally almost every single post on this site being partial metadata. If I defined that PAG stuff, well, literally every single post would have partial metadata and you'd have to ask so many people about it that it wouldn't be worth it.

Yes, that is the problem I have, I don't think 100% complete metadata can be defined. Also I will clarify: I'm not suggesting PAG should be under the definition of 100% completeness, I'm saying because of the very existence of technology and tools like PAG we wouldn't be able to define 100% completeness, but more on that...

Complete metadata was defined on the wiki under partial metadata in a way that would account for older posts where you could really only originally enter the prompt, neg prompt, CFG, seed, steps, sampler and model hash. Newer posts of course now you have metadata fields, but we have to account for these older posts.

In regards to older posts, if there is metadata missing, it is because the generation parameters are not present in the EXIF metadata uploaded image. A tag for missing EXIF data already exists: metadata_request. Old post or not, partial_metadata doesn't add any more value than what metadata_request does.

It's just the way it has to be; and to me at least, the metadata is "complete" when all of those fields are given. Don't take "complete metadata" as meaning "100% completeness", only uploaders like me would do that with metadata.

And here I think is where we are at an impasse. I'm convinced we need to have a strict definition for what is considered complete. I believe strict definitions make tidy Boorus; and (in my opinion) an untidy Booru will unmake its community. I still think the tag should be nuked, but I think more discussion should be had before any BUR.

I think I've said and hashed out all I really need to say on the matter (perhaps I've said too much). I'm open to any more ideas or defense of the partial_metadata tag, but I will probably lurk this thread for the time being.

azreturned said:

I still think the tag should be nuked, but I think more discussion should be had before any BUR.

I mean, you can make a BUR requesting whatever you want. If you want an easy way to directly see what people are feeling about your idea (by way of voting), it probably wouldn't hurt to give it a try.

BUR #331 is pending approval.

deprecate partial_metadata

Some time has passed and there is no further discussion; I believe then it would be appropriate to poll the community for their feelings on the idea. The reasoning to deprecate partial_metadata is simple. It is redundant. What is considered partial or complete is ambiguous. This tag won't help new users filter out images with missing metadata. In my opinion, this tag adds no value to AIBooru. Meta tags need to be kept tidy, I do not believe we should let our meta tagging conventions sprawl out of control.

I know I mentioned nuking before, but I think deprecation would be a better approach. This provides an opportunity to undeprecate it should its definition and purpose become concise.

1