I see some posted images have no metadata for image generation, but there are many tags in the list,
how are the tags generated, and how accurate are they?
Posted under General
box_want said:
I see some posted images have no metadata for image generation, but there are many tags in the list,
how are the tags generated, and how accurate are they?
They are not generated, they are typed in, much like in any other -booru site. They are as accurate as how well the people who have tagged the image are at tagging; you can see the tag history of a post in the post page by scrolling down and seeing the category "History" and pressing "Tags".
(Everything written below is more trivial information but if you are curious, carry on reading.)
PS. I said that they are not generated, which is the ideal solution, as AI tags are bad. We had a system set on before where the users could click on the select "Suggested Tags" and there were a bunch of users who clicked every single tag on that list. So I guess you could call those posts' tags as "generated", and they are not that accurate. Here's an example: look at the tags of post #36004.
As to how those tags were generated/suggested for the Suggested Tags feature, it was trained on hundreds of thousands of Danbooru posts or so, and the same AI tag model was used over here. It is still in place for features such as searching with ai: prefix, such as ai:short_hair limit:10.
Inaccuracy reasons:
[1] Its accuracy lowers as the time goes on, as it was only trained once, and it is super expensive to train it again. It also only trained the tags with X thousand posts, so it doesn't/didn't suggest any obscure tag for you.
[2] The tags are obviously based on the tags people have manually typed in, so if a tag was misunderstood or if its tagged differently now, the AI model will suggest stuff wrong.
[3] AI does not understand some concepts at all, which is expected. ai:virtual_youtuber limit:5 is a good example. At the time I am searching this it does show 3 Watson Amelia posts, but also 1 Genshin Impact character and one Honkai character, as I suppose miHoYo characters have similar design choices to typical virtual youtubers visually, but AI does not understand that it's not tagged on that basis.
Lyren said:
They are not generated, they are typed in, much like in any other -booru site. They are as accurate as how well the people who have tagged the image are at tagging; you can see the tag history of a post in the post page by scrolling down and seeing the category "History" and pressing "Tags".
(Everything written below is more trivial information but if you are curious, carry on reading.)
PS. I said that they are not generated, which is the ideal solution, as AI tags are bad. We had a system set on before where the users could click on the select "Suggested Tags" and there were a bunch of users who clicked every single tag on that list. So I guess you could call those posts' tags as "generated", and they are not that accurate. Here's an example: look at the tags of post #36004.
As to how those tags were generated/suggested for the Suggested Tags feature, it was trained on hundreds of thousands of Danbooru posts or so, and the same AI tag model was used over here. It is still in place for features such as searching with ai: prefix, such as ai:short_hair limit:10.
Inaccuracy reasons:
[1] Its accuracy lowers as the time goes on, as it was only trained once, and it is super expensive to train it again. It also only trained the tags with X thousand posts, so it doesn't/didn't suggest any obscure tag for you.
[2] The tags are obviously based on the tags people have manually typed in, so if a tag was misunderstood or if its tagged differently now, the AI model will suggest stuff wrong.
[3] AI does not understand some concepts at all, which is expected. ai:virtual_youtuber limit:5 is a good example. At the time I am searching this it does show 3 Watson Amelia posts, but also 1 Genshin Impact character and one Honkai character, as I suppose miHoYo characters have similar design choices to typical virtual youtubers visually, but AI does not understand that it's not tagged on that basis.
Thank you Lyren for the detailed answer, I was assuming that posts without prompts or metadata, were being tagged automatically when the uploader did not provide the details or when the image has no generation data. The tags seem to be quite accurate at least for some that I have grabbed all the tags and created a prompt, so the quality can depend on the uploaders choices from the suggested tags.