@Swayam4414 Informed us today that they would like to use StanfordAIMI/CheXagent-8b but are unable to feed it images.
We explained that this is because if a model is marked as text-generation, it only accepts text. If a model belongs to the image-text-to-text category, it can be used to process images and text.
As a result, we'll be expanding the capabilities of text-generation models that support multi modal inputs, starting with StanfordAIMI/CheXagent-8b and then moving onto others that may meet the criteria.
@Swayam4414 Informed us today that they would like to use
StanfordAIMI/CheXagent-8bbut are unable to feed it images.We explained that this is because if a model is marked as
text-generation, it only accepts text. If a model belongs to theimage-text-to-textcategory, it can be used to process images and text.As a result, we'll be expanding the capabilities of
text-generationmodels that support multi modal inputs, starting withStanfordAIMI/CheXagent-8band then moving onto others that may meet the criteria.