Image Description
The image description analysis task uses an AI model to generate a textual description of an image or video frame.
IMPORTANT: Image description can take a significant amount of processing time when running on a CPU. To produce a description for a single image takes several minutes using the pre-trained model ImageDescription_LLaVA-v1.6-Mistral-7B-HF.dat.
IMPORTANT: Image description requires a significant amount of memory. OpenText recommends that your Media Server has at least 64GB RAM for using the pre-trained model ImageDescription_LLaVA-v1.6-Mistral-7B-HF.dat.
| Configuration Parameter | Description |
|---|---|
| GPUDeviceID | The device ID of the GPU to use. |
| Input | The image track to process. |
| MaxOutputTokens | Limits the length of the text produced by the AI model. |
| Model | The name of the model to use to generate image descriptions. |
| Region | A region of the image or video to restrict analysis to. |
| SampleInterval | The interval at which frames are selected to be analyzed. |
| SyncDatabase | Specifies whether to synchronize with the training database before beginning the analysis task. |
| Type | The analysis engine to use. Set this parameter to ImageDescription. |
Output Tracks
| Output track | Description |
|---|---|
Data
|
Contains a record, with a textual description, for each processed image or video frame. |
DataWithSource
|
The same as the |