Cognitive Services includes
Transcription(speech to text )
Real time content moderation
As the volume of media sources increases, the challenge facing owners is to know what content they have and where. Whether the content be user generated, or simply drives of rushes, all “unstructured data” remains inaccessible to the business and a slow, painful process to locate when needed.
Artificial Intelligence (AI) and Machine Learning (ML) services in the cloud are becoming significant business capabilities, providing a wide range of features such as speech to text, level detection and automatic content moderation (to detect violence and nudity). The challenge is to integrate all these services into the same media repository that houses the content, allowing for the existing metadata to be enriched and for users to perform semantic-based searches across combined media and metadata.
It is important to integrate these functionalities in a cost-effective manner, thereby removing the historic concerns of ‘time and budget’ as reasons for business not to proceed.
Most DAMs which offer cloud integrations make use of the standard “video” based APIs from providers like Amazon and Microsoft, however the opex costs for running these services can be significant and prohibitive. Furthermore, the value of such metadata is diminished unless it is returned in a contextual way. The Content Discovery engine we use is different because it has been designed with an understanding of the primary challenges faced by media companies such as broadcasters and post production facilities and therefore, can solve them.
Building on its tradition of integration, This supports a wide range of AI and ML based public cloud environments and uses its Orchestration engine to request only those indexing services required for the content. These “in flight” decisions are based on reviewing the results of previous AI / ML calls which, then combined with configured business rules, allow for more focused and more detailed indexing to occur.
Another key feature of the Content Discovery engine is that it does not need to send whole chunks of video into the cloud. Instead, it will automatically extract full frame still images from content (e.g. once every second) and submit those to the cloud using image APIs rather than video APIs. There are two benefits to this approach: firstly, it’s more cost effective, secondly and more importantly, image APIs are a lot “richer” in features to their corresponding video APIs. this automatically combines the resulting metadata back into the asset, understanding where objects appear and disappear within the timeline, and provides a rich contextual experience at a fraction of the cost of using the standard video APIs
By combing speech to text with label detection, as well as existing ancillary files such as subtitles, still images and PDFs, etc., This provides smart and contextual content recommendations that matches the current search. Results are weighted based on relevance, and these results can easily be exported for use within edit environments and more.