Skip to content

Cookbook

Audio Intelligence at Scale: Automate Insights from Meetings, Calls, and Interviews

If you're drowning in audio recordings from meetings, sales calls, or interviews, you're not alone. Many professionals struggle to efficiently process and extract value from hours of spoken content. Mantis AI solves this problem by automating the extraction of intelligence from audio files, saving you time and unlocking insights that would otherwise remain buried. In this post, we'll explore how you can use Mantis to transform your audio content into actionable data.

How to Make Any Image Speak to the Visually Impaired

Imagine a world where anyone, regardless of their ability to see, can hear a detailed description of an image. This Large Language Model (LLM) based solution does just that. It transforms images into spoken descriptions, making them accessible to those who are visually impaired. Let’s explore how this works and how you can use it.

Stop Wasting Hours on Manual Data Entry Forever

If you're tired of manually entering data from documents or forms, you're not alone. Many businesses and organizations face the challenge of dealing with a lot of paperwork. LLMs (Large Language Models) can help solve this problem by automating the process of extracting and processing information from documents. In this post, we'll explore how you can use LLMs to streamline data entry, saving time and reducing errors.

How to Make Your Images Speak Multiple Languages

Are you looking to enhance your application's accessibility or localization by providing image descriptions in multiple languages?

With Groq’s fast inference and the llama-3.2-90b-vision model, you can generate detailed, accurate image descriptions in English, Spanish, German, and more.

This implementation allows you to upload an image, convert it to base64 format, and request descriptions in multiple languages. Perfect for projects where visual content needs to be understood globally!

Step-by-Step Guide to Building Visual Conversation Apps

Ever wished you could have a conversation with a Large Language Model (LLM) about the images you see? With LLM technology improving, this is now possible! You can show an image to an LLM and ask questions about it, and the LLM will respond in real-time. It’s like chatting with a smart assistant that can "see" the picture and understand it.

In this post, we’ll walk you through a simple setup that lets you start a visual conversation with an LLM, using just an image and your questions. You’ll learn how to set up this system and have a conversation with an LLM about anything you like in the image.