[Everyone’s AI] Explore AI Model Jina: Neural Search Framework

Today’s topic is Jina from Google I/O 2021.

Jina is a Neural Search Framework that provides large-scale indexing and querying of different kinds of data, including video, images, text, music, source code, and PDFs. As an example of Jina, we’ll show you the Cross-Modal Search System, a model that lets you enter text or insert an image to obtain a picture that matches the text or to obtain a description of the image.

You can try Jina right away by clicking on the following links.

Demo page of Jina: https://master-jina-dleunji.endpoint.ainize.ai/

GitHub Repository of Jina: https://github.com/jina-ai/jina

“Being helpful in moments that matter” was the theme of this year’s Google I/O Conference. And many attendees at the conference seem to be interested in “Responsible next-generation AI”, so I’d like to introduce a model related to it.

Although chatbots have made great strides in the field of natural language processing in recent years, they are still only trained on text. Communication between people takes the form of images, text, audio, and video. Therefore, you should build a MUM (Multitask Unified Model) that allows people to naturally ask questions about various types of information. If, for example, you ask MUM, “Show me where the lion roars at sunset,” you can get a video that shows the exact moment the lion roars.

Google Keynote (Google I/O ‘21)

What I’m going to introduce today is Jina, a search engine that can search images, text, audio, and video like the example above.


Jina is a Neural Search Framework based on deep learning. Symbolic Search had been used traditionally as a means to provide a machine with a set of rules on how to interpret data. However, this method was time consuming because you had to write the entire set of rules. That’s why Neural Search was developed to deal with this issue. It is a search method that uses pre-trained neural networks that improves over time without requiring all the rules to understand the data.


With Jina, you can search for various types of data, including photos, videos, and audio, and it provides services such as REST API and gRPC to be used in a cloud environment. More information about Jina can be found at this link.


Here’s an example that uses Jina.

Build A Cross-Modal Search System : To Look For Images From Captions

This is a model that allows users to search for images with captions. First, when you input an image, it will be encoded into a vector. It will then find the most similar image by comparing it with the data set previously encoded as a vector. When text is input, it is converted into a vector and compared to the previously calculated image index vector to find the most similar image.

  • Try a demo page of Jina

We’ll look at a demo page provided by Ainize to see how the model works.

Use the Search Box to find a photo matching a specific text or an image. You can try the demo here. you can use not only cross modal but also other demos such as object search and gif search.

  • Try an api of Jina

This time let’s search an image using the API provided by Ainize. You can try it on the Open API in the demo .

It would change the look of the search engine from how it is currently. I expect this type of neural search engine to be quite useful in a number of fields, such as education and entertainment.

As new Jina models are released, I will share them with you. Stay tuned!


  1. Google Keynote (Google I/O ‘21)
  3. Jina Github
  4. Jina Demo
1 Like