Streaming Media

Document Type

Event

Start Date

28-4-2022 5:00 PM

Description

Deep Learning Information Retrieval (IR) is a booming area of research. Research in this field focuses on retrieving the most relevant search results based on the meaning of the search result, not just the keywords. The state-of-the-art technologies generally involve taking existing deep neural networks (such as Universal Sentence Encoder, or Google’s BERT), and training them to rank search results. However, some search engines make use of text matching algorithms (like Best Match 25). These algorithms work by taking into account the term frequency and other word patterns and work surprisingly well. The most common problem when it comes to Machine Learning is over-fitting. Even if the model gives better search results for programming-related questions, that doesn’t mean that it’ll work for searching the research articles. Generally, machine learning-based models are often slow and not scalable. In this project, we tackle all these problems by using the deep learning networks and approximate nearest neighbor search to implement the search engine that accurately ranks the search results based on the given query with a rapid and scalable solution. This model accepts not only text as input but also images as input. The image data is processed through the Image Captioning model that generates the text. This work uses the 200,000 Jeopardy! Questions and MS COCO datasets.

Share

COinS
 
Apr 28th, 5:00 PM

GR-183 - Deep Learning Search Engine

Deep Learning Information Retrieval (IR) is a booming area of research. Research in this field focuses on retrieving the most relevant search results based on the meaning of the search result, not just the keywords. The state-of-the-art technologies generally involve taking existing deep neural networks (such as Universal Sentence Encoder, or Google’s BERT), and training them to rank search results. However, some search engines make use of text matching algorithms (like Best Match 25). These algorithms work by taking into account the term frequency and other word patterns and work surprisingly well. The most common problem when it comes to Machine Learning is over-fitting. Even if the model gives better search results for programming-related questions, that doesn’t mean that it’ll work for searching the research articles. Generally, machine learning-based models are often slow and not scalable. In this project, we tackle all these problems by using the deep learning networks and approximate nearest neighbor search to implement the search engine that accurately ranks the search results based on the given query with a rapid and scalable solution. This model accepts not only text as input but also images as input. The image data is processed through the Image Captioning model that generates the text. This work uses the 200,000 Jeopardy! Questions and MS COCO datasets.