Process & Story
The process of image recognition is always challenging, but it is challenging when dealing with images that are not well-labeled or contain a variety of objects. In the context of hospitality, this means trying to develop a model that can correctly identify and categorize photos of hotels. This was the task at hand for our team as we sought to create an AI solution based on visual image recognition.
Problem
Hospitality platforms deal with hundreds of thousands of images for hotel offers. For each offer, booking engines present several selected images that vary from the bedroom, bathrooms, lobby, pool, restaurant, window view, and nearby attractions. To provide the best user experience, the platform should show an optimal number of photos from each category to present the hotel offer in the best possible way visually. We’ve observed that booking engine sites deal with vast numbers of hotel photos, some with very low image quality or not particularly relevant to travelers, but the problem here is the lack of description for content in the image resulting in the wrong order of the photos.
To show the right image to the user, the platform must first categorize the image. This can be done either manually by visual content specialists or automatically with the help of machine learning and visual image recognition.
It is impossible to keep up with a costly manual process for big hospitality platforms that integrate with various vendors and update their offers daily. Booking engines are in great need of an artificial intelligence tool that can automatically detect and categorize photos for all their hotel offers.
This process of image classification is a complex task as it not only requires identifying what’s in the image but also understanding the context of the image and how it relates to other images in the same category.
Solution
The solution was to create a machine learning model that could sift through the image data and automatically categorize the photos. The first step was to gather a dataset of images representing the different types of hotels and their interiors found in the hospitality industry. This dataset was used to train a machine learning model that could be used for image recognition.
As this is an image recognition problem, we decided to apply ResNet architecture for the machine learning model. The Residual Network (ResNet) is a type of Convolutional Neural Network (CNN) architecture that addresses the “vanishing gradient” issue, allowing for networks with hundreds of convolutional layers to outperform shallower ones. The ResNets architecture and convolutional neural networks are frequently used in deep learning computer vision applications such as object detection, image classification, and image segmentation.
For the deep learning model building and training, we used the fast.ai library, which increases the level of abstraction of PyTorch. It is relatively new but already supports good practices and is always up to date with advancements in deep learning. We fine-tuned the ResNet deep learning model pre-trained on the ImageNet dataset with the hotel images dataset.
Once the model was trained, it was then tested on a new set of images to see how well it could categorize them. The results were promising, and the model was able to identify and categorize a variety of hotel images correctly. However, there were still some challenges that needed to be addressed.
The next step was to create a prototype application that would allow users to upload photos and identify the category of each photo or to do the same through our dedicated API endpoints.