BigHub blog - How Our AI System Fights Against Frauds in International Shipping

In the world of logistics, fraudulent and dangerous packages are one of the industry's biggest challenges. That's why a major multinational logistics company turned to BigHub for help in implementing a system for early detection. With a goal of deploying a solution for real-time evaluation of shipments as they enter the transportation network, our team at BigHub faced several challenges such as scaling the REST API and managing the ML lifecycle. Thanks to our expertise and an agile approach, we were able to successfully deliver a solution that keeps the company one step ahead in the fight against bad actors.

BigHub has a longstanding partnership with a major international logistics firm, during which it has successfully implemented a diverse range of data projects. These projects have encompassed a variety of areas, including data engineering, real-time data processing, cloud and machine learning-based applications, all of which have been designed and developed to enhance the logistics company's operations, including warehouse management, supply chain optimization and the transportation of thousands of packages globally on a daily basis.

In 2022, BigHub was presented with a new challenge: to aid in the implementation of a system for the early detection of suspicious fraudulent shipments entering the company's logistic network. Based on the client's pilot solution, which had been developed and tested using historical data, BigHub improved the algorithms and deployed them in a production environment for real-time evaluation of shipments as they entered the transportation network. The initial pilot solution was based on batch evaluation, but the requirement for our team was to create a REST API that could handle individual queries with a response time of less than 200 milliseconds. This API would be connected to the client's network, where further operations would be carried out on the data.

‍

The proposed application is designed with a high-level architecture, as illustrated in the accompanying diagram. The core of the system is the REST API, which is connected to the client's network to receive and process queries. These queries are subject to validation and evaluation, with the results then returned to the end user. The data layer serves as the foundation for the calculations, as well as for the training of models and pre-processing of feature tables. The evaluation results are also stored in the data layer to facilitate the production of summary analyses in the reporting layer. The MLOps layer manages the lifecycle of the machine learning model, including training, validation, storage of metrics for each model version and making the current version of the model accessible via the REST API. To achieve this, the whole solution leverages a variety of modern data technologies, including Kubernetes, MLFlow, AirFlow, Teradata, Redis and Tableau.

During the development of the system our team needed to address several challenges that include:

Setup and scaling of the REST API to handle a high volume of queries (260 queries from 30 parallel resources per second) in real-time, ensuring it is ready for global deployment.
Optimizing the evaluation speed of individual queries, through the use of low-level programming techniques, to reduce the time from hundreds of milliseconds to tens of milliseconds.
Managing the machine learning model lifecycle, including automated retraining, deployment of new versions into API, monitoring of quality and notifications, to ensure reliable long-term performance.
Implementing modifications on the run - our agile approach ensured flexibility and allowed quick and successful changes to the ongoing project for the satisfaction of both parties and better results.

We are proud to have successfully deployed the solution in a production environment within six months. Our ongoing performance monitoring and validation evaluations for 12 origin countries have been successful and countries are gradually added and tested over time. The goal is to roll out the application globally within the first half of the 2023.

‍