Artificial Intelligence (AI) is one of the areas that has increasingly entered our lives in recent years. The possibility of having machines that in some ways mimic how humans think rationally and make decisions, from the simplest to the most complex, opens up a world of possibilities to transform and improve areas such as health, agriculture, space exploration, sports, the automotive industry, and others. Although we might not realize it yet, AI is already very present in our everyday lives. For example, social media platforms use so-called recommendation systems that, in short, prioritize the content they believe the user will prefer. The decision to present a certain type of content to a given user takes into account their personal data (age, gender, etc.) and the interactions they have with the different types of content on the platform. In addition to social media, personal assistants (e.g., Google Assistant, Siri, Alexa), smartphones, cars, and many others already have some form of AI embedded into them. The presence of AI in our lives will continue to grow as we better understand how it works and how it can be applied in different scenarios.
Whether it’s a chatbot, optical character recognition (OCR), text analysis, facial recognition, or any other type of AI task, we are seeing an increasing demand for these AI functionalities in the solutions our customers are having us build, as they become more aware of AI in general and the benefits it can bring. With this in mind, we decided to spend some of the time we have for self-learning during the week developing a framework that could tackle some of these AI tasks so that we can easily deploy it across multiple projects.
In this article, we are going to share with you our progress, how we approached the development of this framework, what technologies we used, what tasks we wanted the framework to accomplish, what concepts we learned, and what difficulties we encountered along the way.
Before we started developing the framework, we had to decide what tasks we wanted the framework to perform in this first phase, what technologies we wanted to use for development and create an architecture for the framework.
In terms of tasks, we decided that the framework should be able to extract entities and intentions from any text and perform predictive maintenance. The decision to extract entities and intentions was made due to the fact that this is one of the tasks within Robotic Process Automation (RPA). Essentially, RPA consists of software that can be easily programmed to perform routine activities, currently carried out by human workers, in a controlled, flexible, and scalable manner. This relieves employees from doing repetitive and error-prone tasks and allows them to focus on more meaningful tasks, which ultimately, helps organizations to be more productive and efficient. The decision to include predictive maintenance in the initial phase of the framework stemmed from the fact that nowadays, modern equipment is fitted with numerous sensors, and we felt it was important for our framework to be able to receive data from these sensors and forecast when a piece of equipment would need maintenance to avoid future failures.
To better understand these two tasks and get a better idea of how to solve them, we had to dive deeper and explore some of the approaches the community uses to solve them. Entity and feature extraction is a subfield of natural language processing (NLP), a branch of artificial intelligence whose main goal is to give computers the ability to understand the text and spoken words in a similar way to humans. To solve NLP-related tasks, we have found the following concepts and approaches:
(Transformer architecture)
Predictive maintenance combines data science and predictive analytics to estimate when a device might fail so that corrective maintenance can be planned before the point of failure. The goal is to schedule maintenance at the most convenient and cost-effective moment, in order to maximize the life of the equipment. Predictive maintenance can be done in two ways: Classification and Regression. In the first method, we try to predict whether a failure is imminent in the next n steps, and in the second, we try to predict how much time is left until the next failure. To solve predictive maintenance problems, we can use the more classic machine learning algorithms such as Support Vector Machine (SVM) or K-Nearest Neighbors (k-NN or KNN), or we can follow the Deep Learning route and use auto-encoders, recurrent neural networks (RNN), and long short-term memory (LSTM) networks. To give you a little more context, we will give you a brief summary of how the previously mentioned algorithms work.
(High-level overview of an Auto-encoder model)
(Structure of an LSTM network)
As for the implementation of the tool, we have determined that we would approach it by abstracting away the use of the most common open-source machine learning libraries. With this approach, we can quickly add as many new problems to solve as we want, and we can choose any library without having to commit to a specific one. We have achieved this by creating a common interface that can be used by any library, so it always behaves the same regardless of the problem being solved.
With this in mind, the following diagram shows the chosen architecture:
The ProblemSolver is the main class and entry point through which the developer would interact with the tool. You pass the problem you are trying to solve, the data to be used, and the format of said data. Internally, the data is converted into a Pandas data frame that is used by the concrete implementation of the ProblemInterface and that defines how a problem looks like. For a problem, you can train a custom model that will be assigned with an ID and saved to the disk as a pickle, or you can make a prediction using an existing model on the disk.
The current problems being solved are:
In terms of technologies, we decided to use Python for the development of our framework, as it is one of the most widely used programming languages in general and is often used for machine and deep learning tasks. Besides Python, for the entity extraction problem we used Spacy, an open source, widely used, NLP-oriented library that has a particular feature that interested us and that is not very common in other NLP-oriented libraries, namely the support of the Portuguese language from the beginning. For the binary and regression problems, we used Pycaret, another open-source and widely used library that contains many modules for solving tasks in the realm of classification, regression, NLP, anomaly detection, and many others.
As mentioned before, we wanted to have the freedom to choose which libraries to use in the concrete implementations, as some are better suited to one type of task than others, and also wanted an easy and straightforward way for developers to interact with the framework. Since the concrete implementations of the framework are always behind the proposed abstraction layer, developers will not need to worry about future updates to those implementations and can easily integrate new implementations, as they will always follow the abstraction layer. It is important to highlight that the framework in question is ready to provide helpful solutions to our customers, however, its development is far from over. We intend to continuously improve the framework, in order to provide quick answers to the most prevalent machine learning tasks, so that they may be swiftly incorporated into the products our customers are having us build. For future steps, we intend to add new tasks to be solved, such as OCR, image classification, object detection, and object tracking, and want to improve the results obtained in the tasks mentioned in this article. Stay tuned for upcoming articles regarding the updates of this framework. 🙂