Automotive Industry — EOE Data Science & ML LifeCycle

Zeeshan Malik
7 min readMar 6, 2021

--

In Automotive Industry today almost every common car is technologically sophisticated, sibilating with multiple microprocessors and millions of lines of software code.

The IoT is quickly progressing and leading us to autonomous vehicles that were once considered the vision of the future. IoT has also facilitated improved transportation efficiency, superior driving experience and advanced vehicle management capabilities with the help of most recent state of the art decentralized edge computing architecture and power of 5G. Due to rigorous use of IoT devices in the automotive industry enables unmissable trends specifically including Adaptive and predictive maintenance, Driver Monitoring System, Road Condition Analytics, digital cockpit solutions and Advance Driver’s Assistance Systems.

  1. Predictive Maintenance is a technique of gathering data through tools to “predict” possible defects in a device/equipment before it fails.
  2. A drive monitoring system can offer a bulk amount of data points to analyze the driver’s action and send alerts in case he/she is drowsy. With driver behavioural analysis, the drive monitoring system software can detect rash driving, distracted driving, driver drowsiness and even drive emotions.
  3. Road Condition Analytics and Navigation powered by AI can detect the condition of roads in real-time. This enables the driver to be updated on accidents, road closures, speed limits, construction works etc., on his/her route. Empowered with this data, the driver can choose to take an alternate route to reach the destination.
  4. The digital cockpit in a car offers a luxurious and personalized experience to the driver and passenger. It also facilitates seamless connectivity between the car and external devices. Driver safety and assistance are considered to be of the highest priority in a digital cockpit solution.
  5. Advance driver’s assistance systems has found a new position in the market. Companies aim to reduce the number of road traffic accidents through deploying AI to the edge to create a safer driving experience. By using radar/lidar/camera sensors on powerful edge compute systems, more frames per second of video can be analyzed with fewer redundancies.

Lastly the value chain in the automotive industry can be broadly described with the following subprocesses

  1. Development
  2. Procurement
  3. Logistics
  4. Production
  5. Marketing
  6. Sales, after-sales and retail
  7. Connected Customer

Development

Vehicle development has become a largely virtual process that is now the accepted state of the art for all manufacturers. CAD models and simulations (typically of physical processes, such as mechanics, flow, acoustics, vibration, etc. on the basis of finite element models) are used extensively in all stages of the development process.

Procurement

The procurement process uses a wide variety of data concerning suppliers, purchase prices, discounts, delivery reliability, hourly rates, raw material specifications and other variables.

Logistics

In the field of logistics, a distinction can be made between procurement logistics, production logistics, distribution logistics and spare parts logistics.

Production

Every sub-step of the production process will benefit from the consistent use of data mining. It is therefore essential for all manufacturing process parameters to be continuously recorded and stored.

Marketing

The focus in marketing is to reach the end customer as efficiently as possible and to convince people either to become customers of the company or to remain customers.

Sales, after-sales and retail

The diversity of potential applications and existing applications in this area is significant. Since the “human factor” embodied by the end customer, plays a crucial role within this context, it is not only necessary to take into account objective data such as sales figures, individual price discounts, and dealer campaigns; subjective customer data such as customer satisfaction analysis based on surveys or third-party market studies covering such subjects as brand image, breakdown rates, brand loyalty, and many others may also be required.

Connected Customer

While this term is not yet established as such at present, it does describe a future in which both the customer and their vehicle are fully integrated with state-of-the-art information technology. This aspect is closely linked to marketing and sales issues, such as customer loyalty, personalized user interfaces, vehicle behaviour in general, and other visionary aspects.

Vision

Vehicle development already makes use of “modular systems” that allow components to be used across multiple model series. At the same time, development cycles are becoming increasingly shorter. Nevertheless, the field of virtual vehicle development has not yet seen any effective attempts to use machine learning methods in order to facilities automatic learning that extracts both knowledge that is built upon other historical knowledge and knowledge that applies to more than one model series so as to assist with future development projects and organizing them more efficiently.

AI and data science in automotive industry as well as in other business niches require a level of precision that is important to capture up front:

  1. Describe the problem to be solved.
  2. Specify all the business questions as precisely as possible.
  3. Determine any other business requirements, such as not losing a customer while increasing cross-sell opportunities.
  4. Specify the expected benefits in business terms, such as reducing churn among high-value customers by X percent.

After the business problem is clarified you need to translate them into data analysis goals and activities for e.g.

  1. Identify high-value customers based on recent purchase data.
  2. Build a model by using available customer data to predict the likelihood of churn for each customer.
  3. Rank customers based on churn propensity and customer value.

To keep analysis on track define success in technical terms.

  1. Describe the criteria for model assessment, such as accuracy and performance.
  2. Define benchmark for evaluating successes, being sure to provide specific number.
  3. Define subjective measurements as best as you can and determine the arbiter of success.

After you complete the task of translating a business problem into an AI and data science solution, and understanding the data needs in support of your business problem. It is time to prepare the data. You need to prepare the data in a format that can be used for model development, measurement, and training an ML model.

Generalized Data Preparation Phase

The data preparation phase mostly includes one of the following key steps specific to a particular use case.

  1. Select a sample subset of data.
  2. Merge data sets or records.
  3. Derive new attributes.
  4. Evaluate the skewness in data.
  5. Format and sorting the data for modeling.
  6. Remove, replace or impute blank or missing values.
  7. Replace or correct data and measurement errors.
  8. Data Normalization
  9. Data Scaling
  10. Categorical data preparation

You defined business goals and spent hours digging through data. It is time to create a model. Fortunately model selection to use isn’t as challenging as it sounds. This phase mostly includes one of the following key steps specific to a particular use case.

  1. Feature Extraction/Selection
  2. Association or Sequential Analysis
  3. Clustering
  4. Classification
  5. Natural Language Processing
  6. Modeling based on Markov decision Processes

Typically, the challenge isn’t which technique to use, but how to train and optimize the technique to make it production efficient.

Generalized Modelling Directions

When you are ready to move your models from research to production, one of the choice could be using TFX to create and manage a production pipeline preferably using Google Cloud.

TFX includes both libraries and pipeline components. The diagram below illustrates the relationship between TFX libraries and pipeline components.

TFX pipeline components

TFX libraries include

  1. Tensorflow Data Validation (For analyzing and validating data)
  2. Tensorflow Transform (For preprocessing data)
  3. Tensorflow (For Training models)
  4. Tensorflow Model Analysis (For Evaluating Tensorflow models)
  5. Tensorflow Metadata (Provides Standard representation of metadata)
  6. ML Metadata (For recording and retrieving metadata)

Supporting Technologies with TFX

  1. Apache Beam

It is an open source unified model for defining both batch and streaming data-parallel processing pipelines.

2. Apache Airflow

It is a platform to programmatically author, schedule and monitor workflows.

3. Kubeflow

It is a dedicated workflow for deploying machine learning code on kubernetes simple, portable and scalable.

Once you have developed and trained a model that you’re happy with, it is now time to deploy it to one or more deployment target(s) where it will receive inference requests.

CI/CD Pipeline to Your Production Environment

Another most integral part are the top-level components of continuous integration, delivery and deployment of your code. Here the best choice for efficient and bundled integration would be Gitlab.

CI/CD Pipeline primarily comprise of:

  1. Jobs, which define what to do. For example, jobs that compile or test code.
  2. Stages, which define when to run the jobs. For example, stages that run tests after stages that compile the code.

CI/CD pipelines can be configured in many different ways:

  1. Basic pipeline
  2. Directed Acyclic Graph Pipeline (DAG) pipelines
  3. Multi-project pipelines
  4. Parent-Child pipelines
  5. Pipelines for Merge Requests
  6. Pipelines for Merged Results
  7. Merge Trains

The basic steps involved after pushing the code to git lab repository are given below:

Gitlab also provides API endpoints to:

  1. Perform Basic functions. For more information, see Pipelines API
  2. Maintain pipeline schedules. For more information, see Pipeline Schedules
  3. Trigger pipeline runs. For more information, see: Triggering pipelines

In the end I would only say that data science and AI can only drive automotive industry in the future. The use of IoT, Data Science and 5G in the automotive industry will bring about better production line performance, shorter manufacturing cycle time, improved quality of the produced vehicles and greater savings realized through increased operational efficiency, reduced waste and lower labor costs.

--

--

No responses yet