This tutorial will guide you through the process of building a machine learning model with SensiML for the AVR® Curiosity Nano. In this example, we will develop a classification model that can indicate the operational state of a fan including whether the fan is on, the speed setting, and whether the fan is experiencing a fault condition (tapping, shaking, or unknown). In addition to a step-by-step walkthrough on the usage of the ML development tools, this tutorial will provide high-level details on the data collection and model development process that can be applied to your application.
A fully developed fan condition monitoring project including dataset, pre-trained model, and firmware source code is provided with this guide to help you get your machine learning project up and running quickly.
This tutorial is an abridged version of the SAMD21 fan condition monitoring tutorial. Visit the Fan Condition Monitoring with SensiML page for a more detailed explanation of the application design and analysis.
- A fan of your choosing. For this example, we'll be using the Honeywell HT-900 table fan as shown in Figure 5.
- Standard mounting putty such as the Loctite Fun-Tak as shown in Figure 6.
- MPLAB® X IDE
- MPLAB® Code Configurator
- SensiML Analytics Studio and Data Capture Lab (Premium account)
- The firmware and MPLAB X project files can be found in the GitHub repository that accompanies this demo.
- The dataset used in this tutorial can be downloaded from the latest GitHub release.
- IMU data collection firmware and usage information for the Curiosity Nano can be downloaded from the AVR DA Curiosity Nano Data Logger repository.
Before we get started, you'll need to install and set up the required software as detailed in the steps below.
Install the MPLAB X IDE and XC8 compiler. These are required to load the demo project and to program the AVR DA board. You can use the default, free license for the XC8 compiler as you will not need any of the pro functionality here.
Sign up for the premium edition account with SensiML if you have not already. We'll use this to process our sensor data and generate the fan condition classifier library. Since the Curiosity Nano does not currently have native support in the SensiML Analytics Studio, the premium edition is required to enable access to the SensiML source code.
Download the SensiML Data Capture Lab from the SensiML Downloads page and install it. We'll use this to capture and label data for our SensiML project.
In this example, we're targeting a predictive maintenance type application. The question we are trying to answer here is whether we can use the analytic abilities of machine learning to monitor and predict machine failure thereby reducing maintenance costs and increasing uptime. To demonstrate how you can attack this type of application, we'll develop a classifier model using the SensiML Analytics Toolkit that can recognize the state of a Honeywell HT-900 fan. The model is deployed on an AVR DA Curiosity Nano mounted to the fan housing and can classify between the three-speed modes of the HT-900 fan as well as three abnormal states: tapping, shaking, and unknown. The example application setup is pictured in Figure 7.
Data Collection Overview
Now let's cover how we should collect the data samples that will be used to develop the fan state classifier model.
The first step in the data collection process is to determine an appropriate sensor configuration for your application; this includes the geometric placement of the sensor, the installation method, and the signal processing parameters like sample rate and sensitivity.
A more detailed analysis of the project setup and sensor configuration can be found on the SAMD21 version of this tutorial.
To affix the AVR DA board to the fan, a standard mounting putty was used; this is the same type as is used to mount posters and other lightweight items to a wall.
In terms of placement, the board was installed in its natural orientation (i.e., the accelerometer should nominally read X=0 Y=0 Z=1g) with the bottom of the Curiosity Nano Base board being attached to the topmost area of the housing. There is no particular reason this placement was chosen other than it is the easiest way to install the board.
Sensor Sampling Configuration
The sensor sampling parameters are summarized below:
- Sensor: 3-axis Accelerometer + 3-axis Gyrometer
- Sample Rate / Frequency Range: 100 Hz (~40 Hz 3 dB cutoff)
- Accelerometer Full Scale Range: +/-2G (most sensitive setting)
- Gyrometer Full Scale Range: +/-125 DPS (most sensitive setting)
Keep in mind that the configuration above was derived from analysis on a specific fan, so it may not be optimal for different setups.
Data Collection Protocol
The next step in the data collection process is putting together a protocol to use when collecting your data. This includes deciding how many samples to collect, what metadata parameters to collect, and other parameters that determine the procedure by which data is collected.
Data Collection: Metadata
Let's cover metadata first as this determines how we contextualize our data. The metadata variables determined for this example application are summarized in the table below.
|Fan ID||Tag for the make, model, and serial number identifier of the fan being used.|
|Environment ID||Tag for the specific environment that the data was captured in.|
|Mount ID||Tag for the installation instance of the sensor.|
|Collection ID||Tag for the data collection effort where multiple samples were captured.|
Data Collection: Sampling Methodology
At this point, we need to decide how to sample data for our application; this includes choosing how many samples to capture and defining what steps are needed to take the measurements. The methodology for this example application is summarized in the steps below:
- Record the metadata values for this data collection in a log. Alternatively, these can be stored directly as metadata variables in SensiML Data Capture Lab.
- Record and label 30-second segments each of fan off, speed 1, speed 2, and speed 3
- Record tapping on the fan housing in the same spot for at most 15 seconds at a time; repeat this until you have 30 seconds of tapping data.
- With the fan speed set at 1 (the slowest setting), record sensor data as you gently shake the table by grabbing either the tabletop or one of the table legs and gently rocking back and forth for at most 15 seconds; repeat until you have 30 seconds of labeled shaking data.
A single run of this process should generate enough data (and enough variation) to create a simple machine learning model that should work well under the constrained conditions of your experiment. To develop a more generalized model, you might perform the above collection with several different fan types or by introducing vibration interference that is realistic for your application.
Data Collection: Data Capture Tools
To record and label IMU data from the evaluation kit for use in the SensiML Analytics Studio, it is simplest to stream data to the SensiML Data Capture Lab directly. Follow the steps below to connect the AVR DA board directly to Data Capture Lab:
Head to the AVR DA Curiosity Nano Data Logger repository to download the data collection firmware.
Follow the "How to Configure, Compile, and Flash" instructions to compile the data streaming firmware for your desired sensor configuration. Note you must use the "SensiML Simple Stream" data stream format to enable direct streaming to Data Capture Lab.
Once you've flashed the data collection firmware, open up Data Capture Lab and create a new project.
Follow the "Usage with the SensiML Data Capture Lab" guide to connect the AVR DA board to Data Capture Lab.
Alternatively, you can use the integrated MPLAB Machine Learning and MPLAB Data Visualizer plugins to collect data and export it to your SensiML project. Refer to the "Using the ML Plugin with SensiML" guide for more information on that process.
For more details on recording and labeling with the Data Capture Lab please visit the SensiML documentation page.
At this point, you should have an initial dataset collected for your application. Let's now move into the Analytics Studio to generate our classifier model.
If you don't have your own data yet but still want to evaluate the model development process, download the dataset that was used in developing this guide from the releases page and import it into your SensiML project to follow along.
Open up the Analytics Studio in your web browser and log in.
Navigate over to the Prepare Data tab to create the query that will be used to train your machine learning model. Fill out the fields as shown in Figure 9; these query parameters will select only the samples in the training fold.
The SensiML Query determines what data from our dataset will be selected for training. We can use this to exclude test data from our training process.
Switch over to the Build Model tab to start developing the machine learning model pipeline.
Enable the Advanced Pipeline Settings and add a Segment Transform block right after the Windowing block. Set the Transform to Strip and the Type to mean.
The Strip Mean Transform ensures that the model removes the bias in the IMU readings, which can vary among individual sensors and installation instances.
Edit the Validation block and set Validation Method to Stratified K-Fold Cross-Validation and Number of Folds to 3.
The Number of Folds used for validation has been reduced to 3 here since the example dataset is small and we want to ensure that each fold has enough data to provide an accurate estimate of model performance.
Edit the AutoML Parameters block and set the Prediction Target to f1-score and toggle the Anomaly Detection switch to On.
We choose the f1-score to account for the class imbalance in the example dataset. We also enable the Anomaly Detection functionality so that an unknown classification (classID=0) is returned whenever the input is far outside any of the known classes; when this is disabled, the input will instead be classified as the closest matching class.
Once you've entered the pipeline settings, click the Optimize button. This step will use AutoML techniques to automatically select the best features and machine learning algorithms for the classification task given your input data. This process will usually take several minutes.
When the process is completed, take a moment to explore the models that were generated and verify they have good performance. Note that the rank 0 model is usually the best compromise among all the generated candidate models.
If the candidate models have poor accuracy or seem unnecessarily complex, chances are you need to go back and check your dataset. You might consider starting with a smaller dataset or even limiting the number of classes during initial development to discover where your model is failing.
Once the Build Model optimization step is completed, navigate to the Download Model tab. Fill out the Knowledge Pack settings using the Pipeline, Model, and Data Source you created in the previous steps, and select Source as the output format (see Figure 15 for reference). Click the Download button to deploy your model.
The Source format, available to premium SensiML subscribers, will generate the collection of C source code required for your model. Note that although we select the SAMD21 ML Eval Kit as the hardware platform, the resulting source code should be compatible with any Microchip platform.
You now should have the source code for your machine learning model that you can integrate into your MPLAB X project.
Knowledge Pack Integration
Let's take our SensiML deployment (i.e., knowledge pack) and integrate it into an existing MPLAB X project using the fan condition monitoring demo project as a template.
Use the MPLAB X project that accompanies this guide as a starting point for your project.
Download or clone the demo source code from the GitHub repository.
Unzip the contents of the SensiML knowledge pack (the ZIP archive downloaded in the previous section) into the same root folder your MPLAB X project is located; this should overwrite the existing knowledgepack folder.
Open up the avrda-cnano-sensiml-fan-condition-demo.X project in the MPLAB X IDE.
Follow the instructions provided in the sensiml-template repository to add your knowledge pack's source files to your project.
Now open up main.c file under Source Files.
Scroll down a bit further down inside the main while loop until you reach the section as shown in Figure 17 that begins with a call to ringbuffer_get_read_buffer. This is the essence of the inferencing code: the code calls into the SensiML knowledge pack via the sml_recognition_run function for every sample we get from the IMU.
If you're creating an application with different classes, make modifications to the LED code here to reflect your class mapping.
The sml_recognition_run function is the main entry point into the SensiML SDK; it internally buffers the samples we give it and makes an inference when it has enough data. For the project in this guide, an inference will be made every 100 samples—this corresponds to the Window Size parameter we defined in the Query step of the model development in Analytics Studio. Note that sml_recognition_run will return a negative integer until it has enough data to make a prediction.
Fan Condition Monitor Firmware Overview
For a description of the demo firmware included with this project including operation, usage, and benchmarks see the README file in the GitHub repository.
That's it! You now have a basic understanding of how to develop a fan condition monitoring application with SensiML and the AVR DA Curiosity Nano.
For an in-depth guide on the data-driven design process see SensiML's "Building Smart IoT Devices with AutoML" whitepaper.
To learn more about the SensiML Analytics Toolkit, including tutorials for other machine learning applications, go to the SensiML "Getting Started" page.
Table of Contents