The insideBIGDATA IMPACT 50 List for Q1 2021

insideBIGDATA 05 January 2021

The team here at insideBIGDATA is deeply entrenched in following the big data ecosystem of companies from around the globe. We’re in close contact with most of the firms making waves in the technology areas of big data, data science, machine learning, AI and deep learning. Our in-box is filled each day with new announcements, commentaries, and insights about what’s driving the success of our industry so we’re in a unique position to publish our quarterly IMPACT 50 List of the most important movers and shakers in our industry. These companies have proven their relevance by the way they’re impacting the enterprise through leading edge products and services. We’re happy to publish this evolving list of the industry’s most impactful companies!

Read the full article here:


Big Data Industry Predictions for 2021

InsideBigData 29 December 2020

2020 has been year for the ages, with so many domestic and global challenges. But the big data industry has significant inertia moving into 2021. In order to give our valued readers a pulse on important new trends leading into next year, we here at insideBIGDATA heard from all our friends across the vendor ecosystem to get their insights, reflections and predictions for what may be coming. We were very encouraged to hear such exciting perspectives. Even if only half actually come true, Big Data in the next year is destined to be quite an exciting ride. Enjoy!

Read the full article here:


LETU Machine Learning Contest Video

KLTV 12 November 2020

Click HERE to view the video

Full URL to video source:


Can a piece of drywall be smart? Bringing machine learning to everyday objects with TinyML

Diginomica 11 November 2020

So-called smart devices like Amazon Echo and Google Nest made early headway into our homes. But will devices as small as a vibration sensor soon outsmart an Echo? Here’s a look under the hood of “TinyML.”

Read the full article at:


LeTourneau University students design artificial intelligence projects for contest

Longview News-Journal 09 November 2020

Since mid-September, 10 teams of LeTourneau University engineering students have been working on projects involving artificial intelligence to enter into a contest. Winners of that contest were announced Friday in the lobby of the Glaske Engineering Center after demonstrations from students.

The teams were challenged by contest sponsors Qeexo, the maker of the machine learning platform AutoML, and Arduino, an open-source electronics hardware and software platform, to provide solutions for real world problems using embedded machine learning. Students’ projects include a device that monitors hand movements to allow it to be almost unbeatable in a game of rock, paper, scissors to a another device that helps fly fishermen perfect their cast.

Link to the full article:


Live Classification Analysis

Sidharth Gulati, Dr. Rajen Bhatt 03 November 2020

Qeexo AutoML enables machine learning application developers to do analysis of different performance metrics for their use-cases and equip them to make decisions regarding ML models like tweaking some training parameters, adding more data etc. based on those real-time test data metrics. In this article, we will discuss in detail regarding live classification analysis module.

Figure 1: Live Classification Analysis

Once the user clicks on Live Classification Analysis for a particular model, they will be directed to the Live Classification Analysis module that would resemble below screenshot.

Figure 2: Live Classification Analysis

In this module we won’t be discussing Sensitivity analysis. To refer to details regarding sensitivity analysis, please read this blog.

Figure 3: Confusion Matrix

For the purpose of this blog, we will use a use-case which aims to classify a few musical air gestures: Drums, Violin and Background. These datasets can be found here.

Live Data Collection

Qeexo AutoML supports live data collection module which can be used to collect data to do analysis on. Data

collection requires a Data collection library to be pushed to the respective hardware. A user can push the library by clicking the “Push To Hardware” button shown below.

Figure 4: Push to Hardware Screen

Once, they click the button and the library flashing is successful, the user will be able to record the data for trained classes in the model for analysis purpose. The user can select any number of seconds of data to do the analysis on. For this particular use-case, we have 3 Classes: Drums, Background and Violin as shown below.

Figure 5: Data Recording Input Screen

Once the user clicks “Record”, they will be redirected to Data Collection page as shown below. This module is same as the Data Collection module which is used to collect training data.

Figure 6: Recording Screen

As the user collects data for respective classes, they will be able to able to see the data in tabular format shown below. They can see the dataset information, delete data and re-record based on their preference.

Figure 7: Dataset Collection

Once, the user has collected the data, they can select whichever data they want to do analysis on by selecting the checkbox as shown above. Once, the user has selected atleast 1 dataset, they will see the Analyze button is activated and as we say, with Qeexo AutoML, “a click is all you need to do Machine Learning”, they will be able to analyze different performance metrics!

Figure 8: Analyze

Performance Metrics

Qeexo AutoML supports 5 different types of performance metrics listed below:

  1. Confusion Matrix: Represents True Labels and Predicted Labels in square matrix. Diagonal (upper left to lower right) elements indicates instances correctly classified. Off-diagonal elements indicate instances mis-classified. Summing instances over each row should sum to total instances for the respective class.
  2. F-1 Score: Measures the 1st harmonic mean of Precision and Recall. Computed as 2 * (Precision * Recall)/(Precision + Recall). Precision measures out of all the samples detected of a given class, how many are relevant. Recall measures out of all the relevant samples of a given class, how many are detected.
  3. Matthews Correlation Coefficient: Measure of discriminative power for binary classifiers. In the multi-class classification case, it quantifies which combinations of classes are the least distinguished by the model. The values can range between -1 and 1, although most often in AutoML the values will be between 0 and 1. A value of 0 means that the model is not able to distinguish between the given pair of classes at all, and a value of 1 means that the model can perfectly make this distinction.
  4. ROC Curve: Plots the False Positive Rate (FPR, x-axis) vs. True Positive Rate (TPR, y-axis) for each class in the classification problem. The dotted line indicates flip-of-the-coin performance where the model has no discriminative ability to distinguish among the classes. The greater the area under the curve (AUC), the better the model.
  5. Kernel Density Estimation plots: This will result in n plots, where n = Number of trained classes. This plot shows the estimated probability density function for each class vs rest of the classes.

For the use case of this blog, please find respective metrics below:

Confusion Matrix

Figure 9: Confusion Matrix

ROC Curve

Figure 10: ROC Curve

Matthews Correlation Coefficient

Figure 11: Matthews Correlation Coefficient

F-1 Score

Figure 12: F-1 Score

Kernel Density Estimation (KDE)

Figure 13: KDE for Background vs Rest of the labels
Figure 14: KDE for Drums vs Rest of the labels
Figure 15: KDE for Violin vs Rest of the labels

With these performance metrics, a user can determine how “well” the model is performing on test data or in live classification scenario. With the help of this module, a user can decide different aspects of a ML pipeline like whether to retrain a model with different parameters, whether more data will help improving the performance or different sensitivities for different classes should be considered. In a nutshell, Live Classification Analysis enables the user to take more control over ML model development cycle based on performance analysis on test data.


Sensitivity Analysis with Qeexo AutoML

Qifan He, Dr. Rajen Bhatt 29 October 2020


For machine learning models, Sensitivity parameter reflects on how sensitive the model is for classes under consideration. Sensitivity Analysis is generally performed before deployment of ML models in the real world application. The primary objective of the Sensitivity Analysis is to make ML model lean more towards certain class(es) than the other(s). Often the sensitivity analysis is also related to the study of the tolerance for misclassifying instances of certain class(es) against the other(s). For example, consider the machine learning model designed to detect faults in the industrial equipment. Generally, an operator want to always make sure that defects, if any, are detected almost always. In this case, an operator is OK (even though it is not ideal) if some non-defects are being recognized as defects. Because the cost of defects not being recognized as defects is very high as this may damage the equipment permanently. While there are also costs of classifying non-defects as defects, these costs are comparatively very less and can be filtered manually as false alarms. In general, ML algorithms should try to reduce the false alarms as well. In this blog, we will discuss how to perform sensitivity analysis on Qeexo AutoML platform.

Sensitivity Analysis

Qeexo AutoML performs the sensitivity analysis using Class Weights. For the classification problem having C, C >= 2, number of classes, class weights is a C-dimensional array of integers > 0. During the model training phase, Qeexo AutoML assigns the weight of 1 to each classes, i.e., the initial (or default) weight vector is C-dimensional array of 1’s which can be represented as {w_1, w_2, ..., w_C}. This results in the initial sensitivity value of 1/C for each class represented as s_1, s_2, ..., s_C such that

    \[ \sum _{i=1}^{C}s_i = 1\ \]

For example, for binary classification problem, i.e., 2-class classification problem, the default sensitivity array is s_1, s_2 = 0.5, 0.5. If we start lowering one of the numbers, the model becomes more sensitive to that particular class. Lowering the sensitivity number of particular class is equivalent to increasing the weight of that class. All the model performance metrics such as Confusion matrix, Learning Curves, ROC curves, F-1 Score, and MCC are all computed with the default sensitivity value 1/C and class weights 1.

After training models on the Qeexo AutoML platform, you will be guided towards the Models page. Here you can see the details of each model and perform the live classification. You can go to the Live Classification Analysis to analyze the sensitivity of each class and update their influence on the model performance.

Figure 1: model details

When you click into Live Classification Analysis icon, you can see the following page.

Figure 2: Live Classification Analysis

In the first tab, you can see the description of the model, such as the classes used in training and the date it has been created. The second section Compiled History will save the history of the weights you have tried; when you click this for the very first time, it will show the default weights of 1 for each class. It also allows you to select and delete any of the weights combination in your history. The selected weights will be updated on your device once you click Selected on this page and push the library to the device with the button in the Live-data Collection tab or on the Model page.

The bigger the weight is, the more sensitive the model is to that class. In other words, the model is more likely to output class with the higher weight. Even though you can assign a weight to each class, only the relative differences between the class weights matter. That is, for three-class classification, weights {1,1,1} will have the same effect as {3, 3, 3} because this simply means each class has equal weights.

In the third tab, you can try different combinations of weights and see the simulation of their effects on model performance. The model performance is shown through two metrics. One is the bar chart showing the the accuracy of each class with the chosen weight combination. The second one is through the confusion matrix. The y-axis of this table is True Label, the x-axis is Predicted Label. The values on diagonal from top left to bottom right of this table is where predicted label matches true label. The perfect performance shown by confusion matrix should have zeros everywhere except the diagonal cells.

The last part of this page offers a chance to collect some new testing data and evaluate the model with the selected weights on the testing data.

Some Examples

Let us first take a look at a binary classifier example. For binary classifier, the default classification rule is the following:

    \[ y= \begin{dcases} class_1, & \text{if} P(class_1) > 0.5\\ class_0, & \text{otherwise} \end{dcases} \]

However, in reality, in order to make the classifier more sensitive towards class-0 using this model, we want to make the following classification rule:

    \[ y= \begin{dcases} class_1, & \text{if} P(class_1) > 0.75\\ class_0, & \text{otherwise} \end{dcases} \]

The new classification rule is stricter for class-1 and relaxed for class-0. This may be the better model compared to the default model because we may want to detect class-1 only if probability assignment of class-1 is highly confident, e.g. >=0.75, otherwise we want to classify the incoming signal (or pattern) into class-0. With this classification rule, the model remains the same but becomes more sensitive to one class over the other(s). In order to achieve this classification rule, the weights are computed as given below:

    \begin{align*} weight_0= 0.5/0.25 = 2 \\ weight_1 = 0.5/0.75 = 0.67 \end{align*}

For example, the models assigned the probabilities to each class as {0.4, 0.6}. With new weights, the weighted probabilities are:

    \begin{align*} weighted\_probability= [2, 0.67] * [0.4, 0.6] = [0.80, 0.40] \end{align*}

Now we need to compare weighted_probability with the default thresholds {0.5,0.5}, that results in the classification decision class-0. With the concept of weights for classes, we have achieved the effect as if the sensitivities are {0.25, 0.75} for each class. Please note, without weights, this signal would have classified to class-1. However, with relaxed sensitivity value for class-0 and stricter sensitivity value for class-1, we get the classification outcome as class-0. Note that the sum of probabilities doesn’t equal to 1. We can also normalize the probabilities and get the same prediction.

    \begin{align*} normalized\_weighted\_probability= [0.80/(0.8 + 0.40), 0.40/(0.80 + 0.40)] = [0.666, 0.333] \end{align*}

This weighted probability compute generalize very well with multi-class classification with each class having its own threshold.


For real-world applications, finding the right weights for each class is a matter of trial-and-error or some predefined human knowledge. Qeexo AutoML offers very efficient method to test with different class weights, quickly check the classification performance, and then push the newly determined class weights in order to perform the live testing.


The insideBIGDATA IMPACT 50 List for Q4 2020

insideBIGDATA 13 October 2020

The team here at insideBIGDATA is deeply entrenched in following the big data ecosystem of companies from around the globe. We’re in close contact with most of the firms making waves in the technology areas of big data, data science, machine learning, AI and deep learning. Our in-box is filled each day with new announcements, commentaries, and insights about what’s driving the success of our industry so we’re in a unique position to publish our quarterly IMPACT 50 List of the most important movers and shakers in our industry. These companies have proven their relevance by the way they’re impacting the enterprise through leading edge products and services. We’re happy to publish this evolving list of the industry’s most impactful companies!

Read the full article here:


Qeexo Adds Support for Arm’s Edge Processor

Datanami 12 October 2020

Qeexo, the “tinyML” specialist, said its AutoML platform now supports the smallest Cortex processors from Arm Ltd., making it the first vendor to automate machine learning on the Arm processor used for edge computing in sensors and microcontrollers.

Read the rest of the article here:


Sound Recognition with Qeexo AutoML

Zhongyu Ouyang and Dr. Geoffrey Newman 21 September 2020


Sound Recognition is a technology based on traditional pattern recognition theories and signal analysis methods which is widely used in speech recognition, music recognition and many other research areas such as acoustical oceanography [1]. Generally, microphones are regarded as sufficient sensing modalities as input to machine learning methods within these fields. Microphones are capable of capturing information necessary for the variety of classification tasks that can be performed on lightweight devices. With this type of sensor, Qeexo AutoML provides a diverse feature stack, taking advantage of the physical properties of microphone data to extract information relevant to such classification tasks. This blog will show you how to perform sound recognition with Qeexo AutoML and explain some of the basics concepts of our feature stack.

AutoML Tutorial

Qeexo AutoML offers a general use user-friendly interface for engineers who wants to perform sound recognition, or any other classification task on embedded devices. The processes discussed in this blog are not specific to sound recognition, but are specifically applicable to it. To get started, navigate to the training page and select (or upload) the labeled training data that you want to use to build models for your embedded device. In the Sensor Selection page, you can select the desired sensor types, (in our sound recognition example we utilize the microphone sensor), to choose the collected data, as shown in Figure 1.

Figure 1: Sensor Selection Page

You are also provided with the option of automatic sensor and feature group selection, if you want to use additional sensor modalities or experiment with feature subgroups. If this is selected, Qeexo AutoML will automatically choose the sensor and feature groups that make the classes most distinct. In the Inference Settings page, you can manually set up the instance length and the classification interval, or let Qeexo AutoML determines them by selecting Determine Automatically, as shown in Figure 2.

Figure 2: Inference Settings Page

In the Model Settings page, you can pick the algorithm(s), choose whether to generate learning curve and/or perform hyperparameter tuning and click Start Training button to start. After the training is finished, a binary file will be generated and can be flashed to the device by clicking the Push to Hardware button. Once the process is finished, you can perform live tests on the model that was built, as shown in Figure 3.

Figure 3: Model Details Page

While the process is by design very straightforward, the details of some of the choices may appear ambiguous.
Other blog posts go into some detail on different aspects of the pipeline, but we will focus on some of the feature
choices applicable to sound recognition.

Sound Recognition Highlighted Features

Fast Fourier Transform (FFT)

Signals in the time domain are difficult for humans and computers alike to distinguish among similar sound sources. One of the most popular ways to transform raw sound data is the Fast Fourier Transform (FFT). Due to the constraints of embedded devices, the FFT is an efficient frequency decomposition technique. The process is described in Figure 4.

Figure 4: FFT process

For different classes, the signals differ in their magnitudes for a given frequency bin. E.g., in Figure 5, sounds generated with different instruments have different distributions of the magnitudes among the frequencies 0-800 Hz; even with differences present up to 2000 Hz.

Figure 5: FFT Features for Different Classes

The Qeexo AutoML training methods will take advantage of the increased class separability in this range to train the model through model training. Qeexo AutoML doesn’t just use all of the FFT coefficients as input in training the model, but actually aggregate the coefficients to create sophisticated features. The specific groupings can be hand-picked during the model selection process to accommodate implementation constraints. To select the features groups, simply check the box(es) in the manual feature selection page as shown in Figure 6.

Figure 6: Manual Feature Selection Page

Mel Frequency Cepstral Coefficients (MFCC)

Mel Frequency Cepstral Coefficients (MFCC) is also an important technique for sound recognition. Humans react differently to distinct ranges of frequencies. As a species, we are more capable of telling the difference in frequencies between a 50Hz and a 100 Hz signal, than that between 10050Hz and 10100 Hz. In other words, we are really bad at distinguishing high pitched sounds. Therefore, in situations where you want to replicate a task performed by humans, such as voice separation, the difference when the frequency is low is the most important. The value of the signal properties decreases with increasing frequency. Mel scale comes into place here, by assigning more importance to the low frequency content and less to the high frequency content. The formula for converting from frequency to Mel score is:

    \begin{align*} M(f) = 1125 * ln(1+f/700)\\ \end{align*}

We build a filter bank containing many triangular filters and apply them to our FFT features to rescale the signals again and convert them to the corresponding Mel scales. In the Mel spectrograms shown in Figure 7, we can see that different classes’ Mel spectrograms appear to have many differences, making them ideal inputs for training a classifier.

Figure 7: Mel Spectrograms for Different Classes

Qeexo AutoML also provides features generated from the coefficients of MFCC. The feature groups can also be selected in the manual selection page shown in Figure 3. If desired, you can visualize the selected features through a UMAP plot by clicking the Visualize button shown in the Sensor Selection page and Feature Group Selection page.

Based on this discussion, it should be apparent that MFCC features will work well for tasks involving human speech. Depending on the task, it may be disadvantageous to include these MFCC features if it does not share similarities with human hearing. Qeexo AutoML performs automatic feature reduction, however, when automatic selection is enabled, so this does not need to be an active concern when training models. If the MFCC features are not highly separable for the task, assuming sufficient data is provided, they will be dropped from the final model during this process.


Qeexo AutoML not only provides model building functionality, but also present the details of the trained models. We provide evaluation metrics like confusion matrix, by-fold cross validation, ROC curve, and even support downloading the trained model to test it elsewhere. As mentioned earlier, we provide support for, but do not limit to microphone sensor usage for sound recognition. You are free to select any other provided sensors such as accelerometer and gyroscope. If these additional sensors don’t improve model performance, they won’t be included in the final device library, through the automated sensor selection process.


[1] Wikipedia: Sound Recognition,