Uploaded by Armagan Ergun

01-MachineLearning Beningo

Simplifying Concepts, Delivering Success℠
DESIGNING INTELLIGENT SYSTEMS USING RESOURCE
CONSTRAINED EDGE DEVICES
Jacob Beningo | President
© 2019 Jacob Beningo All Rights Reserved
Simplifying Concepts, Delivering Success℠
THE LECTURER
Jacob Beningo
President
Social Media / Contact
jacob@beningo.com
810-844-1522
Jacob_Beningo
Newsletters
Embedded Bytes
Beningo Engineering
JacobBeningo
Embedded Basics
http://bit.ly/1BAHYXm
© 2019 Jacob Beningo All Rights Reserved
www.beningo.com
Consulting
•
•
•
•
•
•
Secure Bootloaders
Code Reviews
Architecture Design
Real-time Software
Expert Firmware Analysis
Microcontroller Systems
Embedded Training
•
•
•
•
•
RTOS Workshop
Bootloader Design
Debugging Techniques
Security Fundamentals
Micro Python
2
Simplifying Concepts, Delivering Success℠
SESSION OVERVIEW
TOPICS
1
Designing Intelligent Systems
2
Machine Learning Basics
3
Intelligence in the Cloud
4
Intelligence at the Edge
5
Datasets, Frameworks and Libraries
6
Example Applications
7
Best Practices
© 2019 Jacob Beningo All Rights Reserved
OBJECTIVE
Explore artificial intelligence
applications at the edge on
Cortex-M processors.
3
Simplifying Concepts, Delivering Success℠
INTRODUCTION
The Pillars of Embedded Software Development
For embedded software developers, there are core
skillsets that every developer must master such as:
• Architecture Design
• Code Analysis
• Debug
• Documentation
• Language Skills
• Processes and Standards
• Testing
• Tools
© 2019 Jacob Beningo All Rights Reserved
Artificial Intelligence
4
Simplifying Concepts, Delivering Success℠
DESIGNING INTELLIGENT SYSTEMS
Machine Learning
“Machine learning is a field of computer science that often uses statistical
techniques to give computers the ability to ‘learn’ with data, without
being explicitly programmed”
- Wikipedia
© 2019 Jacob Beningo All Rights Reserved
5
Simplifying Concepts, Delivering Success℠
DESIGNING INTELLIGENT SYSTEMS
Machine Learning
Why do we need intelligent systems?
•
To solve problems that are not easy for humans to code for
•
To scale system behaviors and results based on new data and situations
•
To perform tasks that are easy for a human but traditionally difficult for computers
•
To decrease system costs in certain applications
•
Because it’s cool and cutting edge
© 2019 Jacob Beningo All Rights Reserved
6
Simplifying Concepts, Delivering Success℠
DESIGNING INTELLIGENT SYSTEMS
Machine Learning
What can machine learning be used for?
•
Image recognition
•
Speech and audio processing
•
Language processing
•
Robotics
•
Bioinformatics
•
Chemistry
•
Video Games
•
Search
© 2019 Jacob Beningo All Rights Reserved
7
Simplifying Concepts, Delivering Success℠
8
DESIGNING INTELLIGENT SYSTEMS
The Range of “Edge” Applications
© 2019 Jacob Beningo All Rights Reserved
Image Courtesy Arm
Simplifying Concepts, Delivering Success℠
9
DESIGNING INTELLIGENT SYSTEMS
The Range of “Edge” Applications
© 2019 Jacob Beningo All Rights Reserved
Image Courtesy Arm
Simplifying Concepts, Delivering Success℠
MACHINE LEARNING BASICS – DEEP LEARNING
Perceptron Neuron
w · x = !j wjxj
0 x1 4 w1
w
1 x2 -2 2
1 w3
1 x3
b=3
If w · x + b ≤ 0
1
If w · x + b > 0
Output
w · x = (0*4) + (1*-2) + (1 *1) = -1
w · x + b = -1 + 3 = 2 > 0
© 2019 Jacob Beningo All Rights Reserved
0
10
Simplifying Concepts, Delivering Success℠
MACHINE LEARNING BASICS – DEEP LEARNING
Sigmoid Neuron
x1
x2
x3
© 2019 Jacob Beningo All Rights Reserved
w1
w2
w3
!(w · x + b)
Output
Fractional value 0 to 1
11
Simplifying Concepts, Delivering Success℠
MACHINE LEARNING – DEEP LEARNING
Sigmoid Neuron
The Sigmoid Function
© 2019 Jacob Beningo All Rights Reserved
12
Simplifying Concepts, Delivering Success℠
MACHINE LEARNING – DEEP LEARNING
Neural Networks
Hidden Layers
Input
Layer
© 2019 Jacob Beningo All Rights Reserved
Output
Layer
13
Simplifying Concepts, Delivering Success℠
INTELLIGENCE IN THE CLOUD
Embedded Architectures
© 2019 Jacob Beningo All Rights Reserved
14
Simplifying Concepts, Delivering Success℠
INTELLIGENCE IN THE CLOUD
Cloud Experimentation
Experiment Setup
•
STM32F779I-Eval
•
Google Cloud Vision API’s
•
Express Logic
•
X-Ware IoT Platform
• ThreadX
• NetX HTTPS Client
• NetX Secure TLS
• etc
Camera
Module
Ethernet
LCD
© 2019 Jacob Beningo All Rights Reserved
AC Adapter
ST-Link
15
Simplifying Concepts, Delivering Success℠
INTELLIGENCE IN THE CLOUD
Cloud Experimentation
© 2019 Jacob Beningo All Rights Reserved
16
Simplifying Concepts, Delivering Success℠
17
INTELLIGENCE AT THE EDGE
Why is ML Moving to the Edge?
Bandwidth
Power
© 2019 Jacob Beningo All Rights Reserved
Cost
Latency
Reliability
Security
Image Courtesy Arm
Simplifying Concepts, Delivering Success℠
18
INTELLIGENCE AT THE EDGE
Model Deployment on Cortex-M MCUs
•
Running ML framework on Cortex-M systems is impractical
•
Need to run bare-metal code to efficiently use the limited
resources
•
Arm NN translates trained model to the code that runs on
Cortex-M cores using CMSIS-NN functions
•
CMSIS-NN: optimized low-level NN functions for Cortex-M
CPUs
•
CMSIS-NN APIs may also be directly used in the application
code
© 2019 Jacob Beningo All Rights Reserved
Image Courtesy Arm
Simplifying Concepts, Delivering Success℠
INTELLIGENCE AT THE EDGE
The Intelligent Edge
What do you need to do machine learning at the edge?
• DSP Capable Processor
• ML Libraries
• Enough CPU cycles
• Training Dataset
•
•
5,000 labeled examples per category
for acceptable performance
10,000,000 labeled examples to
achieve human performance
• Time and patience
Image
Source: hackernoon
© 2019 Jacob Beningo All Rights
Reserved
19
Simplifying Concepts, Delivering Success℠
20
DATASETS, FRAMEWORKS AND LIBRARIES
Dataset Size (# samples)
Datasets
109
108
107
106
105
104
103
102
101
100
Canadian
Hansard
WMT
ImageNet 10k
Sports-1M
ImageNet
Public SVHN ILSVRC 2014
MNIST
Criminals
CIFAR-10
IRIS
T vs. G vs. F
1900
© 2019 Jacob Beningo All Rights Reserved
1950
Rotated T vs. G
2000
1985
2015
Image Courtesy Arm
Simplifying Concepts, Delivering Success℠
DATASETS, FRAMEWORKS AND LIBRARIES
Software Frameworks
DistBelief
TensorFlow
MXNet
Theano
Software Libraries
PyLearn2
Torch
Caffe
© 2019 Jacob Beningo All Rights Reserved
21
Simplifying Concepts, Delivering Success℠
22
DATASETS, FRAMEWORKS AND LIBRARIES
CMSIS-NN
CMSIS-NN: collection of optimized neural network functions for Cortex-M CPUs
Key considerations:
§ Improve performance using SIMD instructions
§ Minimize memory footprint
§ NN-specific optimizations: data-layout and offline weight reordering
© 2019 Jacob Beningo All Rights Reserved
Image Source: Arm
Simplifying Concepts, Delivering Success℠
23
DATASETS, FRAMEWORKS
AND LIBRARIES
CMSIS-NN: Efficient NN Kernels for Cortex-M CPUs
Convolution
§
Boost compute density with GEMM based
implementation
§
Reduce data movement overhead with depth-first
data layout
§
Interleave data movement and compute to
minimize memory footprint
Pooling
§
§
Improve performance by splitting pooling into x-y
directions
Improve memory access and footprint with in-situ
updates
Activation
§
ReLU: Improve parallelism by branch-free
implementation
§
Sigmoid/Tanh: fast table-lookup instead of
exponent computation
©CMSIS-NN
2019 Jacobpaper:
Beningo
https://arxiv.org/abs/1801.06601
All Rights Reserved
*Baseline uses CMSIS 1D Conv and Caffe-like Pooling/ReLU
Image Source: Arm
Simplifying Concepts, Delivering Success℠
24
DATASETS, FRAMEWORKS
AND LIBRARIES
CMSIS-NN: Efficient NN Kernels for Cortex-M CPUs
Many resources available due to the openness of ML
community:
•
DNN:
https://research.google.com/pubs/archive/42537.pd
f
•
CNN:
https://research.google.com/pubs/archive/43969.pd
f
•
CNN-GRU: https://arxiv.org/abs/1703.05390
•
LSTM: https://arxiv.org/abs/1705.02411
Need compact models: that fit within the Cortex-M
system memory
Need models with less operations: to achieve real time
performance
© 2019 Jacob Beningo All Rights Reserved
NN Models from literature trained on
Google speech commands dataset
Image Source: Arm
Simplifying Concepts, Delivering Success℠
EXAMPLE APPLICATIONS
Convolutional Neural Network (CNN) on Cortex-M7
• CNN with 8-bit weights and 8-bit activations
• - Total memory footprint: 87 kB weights + 40 kB
activations + 10 kB buffers (I/O etc.)
• - Example code available in CMSIS-NN github
© 2019 Jacob Beningo All Rights Reserved
NUCLEO-F746ZG
216 MHz, 320 KB SRAM
25
Simplifying Concepts, Delivering Success℠
EXAMPLE APPLICATIONS
OpenMV Camera
OpenMV Cam with a Cortex-M7
© 2019 Jacob Beningo All Rights Reserved
Video :
https://www.youtube.com/watch?v=PdWi_fvY9Og
26
Simplifying Concepts, Delivering Success℠
BEST PRACTICES
Machine Learning
1
Read Deep Learning by Ian Goodfellow, Yoshua
Bengio, Aaron Courville, Francis Bach.
6
Use 80% of your data for training and the last 20%
for validating the model.
2
Start in the cloud or on a PC and then work
your way to the embedded target.
7
Review the Arm papers on keyword spotting and
speech recognition.
3
Create a “Hello World” application that can
recognize hand written digits.
8
Purchase a development kit and duplicate an
example and then try to scale it.
4
Make sure you are using the right data.
9
Explore CMSIS-NN and the white papers that
surround it
5
Try multiple tools to see which one best fits
your application and team.
10
Start early, don’t wait to the last minute to learn
how machine vision works.
© 2019 Jacob Beningo All Rights Reserved
27
Simplifying Concepts, Delivering Success℠
GOING FURTHER
Resources from beningo.com
• Embedded Bytes Newsletter
•
Introduction Video:
https://www.youtube.com/watch?v=aircAruvn
Kk
•
Online Book:
http://neuralnetworksanddeeplearning.com/
•
MIT Course: http://introtodeeplearning.com/
•
CMSIS-NN paper:
https://arxiv.org/abs/1801.06601
•
KWS (Keyword Spotting) paper:
https://arxiv.org/abs/1711.07128
© 2019 Jacob Beningo All Rights Reserved
28
Simplifying Concepts, Delivering Success℠
UPCOMING EVENTS
• RTOS Fundamentals Online
• Advanced RTOS Techniques Online
• Bootloaders Online
• Technology Primers
§
Debugging
§
Security
For more events, visit Beningo.com
© 2019 Jacob Beningo All Rights Reserved
29
Simplifying Concepts, Delivering Success℠
THE NEXT SESSION WILL BEGIN SHORTLY
Introduction to the Cortex-M1 and Cortex-M3 using Arm DesignStart FPGA
This session will explore the benefits of using the Cortex-M1 and Cortex-M3 soft cores within a Xilinx FPGAs. To attend this session no
FPGA knowledge is necessary, the session will cover the architecture of the device, connecting peripherals and how we create and
deploy project along with the debugging options available to us. While the time available for these concepts is limited the attendee
will take away a good overview of the benefit, design and development life cycle and how to address any challenges encountered
along the way.
Register at: https://www.beningo.com/insights/conferences
or
http://bit.ly/ArmDesignStartFPGA
© 2019 Jacob Beningo All Rights Reserved
30
THANK YOU!
beningo.com
Simplifying Concepts, Delivering Success℠
Trademark and copyright statement:
The trademarks featured in this presentation are registered and/or unregistered trademarks of Beningo
Embedded Group (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks
featured may be trademarks of their respective owners.
Copyright © 2019. All rights reserved.