
IoT Inspector Officer: Are You a Smart Speaker or a Smart Camera?
The rapid growth of Internet of Things (IoT) devices is transforming various aspects of daily life, including smart homes, healthcare, industrial automation, and critical infrastructure. These devices, which originate from diverse environments, are built by different vendors and designed for distinct purposes, creating a highly heterogeneous IoT ecosystem. However, from a network security perspective, this diversity presents significant challenges in accurately identifying and verifying the sources of communication between devices and servers.
IoT Device Identification: Challenges of Conventional Detection Techniques
Conventional device identification systems often rely on identity-based features, such as Media Access Control (MAC) addresses. However, this method alone is insufficient because many manufacturers do not follow a standardized approach, and some devices may not support this feature. This blog introduces a behavior-based approach that represents device behavior by combining key features extracted from network traffic data. For illustration, consider real-life examples: the data generated by smart security cameras differs significantly from that produced by smart speakers. Security cameras continuously record and transmit video data to the edge or cloud for processing. At the same time, smart speakers, such as the Amazon Echo Dot, typically remain idle until activated by a user command.
Practical Applications in Network Environments
When analyzing network traffic from various IoT devices, the deployment of an identification model becomes crucial. Access to network data can vary depending on whether packet headers, payloads, or both are available. This blog focuses on the challenging scenario where communication occurs over HTTPS, resulting in encrypted payloads. In such cases, only packet headers are accessible, as illustrated in Figure 1. To address this issue, key features must be extracted from the accessible data sources. Table 1 presents these extracted features, which are categorized as follows:
-
Traffic Flow Features
-
Network Protocol Features
-
Packet Size and Length Features
-
Handshake and Encryption Features
-
Statistical Features (Descriptive Statistics)
-
Entropy-Based Features
-
Port-Based Features
-
User-Agent Features
Table 1: HTTPS-based features extracted from traffic data
No |
Feature Name |
No |
Feature Name |
No |
Feature Name |
No |
Feature Name |
1 |
JA3 |
14 |
handshake version |
27 |
q1 |
40 |
max_p |
2 |
stream |
15 |
handshake cipher suites length |
28 |
iqr |
41 |
med_p |
3 |
Inter arrival time |
16 |
handshake cipher suites |
29 |
sum_e |
42 |
average_p |
4 |
time since previously displayed frame |
17 |
handshake extensions length |
30 |
min_e |
43 |
var_p |
5 |
l4 tcp |
18 |
handshake sig hash alg len |
31 |
max_e |
44 |
q3_p |
6 |
l4 udp |
19 |
payload entropy |
32 |
med_e |
45 |
q1_p |
7 |
l7 http |
20 |
sum_et |
33 |
average_e |
46 |
iqr_p |
8 |
l7 https |
21 |
min_et |
34 |
var_e |
47 |
user agent Browser |
9 |
ttl |
22 |
max_et |
35 |
q3_e |
48 |
user agent OS |
10 |
Eth size |
23 |
max_et |
36 |
q1_e |
49 |
user agent Device |
11 |
Ip size |
24 |
average_et |
37 |
iqr_e |
|
|
12 |
payload length |
25 |
var_et |
38 |
sum_p |
|
|
13 |
tcp window size |
26 |
q3 |
39 |
min_p |
|
|
Machine Learning-based Classifiers for Identifying IoT Devices
To classify various IoT devices such as smart cameras, smart speakers, and smart TVs, multiple supervised machine learning techniques can be employed. Classifiers can be trained using all available features or a carefully selected subset. Suitable models include:
-
Support Vector Machine (SVM)
-
K-Nearest Neighbors (K-NN)
-
Logistic Regression
-
Extreme Gradient Boosting (XGBoost)
Conclusion
Identifying IoT devices in encrypted network environments presents challenges that conventional methods often do not effectively address. This blog introduces a behavior-based approach that combines application layer features to enhance device identification accuracy. By inputting these features into machine learning classifiers and training them properly, more reliable device identification can be achieved, thereby enhancing both security and network management within IoT ecosystems.
References
Edited By: Windhya Rankothge, PhD, Canadian Institute for Cybersecurity
Related Blogs: The Role of Artificial Intelligence in Modern Cyber Defence: Friend or Foe?