Confusion Matrix role in Cyber Security

Raja Sharma
4 min readJun 6, 2021

What is cybersecurity?

Cybersecurity is the protection of internet-connected systems such as hardware, software and data from cyberthreats. The practice is used by individuals and enterprises to protect against unauthorized access to data centers and other computerized systems.

A strong cybersecurity strategy can provide a good security posture against malicious attacks designed to access, alter, delete, destroy or extort an organization’s or user’s systems and sensitive data. Cybersecurity is also instrumental in preventing attacks that aim to disable or disrupt a system’s or device’s operations.

Why is cybersecurity important?

Getting hacked isn’t just a direct threat to the confidential data companies need. It can also ruin their relationships with customers, and even place them in significant legal jeopardy. With new technology, from self-driving cars to internet-enabled home security systems, the dangers of cyber crime become even more serious.

So, it’s no wonder that international research and advisory firm Gartner Inc. predicts worldwide security spending will hit $170 billion by 2022, an 8% increase in just a year

What is Confusion Matrix?

A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is making.

It is extremely useful for measuring Recall, Precision, Specificity, Accuracy and most importantly AUC-ROC Curve.

Let’s understand TP, FP, FN, TN .

True Positive (TP)

  • The predicted value matches the actual value
  • The actual value was positive and the model predicted a positive value

True Negative (TN)

  • The predicted value matches the actual value
  • The actual value was negative and the model predicted a negative value

False Positive (FP)

  • The predicted value was falsely predicted
  • The actual value was negative but the model predicted a positive value

False Negative (FN)

  • The predicted value was falsely predicted
  • The actual value was positive but the model predicted a negative value

Confusion matrices have two types of errors: Type I and Type II

FP(Type-1 error): In this case, our model predicts the system is secure but our system is actually under attack. Therefore, the cybersecurity officers have no clue about the attack which may cause a major Cyberattack.

FN(Type-2 error): In this case, our model predicts that the system is insecure which makes the cybersecurity officers more active in detecting the issue. However, this does not lead to any actual harm to the system.

Multiple standards have been defined for the 2 class matrix:

  • The accuracy is proportional to total number of predictions that were correct. It is calculated by the below equation –
  • The recall or positive rate (TP) is directly proportional to the number of positive cases predicted by the machine learning model. It can be calculated by the below equation –
  • The precision is directly proportional to the number of correctly predicted cases that turned out to be positive. It can be calculated by the below equation –

let’s relate this confusion matric with a real-world example:-

Consider we have a server where we received 2000 data traffic in 1 hour. (This will be a scenario).

When our machine evaluated our data traffic, let’s say it predicted that the packet/transmission is dangerous or not to the server. We want to know if the packet or transmission was good or suspicious.

  • True Positive: The model predicted 1350 times that attack wouldn’t take place, out of which 1200 times actually no attack happened. These are the correct predictions.
  • False Positive: 150 times the attack actually took place when the model had predicted that no attack will happen. It is also called as Type I error.
  • False Negative: Out of 650 times for which model predicted attack will take place, 50 times the attack didn’t happen. This can be considered as “False Alarm” and also Type II error.
  • True Negative: Out of 650 times for which model predicted attack will take place, 600predictions were ‘True’ which means 600 times attack actually took place. Due to prediction, Security Operations Centre (SOC) will receive notification and can prevent the attack.

Conclusion:-

We can say that Machine Learning is a very much an important part of the IT industry and it has been used in every domain and it is being developed day by day to meet the need of the industry. We have also well discussed how the confusion matrix work and how it helps in real-world problems.

Thanks For Reading!!

--

--