Skip to main content

Cyber Twitter: Using Twitter to generate alerts for Cybersecurity Threats and Vulnerabilities

 

In the broad domain of security, analysts and policymakers need knowledge about the state of the world to make timely critical decisions, operational/tactical as well as strategic. This knowledge has to be extracted from a variety of different sources and then represented in a form that will enable further analysis and decision making. Some of the data underlying this knowledge is in textual sources traditionally associated with Open-source Intelligence (OSINT). OSINT is intelligence gathered from publicly available overt sources such as newspapers, magazines, social networking sites, video sharing sites, wikis, blogs, etc. In the cybersecurity domain, information available through OSINT can complement data obtained through traditional security systems and monitoring tools like Intrusion Detection and Prevention Systems (IDPs). Cybersecurity information sources can be divided into two abstract groups, formal sources such as NIST’s National Vulnerability Database (NVD), United States Computer Emergency Readiness Team (US-CERT), etc. and various information sources such as blogs, developer forums, chat rooms and social media platforms like Twitter1, Reddit2 and Stackoverflow, these provide information related to security vulnerabilities, threats and attacks. A lot of information is published on these sources on a daily basis making it nearly impossible for a human analyst to manually comb through, extract relevant information, and then understand various contextual scenarios in which an attack might take place. 

Twitter as an OSINT source


Over the past decade, Twitter has become a vital source of open-source intelligence. The social media site’s data has been used by researchers to gather intelligence about the impact of natural disasters, terrorist attacks, government elections, predicting stock markets, etc. In our work, we are interested in using Twitter as a source of information to study various cybersecurity events. Twitter users, as in when new vulnerabilities are made public, tweet about these vulnerabilities (Figures) to spread information on the network so that others can use that particular information to secure their systems. Individuals or reputed security experts like Brian Krebs (an investigative journalist who writes about cybercrime) can be valuable resources for cybersecurity incidents. Established companies like @web security or @intersecww or disseminate news, tips and latest information on web security, web application protection, hacker incidents, data breaches, penetration testing results, etc. 




Cyber Framework


We develop CyberTwitter, a framework to automatically issue cybersecurity vulnerability alerts to users (Figure). CyberTwitter begins by collecting relevant tweets by querying the Twitter API. The tweet Collection module collects, cleans and stores tweets returned by the API. Every tweet is further processed by the Security Vulnerability Concept Extractor (SVCE) which extracts various terms and concepts related to security vulnerabilities. Intelligence from these terms and concepts is then converted to RDF statements using our intelligence ontology. We use UCO ontology (Unified Cybersecurity Ontology) to provide our system with cybersecurity domain

information. RDF Linked Data representation is stored in our “Cybersecurity Knowledge Base” allowing our alert system to reason over the data. Finally, we issue alerts to the end-user based on a “User System Profile”. We will further explain various details and sub-modules present in our system in the next few subsections. 

Tweet Collection 

CyberTwitter collects data through the Twitter Stream API3 based on a set of keywords. These keywords are derived from the “User System Profile” and a list of cybersecurity terms (see Figure). For our system, we limit ourselves to tweets in the English language. After collecting a good number of tweets we clean the data using WordNet, which is a large lexical database for English. 

Cybersecurity Ontologies and Knowledge Bases 


A data feed sent through the Twitter Stream API essentially consists of a stream of strings that computers can process. However, in the real world, strings represent terms and concepts that may sometimes be ambiguous and computers are not programmed to handle ambiguity. Computer systems can be aided in this task by various Semantic Web technologies that represent the real world as concepts. These concepts are then associated with Uniform Resource Identifiers (URIs). For example, the string “Apple” can be associated with the company Apple Inc. or the fruit apple. Also, these concepts can have various attributes and relations to other concepts.


By: Anjan Neema

(Tech Intern, WCSF)


To stay updated with our blogs, please don't forget to "SUBSCRIBE" us.

To know more about us, please visit: https://www.worldcybersecurities.com/


Comments

Popular posts from this blog

UNESCO Guidelines on Generative AI in Schools

The advent of artificial intelligence has assumed prominence amongst all industries and various facets of people's personal lives. The integration of AI in education has been inevitable, given the significance and role of information, knowledge production and administration in the sector. This is especially so as its capabilities entail replicating higher-order thinking. Besides assisting in the education process, it also brings the element of real-life relevance, allowing education to be imparted against the backdrop of the evolving world due to the same AI. It tends to have implications on the subject matter that needs to be imparted, which tends to be something that constantly needs to answer the question of "Why and how is this particular subject matter relevant for learning?".  This induces policy-makers and educational institutions to rethink what they need to impart as knowledge, the area of matter, and the manner of thinking to be emphasised. This is because educa

Dark Web: Safe or unsafe? Truth Revealed!

  The dark web is the part of the internet that is not visible to search engines. With the advancement in technology, digitization has resulted in different types of attacks. We can talk to anyone as long as we have an internet connection. The main concern is with privacy and anonymity in mind.  A team of computer scientists and mathematicians working for one branch of the US navy which is known as the Naval Research laboratory (NRL), developed a new technology known as Onion Routing. It allows anonymous communication where the source and destination cannot be determined by the third party. A network using the Onion Routing technique is classified as Darknet. The NRL released the Onion Routing Technique and it became The Onion Router, also known as TOR. Advantages of Dark Web  Humans are allowed to hold privacy and express their views freely. Privacy is considered to be critical for honest persons through the different criminals and stalkers.  The growing tendency of employers to track

Need for Anti-Spam Laws in India: Comparative Analysis

  Introduction Spam is unsolicited, usually commercial messages (such as e-mails, text messages, or internet postings) sent to a large number of recipients or posted in a large number of places. The spamming activity is usually considered to cause a lot of nuisance and mental annoyance. Spamming is carried out with the help of an electronic mechanism to send unsolicited messages and advertisements. It can also be termed “An unsolicited e-mail” from which the sender attempts to gain an advantage. "India is the seventh biggest spammer in the world 7.8 billion spam e-mails sent in past 24 hours". It’s high time that India has to come up with its legislation to curb the activity.  The author will also argue the need for anti-spam legislation in India with a comparative analysis of various other jurisdictions. Why is it a concern? The term spam emerged due to the spread of unsolicited commercial messages in the internet space. The main challenge is that it has varied charact