Collecting Data with a Custom SIEM System Built on Apache Kafka and Kafka Connect ft. Vitalii Rudenskyi

The best-informed business insights that support better decision-making begin with data collection, ahead of data processing and analytics. Enterprises nowadays are engulfed by data floods, with data sources ranging from cloud services, applications, to thousands of internal servers. The massive volume of data that organizations must process presents data ingestion challenges for many large companies. In this episode, data security engineer, Vitalli Rudenskyi, discusses the decision to replace a vendor security information and event management (SIEM) system by developing a custom solution with Apache Kafka® and Kafka Connect for a better data collection strategy.Having a data collection infrastructure layer is mission critical for Vitalii and the team in helping enterprises protect data and detect security events. Building on the base of Kafka, their custom SIEM infrastructure is configurable and designed to be able to ingest and analyze huge amounts of data, including personally identifiable information (PII) and healthcare data. When it comes to collecting data, there are two fundamental choices: push or pull. But how about both? Vitalii shares that Kafka Connect API extensions are integral to data ingestion in Kafka. The three key components to allow their SIEM system to collect and record daily by pushing and pulling: NettySource Connector: A connector developed to receive data from different network devices to Apache Kafka. It helps receive data using both the TCP and UDP transport protocols and can be adopted to receive any data from Syslog to SNMP and NetFlow.PollableAPI Connector: A connector made to receive data from remote systems, pulling data from different remote APIs and services.Transformations Library: These are useful extensions to the existing out-of-the-box transformations. Approach with “tag and apply” transformations that transform data into the right place in the right format after collecting data.Listen to learn more as Vitalii shares the importance of data collection and the building of a custom solution to address multi-source data management requirements. EPISODE LINKSFeed Your SIEM Smart with Kafka ConnectTo Pull or to Push Your Data with Kafka Connect? That Is the Question.Free Kafka Connect 101 CourseSyslog Source Connector for Confluent PlatformJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Kafka streaming in 10 minutes on Confluent CloudUse 60PDCAST to get an additional $60 of free Confluent Cloud usage (details)

Om Podcasten

Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join our hosts as they stream through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.