On this page
Axual 2021.2 is here!
We are happy to announce the Axual summer release, version 2021.2. Continue reading this release update blog with insights and links to relevant sections with details about the changes. The blog also contains 2 feature showcase videos, featuring the Azure Data Lake Gen2 Sink connector and Avro Schema download.
Quality and stability improvements
Whether you work with software, whether it is a simple text editor or a streaming platform like Kafka. Everybody knows software is made by humans. At least, it still is. And this means that every now and then, a little bug makes its way into a product release. Some of those bugs are just mildly annoying and some of them have a big impact on productivity. For the 2021.2 release we spent a lot of effort to improve the quality of the product.
Better feedback
As of 2021.2. both the developer & operator of Axual Platform receive more extensive feedback whenever an erroneous Avro schema is being uploaded.
When using the Avro schema upload functionality, schemas are being validated to make sure the syntax is correct. Whenever there is a problem with a schema, the developer will receive feedback in the UI and the operator can find error logs in the management API.
This saves a lot of time trying to get the schema right.
From this release we also worked to prevent erroneous situations.
As you know you can use our Self-Service interface to define environments which are mapped to a physical Kafka cluster. This helps you to test-drive your apps on topics without having to create them with different names and such.
However, changing the environment name after you have created topics in those environment could lead to very weird behaviour. That’s why we added an extra check to prevent this from happening. As long as no topic exists in the environment, you are free to edit whatever you want.
Other bug fixes we worked on are:
- Windows line endings support for certificate and schema files
- Topic ACLs being accidentally removed
- Race condition in the AvroDeserializer used in Stream Browse & search in multi tenant deployments
- A Distributor caching issue causing some offsets not to be distributed correctly, potentially causing duplicate consumption of messages
Read more about bug fixes and improvements in the release notes which can be found in our online documentation.
Facilitating collaboration
Avro Schema Download
When producers and consumers interact through a topic, it is important that both parties speak the same language. That is where schemas can help. A schema contains information about the fields which are present in messages on the topic. The producer uses the schema to serialize and the consumer to deserialize messages. Every message contains a reference to the schema which is stored externally in a schema registry. Apache Avro is an example of a schema format which can be used.
Through the Self-Service interface you can already upload an Avro schema with a particular version and a description. This way, you prepare your Avro schema to be selected as soon as you create a topic.
Moreover, we allow you to upload different versions of schema’s, and use different schema versions in different environments. This way you can test a schema in the development environment, before promoting it to the acceptance or production environment.
So far so good. As producers and consumers we can now use the same schema in our applications to produce to the applicable topic.
However, in previous versions of our platform there was not really a way to exchange schemas to use in the producer or consumer application(s). Either it was the responsibility of the producer or the stream team to make this schema available somewhere in the organization
Avro schema download: allowing collaboration on a streaming platform
As of 2021.2 we make this a bit simpler. If an Avro Schema is used on a topic in a particular environment, you can click the version labels displayed for both the key and value, just below the topic name.
If you click the version label, the schema contents will be shown. This allows you to quickly see with what the payload of messages will be on this topic in this environment. And finally you can download this schema, so you can use it in your app.
So things have come full circle when it comes to handling Avro schemas. From the upload to the download. To see this new feature in action, please check the following feature showcase.
How to facilitate collaboration on Apache Kafka using Avro Schema upload and download
Security improvements
For the 2021.2 release we have worked on multiple security improvements. Two of them we would like to highlight to this blog. Schema Registry & Discovery API SSL client authentication and enabling the Authorisation Code flow with PKCE.
Schema Registry & Discovery API SSL client authentication
Schema Registry and Discovery API are both components in our infrastructure which are accessed by Kafka clients 24/7. Discovery API to return the Kafka cluster coordinates to the client and Schema Registry to provide certificates or certificate IDs to the consumer or producer respectively.
Of course, both the Schema Registry and Discovery API HTTP endpoints are secured by SSL, which means network traffic is encrypted (HTTPS). However, up until now, there was no client authentication happening, which means that any client on the network could access both services.
As these interfaces are read-only, there is not much to fear. But in a multi tenant context where you cannot really trust the network, better security is a strict requirement. That’s why as of release 2021.2 we are offering SSL Client Authentication for both Discovery API and Schema Registry.
What this means? Both services are authenticating the connection from clients based on the application certificate. As long as the service trusts the certificate authority of the connecting client, the connection will be accepted.
As an operator you are in control when you want to turn on this feature. This allows you to warn all of the connecting teams in your organization so they are prepared as soon as client authentication is turned on.
As for the compatibility at the client side: we can confirm that the following clients can connect to an authenticating Discovery API and Schema Registry without problems:
- Java Client5.4.4 and up
- .NET client1.4.0 and up
- Python client1.0.0
Self-Service: using the more secure PKCE instead of implicit flow
We introduced the Authorisation Code Flow with PKCE. PKCE (Proof Key for Code Exchange) is used to securely perform the OAuth exchange on public clients like a browser. It mitigates the risk of having the authorisation code intercepted. It does this by introducing an extra client secret that is used when exchanging the authorisation code for an access token.
Find out more about PKCE by reading this article OAuth 2.0: Implicit Flow is Dead, Try PKCE Instead | Postman Blog
In our operations documentation you can read more on how you can enable the PKCE flow in your Axual installation. For a helm chart deployment, click here. For an Axual CLI deployment, click here. The operations documentation also contains some settings which you can tweak to set the correct timeouts.
REST Proxy configurability
REST Proxy is one of the many interfaces that we offer to produce to and consume from Kafka. With the added Avro Schema support it is a good second option besides using a language specific client.
Whether you are talking about an on premises installation or (private) cloud, the environment the interface will be running in and its latency to Kafka has a big impact on whether REST proxy can work reliably.
As you might know, the Apache Kafka Java client library offers a lot of possibilities to configure the producer and consumer. As REST Proxy is using this library under the hood, we thought it made sense to offer this configurability to the proxy as well.
As of Axual REST Proxy 1.3.0 you can tweak the producer and consumer configurations by passing environment variables with a certain structure. In the documentation you will find examples on how to pass these configurations based on whether you deploy using Axual CLI or Helm charts. Moreover we have added sensible defaults for some important settings like METADATA_MAX_AGE_MS and CONNECTION_MAX_IDLE_MS.
Offloading Kafka topics to Azure Data Lake Gen2
Azure Data Lake Gen2 Sink Connector
Kafka is great for event based and stream processing and is often used as a data source to feed analytics engines or data lakes.
Together with this platform release we are announcing the availability of the new Azure Data Lake Storage Gen2 Sink connector. It provides developers and analysts with a simple solution to load records from Kafka Topics into an Azure Data Lake Storage file. Other Azure services can then load this file for further processing and analysis or archiving. The connector uses the Avro Object Container format to store multiple records into a single file.
We are open sourcing this connector which means you can use it free of charge in any environment. However, we can imagine you need development or production support to make sure you set up your connector properly. Please contact us if you want to find out more.
The ADLS Gen2 Sink Connector is well documented in Gitlab with a complete list of possible configurations and ready-to-use examples. In Gitlab is where you can find the source code as well.
Be sure to check out the feature showcase below if you want to learn how you can use the Azure Data Lake Gen2 Sink Connector yourself
How to use the ADLS Gen2 Sink Connector to offload Kafka data into Azure Data Lake Gen2
What’s next?
Thanks for diving into the 2021.2 release. You can find more details about the release in our release notes, part of the documentation here. But the best thing is to try out for yourself. We invite you to request a trial, you can do that at the bottom of this page.
Download the Use Case
Download for free, no credentials neededAnswers to your questions about Axual’s All-in-one Kafka Platform
Are you curious about our All-in-one Kafka platform? Dive into our FAQs
for all the details you need, and find the answers to your burning questions.
Related blogs
Norsk Helsenett (NHN) is revolutionizing Norway's fragmented healthcare landscape with a scalable Kafka ecosystem. Bridging 17,000 organizations ensures secure, efficient communication across hospitals, municipalities, and care providers.
Apache Kafka has become a central component of modern data architectures, enabling real-time data streaming and integration across distributed systems. Within Kafka’s ecosystem, Kafka Connect plays a crucial role as a powerful framework designed for seamlessly moving data between Kafka and external systems. Kafka Connect provides a standardized, scalable approach to data integration, removing the need for complex custom scripts or applications. For architects, product owners, and senior engineers, Kafka Connect is essential to understand because it simplifies data pipelines and supports low-latency, fault-tolerant data flow across platforms. But what exactly is Kafka Connect, and how can it benefit your architecture?
Apache Kafka is a powerful platform for handling real-time data streaming, often used in systems that follow the Publish-Subscribe (Pub-Sub) model. In Pub-Sub, producers send messages (data) that consumers receive, enabling asynchronous communication between services. Kafka’s Pub-Sub model is designed for high throughput, reliability, and scalability, making it a preferred choice for applications needing to process massive volumes of data efficiently. Central to this functionality are topics and partitions—essential elements that organize and distribute messages across Kafka. But what exactly are topics and partitions, and why are they so important?