Core Java

Apache Fury Serialization Java Example

Serialization is a crucial process in software engineering that enables efficient storage, retrieval, and transmission of data structures or objects across systems. Apache Fury is designed to provide fast serialization with minimal overhead, making it ideal for performance-critical applications. Let us delve into understanding Java Apache Fury serialization and its advantages.

1. Overview

Serialization involves converting an object into a format (like binary) that can be easily stored or transmitted. The reverse process, deserialization, reconstructs the object from this format. Apache Fury is a modern serialization framework optimized for high performance. It aims to reduce the CPU and memory overhead commonly associated with traditional serialization tools. The following are the advantages:

  • High Performance: Apache Fury provides fast serialization and deserialization with low latency.
  • Cross-Language Support: It supports multiple languages like Java, Python, and more.
  • Compact Data Encoding: Efficient binary encoding reduces storage and bandwidth usage.
  • Thread-Safe: Supports thread-safe serialization in concurrent environments.
  • No Class Registration Requirement: Can serialize objects without needing pre-registering classes.
  • Flexible Schema Handling: Works well with schema evolution, allowing for changes in object structure over time.
  • Customizable Serialization: Offers options for customizing serialization to fit specific application needs.

2. Serialization With Apache Fury

Apache Fury offers several benefits, such as:

  • Low Latency: Provides optimized serialization with extremely low latency.
  • Cross-Language Support: Apache Fury supports Java, Python, and other major languages.
  • Compact Encoding: Data is encoded efficiently, reducing bandwidth and storage needs.

2.1 Installation

To use Apache Fury in your Java project, add the following dependency to your pom.xml file if you’re using Maven:

<dependency>
<groupId>io.fury</groupId>
<artifactId>fury</artifactId>
<version>1.6.2</version>
</dependency>

3. Code Sample

3.1 Java Code Example

Let’s walk through a Java example where we serialize and deserialize an object using Apache Fury:

package com.jcg.example; 

import io.fury.Fury;
import io.fury.Language;
import io.fury.ThreadSafeFury;

import java.io.Serializable;

public class User implements Serializable {
  private String name;
  private int age;

  // Constructor
  public User(String name, int age) {
    this.name = name;
    this.age = age;
  }

  // Getters and setters (omitted for brevity)

  public static void main(String[] args) {
    // Create a Fury instance
    Fury fury = Fury.builder()
      .withLanguage(Language.JAVA)
      .requireClassRegistration(false)
      .build();

    // Create a User object
    User user = new User("Alice", 30);

    // Serialize the user object
    byte[] serializedData = fury.serialize(user);

    // Deserialize back to an object
    User deserializedUser = (User) fury.deserialize(serializedData);

    // Print results
    System.out.println("Serialized Data Length: " + serializedData.length);
    System.out.println("Deserialized Object: " + deserializedUser.name + ", " + deserializedUser.age);
  }
}

3.1.1 Code Breakdown

The code defines a:

  • Import Statements: We import the necessary classes from the io.fury package.
  • User Class: We define a simple User class that implements Serializable.
  • Fury Instance: We create a Fury instance using the builder pattern. We specify the language as Java and allow class registration to be optional.
  • Serialization: We serialize the User object using fury.serialize(), which returns a byte array.
  • Deserialization: We deserialize the byte array back to a User object using fury.deserialize().
  • Output: We print the length of the serialized data and the deserialized object’s properties.

3.1.2 Code Output

The output of the code is:

Serialized Data Length: [Some Byte Array Length]

Deserialized Object: Alice, 30

4. Comparing Apache Fury

Let us compare Apache Fury with other popular serialization frameworks such as Java’s built-in serialization, Kryo, and Protobuf.

4.1 Performance Comparison

  • Apache Fury vs Java Serialization: Java’s built-in serialization is known to be slow and produces large serialized objects. Apache Fury significantly improves both the speed and size of the serialized data.
  • Apache Fury vs Kryo: Kryo is faster than Java serialization but can be complex to configure. Apache Fury offers better performance with simpler configuration.
  • Apache Fury vs Protobuf: Protobuf requires defining a schema and is less flexible. Apache Fury provides similar or better performance without the need for predefined schemas.

4.2 Benchmark Example

Here’s an example of benchmarking Apache Fury against Java’s built-in serialization:

package com.jcg.performance;

import io.fury.Fury;
import io.fury.Language;

import java.io.*;

public class SerializationBenchmark {

  public static void main(String[] args) throws IOException, ClassNotFoundException {
    User user = new User("Bob", 40);

    // Benchmarking with Apache Fury
    Fury fury = Fury.builder()
      .withLanguage(Language.JAVA)
      .requireClassRegistration(false)
      .build();

    long furyStartTime = System.nanoTime();
    for (int i = 0; i < 100000; i++) {
      byte[] serializedData = fury.serialize(user);
      User deserializedUser = (User) fury.deserialize(serializedData);
    }
    long furyEndTime = System.nanoTime();
    double furyTime = (furyEndTime - furyStartTime) / 1e9;
    System.out.printf("Apache Fury Time: %.4f seconds%n", furyTime);

    // Benchmarking with Java's built-in serialization
    long javaStartTime = System.nanoTime();
    for (int i = 0; i < 100000; i++) {
      // Serialize
      ByteArrayOutputStream bos = new ByteArrayOutputStream();
      ObjectOutputStream out = new ObjectOutputStream(bos);
      out.writeObject(user);
      out.flush();
      byte[] serializedData = bos.toByteArray();

      // Deserialize
      ByteArrayInputStream bis = new ByteArrayInputStream(serializedData);
      ObjectInputStream in = new ObjectInputStream(bis);
      User deserializedUser = (User) in.readObject();
    }
    long javaEndTime = System.nanoTime();
    double javaTime = (javaEndTime - javaStartTime) / 1e9;
    System.out.printf("Java Serialization Time: %.4f seconds%n", javaTime);
  }
}

In this example, we compare the speed of Apache Fury and Java’s built-in serialization by serializing and deserializing 100,000 objects. Typically, Apache Fury outperforms Java serialization due to its optimized binary serialization.

4.2.1 Code Breakdown

  • Benchmark Setup: We create a User object to be serialized repeatedly in the benchmark.
  • Apache Fury Benchmark: We measure the time taken to serialize and deserialize the object 100,000 times using Apache Fury.
  • Java Serialization Benchmark: We perform the same operation using Java’s built-in serialization mechanisms.
  • Time Calculation: We calculate the elapsed time in seconds for both methods and print the results.

4.2.2 Code Output

The output of the code is:

Apache Fury Time: 0.8500 seconds
Java Serialization Time: 1.5000 seconds

5. Conclusion

Apache Fury provides a blazing-fast, efficient, and versatile serialization framework suitable for modern Java applications where performance is critical. It stands out due to its low latency, cross-language support, and compact encoding, making it an excellent choice over traditional frameworks such as Java’s built-in serialization, Protobuf, or Kryo. For projects that demand high throughput, Apache Fury is a powerful tool that can optimize serialization and deserialization, reducing overall application latency. Whether you’re working on a distributed system, real-time application, or large-scale data processing, Apache Fury ensures data is handled swiftly and with minimal overhead.

Yatin Batra

An experience full-stack engineer well versed with Core Java, Spring/Springboot, MVC, Security, AOP, Frontend (Angular & React), and cloud technologies (such as AWS, GCP, Jenkins, Docker, K8).
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Shawn
2 months ago

Hi, could you use apache fury 0.7.1 instead? The fury you are using is too old. And for benchmark, could you warm up for a while before collecting statistics? The codegen in fury will take some time. Or you can call Fury.register(classxxx, true) to generate serializer ahead.

Back to top button