Enterprise Java

How To Stream/Serialize JPA Result As JAX-RS Response For Large Data

There are times that retrieving a large data set through JPA is necessary (e.g. more than 1,000,000 records) and having them stuffed into a single instance of java.util.List is just risky (memory barrier). So, here’s a quick solution of how a JAX-RS REST resource end-point could still give us a timely Response without breaking the memory constrain through streaming or serialization of JPA entity by “pages”.

Example Database Table And JPA Entity

Database Table

To demonstrate how we could achieve outputting of large data, here’s an example MySQL database table that we could use.

01
02
03
04
05
06
07
08
09
10
create database large_data_test_db;
use large_data_test_db;
 
create table generated_uuids (
    record_no bigint not null auto_increment,
    uuid varchar(100) not null,
    datetime_generated datetime not null,
    primary key(record_no),
    unique(uuid)
);

JPA Entity

Next, define the JPA entity class which represents the table structure of the above.

Codes for GeneratedUuidEntity.java

001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
package com.developerscrappad;
 
import java.io.Serializable;
import java.util.Date;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.NamedQueries;
import javax.persistence.NamedQuery;
import javax.persistence.Table;
import javax.persistence.Temporal;
import javax.persistence.TemporalType;
 
@Entity
@Table( name = "generated_uuids" )
@NamedQueries( {
    @NamedQuery( name = "GeneratedUuidEntity.listAll", query = "SELECT u FROM GeneratedUuidEntity u" ),
    @NamedQuery( name = "GeneratedUuidEntity.queryRecordsSize", query = "SELECT count(u) FROM GeneratedUuidEntity u" )
} )
public class GeneratedUuidEntity implements Serializable {
 
    private static final long serialVersionUID = 12312312234234123L;
 
    @Id
    @GeneratedValue( strategy = GenerationType.IDENTITY )
    @Column( name = "record_no" )
    private Long recordNo;
 
    @Column( name = "uuid" )
    private String uuid;
 
    @Column( name = "datetime_generated" )
    @Temporal( TemporalType.TIMESTAMP )
    private Date datetimeGenerated;
 
    public GeneratedUuidEntity() {
    }
 
    public GeneratedUuidEntity( Long recordNo ) {
        this.recordNo = recordNo;
    }
 
    public GeneratedUuidEntity( Long recordNo, String uuid, Date datetimeGenerated ) {
        this.recordNo = recordNo;
        this.uuid = uuid;
        this.datetimeGenerated = datetimeGenerated;
    }
 
    public Long getRecordNo() {
        return recordNo;
    }
 
    public void setRecordNo( Long recordNo ) {
        this.recordNo = recordNo;
    }
 
    public String getUuid() {
        return uuid;
    }
 
    public void setUuid( String uuid ) {
        this.uuid = uuid;
    }
 
    public Date getDatetimeGenerated() {
        return datetimeGenerated;
    }
 
    public void setDatetimeGenerated( Date datetimeGenerated ) {
        this.datetimeGenerated = datetimeGenerated;
    }
 
    @Override
    public int hashCode() {
        int hash = 0;
        hash += ( recordNo != null ? recordNo.hashCode() : 0 );
 
        return hash;
    }
 
    @Override
    public boolean equals( Object object ) {
        // TODO: Warning - this method won't work in the case the id fields are not set
        if ( !( object instanceof GeneratedUuidEntity ) ) {
            return false;
        }
 
        GeneratedUuidEntity other = ( GeneratedUuidEntity ) object;
 
        if ( ( this.recordNo == null && other.recordNo != null ) || ( this.recordNo != null && !this.recordNo.equals( other.recordNo ) ) ) {
            return false;
        }
 
        return true;
    }
 
    @Override
    public String toString() {
        return "com.developerscrappad.GeneratedUuidEntity[ recordNo=" + recordNo + " ]";
    }
}

There are two named queries defined in GeneratedUuidEntity. The GeneratedUuidEntity.queryRecordsSize is to query the total record number of the table, whereas the GeneratedUuidEntity.listAll is to retrieve all of the records in the table.

Implementing The JAX-RS REST Resource (The Java EE way)

Let’s have a JAX-RS REST resource class by the name JPAStreamingRESTResourcewith an available JPA EntityManager (Persistence Unit Name: JPAStreamingPU) to be injected and to be obtained through a protected method getEntityManager().

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
@Path( "generated-uuids" )
@Stateless( name = "JPAStreamingRESTResource", mappedName = "ejb/JPAStreamingRESTResource" )
public class JPAStreamingRESTResource {
 
    @PersistenceContext( unitName = "JPAStreamingPU" )
    private EntityManager entityManager;
 
    protected EntityManager getEntityManager() {
        return entityManager;
    }
     
    /**
     * Say "NO" to response caching
     */
    protected Response.ResponseBuilder getNoCacheResponseBuilder( Response.Status status ) {
        CacheControl cc = new CacheControl();
        cc.setNoCache( true );
        cc.setMaxAge( -1 );
        cc.setMustRevalidate( true );
 
        return Response.status( status ).cacheControl( cc );
    }
}

Besides, we have a method name getNoCacheResponseBuilder(), which is to obtain a non-caching javax.ws.rs.core.Response.ResponseBuilder, so that we don’t get weird cached results later.

The JPA Invocation Methods

Next, let’s define two methods within the resource class, namely:

queryGeneratedUuidRecordsSize() – to retrieve the total number of records in the table

1
2
3
4
private int queryGeneratedUuidRecordsSize() {
    return getEntityManager().createNamedQuery( "GeneratedUuidEntity.queryRecordsSize", Long.class )
        .getSingleResult().intValue();
}

listAllGeneratedUuidEntities() – to retrieve the all of the data from the table, but with certain limitation criteria such as the start position of the record (recordPosition) and the maximum number of records per round trip to the database (recordsPerRoundTrip). The intention is to “page” the results so that the result list will not be overly bloated. We’ll see this in action later.

1
2
3
4
5
6
private List<GeneratedUuidEntity> listAllGeneratedUuidEntities( int recordPosition, int recordsPerRoundTrip ) {
    return getEntityManager().createNamedQuery( "GeneratedUuidEntity.listAll" )
        .setFirstResult( recordPosition )
        .setMaxResults( recordsPerRoundTrip )
        .getResultList();
}

Let The Streaming Begin

Now, let’s implement the resource end-point method, which retrieves the data without compromising size, at least in theoretical speaking. This method will return a JSON response with the data format of:

01
02
03
04
05
06
07
08
09
10
{
    "result": [
        {
            "record_no": 1,
            "uuid": "34d99089-3e36-4f00-ab93-846b61771eb3",
            "datetime_generated": "2015-06-28 21:02:23"
        },
    
    ]
}
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
@GET
@Path( "list-all" )
@Produces( "application/json" )
@TransactionAttribute( TransactionAttributeType.NEVER )
public Response streamGeneratedUuids() {
 
    // Define the format of timestamp output
    SimpleDateFormat df = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );
 
    return getNoCacheResponseBuilder( Response.Status.OK ).entity( new StreamingOutput() {
 
        // Instruct how StreamingOutput's write method is to stream the data
        @Override
        public void write( OutputStream os ) throws IOException, WebApplicationException {
            int recordsPerRoundTrip = 100;                      // Number of records for every round trip to the database
            int recordPosition = 0;                             // Initial record position index
            int recordSize = queryGeneratedUuidRecordsSize();   // Total records found for the query
 
            // Start streaming the data
            try ( PrintWriter writer = new PrintWriter( new BufferedWriter( new OutputStreamWriter( os ) ) ) ) {
 
                writer.print( "{\"result\": [" );
 
                while ( recordSize > 0 ) {
                    // Get the paged data set from the DB
                    List<GeneratedUuidEntity> generatedUuidEntities = listAllGeneratedUuidEntities( recordPosition, recordsPerRoundTrip );
 
                    for ( GeneratedUuidEntity generatedUuidEntity : generatedUuidEntities ) {
                        if ( recordPosition > 0 ) {
                            writer.print( "," );
                        }
 
                        // Stream the data in Json object format
                        writer.print( Json.createObjectBuilder()
                            .add( "record_no", generatedUuidEntity.getRecordNo() )
                            .add( "uuid", generatedUuidEntity.getUuid() )
                            .add( "datetime_generated", df.format( generatedUuidEntity.getDatetimeGenerated() ) )
                            .build().toString() );
 
                        // Increase the recordPosition for every record streamed
                        recordPosition++;
                    }
 
                    // update the recordSize (remaining no. of records)
                    recordSize -= recordsPerRoundTrip;
                }
 
                // Done!
                writer.print( "]}" );
            }
        }
    } ).build();
}

Cord Explanation:

This is quite simple actually. The trick is to define the expression of the anonymous class StreamingOutput by overriding the write() method, which in it, first query the total record size through queryGeneratedUuidRecordsSize(), then retrieves the records page by page through listAllGeneratedUuidEntities(). This method will make several round trips to the database, depending on the recordsPerRoundTrip value defined.

Full source codes for JPAStreamingRESTResource.java:

001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
package com.developerscrappad;
 
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.text.SimpleDateFormat;
import java.util.List;
import javax.ejb.Stateless;
import javax.ejb.TransactionAttribute;
import javax.ejb.TransactionAttributeType;
import javax.json.Json;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.WebApplicationException;
import javax.ws.rs.core.CacheControl;
import javax.ws.rs.core.Response;
import javax.ws.rs.core.StreamingOutput;
 
@Path( "generated-uuids" )
@Stateless( name = "JPAStreamingRESTResource", mappedName = "ejb/JPAStreamingRESTResource" )
public class JPAStreamingRESTResource {
 
    @PersistenceContext( unitName = "JPAStreamingPU" )
    private EntityManager entityManager;
 
    private List<GeneratedUuidEntity> listAllGeneratedUuidEntities( int recordPosition, int recordsPerRoundTrip ) {
        return getEntityManager().createNamedQuery( "GeneratedUuidEntity.listAll" )
            .setFirstResult( recordPosition )
            .setMaxResults( recordsPerRoundTrip )
            .getResultList();
    }
 
    private int queryGeneratedUuidRecordsSize() {
        return getEntityManager().createNamedQuery( "GeneratedUuidEntity.queryRecordsSize", Long.class )
            .getSingleResult().intValue();
    }
 
    protected EntityManager getEntityManager() {
        return entityManager;
    }
     
    /**
     * Say "NO" to response caching
     */
    protected Response.ResponseBuilder getNoCacheResponseBuilder( Response.Status status ) {
        CacheControl cc = new CacheControl();
        cc.setNoCache( true );
        cc.setMaxAge( -1 );
        cc.setMustRevalidate( true );
 
        return Response.status( status ).cacheControl( cc );
    }
 
    @GET
    @Path( "list-all" )
    @Produces( "application/json" )
    @TransactionAttribute( TransactionAttributeType.NEVER )
    public Response streamGeneratedUuids() {
 
        // Define the format of timestamp output
        SimpleDateFormat df = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );
 
        return getNoCacheResponseBuilder( Response.Status.OK ).entity( new StreamingOutput() {
 
            // Instruct how StreamingOutput's write method is to stream the data
            @Override
            public void write( OutputStream os ) throws IOException, WebApplicationException {
                int recordsPerRoundTrip = 100;                      // Number of records for every round trip to the database
                int recordPosition = 0;                             // Initial record position index
                int recordSize = queryGeneratedUuidRecordsSize();   // Total records found for the query
 
                // Start streaming the data
                try ( PrintWriter writer = new PrintWriter( new BufferedWriter( new OutputStreamWriter( os ) ) ) ) {
 
                    writer.print( "{\"result\": [" );
 
                    while ( recordSize > 0 ) {
                        // Get the paged data set from the DB
                        List<GeneratedUuidEntity> generatedUuidEntities = listAllGeneratedUuidEntities( recordPosition, recordsPerRoundTrip );
 
                        for ( GeneratedUuidEntity generatedUuidEntity : generatedUuidEntities ) {
                            if ( recordPosition > 0 ) {
                                writer.print( "," );
                            }
 
                            // Stream the data in Json object format
                            writer.print( Json.createObjectBuilder()
                                .add( "record_no", generatedUuidEntity.getRecordNo() )
                                .add( "uuid", generatedUuidEntity.getUuid() )
                                .add( "datetime_generated", df.format( generatedUuidEntity.getDatetimeGenerated() ) )
                                .build().toString() );
 
                            // Increase the recordPosition for every record streamed
                            recordPosition++;
                        }
 
                        // update the recordSize (remaining no. of records)
                        recordSize -= recordsPerRoundTrip;
                    }
 
                    // Done!
                    writer.print( "]}" );
                }
            }
        } ).build();
    }
}

Watch Out

Do remember to tune the application server’s response connection timeout value to prevent java.io.IOException Premature EOF exception being thrown by the REST or Http Client.

Testing It

To test whether this works, just get the table loaded with just 567 records. Then, have the unit test invoke the end-point URL and save the retrieved JSON data to a file with the below unit test codes (Apache HttpClient is used):

Codes for JPAStreamingUnitTest.java:

001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
package com.developerscrappad;
 
import java.io.File;
import java.io.FileInputStream;
import static org.junit.Assert.*;
 
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.UUID;
import javax.json.Json;
import javax.json.JsonArray;
import javax.json.JsonObject;
import javax.json.JsonReader;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
 
public class JPAStreamingUnitTest {
 
    private static final String dbDriverClassname = "com.mysql.jdbc.Driver";
    private static final String dbUrl = "jdbc:mysql://localhost:3306/large_data_test_db";
    private static final String username = "username";
    private static final String password = "password";
    private static final int numberOfRecords = 567;
    private static final String jsonResultOutputFilename = "testing123.json";
 
    @BeforeClass
    public static void setUpClass() {
        try {
            Class.forName( dbDriverClassname );
 
            try ( Connection conn = DriverManager.getConnection( dbUrl, username, password ) ) {
                String insertSQL = "insert into generated_uuids (uuid, datetime_generated) values (?, now())";
 
                try ( PreparedStatement stmt = conn.prepareStatement( insertSQL ) ) {
                    for ( int i = 0; i < numberOfRecords; i++ ) {
                        System.out.println( "Inserting row: " + i );
 
                        stmt.setString( 1, UUID.randomUUID().toString() );
                        stmt.executeUpdate();
                    }
                }
            }
        } catch ( final Exception ex ) {
            ex.printStackTrace();
            fail( ex.getMessage() );
        }
    }
 
    @AfterClass
    public static void tearDownClass() {
        try {
            Class.forName( dbDriverClassname );
 
            try ( Connection conn = DriverManager.getConnection( dbUrl, username, password ) ) {
                String truncateSQL = "truncate generated_uuids";
                conn.createStatement().executeUpdate( truncateSQL );
            }
 
            new File( System.getProperty( "java.io.tmpdir" ), jsonResultOutputFilename ).delete();
        } catch ( final Exception ex ) {
            ex.printStackTrace();
            fail( ex.getMessage() );
        }
    }
 
    @Test
    public void testJPAStreaming() {
 
        try {
            CloseableHttpClient httpclient = HttpClients.createDefault();
            HttpGet httpGet = new HttpGet( url );
 
            try ( CloseableHttpResponse response1 = httpclient.execute( httpGet ) ) {
                System.out.println( response1.getStatusLine() );
                HttpEntity entity1 = response1.getEntity();
                Files.copy( entity1.getContent(), FileSystems.getDefault().getPath( System.getProperty( "java.io.tmpdir" ), jsonResultOutputFilename ) );
            }
 
            // Validate
            try ( JsonReader jsonReader = Json.createReader( new FileInputStream( new File( System.getProperty( "java.io.tmpdir" ), jsonResultOutputFilename ) ) ) ) {
                JsonObject jsonObj = jsonReader.readObject();
                assertTrue( jsonObj.containsKey( "result" ) );
 
                JsonArray jsonArray = jsonObj.getJsonArray( "result" );
                assertEquals( numberOfRecords, jsonArray.size() );
 
                SimpleDateFormat validationDF = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );
 
                for ( int i = 0; i < jsonArray.size(); i++ ) {
                    JsonObject generatedUuidJsonObj = jsonArray.getJsonObject( i );
                    int recordNumber = generatedUuidJsonObj.getInt( "record_no" );
                    assertTrue( recordNumber > 0 );
 
                    try {
                        UUID.fromString( generatedUuidJsonObj.getString( "uuid" ) );
                    } catch ( IllegalArgumentException ex ) {
                        fail( "Invalid UUID format at record number: " + recordNumber );
                    }
 
                    try {
                        validationDF.parse( generatedUuidJsonObj.getString( "datetime_generated" ) );
                    } catch ( final NullPointerException | ParseException ex ) {
                        fail( "datetime_generated field must not be null and must be of format yyyy-MM-dd HH:mm:ss" );
                    }
                }
            }
 
        } catch ( final Exception ex ) {
            ex.printStackTrace();
            fail( ex.getMessage() );
        }
    }
}

And we are done. Thanks for reading and hope this helps.

Max Lam

Born and currently resides in Malaysia, a seasoned Java developer whom had held positions as Senior Developer, Consultant and Technical Architect in various enterprise software development companies.
Subscribe
Notify of
guest


This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Seb Trap
Seb Trap
7 years ago

The JSR-352 and here the javax.batch.api.chunk was introduced for the handling of large operations.
There are connectors available for JDBC, JPA, and others based on the needs.

Back to top button