Async REST API Using Spring Boot

In this post, we will look at two of the most fundamental means of communication in the world of software development. Synchronous communication and asynchronous communication. We will then explore the various means of achieving asynchronous communication in applications. And to demonstrate the topic, we will build a fully functional Async REST API using Spring Boot.

Synchronous Communication

Whenever an HTTP client invokes an HTTP endpoint it expects a rather instantaneous response. i.e. it expects the result of request will be available either immediately or within a tolerable time limit (approx. 60 seconds.) If the client doesn’t receive a response within this time the client will usually time-out waiting for response and throw an error. This mode of communication is perfectly fine for any request that needs data which is available quickly ex: a trivial database lookup or some light-weight computation, etc… These requests are known as synchronous requests and in general we refer to it as synchronous client-server communication. In the REST architectural pattern, we typically use synchronous communication.

The dilemma..!

What happens when the submitted request needs a rather long time to process? Ex: submitting a file with payroll information for processing or adding a long list of users in the system with each receiving an email to confirm their account, etc… These requests need minutes to hours depending on the complexity of the systems involved.

Asynchronous Communication

The polling model of asynchronous communication

The communication pattern where a client drops off a job for background processing and walks away with a token to check back (poll) later on the status of the job, is what a typical asynchronous communication looks like. Usually these are long running jobs or long running tasks.

Here the client submits a job by sending a request to the service and it immediately receives a token associated with the submitted job. The service then spawns off another thread to process the submitted job. The client can use the token to check the status of the submitted job over and over again i.e. the client keeps polling the service to keep track of the updates to the submitted job.

Pro’s and Con’s of the polling model of asynchronous communication

As clearly evident, the client is solely responsible for keeping a track of the job status and to fetch the response/output of the job once the job is completed. An immediately noticeable drawback of this mechanism is that clients need to keep polling the service to keep track of submitted jobs and to get its output.

However, this scheme is favourable where the client sits in a network setup where inbound traffic is not allowed whereas there are no restrictions on the amount of outbound traffic. This is the most common network setup in the most usual cases that involve going across the organizational firewall i.e. network admins will allow outbound requests towards the internet however would not allow any inbound requests from the internet. Moreover, in the event of any outage on the client side, the submitted job’s status/output is not lost as the client can resume polling for the jobs that it submitted.

Being the most common model of asynchronous communication, we will focus more on the polling model in this post. However, before we move on, let us also take a look at…

The callback model of asynchronous communication

The communication pattern where a client drops off a job for background processing with a callback URI where it can be informed about the outcome of the job is another variation of asynchronous communication. Usually these are also long running jobs or long running tasks.

Here the client submits a job by sending a request to the service and doesn’t expect anything immediately (except for an HTTP Status 200 or 201.) The service then spawns off another thread to process the submitted job. Once the job is completed, it’s the responsibility of the service to inform the client about the outcome of the job.

Pro’s and Con’s of the callback model of asynchronous communication

Here the client is not responsible for tracking the job status. It can send a request and sit-back, relax and wait for a response to hit it at the specified callback URI. However, it means that the client needs to expose a service end-point and ensure it’s up & available to the service as the callback URI. i.e. the client is no longer a simple client, it’s now a server as well! Therefore any outage on the client side may potentially impact the service if it tries to reach the callback URI and finds it unavailable which may result in a missed callback. There are other mechanisms that can be put in place to overcome these limitations, however, as you can easily see, this starts bloating up the entire schematic resulting in maintenance over-heads. Also the client now needs to be accessible from the service even across organizational firewalls and if needed across the internet.

An application adhering to the polling model of asynchronous communication

We are going to build an application that exposes a RESTful API which offers endpoints that facilitate the polling model of asynchronous communication. Broadly, the app shall provide: an endpoint to submit (POST) a job with a file, an endpoint to poll for (GET) the status of the submitted job, an endpoint to fetch (GET) the response of the submitted job and an endpoint to clear (DELETE) the submitted job and any data associated with it.

We will use Spring Boot to create this application. Spring Boot allows us to quickly churn out fully working Proof-of-Concept applications to demonstrate key objectives. We will limit this post to highlight only those pieces of application as necessary so that you know what to look out for while building for a polling model of asynchronous communication.

The entire working Proof-of-Concept application can be accessed at: https://github.com/sanketdaru/async-jobs-over-restful-api

Async configuration

At the heart of the entire asynchronous processing in within Spring Boot, is the thread pool task executor which needs to be configured and exposed as a Spring bean:

@Configuration
@EnableAsync
public class AsyncConfig{
	@Bean(name = "asyncTaskExecutor")
	public Executor getAsyncExecutor() {
		ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
		executor.setCorePoolSize(2);
		executor.setMaxPoolSize(10);
		executor.setQueueCapacity(10);
		executor.setThreadNamePrefix("MyAsyncThread-");
		executor.initialize();
		return executor;
	}
}

Don’t forget to annotate the class with @Configuration and also @EnableAsync which enables Spring’s asynchronous method execution capability.

This configuration is the first foundation for implementing Spring Boot long running tasks.

Async service

Next, we mark as @Async the service method that will accept an input file and a job-id, and it will return a Future handle using a CompletableFuture.

        @Async("asyncTaskExecutor")
	public CompletableFuture<SimpleResponse> postJobWithFile(String jobId, File file) {
		LOGGER.info("Received request with job-id {} and file {}", jobId, file);

		CompletableFuture<SimpleResponse> task =  new CompletableFuture<SimpleResponse>();
		
		try {
			int numberOfVowels = 0;
			String fileContents = FileHelper.fetchFileContents(file);
			
			// Trivial loop to demonstrate a long-running task
			for (int i=0; i<fileContents.length(); i++) {
				switch(fileContents.charAt(i)){
					case 'a':
					case 'A':
					case 'e':
					case 'E':
					case 'i':
					case 'I':
					case 'o':
					case 'O':
					case 'u':
					case 'U': numberOfVowels++;
				}
				
				Thread.sleep(100);
			}
			
			StringBuilder outputFileContents =  new StringBuilder();
			outputFileContents.append("The job file contained ");
			outputFileContents.append(fileContents.length());
			outputFileContents.append(" chars, of which ");
			outputFileContents.append(numberOfVowels);
			outputFileContents.append(" were vowels.");
			
			File outputFile = getOutputFile(jobId);
			FileHelper.writeToFile(outputFile, outputFileContents.toString());
			task.complete(new SimpleResponse(jobId, RequestStatus.COMPLETE, outputFile));
		} catch (IOException e) {
			LOGGER.error("Error during file operation.", e);
			task.completeExceptionally(e);
		} catch (InterruptedException e) {
			LOGGER.error("Error while counting characters in file.", e);
			task.completeExceptionally(e);
		} finally {
			file.delete();
		}
		
		LOGGER.info("Completed processing the request.");
		return task;
	}

Don’t forget to annotate the method with @Async("asyncTaskExecutor") which indicates to Spring that this is an asynchronous method and provides it the executor configuration using bean name “asyncTaskExecutor”. The @Async annotation is the next foundation for implementing Spring Boot long running tasks.

The task is an instance of a CompletableFuture. During the processing of the job, it may complete normally or it may fail. We track the completion stage of the task using the methods complete or completeExceptionally for normal completion or failure respectively.

The job itself is fairly trivial. We count the total number of characters in the file and sleep for 100 milliseconds for each character. We also count the total number of vowels in the file. Finally we delete the uploaded file. 

Remember, this can be any real-life, long-running job.

The POST endpoint

We use a regular REST controller with a POST end-point that accepts a multipart/form-data.

        @PostMapping(consumes = "multipart/form-data", produces = "application/json")
	public SimpleResponse postJobWithFile(@RequestParam("file") MultipartFile file) 
			throws Throwable {
		LOGGER.info("Received request for asynchronous file processing.");

		String jobId = UUID.randomUUID().toString();
		LOGGER.info("Generated job-id {} for this request.", jobId);
		
		if (null != jobsService.fetchJob(jobId)) {
			throw new ErrorWhileProcessingRequest("A job with same job-id already exists!", true);
		}

		File uploadedFile = multipartFileHelper.saveUploadedFile(file, jobId);
		if (null == uploadedFile) {
			throw new ErrorWhileProcessingRequest("Error occurred while reading the uploaded file.");
		}

		CompletableFuture<SimpleResponse> completableFuture = jobsService.postJobWithFile(jobId, uploadedFile);

		asyncJobsManager.putJob(jobId, completableFuture);

		LOGGER.info("Job-id {} submitted for processing. Returning from controller.", jobId);
		return new SimpleResponse(jobId, RequestStatus.SUBMITTED);
	}

We use a random UUID as a token to track every incoming request, save the file sent with request to the local disk, kick-off the asynchronous service, put the CompletableFuture in an in-memory map and respond to the caller with the token.

The controller itself is a normal Spring Boot REST Controller. However, the service it invokes is an Asynchronous service and hence this scheme is also more commonly referred to as Spring Boot Async REST Controller.

Keeping track of jobs

We need to keep a track of all the jobs that the service is processing so as to be in a position to respond as appropriate to every subsequent request for status or output from clients. To do this, we need an AsyncJobsManager which essentially just wraps a ConcurrentMap. The limitation of this design is that once our service is killed, all job progress details are lost.

@Service
public class AsyncJobsManager {

	private final ConcurrentMap<String, CompletableFuture<? extends BaseResponse>> mapOfJobs;

	public AsyncJobsManager() {
		mapOfJobs = new ConcurrentHashMap<String, CompletableFuture<? extends BaseResponse>>();
	}

	public void putJob(String jobId, CompletableFuture<? extends BaseResponse> theJob) {
		mapOfJobs.put(jobId, theJob);
	}

	public CompletableFuture<? extends BaseResponse> getJob(String jobId) {
		return mapOfJobs.get(jobId);
	}

	public void removeJob(String jobId) {
		mapOfJobs.remove(jobId);
	}
}

The response

We have a very trivial response structure that has a job-id, status and an output file URI (applicable only for jobs that are completed.) The class SimpleResponse extends from BaseResponse to encapsulate the needed data points.

The other endpoints

In our rest controller, we need a GET endpoint where the clients can poll for job status, another GET endpoint using which the clients can retrieve the job output file and optionally a DELETE endpoint that can be leveraged by the clients to inform the service to destroy the job details including any output.

        @GetMapping(path = "/{job-id}", produces = "application/json")
	public SimpleResponse getJobStatus(@PathVariable(name = "job-id") String jobId) throws Throwable {
		LOGGER.debug("Received request to fetch status of job-id: {}", jobId);

		return jobsService.getJobStatus(jobId);
	}

	@GetMapping(path = "/{job-id}/output-file", produces = MediaType.APPLICATION_OCTET_STREAM_VALUE)
	public ResponseEntity<Resource> getJobOutputFile(@PathVariable(name = "job-id") String jobId) throws Throwable {
		LOGGER.debug("Received request to fetch output file of job-id: {}", jobId);

		File outputFile = jobsService.getJobOutputFile(jobId);
		InputStreamResource resource = new InputStreamResource(new FileInputStream(outputFile));

		return ResponseEntity.ok()
				.contentLength(outputFile.length())
				.contentType(MediaType.APPLICATION_OCTET_STREAM)
				.body(resource);
	}

	@DeleteMapping(path = "/{job-id}", produces = "application/json")
	public SimpleResponse deleteJobAndAssociatedData(@PathVariable(name = "job-id") String jobId) throws Throwable {
		LOGGER.debug("Received request to delete job-id: {}", jobId);

		return jobsService.deleteJobAndAssociatedData(jobId);
	}

Configuration

We use application.properties file as a standard means of managing our application configuration. The property app-configs.jobFilesLocation can be used to control where the uploaded job files and resultant output files need to be stored.

Code download

The entire working Proof-of-Concept application can be accessed at: https://github.com/sanketdaru/async-jobs-over-restful-api

Running the application

Being a gradle application, we can leverage gradle-wrapper to run the Spring Boot application. It will start a server bound to localhost and listen on port 8080 for incoming connections.

Use your favourite REST client to issue requests and check the output. For the simplistic use-case that it is, curl can also be used.

POST a new job

curl --location --request POST 'http://localhost:8080/api/v1/jobs' \
--form 'file=@"sample.txt"'

As soon as a new job is posted by a client a response is received immediately. Notice the nio-8080-exec- thread name which belongs to Tomcat handling the incoming request…

[nio-8080-exec-1] c.s.p.a.controller.JobsController        : Received request for asynchronous file processing.
[nio-8080-exec-1] c.s.p.a.controller.JobsController        : Generated job-id 26e62d65-3e6a-4aab-8d79-a64f1eeee783 for this request.
[nio-8080-exec-1] c.s.p.a.controller.JobsController        : Job-id 26e62d65-3e6a-4aab-8d79-a64f1eeee783 submitted for processing. Returning from controller.

… however the actual job is executed asynchronously in another thread named MyAsyncThread-

[MyAsyncThread-1] c.s.poc.asyncjob.service.JobsService     : Received request with job-id 26e62d65-3e6a-4aab-8d79-a64f1eeee783 and file /var/tmp/26e62d65-3e6a-4aab-8d79-a64f1eeee783.in
[MyAsyncThread-1] c.s.poc.asyncjob.helper.FileHelper       : Reading from file: /var/tmp/26e62d65-3e6a-4aab-8d79-a64f1eeee783.in
[MyAsyncThread-1] c.s.poc.asyncjob.helper.FileHelper       : Writing to file: /var/tmp/26e62d65-3e6a-4aab-8d79-a64f1eeee783.out
[MyAsyncThread-1] c.s.poc.asyncjob.service.JobsService     : Completed processing the request.

GET job status

GET the status of the posted asynchronous job. Use job_id received as response from posting a new job command as an input to next command

curl --location --request GET 'http://localhost:8080/api/v1/jobs/{job_id}'

Once the job status is COMPLETE, you can check the location configured by app-configs.jobFilesLocation property to inspect the job output file.

GET job output file

GET the output file produced as a result of completion of the posted asynchronous job. Use the output_file_uri received as response from get the status command as an input to next command

curl --location --request GET 'http://localhost:8080/{output_file_uri}'

DELETE job and output file

curl --location --request DELETE 'http://localhost:8080/api/v1/jobs/{job_id}'

You can check the location configured by app-configs.jobFilesLocation property to ensure the job output file is really deleted.

Conclusion

This was a short primer on how to get started with the polling model for asynchronous communication. We used Spring Boot to develop an Asynchronous (async) API using a scheme which is more commonly known as Spring Boot async REST controller. However do remember that the controller is a regular REST controller and the service it invokes is an @Async service. In future posts we will explore the other models of asynchronous jobs.

Hope you enjoyed reading the post and got to learn something. If this post helped you in any way, I would be thrilled to hear about it. Please leave a comment, your inputs inspire me to write more. Cheers!

3 Comments

  1. Eduardo September 12, 2022
    • Sanket Daru September 12, 2022
  2. Ged February 15, 2022

Leave a Reply