Building a Private Local AI Agent with Java 21 and Spring Boot
I've just released minimal-java-agent, an open-source template for building AI-powered agents using Java 21 and Spring Boot 4.0.3. The focus is on keeping it minimal, performant, and 100% local.
![]() |
| AI chat using Postman on localhost |
Tech Stack
- Java 21 + Spring Boot 4.0.3: Latest LTS with production-ready features
- Virtual Threads: Lightweight concurrency for high-throughput scenarios
- Spring WebFlux: Non-blocking reactive stack for optimal resource utilization
- Ollama + Spring AI: Integration with local LLMs via Spring's
ChatClientabstraction - 100% Test Coverage: Comprehensive unit tests with JaCoCo reporting
- Multi-stage Dockerfile: Optimized container builds for deployment
Architecture The project follows a clean separation of concerns:
controller/: REST endpoints for agent interactioncomponent/: Core services (instance ID, thread info)model/: Data transfer objects
Code Example The reactive controller uses WebFlux to handle requests asynchronously:
@PostMapping("/api/chat")
public Mono<ChatResponse> chat(@RequestBody String message) {
return chatClient.prompt(message)
.stream()
.content()
.collect(Collectors.joining())
.map(llmResponse -> new ChatResponse(
agentInfo.getInstanceId(),
agentInfo.getAgentName(),
agentInfo.getThreadInfo(),
llmResponse
));
}
Local LLM Integration The agent connects to Ollama running locally. All processing happens on your machine — no data leaves your environment.
One-time setup:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama pull llama3.2:1b-q4_0
Test the agent:
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: text/plain" \
-d "Hello"
Performance on Mac Mini M4
- Model:
llama3.2:1b-q4_0(~600 MB) - Response speed: 30-40 tokens/second
- Memory usage: ~1 GB total (Ollama + Spring Boot)
- CPU usage: < 10% during inference
Key Features
- ✅ 100% local, no cloud dependencies
- ✅ Virtual threads for efficient concurrency
- ✅ Reactive streaming responses
- ✅ Full test coverage
- ✅ Docker-ready with multi-stage builds
- ✅ Clean, minimal architecture
Repository GitHub: github.com/carlosquijano/minimal-java-agent License: Apache 2.0
#Java #SpringBoot #AI #Ollama #OpenSource #VirtualThreads #WebFlux
%202.24.05%E2%80%AFp.%C2%A0m..png)
Comments
Post a Comment