To partition rate limits by user or API key in a horizontally scaled Spring Gen AI application, you can use a distributed store like Redis with keys uniquely tied to users or API keys. You can refer to the below steps:
Here is the code showing the above steps:
In the above code, we are using Redis Key Partitioning to Use rate_limit:{apiKey} for unique rate limit partitions, ZSET Data Structure, which Efficiently manages sliding windows with timestamp tracking, and Atomic Transactions, which ensures consistent updates in distributed environments.
Hence, referring to the above, you can partition rate limits by user or API key in a horizontally scaled Spring Gen AI application