How can I save the application state for a node.js Application that consists mostly of HTTP request?
I have a script in Node.JS that works with a RESTful API to import a large number (10,000+) of products into an E-Commerce application. The API has a limit on the amount of requests that can be made and we are staring to brush up against that limit. On a previous run the script exited with a Error: connect ETIMEDOUT probably due to exceeding API limits. I would like to be able to try connecting 5 times and if that fails resume after an hour when the limit has been restored.
It would also be beneficial to save the progress throughout in case of a crash (power goes down, network crashes etc). And be able to resume the script from the point it left off.
I know that Node.js operates as a giant event-queue, all http requests and their callbacks get put into that queue (together with some other events). This makes it a prime target for saving the state of current execution. Other pleasant (not totally necessary for this project) would be being able to distribute the work among several machines on different networks to increase throughput.
So is there an existing way to do it? A framework perhaps? Or do I need to implement this myself, in that case, any useful resources on how this can be done would be appreciated.
Problem courtesy of: Michael Yagudaev
I’m not sure what you mean when you say
I know that Node.js operates as a giant event-queue, all http requests and their callbacks get put into that queue (together with some other events). This makes it a prime target for saving the state of current execution
Please feel free to comment or expound on this if you find it relevant to the answer.
That said, if you’re simply looking for a persistence mechanism for this particular task, I might recommend Redis, for a few reasons:
- It allows atomic operations on many data types; for example, if you had an entry in Redis called num_requests_made that represented the number of requests made, you could increment this number easily in Redis using INCR num_requests_made, and it’s guaranteed to be atomic, making it easier to scale to multiple workers.
- It has several data types that could prove useful for your needs; for example, a simple string could represent the number of API requests made during a certain period of time (as in the previous bullet point); you might store details on failed API request that need to be resubmitted in a list; etc.
- It provides pub/sub mechanisms which would allow you to communicate easily between multiple instances of the program.
If this sounds interesting or useful and you’re not already familiar with Redis, I highly recommend trying out the interactive tutorial, which introduces you to a few data types and commands for them. Another good piece of reading material is A fifteen minute introduction to Redis data types.
Solution courtesy of: Michelle Tilley