= ApplicationPool algorithm == Introduction For efficiency reasons, Passenger keeps a pool spawned Rails applications. Please read the C++ API documentation for the ApplicationPool class for a full introduction. This document describes an algorithm for managing the pool. The algorithm should strive to keep spawning to a minimum. TODO: check whether the algorithm has thrashing behavior. == Definitions === Types Most of the types that we use in this document are pretty standard. But we explicitly define some special types: - list A doubly linked list which contains elements of type SomeType. It supports all the usual list operations that one can expect from a linked list, like add_to_back(), etc. The following operations deserve special mention: * remove(iterator) Removes the specified element from the list. _iterator_ is a linked list iterator: it probably contains the links and a reference to the actual list element, depending on the list implementation. This operation can be done in O(1) time. * move_to_front(iterator) Moves the specified element to the front of the list. _iterator_ is an iterator, as described earlier. - AppContainer A compound type which contains an application instance, as well as iterators for various linked lists. These iterators make it possible to perform actions on the linked list in O(1) time. An AppContainer has the following members: * app - An Application object, representing an application instance. * last_used (time) - The last time a session for this application instance was opened or closed. * sessions (integer) - The number of open sessions for this application instance. Invariant: (sessions == 0) == (This AppContainer is in inactive_apps.) * iterator - The iterator for this AppContainer in the linked list apps[app.app_root] * ia_iterator - The iterator for this AppContainer in the linked list inactive_apps. This iterator is only valid if this AppContainer really is in that list. === Special functions - spawn(app_root) Spawns a new instance of the application at the given application root. Throws an exception if something went wrong. This function is thread-safe. Note that application initialization can take an arbitrary amount of time. === Instance variables The algorithm requires the following instance variables for storing state information: - lock: mutex This lock is used for implementing thread-safetiness. We assume that it is non-recursive, i.e. if a thread locks a mutex that it has already locked, then it will result in a deadlock. - apps: map[string => list] Maps an application root to a list of AppContainers. Thus, this map contains all application instances that are in the pool. Invariant: for all values v in app: v is nonempty. for all 0 <= i < v.size() - 1: if v[i].app is active: v[i + 1].app is active An active application is one that has more than 0 active sessions. - max: integer The maximum number of AppContainer objects that may exist in 'apps'. - max_per_app: integer The maximum number of concurrent AppContainer objects a single application may spawn. - count: integer The current number of AppContainer objects in 'apps'. Since 'max' can be set dynamically during the life time of an application pool, 'count > max' is possible. - active: integer The number of application instances in 'apps' that are active. Invariant: active <= count - inactive_apps: list A linked list of AppContainer objects. All application instances in this list are inactive. Invariant: inactive_apps.size() == count - active for all c in inactive_apps: c is in apps. c.sessions == 0 - restart_file_times: map[string => time] Maps an application root to the last known modification time of 'restart.txt'. Invariant: for all keys app_root in restart_times: apps.has_key(app_root) - app_instance_count: map[string => unsigned int] Maps an application root to the number of spawned applications. Invariant: app_instance_count.keys == apps.keys for all (app_root, app_count) in app_instance_count: app_instance_count[app_root] < count app_count == apps[app_root].size() (sum of all values in app_instance_count) == count. == Algorithm in pseudo code # Thread-safetiness notes: # - All wait commands are to unlock the lock during waiting. function get(app_root): MAX_ATTEMPTS = 10 attempt = 0 time_limit = now() + 5 seconds lock.synchronize: while (true): attempt++ container, list = spawn_or_use_existing(app_root) container.last_used = current_time() container.sessions++ try: return container.app.connect() on exception: container.sessions-- if (attempt == MAX_ATTEMPTS): propagate exception else: # The app instance seems to have crashed. # So we remove this instance from our data # structures. list.remove(container.iterator) if list.empty(): apps.remove(app_root) app_instance_count.remove(app_root) count-- active-- # Returns a pair of [AppContainer, list] that matches the # given application root. If no such AppContainer exists, then it is created # and a new application instance is spawned. All exceptions that occur are # propagated. function spawn_or_use_existing(app_root): list = apps[app_root] if (list != nil) and (needs_restart(app_root)): for all container in list: if container.sessions == 0: inactive_apps.remove(container.ia_iterator) else: active-- list.remove(container.iterator) count-- apps.remove(app_root) app_instance_count.remove(app_root) list = nil Tell spawn server to reload code for app_root. if list != nil: # There are apps for this app root. if (list.front.sessions == 0): # There is an inactive app, so we use it. container = list.front list.move_to_back(container.iterator) inactive_apps.remove(container.ia_iterator) active++ else if (count >= max) or ( (max_per_app != 0) and (app_instance_count[app_root] >= max_per_app) ): # All apps are active, and the pool is full. # -OR- # All apps are active and the number of max instances # spawned for this application has been reached. # # We're not allowed to spawn a new application instance. # So we connect to an already active application. This connection # will be put into that application's queue. container = a container in _list_ with the smallest _session_ value list.move_to_back(container.iterator) else: # All apps are active, but the pool hasn't reached its # maximum yet. So we spawn a new app. container = new AppContainer # TODO: we should add some kind of timeout check for spawning. container.app = spawn(app_root) container.sessions = 0 iterator = list.add_to_back(container) container.iterator = iterator app_instance_count[app_root]++ count++ active++ else: # There are no apps for this app root. wait until (active < max) and (max_per_app == 0 or app_instance_count[app_root] < max_per_app) if count == max: # Here we are in a though situation. There are several # apps which are inactive, and none of them have # application root _app_root_, so we must kill one of # them in order to free a spot in the pool. But which # one do we kill? We want to minimize spawning. # # It's probably a good idea to keep some kind of # statistics in order to decide this. We want the # application root that gets the least traffic to be # killed. But for now, we kill a random application # instance. container = inactive_apps.pop_front list = apps[container.app.app_root] list.remove(container.iterator) if list.empty(): apps.remove(container.app.app_root) restart_file_times.remove(container.app.app_root) app_instance_count.remove(container.app.app_root) else: app_instance_count[container.app.app_root]-- count-- container = new AppContainer # TODO: we should add some kind of timeout check for spawning. container.app = spawn(app_root) container.sessions = 0 list = apps[app_root] if list == nil: list = new list apps[app_root] = list app_instance_count[app_root] = 1 else: app_instance_count[app_root]++ iterator = list.add_to_back(container) container.iterator = iterator count++ active++ return [container, list] # The following function is to be called when a session has been closed. # _container_ is the AppContainer that contains the application for which a # session has been closed. function session_has_been_closed(container): lock.synchronize: list = apps[container.app.app_root] if list != nil: container.last_used = current_time() container.sessions-- if container.sessions == 0: list.move_to_front(container.iterator) container.ia_iterator = inactive_apps.add_to_back(container.app) active-- function needs_restart(app_root): restart_file = "$app_root/tmp/restart.txt" s = stat(restart_file) if s != null: delete_file(restart_file) if (deletion was successful) or (file was already deleted): restart_file_times.remove(app_root) result = true else: last_restart_file_time = restart_file_times[app_root] if last_restart_time == null: result = true else: result = s.mtime != last_restart_file_time restart_file_times[app_root] = s.mtime else: restart_file_times.remove(app_root) result = false return result # The following thread will be responsible for cleaning up idle application # instances, i.e. instances that haven't been used for a while. thread cleaner: lock.synchronize: done = false while !done: Wait until CLEAN_INTERVAL seconds have expired, or until the thread has been signalled to quit. if thread has been signalled to quit: done = true break now = current_time() for all container in inactive_apps: app = container.app app_list = apps[app.app_root] if now - container.last_used > MAX_IDLE_TIME: app_list.remove(container.iterator) inactive_apps.remove(iterator for container) app_instance_count[app.app_root]-- count-- if app_list.empty(): apps.remove(app.app_root) app_instance_count.remove(app.app_root) restart_file_times.remove(app.app_root)