In a previous article, I detailed the open source projects that I used to implement a PaaS infrastructure.

Since that time the number of instances in the infrastructure has grown by 2.5X and several of the components needed to be rethought.

Capacity/Performance Management

Previous: Collectd/Visage
Replacement: Collectd/Graphite
Reasons: The collectd backend was too slow and I/O heavy
Graphite graphs are easily to embed in dashboard applications
Ability to easily transform metrics, such as average CPU across a cluster of servers

Continuous Integration

Previous: Selenium
Replacement: Custom Tests
Reasons: Selenium tests failed too often for undiscernable reasons
False positives slowed development too often

Log Collection

Previous: Rsyslog/Graylog2
Replacement: Logstash/ElasticSearch/Kibana
Reasons: Mongodb too slow in EC2 for storing and searching

Logstash offers better parsing and indexing of logs with powerful filtersElasticSearch is super fast and scales horizontally on EC2

Kibana is simple to use and allows Developers to quickly find the relevant information

All of these components are easily integrated into our dashboard application

These changes not only allow the infrastructure to scale, but provide APIs that allow easy integration with custom dashboards.