On Fri 18 Jan we had a severe system incident that affected BidTheatre services. This incident report details what occurred, and what measures have been taken to guard against similar futures incidents.
On the afternoon of Friday the 18th, an intrusion was made on our database backend, consisting of a master / slave replica set. The replicated setup allows for fast failover, should something happen to the master database. However, the intrusion rendered both master and replica unusable, and we had to resort to off-site backups to restore the database service. All-in-all, this caused most services to be unavailable during Friday evening to mid-day Saturday.
Fri 18 12:14 - Monitoring indicates that standby replica database is unaccessible
Fri 18 15:00 - DB replica is established to have been intruded upon
Fri 18 16:00 - Intrusion is deemed to likely have been possible through a weakness in the Microsoft RDP protocol for remote access
Fri 18 17:30 - Clean system setup of slave is initiated
Fri 18 18:15 - Unplanned maintenance of master db is initiated to secure it from the same vulnerability as the master
Fri 18 18:20 - An incident issue to inform clients is created
Fri 18 18:55 - Master db is established to have been intruded upon. Clean system setup of server is initiated
Sat 19 09:50 - Backup restored on master db
Sat 19 18:00 - Full replication of database master - slave is restored
Bidding service: Halted from Fri 18:15 to Sat 19 13:10
API service: Unavailable from Fri 18:15 to 10:55
Administration UI: Unavailable from Fri 18:15 to 10:55
Adserving service: Operational
Data loss from approx 16:14 (time of last backup) to 18:20 on Fri 18 (time for bidding off). Please consult with our platform team at email@example.com
if you have experienced data loss, and we will try to remedy this to the best of our ability.
The intrusion was facilitated due to a weakness in the Microsoft RDP protocol used for remote management of servers. We have since the attack closed RDP access, and will resort to other ways to control the servers remotely.
Server patching with latest software updates has been upgraded to be done within maximum three days instead of weekly.
We are deeply sorry for this incident, an outage of this sort is simply unacceptable. We have made it a first priority to employ state-of-the-art security measurements such as two factor authentication, encryption, anti-virus software, along with documented security policies and other procedures to enforce security. Nevertheless, the work to guard Internet services against malicious entities is an ongoing and never ending one.
We are committed towards offering an uninterrupted, always-on 24/7 service of programmatic advertising, and I sincerely hope that we will have your continued trust in doing so.
Marcus Johansson, Founder & CEO BidTheatre
Jan 21, 19:05 CET