What to do when things go wrong?


I blogged earlier about blameless post-mortems and how one gets to a point that they are able to do blameless post-mortems – by having an operational rigor and observability. This is more of a lessons learnt post about what do you do and what you don’t when things go wrong?

Focusing on the Who?

A lot of times focusing on the “who reported the issue?” can be focusing on a wrong thing. If a report comes from a penetration test or a bug bounty researcher or an internal security engineering resource you need to make sure that the impact and likelihood is clearly understood. There are sometimes where customers (who pay or intend to pay for your service) report problems – these are obviously more important.

Focusing on the How?

How a security issue gets reported is important. As examples where you learn about a security issue via a bug report(1), or where you learn about it via your own telemetry(2) or you learn about it on Twitter! There is a potential for legal ramifications in each of these cases and the risks might be different. When things become public without your knowledge where you were not notified and the information is now public you do have a role to instill confidence in your current customers. The best approach here tends to be of sticking to facts without any speculations. If you are working on incident say so. Don’t say we are most secure when you are the subject of a breach discussion especially because you already have the data that you are not as secure. Identification and Containment of the security issue are top priorities – do not take resources that are doing these actions away to ensure Public Relations are good – doing that will eventually make public relations bad! Involve lawyers in your communications process and mark communications with right legal tags (“attorney client privileged material”) so that if a litigation happens you can clearly demarcate evidence that can be or cannot be part of a discovery.

Focusing on the What?

“What” needs to be done has to be clear with the help of an incident manager. The incident manager is the person who is most well read, subject matter expert, and leads the response process. Having this single-threaded ownership of leading the incident is incredibly important. The role of the incident manager is to ensure they have all the information that they need to make decisions. This also streamlines the process of public relations, legal needs, incident cleanup (eradication and recovery), and helps with swift and focused decision-making. This can sometimes be crisis management depending on impact and otherwise it can be just another day in the Security operations office. The key trait here is focus and goal-based decision making. Adrenaline can run high, tempers can flare – that typically happens when you are unprepared to handle security incidents. The tempers and nervousness can be avoided by being proactive in doing tabletop exercises, incident dry-runs and having good runbooks. But all practice games do is prepare you for the real thing – the real thing is how you handle a true incident. Use the help of key stakeholders to derive best decisions – there often tend to be situations where no answer looks good – and therein comes the customer focus – if you focus on the well being of customers you will rarely go wrong.

Focusing on the Why?

Capture incident response logs in tickets and communications so all the timeline and actions get captured properly with documentation. After the recovery is completed, do a blameless post-mortem of how you got there. Ensure you put a timeline of taking on agreed-upon corrective actions on a timeline that is agreed and don’t waiver – this is a part of operational rigor one needs to follow to really avoid incidents from happening in future. Typically, the reason why issues happen is because something was not prioritized as it should have been. Reprioritize to make sure you can reassess. Sometimes the size of the incident makes it your reprioritization almost coerced – it’s ok to be coerced in that direction. You will find that coercion is simply an acceleration of the actions that you should have taken up earlier. No one is perfect – just come out of it better!

Focusing on the Where?

Where you discuss the issue is important. When sizable incidents happen discuss is openly with the business leaders so that full awareness and feedback is provided in “powerful forums”. This obviously does not mean that you break your attorney client privilege – it just means discuss with the highest leaders in a manner where action items, impact and post-mortem results are provided. This enables business to become resilient over time and develop confidence in the security teams. If you need to do public releases then ensure that lawyers read it and security SMEs read it as well as business leaders read it – only then do such releases. Don’t let the “left hand meet right” situation ever occur. This instills customer confidence in your process.


This was just an attempt for me to pen-down my thoughts as they appeared in my brain. I am sure I forgot a lot such as stress of handling, avoiding knee-jerk reactions, etc. but these are top most important things that I felt were necessary to share. Remember, incident handling gets better with practice – you want the practice be done in practice games not in the olympics! 🙂


The historical evolution of Cross-Site Request Forgery


Having been in application security for more than 2 decades now and officially completing my 18th year now of being meaningfully employed in that space there is just a lot of crud that I have gathered in my brain. Most of that is history of how things came about to be. That stuff is likely not interesting to most but I find it intriguing as to how some seemingly minor decisions of one software vendor can have massive impact to the web application security industry.

Oh the dreaded IE…
Internet Explorer 4, 5 and 6 that started in the Windows XP days (or even earlier, can’t recall) had a setting – the cookie jar was not shared – i.e., if you opened a new window to a site, you would have to log in again unless you used “Ctrl+N” key to open a new window from an existing session. Each new process would have its own cookie jar. For the uninitiated, the “cookie jar” is the internal browser storage of cookies. Cookies are random looking strings that indicate a “trust token” that a web server places in the browser. Since HTTP is a connectionless protocol, this cookie is what preserves the “state” and this is exactly what authorization decisions in HTTP context are typically based on. These cookies are stored in a web browser data storage called cookie jar where each cookie gets stored with the name, value, domain, path (and today, there are few other attributes but that wasn’t the case back in 2004-2005). The browser gets all these parameters from the HTTP response header Set-Cookie. Microsoft, the vendor for Internet Explorer, made a decision that each new window of IE should have its own set of stored cookies that were not shared. Mozilla Firefox and Google Chrome always had a shared cookie jar if I recall correctly.

Along came a Cross-Site Request Forgery (CSRF)…

Jesse Burns from iSecPartners (an NYC-based security consultancy that was acquired by NCC group) back then wrote a paper which I think was the seminal paper on Cross-Site Request Forgery. They called it “XSRF” back then because “XSS” was already in parlance back-then. Thereafter, there were presentations in 2006 about the same by Microsoft. The whole attack was simple. The victim has a browser tab open in which they are logged into a site that has issued that session a cookie value. Due to the browser same-origin policy (a concept that Netscape designed in 1995) that cookie would be resent by the browser in the request as a Cookie HTTP request header whenever an HTTP request was sent to the same domain, protocol (“scheme”) and port. There were few idiosyncracies of IE (which made it infamous back then) such as if the port number did not match IE did not complain and would think that access was allowed per the Same-Origin Policy (SOP). What does that mean? http://example.com and http://example.com:81 would be treated as the same origin! Weird right? It wasn’t the case with other browsers. This was also documented in Michal Zalewski’s book Tangled Web in 2011. Where am I going with this? So while IE did some weird things, it did one good thing – isolate cookie jars. So if you opened up a new window where the attacker ran a payload that sent a request to the site which had handed you a cookie, the new IE window would have no interesting cookies to share with that site – inadvertently protecting the user from being a victim to a CSRF issue. Yes, the hated IE protected the users from being victim to CSRF! Who would have thought? That’s how weird 2005 was 🙂

Fast forward…

Since all browser vendor today have concept of shared cookie jars because who doesn’t like opening new tabs of their favorite cloud consoles without having to re-login right? So what did we the people do? We came up with another attribute that could be added to a Set-Cookie HTTP response header – SameSite attribute which restricted the cookie from being sent unless the request originated from a page on the same site as the cookie issuer.

So there you have it… the history of SameSite and how one of the most hated browsers of the day (IE) did one good thing for users – protect them from CSRF! 🙂