Saturday, April 22, 2017

Putting my node.js-toolbelt aside...

So I'm about to head into a new challenge, which does not include much - if any - node.js development. Therefore I'm trying to conserve as much of my current knowledge as possible for others and potentially a future me (hey tom! :)).


Ubuntu is the one and only operating system to use if you ask me, but I'm sure you can get stuff done with any of the operating systems nowadays. As an editor I love Atom for its simplicity, ecosystem and performance (see: how to install Atom on Ubuntu). Chrome is the way to go for its incredible developer tools of course.

Installing npm properly is surprisingly hard because of file permissions. If you don't follow the steps below you'll have to install global dependencies as root (and also some local dependencies if I remember correctly?): 

Frequently used modules (server-side)

When creating a new project I see myself using the same modules over and over again:
  • moment: I haven't touched a Date-object in a year or so...
  • express: it took me months to finally give it a try, but it really makes things a lot easier!
  • Q: I'm trying to use native Promises, but for advanced promise-handling (e.g. wait for several promises to finish) this library is still the way to go
  • bunyan: must-have if you're serious about your application. Crank up your logging for easy error-detection. Also incredibly helpful if you're using multithreading (cluster)
  • request-promise: for requests...
  • standard: You're doing it wrong if you're not using a linter. It took me way too long to figure this out, since I was too lazy to figure out the correct rules, but that is exactly the point of standardjs: just use a fairly standard ruleset across all projects and get coding!
Some more notes on standardjs: I hate single quotes and I was absolutely shocked when I saw that they do not use semicolons. However, it really does not matter at all. You'll get used to it, I promise. It's not religion, we just want to get stuff done...
There are also Atom-plugins to show and automatically "fix" style violations. That way I didn't even care if I accidentally used double quotes or a semicolon out of habit, because the linter fixed it for me.

This is usually enough to get started. Of course, depending on the project you might include a few more related modules, but I generally try to keep the number of dependencies low. Most importantly, I try to avoid frameworks like lodash, etc because they're usually used like a blackbox that magically does things without knowing about its performance or security implications. That's why it took me so long to finally use express, but in that case it saves you from lots of stupid boilerplate-code (request routing) without being an unnecessarily thick layer.

Someone has to be able to use the application after all (aka client-side)

I'm really not a designer, and would never dare to call myself a Frontend Developer. I can get decent things done, but using CSS is like black magic for me, which is why I'm still using br-tags instead of setting margin on elements...

Anyway, I recently started using Material Design Web Components for all webapps I'm working on! The documentation is far from perfect, which makes getting started really hard as a non-experienced Web Developer. In fact it took me three tries (over the span of a few months) until I finally understood how to properly use it and felt comfortable to use it in my own projects. The thing that was not obvious to me from the beginning was that there is additional documentation for each component inside the respective folder (doh!). So after checking out the Getting Started Guide make sure to look at a component's documentation.

For small webapps which are only used internally inside a company it's usually fine to include each Javascript- and CSS-file in your HTML and call it a day. If the project gets any bigger and you care about the nerves of the poor guy who has to maintain your project after you quit, you should give webpack a try. I love that it allows you to draw dependencies between Javascript, CSS and even HTML if need be. Moreover, you can start using npm for installing frontend dependencies, which is a huge step forward for everyone involved. Again, the documentation is really hard to follow for non-experienced developers and it took me quite some time to figure out the basic setup I needed. I recommend you to take a look at the obvious "Getting Started" for Javascript and "Code Splitting CSS" for CSS.


It's still hard for me to put the way I create applications into words, but I share a lot of articles related to that on Google+. Here's a few things to know:
  • classes ending with "Bridge" are generally wrappers for some kind of logic, but nothing that involves UI. TimeBridge, SqlBridge, etc you get the idea.
  • classes ending with "Component" wrap a UI component and all its logic.
Without webpack, every class is wrapped in an IIFE in order to not pollute the global namespace by exposing internal methods and variables. With webpack the class is simply exported using standard syntax.

That's the most important things to know when reading my code. There are some more subtle things which I won't go into detail now (instead, take a look at the articles posted on Google+).


I love Google Cloud, and I enjoyed working with AppEngine a year ago. But for node.js development Heroku is still the way to go if you ask me. The deployment process via git is too good to be true!
(Please note that I've only used Heroku for single-instance hobby projects. For anything serious I'd probably not recommend Heroku but a multi-instance setup with a loadbalancer and autoscaler)

Putting it all together

You can find a sample project which uses all of the above on GitHub: either with webpack or without it.

The apps are hosted here (with webpack) and here (without webpack) for demonstration purposes.

Sunday, April 2, 2017

How to use ZAP proxy to find vulnerabilities in WebGoat

So you're starting to become a "Penetration Tester" or "Web Application Security Expert"? Maybe you first look at how to discover a XSS-vulnerability? That's great and all necessary, but learn to use existing tools to your advantage and save yourself from repetitive work. Introducing ZAP proxy from the magnificent OWASP-community. It's "is one of the world’s most popular free security tools" so you better know how to use it! Another very popular alternative is Burp Suite, but I prefer to use open-source tools where possible.

Let's take a look how to use a tool like ZAP to find vulnerabilities in a purposefully vulnerable demo project: WebGoat is another project by OWASP which "designed to teach web application security lessons". In this case we're using NodeGoat since I'm currently focusing on Node.js development.

Install and start ZAP, which automatically starts a local proxy at port 8080. Configure your browser or your operating system to use that for both HTTP and HTTPS.
Configuring a proxy on Ubuntu 16.10
Now we're ready to rumble. Before ZAP can start with the hard work we'll have to show it around the webapp we want to attack. Open your own instance of NodeGoat in a browser and trigger all available pages and features once: signup, login, logout, changing profile, etc. You should end up with a "sitemap" in ZAP now. If not, you didn't configure your browser to use the proxy!
sitemap generated by ZAP after visiting NodeGoat in a browser
While you were browsing NodeGoat, ZAP inspected all traffic and looked for possible vulnerabilities. This passive scan should mostly uncover some missing security-related HTTP headers. You can see those under "Alerts" in ZAP.

Our part is done now, let's get ZAP working! In order to start an active scan, right-click on the entry of your webapp in the sitemap, click "Attack" and "Active Scan". You can fine-tune the scan by clicking "Show advanced options" and looking at all available options. Since we're using our own instance of NodeGoat it's safe to crank up all settings. Most importantly, enable "HTTP Headers" and "Cookie Data" under "Input Vectors". This tells ZAP to try known combinations of headers in order to exploit a vulnerability.
starting an active scan in ZAP
configuring an active scan in ZAP

Again, inspect the "Alerts" tab for an explanation of all problems discovered so far.

What we found so far are mostly vulnerabilities for the public-facing side of our webapp (read not authenticated sections). This is great for starters, but there's so much more potential to find a vulnerability in authenticated sections of a webapp! Unfortunately this is quite hard to scan in ZAP because it involves some more configuration on our end. If you're interested in how to do that, check out this video: ZAP Tutorial - Ajax Spidering authenticated websites

Sunday, October 9, 2016

Compute Engine 101: Auto-updating instance

We're using Google Compute Engine to run one-off tasks which require strong machines (32 cores), then shut down automatically after completing the task. Previously I started those machines manually, connected to it, ran the task and waited for it to complete in order to stop the instance afterwards. It took some time until it annoyed me enough to automate all the things, but here's how it works now.

After following the instructions you'll also be able to view your instance's logs via Google Cloud Logs Viewer, which is a huge advantage! You could say we're creating a more flexible version of Heroku (for one-off tasks) here. :)

Compute Engine

Source Repository

Mirror your existing git repository using Google Cloud Repositories so you can easily access it from within Compute Engine without fiddling around with SSH keys. The code in this example assumes you call the repository "worker", but you can call it whatever you want and adapt the code accordingly.

Instance Setup

Create your instance as you would normally do, but be careful when configuring "Identity and API access". Default access does not include access to Google Cloud Source Repositories, so you have two options:
  1. choose "Allow full access to all Cloud APIs". This might be dangerous because it, well, gives the instance access to all APIs and therefore allows a hacker to run havoc if he gets access to that instance.
  2. create a new service account and give it access to "Compute Instance Admin", "Logs Writer" and "Source Repositories Reader" 

After creating the instance connect to it and install your dependencies. Here's my script installing Java, Maven and Google Cloud Logging:

Afterwards shut down your instance and configure a startup-script. Here's mine, which fetches the latest code from git and executes the program via Maven:

That's it! Your program executes at its latest version now every time you start the instance.


Implement a servlet on AppEngine, which starts the Compute Engine instance using google-cloud-java:

You could, for example, trigger that servlet via a cronjob every day if need be. Definitely make sure to properly authenticate requests to this servlet as you don't want strangers / hackers to start your instance without you knowing.

Wednesday, August 31, 2016

Bugs From Hell: Web Developer Edition

Every now and then I have the pleasure to encounter bugs which leave me absolutely clueless even after thinking about all sane possibilities. Here's one of those:

We have a landing page where the user has to enter his zip code (which is stored in cookies) and is forwarded to the webshop afterwards. In the webshop we read the zip code from the cookies at page load and fetch data accordingly (some products are not available in some areas). Every now and then no data was fetched. Why? Because the zip code was not set in cookies. Wait what? I made sure the cookie was set correctly after entering the zip code, but after page load it was not there anymore. As soon as I set a breakpoint before the cookie was accessed the problem disappeared, so I knew it was some timing-problem. The cookie was always accessible after the page was fully loaded. Was it the browser persisting the cookie "too slow"? No way, it's all synchronous.
Suddenly the solution struck me (it was one of those Matrix-moments where I was not in control of my mind)! On the landing page we're using prerender. This had to cause the problem (I still had no clue how exactly at this point)! Some googling revealed: yes it does. The browser tries to resolve cookie conflicts for you, but in that case it only made things worse it seems.

Here's some thoughts from the Chrome developers on that topic.

And here's someone else experiencing a similar (the same?) issue.

While debugging this insanity I came across a really cool library which allowed me to cross out some possibilities quite fast. It allows you to halt your code whenever a variable is accessed or modified (without changing your code!). This way I ruled out that some third-party code was modifying my cookies.

Generally speaking when debugging the most important thing is to reproduce the problem yourself. You might not be able to reproduce it consistently (I couldn't either), but you get a good idea of what could possibly be wrong. Afterwards you have to come up with possible causes for that error. Make sure not to rule something out because you think "it's impossible". Everything is possible, it's an unforeseen bug after all! Next you focus on one after another of the possible problems you came up with. Using a lot of knowledge about the parts being involved (libraries, browser, etc) you will eventually come up with a solution. :)

PS: In case you have a similar problem but don't use prerender it's probably someone who is reading the whole cookie-string (, potentially modifying it) and writing it all at once afterwards, instead of adding / modifying just his own cookie. Again, the above library quickly ruled that possibility out for me.

Friday, June 3, 2016

AppEngine 101: Datastore Consistency

The beauty of cloud solutions like AppEngine and its database, called Datastore: it just scales. It does indeed scale very well, but it does so by applying a few restrictions. In the case of Datastore that is "eventual consistency", something you're not used to when you're used to conventional databases like MySQL.

What does eventual consistency mean?

Here's a really simple example to describe it: You have a table called Messages where you store messages sent by the users of your website (a chatroom, or guestbook, etc). When the page is reloaded you query all data from the Messages-table and display it. Someone enters a message and it's stored in the database. The page is reloaded moments later and all Messages are looked up using a query. The recently stored message does not show up though. Because eventual consistency!

After changing (creating, updating, deleting) data in your database, queries executed moments later might (!) not return the latest data in some cases. Eventually, though, it is going to return those changes. It might be nanoseconds, seconds, ... later.

Your first thought might be that this is awful, however it isn't. It is what allows us to scale virtually infinitely. Eventual consistency is completely fine for lots of usecases: Facebook News Feed (who cares if those status updates show up a few moments earlier or later?), or even static data (a shop which changes its product assortment only once a week during a maintenance timeframe).
Of course there are times where consistent data is crucial: everything involving real money flowing, mission-critical data used for real time status monitoring, etc. This is why Datastore has "Transactions". Every database action executed within a transaction is consistent. However, if consistency can't be assured because someone else is changing that data at the same time transactions fail and you have to retry them for example.

For a much more detailed explanation check out this article: Balancing Strong and Eventual Consistency with Google Cloud Datastore