Migration from PhantomJS to Chrome DevTools Protocol

Being a web interface, Cockpit has a comprehensive integration test suite which exercises all of its functionality on a real web browser that is driven by the tests. Until recently we used PhantomJS for this, but there was an ever-increasing pressure to replace it.

Why replace PhantomJS?

Phantom’s engine is becoming really outdated: it cannot understand even simple ES6 constructs like Set, arrow functions, or promises, which have been in real browsers for many years; this currently blocks hauling in some new code from the welder project. It also doesn’t understand reasonably modern CSS which particularly is important for making mobile-friendly pages, so that we had to put in workarounds for crashes and other misbehaviour into our code. Also, development got officially declared as abandoned last April.

So about two months ago I started some research for possible replacements. Fortunately, Cockpit’s tests are not directly written in JavaScript using the PhantomJS API, but they use an abstract Browser Python class with methods like open(url), wait_visible(selector), and click(selector). So I “only” needed to reimplement that Browser class, and didn’t have to rewrite the entire test suite.

Candidates

The contenders in the ring which are currently popular and most likely supported for a fair while, with their pros and cons:

  1. Electron. This is the rendering engine of Chromium plus the JS engine from nodejs. It is widely adopted and used, and relatively compact (much smaller than Chromium itself).

    • pro: It has a built in REPL to use it interactively (node_modules/.bin/electron -i) and this API is relatively simple and straightforward to use, if your test is built around an external process. This is the case for our Python tests.
    • pro: If your tests are in JS, there is Nightmare as API for electron. This is a really nice one, and super-easy to get started; npm install nightmare, write your first test in 5 lines of JS, done.
    • pro: Has nice features such as verbose debug logging to watch every change, signal, and action that’s going on. You can also enable the graphical window where you see your test actions fly by, you can click around, and use the builtin inspector/debugger/console.
    • It lags behind the latest Chromium a bit. E. g. latest chromium-browser in Fedora 27 is v62, latest Electron is based on 58. (But this might be a good or bad thing depending on your project - sometimes you actually don’t want to require the very latest browser)
    • con: Not currently packaged in Fedora or Debian, so you need to install it through npm (~ 130 MB uncompressed). I. e. almost twice as big as PhantomJS, although the difference in the compressed download is much smaller.
    • con: It does not represent a “real-life” browser as it uses a custom JS engine. While this should not make much of a difference in theory, there’s always little quirks and bugs to be aware of.

  2. Use a real browser (Chromium, Firefox, Edge) with Selenium

    • pro: Gives very realistic results
    • pro: Long-established standard, so most likely will continue to stay around for a while
    • con: Much harder to set up than the other two
    • con: API is low-level, so you need to have some helper API to write tests in a sensible manner.

  3. Use Chromium itself with the DevTools Protocol, possibly in the stripped down headless variant. You have to use a library on top of that: chrome-remote-interface seems to be the standard one, but it’s tiny and straightforward.

    • pro: This is becoming an established standard which other browsers start to support as well (e. g. Edge)
    • pro By nature, gives very realistic test results, and you can choose which Chrome version to test against.
    • pro: Chromium is packaged in all distros, so this doesn’t require a big npm download for running the tests.
    • con: Relatively hard to set up compared to electron or phantom: you manually need to control the actual chromium process plus your own chrome-remote-interface controller process, and allocate port numbers in a race-free manner (to run tests in parallel).
    • con: Relatively low-level protocol (roughly comparable to Selenium), so this is not directly appropriate for writing tests - you need to create your own high-level library on top of this. (But in Cockpit we already have that)

  4. puppeteer is a high-level JS library on top of the Chromium DevTools Protocol.

    • pro: Comfortable and abstract API, comparable to Nightmare.
    • pro: It does the job of launching and controlling Chromium, so similarly simple to set up as Nightmare or Phantom.
    • con: Does not work with the already installed/packaged Chromium, it bundles its own.

After evaluating all these, my conclusion is that for a new project I can recommend puppeteer. If you can live with pulling in the browser through NPM for every test run (CI services like Semaphore cache your node_modules directory, so it might not be a big issue) and are fine with writing your tests in JavaScript, then puppeteer provides the easiest setup and a comfortable and abstract API.

For our existing Cockpit project however, I eventually went with option 3, i. e. Chrome DevTools protocol directly. puppeteer’s own abstraction does not actually help our tests as we already have the Browser class abstraction, and for our CI system and convenience of local test running it actually does make a difference whether you can use the already installed/packaged Chrome or have to download an entire copy. I also suspect that my troubles with SSL certificates (see below) would be much harder or even impossible to solve/workaround with puppeteer.

Interacting with Chromium

The API documentation is excellent, and one can tinker around in the REPL interpreter in simple and straightforward way and watch the result in an interactive Chromium that runs with a temporary $HOME (to avoid interfering with your real config):

$ rm -rf /tmp/h; HOME=/tmp/h chromium-browser --remote-debugging-port=9222 about:blank &
$ mkdir /tmp/test; cd /tmp/test
$ npm install chrome-remote-interface
$ node_modules/.bin/chrome-remote-interface inspect

In the chrome-remote-interface shell one can directly use the CDP commands, for example: Open Google’s search page, focus the search input line, type a query, and check the current URL afterwards:

>>> Page.navigate({url: "https://www.google.de"})
{ frameId: '4521.1' }
>>> Runtime.evaluate({expression: "document.querySelector('input[name=\"q\"]').focus()"})

>>> // type in the search term and Enter key by key
>>> "cockpit\r".split('').map(c => Input.dispatchKeyEvent({type: "char", text: c}))

>>> Runtime.evaluate({expression: "window.location.toString()"})
{ result:
   { type: 'string',
     value: 'https://www.google.de/search?source=hp&ei=T5...&q=cockpit&oq=cockpit&gs_l=[...]' } }

The porting process

After getting an initial idea and feeling how the DevTools protocol works, the actual porting process went in a pretty typical Pareto way. After two days I had around 150 out of our ~ 180 tests working, and porting most of the API from PhantomJS to CDP invocations was straightforward. A lot of the remaining test failures were due to “ordinary” flakes and bugs in the tests themselves, and a series of four PRs fixed them.

There were three major issues on which I spent the “other 90%” of the time on this though - perhaps this blog post and my upstream bug reports help other people to avoid the same traps:

  • Frame handling: Cockpit is built around the concept of iframes, with each frame representing an “application” on your “server Linux session”. To make an assertion or run a query in an iframe, you need to “drill through” into the desired iframe from the root page DOM. I started with a naïve JavaScript-only solution:

    if (current_frame)
      frame_doc = document.querySelector(`iframe[name="${current_frame}"]`).contentDocument.documentElement;
    else
      frame_doc = document;
    

    and then do queries on frame_doc. This actually works well for all but one of our tests which checks embedding a Cockpit page into a custom HTML page. Then this approach (rightfully) fails due to browser’s Same-origin Policy.

    So I went ahead and implemented a solution using the DevTools “mirror” DOM and API. It took me three different attemps to get that right, and in that regard the API documentation nor a Google search were particularly instructive. This is an area where the protocol really could be improved. I posted my solution and a few suggestions to devtools-protocol issue #72.

  • SSL client certs: Our OpenShift tests kept failing when the OAuth page came up, but only when using Headless mode. I initially thought this was due to the OAuth server having an invalid SSL certificate, as the initial error message suggests something like that. But all approaches with --ignore-certificate-errors or a more elaborate usage of the Security API or even actually installing the OAuth server’s certificate didn’t work - quite frustrating.

    It finally helped to enable a third kind of logs (besides console messages and --enable-logging --v=1) which finally revealed what it was complaining about: OAuth was sending a request for presenting a client-side SSL certificate, and this just causes Chromium Headless to throw its hands into the air. As there is no workaround with Chromium Headless, I had to bite the bullet and install the full graphical Chromium (plus half a metric ton of X/mesa dependencies) and Xvfb into our test containers, plus write the logic to bring these up and down in an orderly and parallel fashion.

  • Silently broken pushState API: One of our tests was reproducibly failing on the infrastructure, and only sometimes locally; the screenshot shows that it clearly was on the wrong page, although the previous navigation requests caused no error. Single-stepping through them also worked. Peter and I have spent about three days debugging this and figuring out why adding a simple sleep(3) at a random place in the test made the test to succeed.

    It turned out that a few months ago the window.history.pushState() method changed behaviour: When there are too many calls to it (> 50 in 10 seconds) it ignores the function call, without returning or logging an error. This was by far the most frustrating and biggest time-sink, but after finally discovering it, we had a good justification why a static sleep() is actually warranted in this case. (Related upstream bug reports: #794923 and #769592)

After figuring all that out, the final patch turned out to be reasonably small and readable. Most of the commits are minor test adjustments which weren’t possible to implement exactly as before in the API. Of course this got preceded with half a dozen preparatory commits, to adjust dependencies in containers, fix test races, and the like.

Now that this is landed, we could clean up a bunch of PhantomJS related hacks, it is now possile to write tests for the mobile navigation, and we can now also test ES6 code (such as welder-web). Debugging tests is much more fun now as you can run them on an interactive graphical browser, to see widgets and pages flying around and interactively mess around or inspect them.