Finding Secrets In Web Archives

Web archives (archive.org) and old cached content is an important trove of data for vulnerability research and bug hunting. Not only we can find some old endpoints that may not have been maintained as well as newer versions, but also they may contain sensitive information which at the time when the snapshot was taken it did not represent a big risk or the risk was not entirely known.

In this tutorial we will show you a very simple technique on how to find secrets across many web pages from various web archives and do so efficiently.

Getting Pown

The first step is to get Pownjs on your system. With Nodejs and NPM already installed, you simply need to run the following command:

$ npm install -g pown

This command will install Pownjs globally so that we can use it in future tutorials. Ensure that you upgrade frequently to get the latest patches and features especially for this tutorial. This is easily done with the following command:

$ pown update

Exploring Pown Lau

Pown Lau, part of the default distribution of Pownjs, is a library and a command line tool for listing old content from archive.org, commoncraw, and the alienvault indexes. The command has a pretty simple syntax. To list all archived URLs from secapps.com we need the following command:

$ pown lau secapps.com

listing web archives

You will notice that some of the URLs are duplicated. This is because they come from either different databases or different archive snapshots. The list can be cleaned up with either some shell scripting like this pown lau secapps.com | sort | uniq or you can use a builtin flag for that like this: pown lau secapps.com -u. Either way, you can use this list as the target for the next step.

Finding Secrets with Pown Leaks

Pown Leaks, also part of the default distribution of Pownjs, is a secrets detection tool. It supports multiple mechanisms to search for strings that look like passwords, tokens and so on. The tool does this searching efficiently and it comes with its own community-supported database.

In the command line you can point the tool to either a directory for recursive search or a URL for fetching and searching within the content. For example, the following command will find the test harness leaks we have in the tool database:

$ pown leaks https://raw.githubusercontent.com/pownjs/pown-leaks/master/lib/db/aws.json

pown leaks cli

Combining Both Tools

Now that we know how to use each tool individually, let’s combine them both to search for leaks across many URLs extracted from the web archives. Our command is very simple:

$ pown lau secapps.com | pown leaks -u -s -

using pown lau with leaks

As you can see, even our own domain contains things that look interesting and potentially require further exploration. Don’t worry, we are safe.

Let’s try this technique with a few more targets from popular bug bounty programs:

$ pown lau mail.ru | pown leaks -u -s -

using the technique on mail.ru

Here is another one:

pown lau uber.com | pown leaks -u -s -

using the technique on uber

For best results, you need to play with the command line options and to select a subset of the available leak detectors. As you may have noticed, some rules are too verbose and while often they can point to interesting areas of research, in some situations the noise may be too much for the task at hand.

Conclusion

In this tutorial we have learned about lau and leaks two tools which come by default with Pownjs. Pownjs is the open source version of much of the technology we have built into the SecApps tools and services. By combining both tools you can hunt for secrets and leaks across many interesting URLs simultaneously.