Tiny WebCrawler for Laravel using Playwright.
Version 2 has been reworked as a simple package that depends on Playwright. It only implements minimal functionality, since you can use playwright-php/playwright directly.
In addition, version 2.2 now supports the Vercel agent-browser.
- PHP >= 8.3
- Laravel >= 11.x
composer require revolution/salvagerInstall Playwright browsers:
vendor/bin/playwright-install --browsersOr install Playwright browsers with OS dependencies:
vendor/bin/playwright-install --with-depsInstall agent-browser and Chromium globally and run it as a Laravel Process.
Warning
This doesn't work with Vercel or Laravel Cloud. See below.
npm install -g agent-browser
agent-browser install
# Linux
agent-browser install --with-deps# .env
SALVAGER_AGENT_BROWSER_PATH=/path/to/agent-browser
SALVAGER_AGENT_BROWSER_OPTIONS=If you want to use custom Chromium binary @sparticuz/chromium, you can specify it in shell environment variables.
AGENT_BROWSER_EXECUTABLE_PATH=/tmp/chromium# .env
SALVAGER_INSTALL_CHROMIUM="node ./scripts/install-chromium.js"You can also install agent-browser locally and use it with Cloud provider such as Browserbase or Browser Use.
This should work on Vercel and Laravel Cloud, which cannot install OS deps.
Install in your Laravel project. Requires agent-browser v0.7.6 or later.
npm install agent-browser# .env
SALVAGER_AGENT_BROWSER_PATH="npx agent-browser"
SALVAGER_AGENT_BROWSER_OPTIONS=Set it in the shell environment variables instead of .env.
AGENT_BROWSER_PROVIDER=browserbase
BROWSERBASE_PROJECT_ID="your-project-id"
BROWSERBASE_API_KEY="your-api-key"AGENT_BROWSER_PROVIDER=browseruse
BROWSER_USE_API_KEY="your-api-key"Vercel also requires AGENT_BROWSER_SOCKET_DIR.
AGENT_BROWSER_SOCKET_DIR=/tmp/I have confirmed that it works with Vercel and Browserbase.
The browser will be terminated when you exit Salvager::browse(), so please obtain any necessary data within the Salvager::browse() closure. The Page object cannot be used outside of Salvager::browse().
use Revolution\Salvager\Facades\Salvager;
use Playwright\Page\Page;
class SalvagerController
{
public function __invoke()
{
Salvager::browse(function (Page $page) use (&$url, &$text) {
$page->goto('https://example.com/');
$page->screenshot(config('salvager.screenshots').'example.png');
$url = $page->url();
$text = $page->locator('p')->first()->innerText();
});
dump($url);
dump($text);
}
}If you want more control, just launch the browser with Salvager::launch().
use Playwright\Browser\BrowserContextInterface;
use Revolution\Salvager\Facades\Salvager;
/* @var BrowserContextInterface $browser */
$browser = Salvager::launch();
$page = $browser->newPage();
$page->goto('https://example.com/');
// Do something...
// Don't forget to close the browser
$browser->close();use Revolution\Salvager\AgentBrowser;
use Revolution\Salvager\Facades\Salvager;
Salvager::agent(function (AgentBrowser $agent) use (&$url, &$text, &$html) {
$agent->userAgent('Chromium');
$agent->open('https://example.com/');
$agent->screenshot(config('salvager.screenshots').'agent-test.png');
$url = $agent->url();
$text = $agent->text('xpath=//p[1]', '--json');
$html = $agent->html('css=html');
// Run any agent-browser command
$result = $agent->run(command: '', args: '', options: '');
$agent->close();
});Since text() and html() use Playwright's page.locator(), using a CSS selector will result in an error if multiple elements are found. If you want to specify one of multiple elements, use XPath.
MIT