Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • H headless-chrome-crawler
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 4
    • Merge requests 4
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • yujiosaka
  • headless-chrome-crawler
  • Issues
  • #354
Closed
Open
Issue created Sep 19, 2019 by Administrator@rootContributor

crawler.response becomes null when connecting to specific chrome instance

Created by: carlkentor

What is the current behavior? Okay, I dont know if this should be classified as a bug or not. But since I cant seem to find any trace of this in the docs, here it is.

When connecting to a specific chrome instance through the: await HCCrawler.connect({ browserWSEndpoint: wsChromeEndpointurl, ...more options goes here}) method, the call to await crawl() results in a null response. The issue lies within:

\node_modules\headless-chrome-crawler\lib\crawler.js

async crawl(){
await this._prepare();
const response = await this._request(); //1. Results in Null
console.log(response); //2. ======> NULL INDEED
await this._waitFor();
const [
  result,
  screenshot,
  cookies,
  links,
] = await Promise.all([
  this._scrape(),
  this._screenshot(),
  this._getCookies(),
  this._collectLinks(response.url),  //3. ====> Exception gets thrown here 
]);
return {
  options: this._options,
  depth: this._depth,
  previousUrl: this._previousUrl,
  response: this._reduceResponse(response),
  redirectChain: this._getRedirectChain(response),
  result,
  screenshot,
  cookies,
  links,
}; }

`

Steps to reproduce

  1. Close all chrome instances.
  2. Open CMD and run the following to start Chrome with remote debugging enabled:

start chrome.exe --remote-debugging-port=9222

  1. Navigate to http://localhost:9222/json/version and copy the value of "webSocketDebuggerUrl"
  2. Set the browserWSEndpoint to the value you just copied.

What is the expected behavior? When running through a clean instance with the

await HCCrawler.launch({
    ...options
})

command, everything works fineand the result comes back as expected. What am I missing here?

Please tell us about your environment:

  • Version: 1.8.0
  • Platform / OS version: Windows 10 / 1803
  • Node.js version: v10.16.3
Assignee
Assign to
Time tracking