Investigate navigation errors when navigation gets redirected or interrupted
Categories
(Remote Protocol :: WebDriver BiDi, task, P2)
Tracking
(Not tracked)
People
(Reporter: whimboo, Unassigned)
References
(Depends on 1 open bug, Blocks 4 open bugs)
Details
(Whiteboard: [webdriver:m17], [wptsync upstream])
Attachments
(4 files, 2 obsolete files)
Originally filed at: https://github.com/puppeteer/puppeteer/issues/12929
We should check what exactly is causing us to fail the navigation command.
Reporter | ||
Comment 1•11 months ago
|
||
The problem here seems to be the redirect to https://twitter.com. Not sure how this is done but it looks like it's not a usual redirect, but maybe they manually call window.location
as well?
Reporter | ||
Comment 2•11 months ago
|
||
Actually using DevTools network monitor the raw response from loading https://x.com shows indeed that window.location.href
is used but there is also a call to replaceState
. Not sure which exactly is executed here.
Reporter | ||
Comment 3•11 months ago
|
||
This bug seems to rely on the same underlying discussion as bug 1914405.
Updated•11 months ago
|
Reporter | ||
Updated•11 months ago
|
Comment 4•10 months ago
|
||
I tried to investigate a bit why nsIRequest.canceledReason didn't give any extra info here.
When the load is interrupted by a window.location update, as pointed out by :Nika, the docshell will stop the load at https://searchfox.org/mozilla-central/rev/cc01f11adfacca9cd44a75fd140d2fdd8f9a48d4/docshell/base/nsDocShell.cpp#9388-9401
// Stop any current network activity.
// Also stop content if this is a zombie doc. otherwise
// the onload will be delayed by other loads initiated in the
// background by the first document that
// didn't fully load before the next load was initiated.
// If not a zombie, don't stop content until data
// starts arriving from the new URI...
if ((mDocumentViewer && mDocumentViewer->GetPreviousViewer()) ||
LOAD_TYPE_HAS_FLAGS(aLoadState->LoadType(), LOAD_FLAGS_STOP_CONTENT)) {
rv = Stop(nsIWebNavigation::STOP_ALL);
} else {
rv = Stop(nsIWebNavigation::STOP_NETWORK);
}
And interestingly, we actually cancel the requests with a proper reason "nsDocLoader::Stop". While this wouldn't be enough to differentiate between a stop due to a navigation and one due to the user canceling the navigation (eg clicking on the cancel icon or pressing escape), it would still be interesting to have this reason available to our Navigation helper.
However the request available from the WebProgress in the parent process is a RemoteWebProgressRequest, which is not the real request, and this request doesn't have the proper canceledReason
set.
Nika, Valentin: do you know if we could get the canceledReason for the real request and set it correctly on the RemoteWebProgressRequest?
[On the other hand, I tried to check if we could reuse our NavigationManager to detect the new navigation, but sadly in the parent process we seem to get the notification about the new navigation after we get NS_ABORTED.]
Comment 5•10 months ago
|
||
(In reply to Julian Descottes [:jdescottes] from comment #4)
Nika, Valentin: do you know if we could get the canceledReason for the real request and set it correctly on the RemoteWebProgressRequest?
RemoteWebProgressRequest
is a tiny type used for web progress listeners in the parent process, so doesn't have any extra details on it beyond the basic mURI
, mOriginalURI
, and mMatchedList
(https://searchfox.org/mozilla-central/rev/90dc3743ea284c258148124dc54dfd123e82586f/dom/ipc/RemoteWebProgressRequest.h#29-32). While the mCanceledReason
exists because of being on nsIRequest
, it is not included in the RequestData
struct sent over IPC (https://searchfox.org/mozilla-central/rev/90dc3743ea284c258148124dc54dfd123e82586f/dom/ipc/PBrowser.ipdl#125-130), and therefore isn't transferred across processes.
If this is something which you think would be valuable to have available on remote web progress notifications, we could add it as an extra piece of information being sent down, but we unfortunately can't easily have a "real" channel available on these progress notifications.
Comment 6•10 months ago
|
||
Thanks for the feedback, let's discuss this during triage today. I think that would allow us to at least craft a better error message, and potentially wait for the next navigation (although our spec currently doesn't support this).
Comment 7•10 months ago
|
||
After discussing we agree it would be great to have the canceled reason on this request, and based on this we will emit a better error message. We should also add a wdspec test for this.
Also we should probably write a script to check how BiDi behaves when navigating to eg the top 1000 sites.
Updated•10 months ago
|
Updated•10 months ago
|
Comment 8•10 months ago
|
||
Interestingly, when writing a wdspec test to reproduce the issue, I also get a failure with chrome:
future = <Future finished exception=UnknownErrorException(unknown error, navigation canceled, as new navigation is requested by scriptInitiated, Error
at new UnknownErrorException (<anonymous>:81:4591)
at <anonymous>:597:6717
at <anonymous>:1:1010
at Array.map (<anonymous>)
at Object.emit (<anonymous>:1:993)
at MapperCdpClient.emit (<anonymous>:17:387)
at #onMessage (<anonymous>:823:2977)
at WindowCdpTransport.window.cdp.onmessage (<anonymous>:888:4763)
at <anonymous>:1:18)>
Maybe this doesn't occur in puppeteer because of specific feature which gets enabled in Chrome by puppeteer?
Comment 9•10 months ago
|
||
And I also get the same when connecting the client from https://juliandescottes.github.io/bidi-web-client/web/ to a regular Chrome instance and doing a navigation to mozilla.org.
So at least the behavior seems consistent in both browsers, although arguably the error message is much more helpful in Chrome.
Comment 10•10 months ago
|
||
The canceledReason is added to the RequestData used to build the
RemoteWebProgressRequest. This canceledReason is useful for consumers such
as WebDriverBiDi which need to emit different error messages depending on
the reason why a specific navigation request was canceled.
Comment 11•10 months ago
|
||
Depends on D223174
Comment 12•10 months ago
|
||
Comment on attachment 9426551 [details]
Bug 1914407 - Add canceledReason to RemoteWebProgressRequest
Revision D223174 was moved to bug 1921119. Setting attachment 9426551 [details] to obsolete.
Reporter | ||
Updated•10 months ago
|
Updated•10 months ago
|
Updated•10 months ago
|
Comment 13•10 months ago
|
||
Cool! so this is an update on firefox that will help me run puppeteer with it as announced here? https://hacks.mozilla.org/2024/08/puppeteer-support-for-firefox/
Updated•10 months ago
|
Reporter | ||
Comment 14•10 months ago
|
||
(In reply to amunizp from comment #13)
Cool! so this is an update on firefox that will help me run puppeteer with it as announced here? https://hacks.mozilla.org/2024/08/puppeteer-support-for-firefox/
You are already able to use Puppeteer with Firefox via the Webdriver BiDi protocol. It's just that in some cases websites do quirky stuff like setting window.location
to a different URL itself to trigger kinda redirect. That currently fails and bug 1921973 got filed for that. Here we are only updating the error message to make it more clear to the user why the call to the browsingContext.navigate
command failed.
Comment 15•9 months ago
|
||
As mentioned at https://github.com/puppeteer/puppeteer/issues/12929#issuecomment-2426520361, the issue actually also affects Puppeteer with Chrome + WebDriver BiDi. Only Chrome + CDP seems to avoid this issue.
Comment 16•9 months ago
|
||
Even with the canceledReason properly added on the request seen from the parent WebProgress, we don't always get the actual canceledReason when a load is interrupted with an NS_BINDING_ABORTED.
A simple test is to use a script doing window.location = someOtherUrl
. If the script is an inline script in the page, the document request will have a canceledReason. However if the script is in a different file and loaded from the page, we will still get a NS_BINDING_ABORTED, but the document request is no longer canceled with a reason, and we can't get any extra information about what triggered the failure.
While we still stop the load from nsDocShell as mentioned in comment 4, whether or not this cancelsWithReason the webprogress' request probably depends on the fact that the channel is still open or not. So we can't reliably use the request canceledReason to get the information.
It would be more reliable to add something to the WebProgress here. If we could have a flag/state which is only set when the load is interrupted by a script.
Olli, our goal is to be able to provide more helpful error messages when a navigation is interrupted because of a JS redirect. Do you think we can add a flag on the parent process ProgressListener to do that?
Comment 17•9 months ago
|
||
Yesterday tried to check the result of navigating with wait=complete to a list of top1000 websites. Out of this test, only 12 websites failed (2 being twitter/x).
Failed navigation to twitter.com Error: NS_BINDING_ABORTED
Failed navigation to x.com Error: NS_BINDING_ABORTED
Failed navigation to dailymotion.com Error: NS_ERROR_ABORT
Failed navigation to bizjournals.com Error: NS_BINDING_ABORTED
Failed navigation to www.gov.br Error: A new navigation interrupted an unfinished navigation
Failed navigation to expedia.co.uk Error: NS_ERROR_ABORT
Failed navigation to soccerway.com Error: NS_BINDING_ABORTED
Failed navigation to reclameaqui.com.br Error: Address rejected
Failed navigation to news.com.au Error: NS_BINDING_ABORTED
Failed navigation to gouv.qc.ca Error: NS_BINDING_ABORTED
Interestingly we see that for www.gov.br we have a better error message.
This one actually comes from the NavigationManager, which is monitoring navigations based on the content process webprogress updates.
The NavigationManager will emit navigation-failed events both when receiving error message for failed navigations due to eg content blocking (NS_ERROR_CONTENT_BLOCKED), but also when it detects a new navigation while an ongoing navigation is still in progress, which can be useful here.
The issue is that our parent-process Navigate.sys.mjs will fail the navigation slightly earlier. One option would be to check in the NavigationManager if we have a currently monitored navigation which matches the one we want to fail. In that case we could skip the NS_BINDING_ABORTED and leverage the upcoming failure from the NavigationManager.
Comment 18•9 months ago
|
||
Comment 19•9 months ago
|
||
Depends on D226770
Updated•9 months ago
|
Comment 20•9 months ago
|
||
The script I used for testing navigation to 1000 top sites can be found at https://gist.github.com/juliandescottes/5c6ab10b1d6b574e4a6d80d0ef65c4d3
For comment 17, a few successful navigations were actually incorrect due to Bug 1926894. Doing another run with a 1000ms timeout between each navigation to workaround the issue gave the following results:
Error | URL |
---|---|
Navigation was aborted by another navigation | https://cedars-sinai.org |
Navigation was aborted by another navigation | https://eonline.com |
Navigation was aborted by another navigation | https://espn.com |
Navigation was aborted by another navigation | https://europa.eu |
Navigation was aborted by another navigation | https://fc2.com |
Navigation was aborted by another navigation | https://go.com |
Navigation was aborted by another navigation | https://gouv.qc.ca |
Navigation was aborted by another navigation | https://moneycontrol.com |
Navigation was aborted by another navigation | https://mozilla.org |
Navigation was aborted by another navigation | https://offerup.com |
Navigation was aborted by another navigation | https://planetofhotels.com |
Navigation was aborted by another navigation | https://rakuten.co.jp |
Navigation was aborted by another navigation | https://soccerway.com |
Navigation was aborted by another navigation | https://thetimes.com |
Navigation was aborted by another navigation | https://tutorialspoint.com |
Navigation was aborted by another navigation | https://twitter.com |
Navigation was aborted by another navigation | https://udn.com |
Navigation was aborted by another navigation | https://uptodown.com |
Navigation was aborted by another navigation | https://vaia.com |
Navigation was aborted by another navigation | https://vice.com |
Navigation was aborted by another navigation | https://wattpad.com |
Address rejected | https://bab.la |
Address rejected | https://newegg.com |
NS_BINDING_ABORTED | https://deepl.com |
NS_BINDING_ABORTED | https://discogs.com |
NS_BINDING_ABORTED | https://findlaw.com |
NS_BINDING_ABORTED | https://techtarget.com |
NS_BINDING_ABORTED | https://vrbo.com |
NS_ERROR_ABORT | https://123rf.com |
NS_ERROR_ABORT | https://bbc.co.uk |
NS_ERROR_ABORT | https://byjus.com |
NS_ERROR_ABORT | https://com.be |
NS_ERROR_ABORT | https://linguee.com |
NS_ERROR_ABORT | https://medlineplus.gov |
NS_ERROR_ABORT | https://urbandictionary.com |
NS_ERROR_ABORT | https://zendesk.com |
NS_ERROR_CONNECTION_REFUSED | https://ouest-france.fr |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, | https://company-information.service.gov.uk SSL_ERROR_BAD_CERT_DOMAIN) |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SEC_ERROR_UNKNOWN_ISSUER) | https://barnesandnoble.com |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SEC_ERROR_UNKNOWN_ISSUER) | https://canada.ca |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SEC_ERROR_UNKNOWN_ISSUER) | https://kompass.com |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SEC_ERROR_UNKNOWN_ISSUER) | https://liputan6.com |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SEC_ERROR_UNKNOWN_ISSUER) | https://state.mn.us |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SEC_ERROR_UNKNOWN_ISSUER) | https://virginia.gov |
NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_SECURITY, SSL_ERROR_BAD_CERT_DOMAIN) | https://streetdirectory.com |
NS_ERROR_NET_INTERRUPT | https://tokopedia.com |
NS_ERROR_NET_RESET | https://nih.gov |
NS_ERROR_NET_TIMEOUT | https://jalan.net |
NS_ERROR_UNKNOWN_HOST | https://army.mil |
NS_ERROR_UNKNOWN_HOST | https://simplyhired.com |
NS_ERROR_UNKNOWN_HOST | https://toppr.com |
NS_ERROR_WONT_HANDLE_CONTENT | https://github.com |
The results are also slightly inconsistent but overall we have ~20 navigations classified with the new error message as aborted navigations, 5 other NS_BINDING_ABORTED, ~10 NS_ERROR_ABORT, ~10 NS_ERROR_GENERATE_FAILURE, and a few other less frequent errors (Address rejected, NS_ERROR_CONNECTION_REFUSED, NS_ERROR_NET_INTERRUPT, NS_ERROR_NET_RESET, NS_ERROR_NET_TIMEOUT, NS_ERROR_UNKNOWN_HOST, NS_ERROR_WONT_HANDLE_CONTENT).
We should investigate those in follow up bugs.
Comment 21•9 months ago
|
||
Reporter | ||
Comment 23•9 months ago
|
||
Thanks Julian for the update. Maybe we should have a meta bug with all the details from comment 20?
Comment 24•9 months ago
|
||
Backed out for causing (part of commit message)
Comment 26•9 months ago
•
|
||
We have another issue, with 204 No content responses. Those are not properly detected by the navigation manager at the moment, and we don't get a navigation failed event for them.
The reason behind this is that we currently ignore STOP notifications in the NavigationListenerChild if coming from an NS_BINDING_ABORTED, because we assume this is due to a process change and we will get other updates soon. However this is not the case for 204 No Content.
Additionally there is absolutely no difference in the flags or status for the update received for a 204 compared to a NS_BINDING_ABORTED for a process change. Maybe I can get some more information on the request...
Comment 27•9 months ago
|
||
Nika, Olli,
Using a WebProgressListener in content processes I'm having trouble to tell apart some StateChange updates with STATE_STOP
and status Cr.NS_BINDING_ABORTED
.
For the most recent example, I will observe this kind of state update when the HTTP response is a 204 No Content, and in this case it will be the last update for this navigation. But I also get it when there is a process change during a navigation (eg navigating from x.com to youtube.com). But in that case we will later get another state change with STATE_STOP but no NS_BINDING_ABORTED. Because of this, I have been ignoring updates with STATE_STOP and NS_BINDING_ABORTED, thinking I should always get "another" update. But it doesn't seem to always be the case, eg with 204 responses. And the request object is not helpful either in case of 204s because it doesn't implement nsIHttpChannel, and we don't have access to the statusCode.
Is there a way we could know whether the STATE_STOP
is really the final event for a given navigation. I know we don't really have a comprehensive view of a "navigation" on the platform side, but maybe you have some suggestions which would help here.
Comment 28•9 months ago
|
||
(In reply to Julian Descottes [:jdescottes] from comment #27)
Using a WebProgressListener in content processes I'm having trouble to tell apart some StateChange updates with
STATE_STOP
and statusCr.NS_BINDING_ABORTED
.
Where are you attaching the web progress listener? Is this being attached in the parent process through the BrowsingContext's WebProgress, or is it being listened to within a content process through the docshell? Theoretically if you're listening in the parent process you shouldn't be seeing any NS_BINDING_ABORTED statuses during process switches, as they should be suppressed internally. Events like that will appear in content processes as the navigation is being aborted there.
Do you have a link to the code which is seeing these mid-navigation NS_BINDING_ABORTED
notifications?
Comment 29•8 months ago
|
||
(In reply to Nika Layzell [:nika] (ni? for response) from comment #28)
(In reply to Julian Descottes [:jdescottes] from comment #27)
Using a WebProgressListener in content processes I'm having trouble to tell apart some StateChange updates with
STATE_STOP
and statusCr.NS_BINDING_ABORTED
.Where are you attaching the web progress listener? Is this being attached in the parent process through the BrowsingContext's WebProgress, or is it being listened to within a content process through the docshell? Theoretically if you're listening in the parent process you shouldn't be seeing any NS_BINDING_ABORTED statuses during process switches, as they should be suppressed internally. Events like that will appear in content processes as the navigation is being aborted there.
Do you have a link to the code which is seeing these mid-navigation
NS_BINDING_ABORTED
notifications?
I am attaching the web progress listener in content processes, mainly because the parent process listener misses to send notifications in some edge cases (data uri iframes for instance, at least that was the case when I tested both approaches 2 years ago at https://bugzilla.mozilla.org/show_bug.cgi?id=1756595#c4).
The code catching mid navigation NS_BINDING_ABORTED is at https://searchfox.org/mozilla-central/rev/f732a1651018b7c32002981a9c8b8613975ffbf9/remote/shared/js-window-actors/NavigationListenerChild.sys.mjs#168-177 (but that's content process so not surprising).
Overall maybe we should just stop using content process listeners here and just try to address the issues we have for edge cases with the parent process approach.
Updated•8 months ago
|
Updated•8 months ago
|
Comment 30•4 months ago
|
||
Unassigning and moving to m16.
Note that based on the conversation on https://github.com/w3c/webdriver-bidi/issues/763#issuecomment-2718673242, the expected behavior is now to not throw when a navigation is aborted by a new navigation.
Maybe with this new behavior it will be ok to simply resolve in case of NS_BINDING_ABORTED, but this requires more investigation.
Updated•4 months ago
|
Comment 32•4 months ago
|
||
Per comment 30, we should rediscuss this bug
Updated•4 months ago
|
Updated•2 months ago
|
Updated•26 days ago
|
Updated•25 days ago
|
Description
•