Just a quick snippet to show how to retrieve the source code of a page from a web extension for Firefox. As a reminder, in web extensions, there are background scripts (that run as a global background process) and content scripts (that run in each tab, along with the loaded HTML pages).
Background scripts cannot directly query the source code of a page. They run in background, they do not have access to pages information. However, background and content scripts can communicate together. So, the idea is that the background script will query the source code to a content script.
Let’s start with the interesting parts of the manifest:
"background": {
"scripts": ["path/to/background/script.js"]
},
"content_scripts": [
{
"matches": ["<all_urls>"],
"js": ["path/to/content/script.js"]
}
],
"permissions": ["tabs"]
Notice the tabs permission.
We need it as this is not the main messaging system. Communications between background and content scripts must go through the tab API.
Here is the content of the background script.
It queries the source code and displays it then.
// Invoke the function...
getSourceCode();
// ... defined here
function getSourceCode() {
// Get the page's source code.
// Background scripts cannot directly get it, so we ask it to our content
// script (in the currently active tab). So we have to go through the tab API.
browser.tabs.query({active: true, currentWindow: true}).then( tabs => {
browser.tabs.sendMessage( tabs[0].id, {'req':'source-code'}).then( response => {
console.log('url = ' + tabs[0].url);
console.log('source code = ' + response.content);
});
});
}
As you can see, we first find the active tab.
Then, we send a request to this tab. Like all the tabs, our content script runs in it. Here is its code. It receives the query and sends the source code (within a promise).
browser.runtime.onMessage.addListener(request => {
var response = '';
if(request.req === 'source-code') {
response = document.documentElement.innerHTML;
}
return Promise.resolve({content: response});
});
That’s it.
The web extension will be able to manipulate the source code, as an example for a deeper analysis.