A lazy/optimized expression detection method
I recently put together a chrome extension for a customer who wanted to detect predefined expressions in webpages. Given the size of the targetted web pages, he was concerned about performance and did not want the web site / browser to be slowed down by the extra processing involved.
I designed a triple blade razor contentScript based on two web APIs which can help with the job:
Blade1: to deal with any web page including the dynamic ones, I configured a mutationObserver to detect the newly created DOM elements. By the way, I'm only interested by the elements which contain text nodes (i.e. nodeType===Node.TEXT_NODE)..
Blade2/ to spend CPU time on useful things, I configured an intersectionObserver to which I passed those elements. This observer detects the nodes becoming visible and feeds those to the third blade.
Blade3/ finally, a simple job scheduler scans its pipeline every 5 ms and communicates with the background script which holds the content processing engine (in this particular case, it looks at the words detected and tries to find expressions in a huge dictionnary).
Hints: to simplify the code and rely only on the mutationObserver to fuel the pipe (even for the static pages), I contentScript is configured to start before the document is ready, ie in manifest.json file:
"run_at": "document_start",
Next work: a bit of optimization to take care of users which scroll very quickly (by pages). This can be done by eliminating elements from the scheduler pipeline in blade2 if those elements have become invisible.
EDIT: sample addon code can be found here on github!
I designed a triple blade razor contentScript based on two web APIs which can help with the job:
Blade1: to deal with any web page including the dynamic ones, I configured a mutationObserver to detect the newly created DOM elements. By the way, I'm only interested by the elements which contain text nodes (i.e. nodeType===Node.TEXT_NODE)..
Blade2/ to spend CPU time on useful things, I configured an intersectionObserver to which I passed those elements. This observer detects the nodes becoming visible and feeds those to the third blade.
Blade3/ finally, a simple job scheduler scans its pipeline every 5 ms and communicates with the background script which holds the content processing engine (in this particular case, it looks at the words detected and tries to find expressions in a huge dictionnary).
Hints: to simplify the code and rely only on the mutationObserver to fuel the pipe (even for the static pages), I contentScript is configured to start before the document is ready, ie in manifest.json file:
"run_at": "document_start",
Next work: a bit of optimization to take care of users which scroll very quickly (by pages). This can be done by eliminating elements from the scheduler pipeline in blade2 if those elements have become invisible.
EDIT: sample addon code can be found here on github!
Comments
Post a Comment