HPass - HTML Parser
📄
Description
HPass - HTML Parser is an extension that can be used for fetching, extracting, and parsing data.
🧩
Documentation (Blocks)
Create Connection
Creates Connection with the specified URL of Website
Connection Timeout
Configures the total connection timeout duration to get the document. The timeout is specified in milliseconds. The default timeout is 30 seconds (30,000 milliseconds). A timeout of zero is treated as an infinite timeout.
Connection User-Agent
Configures the request User-Agent Header. Some websites do not allow access to their website unless the User-Agent is specified.
Connection Cookies
Adds each of the cookies to the connection. Cookies Parameter must be a dictionary with the name of the cookie and its value.
Connection Parameters
Adds each of the parameters or query string to the connection. The parameters must be a dictionary with the name of the parameter and its value. Connection parameters are sent in the request as a query string.
Connection Follow Redirects
Configures whether the connection should follow server redirects. By default this is true.
Connection Headers
Adds each of the headers to the connection. Headers must be a dictionary with the name of the header and its value.
Connection Ignore HTTP Errors
Configures the connection to not throw errors when an HTTP error occurs. Example: 404, 500 etc. By default, this is set to false. If configured to true, the error may be reflected in the HTML with the description if any error occurs.
Connection Ignore Content-Type
Configures the connection to not throw errors when the content type is inappropriate. If set to true it ignores the document’s content type when parsing the document and it tries to parse regardless of its content type.
Connection Proxy
Configures HTTP proxy to use for this request.
Connection Referrer
Configures the referrer header.
Connection Request Body
Configures the request body. It is useful when a server expects a plain request body.
Get Document
Executes get request with the specified connection ID. After the document is received when GotDocument is triggered.
Parse
Parses the given HTML.
Get Body
Returns the body element.
Get Head
Returns the head element.
Get Title
Returns the title of the website.
Is Charset Supported
Checks whether the specified charset is supported for the website.
Pretty Print
Returns if pretty print is enabled.
Get Whole Text
Returns the (unencoded) text of all children of this element, including any newlines and spaces present in the original.
Get Outer HTML
Returns the the outer HTML of this element. For example, on ap
element, this may return<p>Kodular</p>
.
Tag Name
Returns the name of the tag for this element. Example:div
.
Get Parent
Return this element’s parent element.
Get ID
Returns theid
attribute of this element.
Own Text
Returns the (normalized) text owned by this element only; does not get the combined text of all children. For example, given HTML<p>Hello <b>Amazing </b>Kodular!</p>
, returns"Hello Kodular!"
.
Note: The text within theb
element is not returned, as it is not a direct child of thep
element.
Node Name
Returns the node name of this node. Mainly used for debugging purposes.
Normal Name
Returns the normalized name of this Element’s tag. This will always be the lowercased version of the tag, regardless of the tag case. For Example:<DIV>
and<div>
both have a normal name ofdiv
.
Is Block
Checks whether this element is a block-level element. (Example:<div>
returns true or an inline element<span>
returns false ).
Matches
Checks whether this element matches the given query. Also knows asmatches()
in the Web DOM.
Has Attribute
Checks whether the element has attribute. This is Case insensitive.
Has Text
Checks whether the element has any text content (that is not just whitespace).
Has Class
Checks whether this element has a class. This is Case insensitive.
Get Absolute URL
Returns absolute URL from a URL whose element attribute may be relative (such as an<a href>
or<img src>
). This is Case insensitive.
Get Attribute Value
Returns an attribute’s value by its specified key.
Get Element Sibling Index
Returns the index of this element in its element sibling list.
Note: If this is the first element sibling, returns 0.
Get Data
Returns the combined data of this element. Data should be used for the contents of scripts, comments, CSS styles, etc.
Get Data Set
Returns the specified element’s HTML5 custom data attributes. Each attribute in the element that has a key starting withdata-
is included in the dataset. It returns a dictionary with keys and values.
Get Number Of Attributes
Returns the number of attributes that the specified element has.
Get Classes
Returns the all of the specified element’s class names.
Get HTML
Returns the specified element’s inner HTML.
Get Text
Returns the normalized, combined text of this element and all its children. Whitespace is normalized and trimmed.
Note: This returns the textual content that would be presented to a reader.
Get Attributes
Returns a dictionary of attributes with key and value of each attribute.
Convert Document to Element
Converts the document to element, which is helpful for various functions.
Get Elements By Sibling Index Greater Than
Returns the list of elements whose sibling index is greater than the specified index.
Get Elements By Sibling Index Lesser Than
Returns the list of elements whose sibling index is lesser than the specified index.
Get Elements By Sibling Index Equals
Returns the list of elements whose sibling index is equal to the specified index.
Get Elements Containing Text
Returns the list of elements that contain the text specified. The search is case insensitive. The text may appear directly in the element, or in any of its descendants.
Get Elements By Class
Returns the list of elements which have the specified class, including or under this element. This is case insensitive.
Get Elements By Attribute Value Prefix
Returns the list of elements that have the attribute with the specified key and has the value which is the prefix of the parameter specified. This is case insensitive.
Get Elements By Attribute Value Not
Returns the list of elements that either does not have the specified attribute or have it with a different value. This is case insensitive.
Get Elements By Attribute Value Suffix
Returns the list of elements that have the specified attribute which ends with the specified suffix value. This is case insensitive.
Get All Elements
Returns the list of elements under this element (including self, and children of children).
Get Elements By Attribute
Returns the list of elements that have this attribute key.
Get Elements By Attribute Key Prefix
Returns the list of elements which have an attribute name starting with the specified prefix. Use may usedata-
to find elements that have HTML5 datasets.
Get Elements By Attribute Value Containing
Returns the list of elements that have attributes whose value contains the match string. This is case insensitive.
Get Elements By Attribute Key Value
Returns the list of elements which have attributes with the specific value. This is case insensitive.
Get Element By ID
Returns element whose id is equal to the specified id, including or under this element.
Note: This finds the first matching ID, starting with this element. There may be more than one id in an element with a specified id.
Get First Element In Siblings
Returns the first element sibling of this element. This may be the same element as specified.
Get Elements By Tag Name
Returns the list of elements including and recursively under the specified element, with the specified tag name.
Select
Returns the list of elements that match the CSS query, with the specified element as the starting context. Matched elements may include the specified element or any of its children. This isquerySelectorAll()
in the Web DOM.
Error
Triggered when an error occurs.
Got Document
Triggered when the document has been gotten.
❔
Usage (How to Apply)
Establishing Connection
Parsing
🎥
Examples
Custom Search Engine
Note: Search Results by Google
Retrieve Favicon & Title
Note: Usually the last element is the largest icon. Websites loading favicon with Javascript may not work.
⚠️
Information
Results may not be as desired if the website uses JavaScript to load content as this extension is the parser.
The Connection functions takes time depending on the website to get the document.
🔗
Reference
📁
Download
v0.1.0.aix (346.2 KB)